Embedding mRuby: retrieving mrb_parser_message after parse error - parsing

I'm trying to embed mRuby in a Max MSP object. One of the first things I want to setup is error logging in the Max IDE console window. To that effect, after I parse the code ( stored in a C string ) with mrb_parse_string, I expect errors to be available in the parser's error_buffer array, but the structures in this array are always empty ( lineno and column set to 0 and message set to NULL ) even when there is an error.
Is there a special way to set up the parser before parsing the code so it fills its error_buffer array properly in case an error occurs ? I've looked into the mirb source, but it doesn't look like it. I'm lost. Here is the code I'm using, taken from a small C program I'm using as test:
mrb_state *mrb;
char *code;
struct mrb_parser_state *parser;
parser = mrb_parse_string(mrb, code, mrbc_context_new(mrb));
if (parser->nerr > 0) {
for(i = 0; i < parser->nerr; i++) {
printf("line %d:%d: %s\n", parser->error_buffer[i].lineno,
parser->error_buffer[i].column,
parser->error_buffer[i].message);
}
return -1;
}
When passed the following faulty ruby code:
[1,1,1]]
the previous code outputs :
line 1:8: syntax error, unexpected ']', expecting $end
line 0:0: (null)
I don't know where the first line comes from, since I compiled mRuby with MRB_DISABLE_STDIO defined and as line 14 and following in mrbconf.md suggests, but it is accurate.
The second line is the actual output from my code and shows that the returned mrb_parser_state structure's error_buffer is empty, which is surprising since the parser did see an error.

Sorry totally misunderstood your question.
So you want to:
capture script's syntax errors instead of printing.
make MRB_DISABLE_STDIO work.
For 1st issue
struct mrb_parser_state *parser;
parser = mrb_parse_string(mrb, code, mrbc_context_new(mrb));
should be replaced with:
struct mrbc_context *cxt;
struct mrb_parser_state *parser;
cxt = mrbc_context_new(mrb);
cxt->capture_errors = TRUE;
parser = mrb_parse_string(mrb, code, cxt);
like what mirb does.
For 2nd issue I don't know your build_config.rb so I can't say much about it.
Some notes to make things accurate:
MRB_DISABLE_STDIO is a compile flag for building mruby so you need to pass it in build_config.rb like:
cc.defines << %w(MRB_DISABLE_STDIO)
(see build_config_ArduinoDue.rb)
line 1:8: syntax error, unexpected ']', expecting $end
is the parsing error of mruby parser([1,1,1]] must be [1,1,1]).
And 1:8 means 8th column of 1st line (which points to unnecessary ]) so it seems like your C code is working correctly to me.
(For a reference your code's compilation error in CRuby:
https://wandbox.org/permlink/KRIlW2956TnS6puD )
prog.rb:1: syntax error, unexpected ']', expecting end-of-input
[1,1,1]]
^

Related

strange errors when using parsertree

I am playing around with the parsing of code using the java15 syntax.
I noticed that when parsing an entire class it gives me an error if the class file ends with an empty line. I wrote some code to remove these empty lines before parsing but is there a more structural solution? Or am I missing something?
Related: when I am trying to parse a single method: as soon as I change something to the location of the accolades { } ( on a separate line or not for example) I receive an error.
|std:///ParseTree.rsc|(14967,5,<455,80>,<455,85>): ParseError(|java+class:///smallsql/database/language/Language_it|(10537,0,<152,0>,<152,0>))
at parse(|std:///ParseTree.rsc|(14967,5,<455,80>,<455,85>))
at $root$(|prompt:///|(0,7,<1,0>,<1,7>))
at *** somewhere ***(|std:///ParseTree.rsc|(14967,5,<455,80>,<455,85>))
at parse(|std:///ParseTree.rsc|(14967,5,<455,80>,<455,85>))
at $root$(|prompt:///|(0,7,<1,0>,<1,7>))
I'm guessing you are parsing using code like so:
parse(#CompilationUnit, file)
[CompilationUnit] file
But: a CompilationUnit does not start or end with layout notation.
To solve this, you should declare the start non-terminal of a grammar like so:
start syntax CompilationUnit = ... ;
layout L = [\t\n\r\ ]*; // it's necessary to have L be nullable
and this will (internally and hidden) generate for you a new production, like so:
syntax start[CompilationUnit] = L CompilationUnit top L;
With this, you can then parse a whole file which might end with layout:
parse(#start[CompilationUnit], file)
[start[CompilationUnit]] file
To extract the CompilationUnit the top field comes in handy:
CompilationUnit u = parse(#CompilationUnit, file).top

How to remove non-ascii char from MQ messages with ESQL

CONCLUSION:
For some reason the flow wouldn't let me convert the incoming message to a BLOB by changing the Message Domain property of the Input Node so I added a Reset Content Descriptor node before the Compute Node with the code from the accepted answer. On the line that parses the XML and creates the XMLNSC Child for the message I was getting a 'CHARACTER:Invalid wire format received' error so I took that line out and added another Reset Content Descriptor node after the Compute Node instead. Now it parses and replaces the Unicode characters with spaces. So now it doesn't crash.
Here is the code for the added Compute Node:
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE NonPrintable BLOB X'0001020304050607080B0C0E0F101112131415161718191A1B1C1D1E1F7F808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF1F2F3F4F5F6F7F8F9FAFBFCFDFEFF';
DECLARE Printable BLOB X'20202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020';
DECLARE Fixed BLOB TRANSLATE(InputRoot.BLOB.BLOB, NonPrintable, Printable);
SET OutputRoot = InputRoot;
SET OutputRoot.BLOB.BLOB = Fixed;
RETURN TRUE;
END;
UPDATE:
The message is being parsed as XML using XMLNSC. Thought that would cause a problem, but it does not appear to be.
Now I'm using PHP. I've created a node to plug into the legacy flow. Here's the relevant code:
class fixIncompetence {
function evaluate ($output_assembly,$input_assembly) {
$output_assembly->MRM = $input_assembly->MRM;
$output_assembly->MQMD = $input_assembly->MQMD;
$tmp = htmlentities($input_assembly->MRM->VALUE_TO_FIX, ENT_HTML5|ENT_SUBSTITUTE,'UTF-8');
if (!empty($tmp)) {
$output_assembly->MRM->VALUE_TO_FIX = $tmp;
}
// Ensure there are no null MRM fields. MessageBroker is strict.
foreach ($output_assembly->MRM as $key => $val) {
if (empty($val)) {
$output_assembly->MRM->$key = '';
}
}
}
}
Right now I'm getting a vague error about read only messages, but before that it wasn't working either.
Original Question:
For some reason I am unable to impress upon the senders of our MQ
messages that smart quotes, endashes, emdashes, and such crash our XML
parser.
I managed to make a working solution with SQL queries, but it wasted
too many resources. Here's the last thing I tried, but it didn't work
either:
CREATE FUNCTION CLEAN(IN STR CHAR) RETURNS CHAR BEGIN
SET STR = REPLACE('–',STR,'–');
SET STR = REPLACE('—',STR,'—');
SET STR = REPLACE('·',STR,'·');
SET STR = REPLACE('“',STR,'“');
SET STR = REPLACE('”',STR,'”');
SET STR = REPLACE('‘',STR,'&lsqo;');
SET STR = REPLACE('’',STR,'’');
SET STR = REPLACE('•',STR,'•');
SET STR = REPLACE('°',STR,'°');
RETURN STR;
END;
As you can see I'm not very good at this. I have tried reading about
various ESQL string functions without much success.
So in ESQL you can use the TRANSLATE function.
The following is a snippet I use to clean up a BLOB containing non-ASCII low hex values so that it then be cast into a usable character string.
You should be able to modify it to change your undesired characters into something more benign. Basically each hex value in NonPrintable gets translated into its positional equivalent in Printable, in this case always a full-stop i.e. x'2E' in ASCII. You'll need to make your BLOB's long enough to cover the desired range of hex values.
DECLARE NonPrintable BLOB X'000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F';
DECLARE Printable BLOB X'2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E';
SET WorkBlob = TRANSLATE(WorkBlob, NonPrintable, Printable);
BTW if messages with invalid characters only come in every now and then I'd probably specify BLOB on the input node and then use something similar to the following to invoke the XMLNSC parser.
CREATE LASTCHILD OF OutputRoot DOMAIN 'XMLNSC'
PARSE(InputRoot.BLOB.BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
With the exception terminal wired up you can then correct the BLOB's of any messages containing parser breaking invalid characters before attempting to reparse.
Finally my best wishes as I've had a number of battles over the years with being forced to correct invalid message content in the "Integration Layer" after all that's what it's meant to do.

What do I need to add to use monadUserState with alex when parsing?

I am trying to write a program that will understand a language where embedded comments are allowed. Such as:
/* Here's a comment
/* This comment is further embedded */ second comment is closed
Must close first comment */
This should be recognized as a comment (and as such not stop at the first */ it sees unless it has only seen 1 comment opening prior).
This would be an easy issue to fix in C, I could simply have a counter that incremented when it saw comment opens and decrements when it sees a comment close. If the counter is at 0, we're in "code section".
However, without having state in Haskell, it's a little more challenging.
I've read up on monadUserState which supposedly allows to keep track of a state for this exact type of parsing. However, I can't find very much reading material on it aside from the tutorial page on alex.
When I try to compile it gives the error
templates\wrappers.hs:213:16: Not in scope: `alexEOF`
It should be noted that I directly changed from the "basic" wrapper to the "monadUserState" without changing my code (I don't know what to add in order to use it). It says that this must be initialized in the user code:
data AlexState = AlexState {
alex_pos :: !AlexPosn, -- position at current input location
alex_inp :: String, -- the current input
alex_chr :: !Char, -- the character before the input
alex_bytes :: [Byte], -- rest of the bytes for the current char
alex_scd :: !Int, -- the current startcode
alex_ust :: AlexUserState -- AlexUserState will be defined in the user program
}
I'm a bit of a lexxing noob and I'm not at all sure what I should be adding here to make it at least compile... then I can worry about the logic of the thing.
Update: Working example available here: http://lpaste.net/119212
The file "tiger.x" (link) in the alex github repo contains an example of how to track embedded comments using the monadUserState wrapper.
Well, unfortunately that example doesn't compile but the ideas there should work.
Basically, these lines perform embedded comment processing:
<0> "/*" { enterNewComment `andBegin` state_comment }
<state_comment> "/*" { embedComment }
<state_comment> "*/" { unembedComment }
<state_comment> . ;
<state_comment> \n { skip }
As for alexEOF, the idea is to add an EOF token to your token data type:
data Tokens = ... | EOF
and define alexEOF as:
alexEOF = return EOF
See the file tests/tokens_monadUserState_bytestring.x in the alex repo for an example of this.

Python parsing error message functions

The code below was created by me with the help of many SO veterans:
The code takes an entered math expression and splits it into operators and operands for later use. I have created two functions, the parsing function that splits, and the error function. I am having problems with the error function because it won't display my error messages and I feel the function is being ignored when the code runs. An error should print if an expression such as this is entered: 3//3+4,etc. where there are two operators together, or there are more than two operators in the expression overall, but the error messages dont print. My code is below:
def errors():
numExtrapolation,opExtrapolation=parse(expression)
if (len(numExtrapolation) == 3) and (len(opExtrapolation) !=2):
print("Bad1")
if (len(numExtrapolation) ==2) and (len(opExtrapolation) !=1):
print("Bad2")
def parse(expression):
operators= set("*/+-")
opExtrapolate= []
numExtrapolate= []
buff=[]
for i in expression:
if i in operators:
numExtrapolate.append(''.join(buff))
buff= []
opExtrapolate.append(i)
opExtrapolation=opExtrapolate
else:
buff.append(i)
numExtrapolate.append(''.join(buff))
numExtrapolation=numExtrapolate
#just some debugging print statements
print(numExtrapolation)
print("z:", len(opExtrapolation))
return numExtrapolation, opExtrapolation
errors()
Any help would be appreciated. Please don't introduce new code that is any more advanced than the code already here. I am looking for a solution to my problem... not large new code segments. Thanks.
The errors() function is called after parse() returns because it appears inside the body of parse(). Hopefully that is a typo.
For this particular input, numExtrapolate is appended with an empty buffer because there is no operand between / and /. That makes its length 4 and your check for Bad1 fails. So put a check like this
if buff:
numExtrapolate.append(''.join(buff))

" '}' expected near '=' " error appearing in a line that otherwise appears perfect

When I attempt to run my script, I get an error returning on a variable assignment. I've re-checked my syntax many times and it doesn't seem to be a mistake I made there--I even had somebody else look at it just in case. However, the error that returns continuously points me to the syntax, and I can't seem to find a solution to this problem.
Here is the whole troublesome function:
function registerquestlines()
if player["testline"] == nil then
player["testline"] = {"prog" = {true,false,false}, "quests" = {"testline1", "testline2", "testline3"}, "prog#" = 1}
end
end
Again, the error I get is: '}' expected near '=' on the line in which I assign values to player["testline"].
A table initializer uses either an unquoted name or a bracketed expression, not a quoted name.
{prog = {true,false,false}}
{["prog"] = {true,false,false}}

Resources