unexpected character while parsing XML from batch script - xml-parsing

I am getting this error when I am trying to parse XML from batch Script
error :
< was unexpected at this time.
xml:
<driver type=".dbdriver">
<attributes>localhost;1521;XE;false</attributes>
<driverType>Oracle thin</driverType>
</driver>
<password>7ECE6B7E7D2AF514C55BAE8B3A6B51E7</password>
<user>JR</user>
batch scrpit:
for /f "tokens=3 delims=><" %%j in ('type %SETTINGSPATH% ^| find "<user>"') do set user=%%j
This code is supposed to read user value from XML which is just "JR" and on some machines I am getting this values; but some machines are not showing this value and showing this error.
Please guide.

Parsing XML with batch is often problematic and always risky. A valid XML documented could be legitamately reformatted in any number of ways that would break you parser. But if you really want to continue to use batch...
That error message occurs when you have an unescaped and unquoted < character in your IN() clause. The "<user>" is already quoted, so that normally should not be a problem. The problem must stem from the value contained in %SETTINGSPATH%. Either the value must have an unquoted and unescaped <, or there must be an odd number of quotes in the value. The odd number of quotes would cause the <user> to no longer be quoted.
The only other possibility is that you have not shown us all your code, and the error is occuring someplace else.

This will never work reliably. The reason for this is that you are trying to process Xml using wrong tools. There is an infinite number of textual representations of an Xml document that have the same semantical meaning. As a result a space here or a new line there will not change the semantics of your document but will break your script even though all the tools that process the input as Xml will continue to work correctly. Use PowerShell or vbscript/jscript where you can use Xml capabilities or you will always have problems like this since you should not use a brush to drive screws.

Related

ANTLR : Generate back source file from AST

I have modified the PLSQL parser given by [Porcelli] (https://github.com/porcelli/plsql-parser). I am using this parser to parse PlSql files. After successful parsing, I am printing the AST. Now, I want to edit the AST and print back the original plsql source with edited information. How can I achieve this? How can I get back source file from AST with comments, newline and whitespace. Also, formatting should also be remain as original file.
Any lead towards this would be helpful.
Each node in an AST comes with an index member which gives you the token position in the input stream (token stream actually). When you examine the indexes in your AST you will see that not all indexes appear there (there are holes in the occuring indexes). These are the positions that have been filtered out (usually the whitespaces and comments).
Your input stream however is able to give you a token at a given index and, important, to give you every found token, regardless of the channel it is in. So, your strategy could be to iterate over the tokens from your token stream and print them out as they come along. Additionally, you can inspect your AST for the current index and see if instead a different output must be generated or additional output must be appended.
The simple answer is "walk the tree, and spit out text that corresponds to the nodes".
ANTLR offers "StringTemplates" as a basic kind of help, but in fact there's a lot of
fine detail that needs to be addressed: indentation, literals and their formats, comments,...
See my SO answer on Compiling an AST back to source code for a lot more detail.
One thing not addressed there is the general need to reproduce the original character encoding of the file (if you can, sometimes you can't, e.g., you had an ASCII file but inserted a string containing a Unicode character).

Is there a list of Special characters to be avoided in JCL/MVS Script variables

I have a program that generates random pin codes. These pins are generated in Java, then stored in the mainframe, via a NATURAL program. From there, they are eventually physically printed by a batch JCL job that calls an MVS SCRIPT to print the form, with the pin code on it.
I've come across some issues with special characters before such as: |{}![]^~<>; that for one reason or another do not print properly. I've also removed 0OQ1l for OCR reasons.
Recently, an error came to my attention with another character that doesn't print properly, . but the . character only fails when it is the first character of the PIN Code.
So since I've run into this issue I thought I would see if I could find other special jcl, Natural or MVS Script characters that might interfere with my programs operation so that I can test them now and hopefully not run into this issue later or have to fallback to only using OCR'ed AlphaNumeric characters.
EDIT
Java - Web Application Runs Under Tomcat 6.x on a Solaris Server.
Natural - The Natural Program is called using webmethods Broker generated classes (POJOs).
My understanding is it uses RPC for actual communication.
The program verifies some data and stores the Pin in combination with a GUID on a record, in ADABAS.
There is a batch job that runs to print the forms. The Batch job is written in JCL.
My Understanding from the maintainer of the Batch Job, and the Forms stuff is the actual language to describe the forms themselves and how they get printed is an outdated/unsupported language called MVS SCRIPT.
the Bottom section of the script looks like this:
//**********************************************************************
//* PRINT SORTED FORMS TO #### USING MVS SCRIPT
//**********************************************************************
PRINTALL EXEC PGM=DSMSPEXEC,PARM='LIST'
//* less 'interesting' lines omitted
SYSPRINT DD SYSOUT=*
PRINT1 DD SYSOUT=A, OUTPUT=*.C####,
RECFM=VBM,LRECL=####,BLKSIZE=####
//* less 'interesting' lines omitted
//SYSIN DD *
AUTH /* redacted */
SCRIPT FROM(MYFORMS) (MESSAGE(ID TRACE) CONT -
FILE(PRINT1) PROFILE(redacted) -
NOSEGLIB DEVICE(PG4A) CHARS(X0A055BC))
.C#### is an actual number and is a variable that points to the chosen printer.
NOTE: I'm a Web Programmer, I don't speak mainframe, JCL, MVS, etc.
I think you will find the program (pgm=) is DSMSPEXC and not DSMSPEXEC.
I am guessing (could be wrong) we are talking about Script/DCF (which later became IBM Bookmaster / Bookmanager on other platforms).
Script/DCF is basically a GML based language. It was from GML that SGML was derived (HTML and XML are prominent examples of SGML languages).
In Script : starts a tag, . ends a tag. There are also macro's which have a . in column 1
.* ".*" in column 1 starts a line comment
.* .fo off is Format off (like <pre> in html)
.fo off
.* Starting an ordered list
:ol.
:li.Item in orded list
:eol.
i.e.
Script HTML
: < - Starts tag
. > - end of tag Script/DCF is generally pretty tolerant of .
& & - Starts a variable
There are variables (&gml. = :) for most of the special characters.
Characters to worry about are
: - always
& - always
. - in column one or after a :.
Other characters should be ok provided there are no translation errors. There charset X0A055BC (Mainframe SONORAN SANS SERIF ??) might not have all the special chars in it.
There are manuals around for Script/DCF tags.
Your data is not going to affect the JCL in any way.
I don't know about ADABAS or NATURAL. If you ask here, http://www.ibmmainframeforum.com/viewforum.php?f=25, specifically about that part, with as much detail as you can provide, there is a very expert guy, RDZbrog, who can probably answer that for you.
For the SCRIPT/VS itself, as Bruce Martin has pointed out, there may be some issues. With .xx and :xx there is not a clash with normal text. But you don't have normal text. With the &, which indicates a SCRIPT variable, it is more likely to be problematic and at any location.
I would fire some test data through: Your PINs with position one being all available punctuation preceding "fo" and "ol", and the same with those sequences "embedded" in your PINs. Also include a double & and a triple &.
Your query should be resolved in your specification. It is not, but I'm sure you'll get all the documentation updated when you get a resolution.

Incremental Parsing from Handle in Haskell

I'm trying to interface Haskell with a command line program that has a read-eval-print loop. I'd like to put some text into an input handle, and then read from an output handle until I find a prompt (and then repeat). The reading should block until a prompt is found, but no longer. Instead of coding up my own little state machine that reads one character at a time until it constructs a prompt, it would be nice to use Parsec or Attoparsec. (One issue is that the prompt changes over time, so I can't just check for a constant string of characters.)
What is the best way to read the appropriate amount of data from the output handle and feed it to a parser? I'm confused because most of the handle-reading primatives require me to decide beforehand how much data I want to read. But it's the parser that should decide when to stop.
You seem to have two questions wrapped up in here. One is about incremental parsing, and one is about incremental reading.
Attoparsec supports incremental parsing directly. See the IResult type in Data.Attoparsec.Text. Parsec, alas, doesn't. You can run your parser on what you have, and if it gives an error, add more input and try again, but you really don't know if the error was an unrecoverable parse error, or just needing for more input.
In your case, usualy REPLs read one line at a time. Hence you can use hGetLine to read a line - pass it to Attoparsec, and if it parses evaluate it, and if not, get another line.
If you want to see all this in action, I do this kind of thing in Plush.Job.Output, but with three small differences: 1) I'm parsing byte streams, not strings. 2) I've set it up to pull as much as is available from the input and parse as many items as I can. 3) I'm reading directly from file descriptos. But the same structure should help you do it in your situation.

Erlang a special value doule quote

When i run my program,a error hanppened, and when i look into the log, appears this {k,3108,"s"},{k,3109,"}, how can a one double quote as a varible's value.
In the text font it is a little hard to see exactly what you actually got in the log but I am guessing it is:
{k,3108,"s"},{k,3109,''}
The first true double quotes make an Erlang string (which is really a list of integers) while the second is actually a pair of ' which is the quote character for atoms. In this case it is the atom with the empty name which is allowed. This is what #shk indicated.
But without more information from you it is really hard to give a proper answer.

FitNesse: Can't see the difference between the expected and the actual result in failed assertion

I'm using FitNesse to test web service responses using check to compare the expected to the actual response.
In some cases the check is failing and I can't see what the differences are between the expected and the actual that is causing it to fail.
Here's a screenshot from what it's telling me in a specific instance (of many similar instances):
Feel free to point out the obvious; it's probably staring at me in the face and I'm looking so hard I can't see it!
I would check that the expected and actual strings are both written with the same text encoding. I've seen this error plenty of times when the text comparing failed due to a comma or apostrophe being written in different encodes.
It is possible that your string contains extra spaces in the actual value. FitNesse, being html based, will not respect leading or trailing spaces. It might not handle any extra spaces inside the actual either. So this can cause the result to be different, but not visibly so.
See if you can add some debug messages that would help you see the extra spaces, or at least count the number of characters in both strings.
This question doesn't specify whether Slim or Fit are being used, or which Slim server/plugin if using Slim, but I found the following to be true for me using FitNesse release 20130530 and fitSharp release 2.2:
Non-ASCII characters and { apostrophes / single quote characters } in input arguments/parameters that are strings are HTML encoded. The values in my FitNesse test tables are HTML encoded, but only the required syntax characters and (double) quotes; not the non-ASCII characters (and FitNesse doesn't seem to have any problems storing those values).
EOL characters in the input arguments that are strings consist of a linefeed character only
I imagine that because I'm using .NET, EOLs in my return values consist of carriage return and linefeed characters.
Because of [1], I'm HTML-encoding non-ASCII characters (but not the HTML syntax characters or quotes). Because of [2] and [3], I'm now removing carriage return characters from my fixture return values. Both changes seem to have resolved this issue for me and expected and actual values are now reported as being the same.
Whitespace has troubled me often. The resulting HTML just collapses whitespace, but the compare in code does not.
I now use a fixture to make differences more explicit to me. Example usage: http://fhoeben.github.io/hsac-fitnesse-fixtures/examples-results/HsacExamples.SlimTests.UtilityFixtures.CompareFixtureTest.html
Newer versions of FitNesse (since 20151230) do a diff on the expected and actual result values. Has that helped you at all?

Resources