Creating an AST, or a M3 can take some time depending on the size of the project you are trying to load. So is there a way to store the AST or M3 in a file? So next time you need it, you don't have to create it again since you can just load the complete thing from a file.
You can read and write any value from/to disk using ValueIO, like so:
rascal>writeBinaryValueFile(|home:///myFile.txt|, myValue)
ok
rascal>readBinaryValueFile(#myType, |home:///myFile.txt|)
myType: myValue
Or in a more readable textual format:
rascal>writeTextValueFile(|home:///myFile.txt|, myValue)
ok
rascal>readTextValueFile(#myType, |home:///myFile.txt|)
myType: myValue
There also exist JSON and CSV (de)serializers for other formats, to be found in lang::json::IO and lang::csv::IO
Related
I need to process a bunch of XML documents. They are quite complex in their structure (i.e. loads of nodes), but the processing consists in changing the values for a few nodes and saving the file under a different name.
I am looking for a way to do that without having to reconstruct the output XML by explicitly instantiating all the types and passing all of the unchanged values in, but simply by copying them from the input. If the types generated automatically by the type provider were record types, I could simply create the output by let output = { input with changedNode = myNewValue }, but with the type provider I have to do let output = MyXml.MyRoot(input.UnchangedNode1, input.UnchangedNode2, myNewValue, input.UnchangedNode3, ...). This is further complicated by my changed values being in some of the nested nodes, so I have quite a lot of fluff to pass in to get to it.
The F# Data type providers were primarily designed to provide easy access when reading the data, so they do not have very good story for writing data (partly, the issue is that the underlying JSON representation is quite different than the underlying XML representation).
For XML, the type provider just wraps the standard XElement types, which happen to be mutable. This means that you can actually navigate to the elements using provided types, but then use the underlying LINQ to XML to mutate the value. For example:
type X = XmlProvider<"<foos><foo a=\"1\" /><foo a=\"2\" /></foos>">
// Change the 'a' attribute of all 'foo' nodes to 1234
let doc = X.GetSample()
for f in doc.Foos do
f.XElement.SetAttributeValue(XName.Get "a", 1234)
// Prints the modified document
doc.ToString()
This is likely not perfect - sometimes, you'll need to change the parent element (like here, the provided f.A property is not mutable), but it might do the trick. I don't know whether this is the best way of solving the problem in general, or whether something like XSLT might be easier - it probably depends on the concrete transformations.
I am using the Csv Type Provider to read data from a local csv file.
I want to export the data as json, so I am taking each row and serializing it using the json.net Library with JsonConvert.SerializeObject(x).
The problem is that each row is modeled as a tuple, meaning that the column headers do not become property names when serializing. Instead I get Item1="..."; Item2="..." etc.
How can I export to Json without 'hand-rolling' a class/record type to hold the values and maintain the property names?
The TypeProvider works by providing compile time type safety. The actual code that is compiled maps (at compile time) the nice accessors to tupled values (for performance reasons, I guess). So at run time the JSON serializer sees tuples only.
AFAIK there is no way around hand-rolling records. (That is unless we eventually get type providers that are allowed to take types as parameters which would allow a Lift<T>-type provider or the CSV type provider implementation is adjusted accordingly.)
I am currently trying to build an iOS game that match word with its definition for myself and my classmates.
I'm having a hard time thinking how do I go about converting a list of words with their definitions in a .docx files into something(JSON, XML, ...) that I can then read it into an Array or Dictionary.
Most of the words in the .docx have the following format:
" Word (): Definition. "
This would be easier with excel. Excel already has the function to export to xml, and should make your life a lot easier, instead of getting all the words out of docx and then converting them to either JSON or XML.
http://www.excel-easy.com/examples/xml.html
What is the difference between XML Serialization and XML Parsing? When should we use each one?
Parsing is, generally speaking, the processing of an input stream into meaningful data structures; in the XML context, parsing is the process of reading a sequence of characters conforming to the grammar and other constraints of the XML spec into whatever internal representation of XML your program uses.
Serialization is the opposite process: processing the internal data structures of a program (in this context, your internal representation of an XML document) and creating a character sequence (typically written to an output stream) that conforms to the angle-bracket syntax of the spec.
Use a parser to read XML from a character stream into data structures; use a serializer to write data structures out into a character stream.
I don't know much about XML, but here's what I know about serialization and parsing.
parsing - reading data (parse-in) from storage, and writing data (parse-out) to storage… "such as a text file"
serializing - (serialize) translating data into a readable format, and (de-serialize) translate that format back to data… "i.e. you want to translate a struct into readable content, stream that content across a network, and translate it back into code."
here's a new one…
marshalling - (marshall and unmarshall) similar to serialize, except marshalling is used to translate data into a different format… "i.e. you want to translate a stream of bytes into an 32 bit structure (one byte to four bytes)"
in easy terms (for beginners)
TL;DR
XML parsing (or XML deserialization) ==> input: valid XML, output: data structures
XML serialization ==> input: data structures, output: valid XML
XML parsing (a.k.a XML de-serialization)
You take a .xml file (example.xml) as input to process it with your programming language of choise, so that your programm can do something usefull with the data in that file. Your programm will transform the information from the file into data structures that your programming language can deal with (i.e. lists, arrays, objects, etc.).
XML serialization
Your programm (in any programming language), transforms information represented as data structures (lists, arrays, objects, etc.) into a valid XML output which can be saved into a file or tranmitted to another programm.
NOTE: Technically the input (when we are takling about parsing) and the output (when we are talking about serialization) does not have to be a file. As said in the more professional answer above it can be any input/output stream, too. And files don't have to have .xml extension, they can have any file extension which represents a valid XML format (i.e. .svg is also a XML based format). The key to understanding is, that when we do XML parsing we have valid XML on the input side and data structures on the output side, and when we do XML serialization we have data structures on the input side and valid XML on the output side.
To give an example from the Python world: you can use buildin packages (like xml.etree.ElementTree) or third party libraries (like lxml (recommended) or xmltodict) to do both - parse (deserialize) or create (serialize) XML data.
I have a series of messages that are defined by independent structs. These structs share a common header are sent between applications. I am creating a decoder that will take the raw data captures in the messages that were built using these structs and decode/parse them to some plain text.
I have over 1000 different messages that need to be decoded so I am not sure if defining all the struct formats in XML and then using XSL or some translation is the way to go or if there is a better way to do this.
There are times when I will need to decode logs containing over a million messages so performance is a concern.
Any recommendations for techniques/tools/algorithms to go about creating the decoder/parser?
struct:
struct {
dword messageid;
dword datavalue1;
dword datavalue2;
} struct1;
Example raw data:
0101010A0A0A0A0F0F0F0F
Decoded message (desired output):
message id: 0x01010101, datavalue1: 0x0A0A0A0A, datavalue2: 0x0F0F0F0F
I am using C++ to do this development.
Regarding "performance" - if you are using disk IO and possible display IO I doubt your parser/decoder will have much effect unless you use a truly horrible algorithm.
I am also unsure about what the problem is - Given the question right now - you have 3 DWORDs in a struct and you claim that there are over 1000 unique messages based on these values.
Your decoded message does not imply to me that you need any kind of parsing - just straight output seems to work (convert from bytes to ascii representation of a hex value)
If you do have a mapping from a value to a string, then a big switch statement is simple - or alternatively if you want to be able to have these added dynamically or change the display, then I would provide the key/value pairs (mapping) in a config file (text, xml, etc) and then do a lookup as the log file/raw data is read.
map is what I would use in that case.
Perhaps if you provide another specific example of the values and decoded output I can come up with a more appropriate suggestion.
If you have the message definitions already given in the syntax that you've used in your example, you should definitely not try to convert it manually into some other syntax (XML or otherwise).
Instead, you should try to write a compiler that takes these method definitions, and compiles them into a decoder function.
These days, the recommendation is to use ANTLR as the parser generator, using any of the ANTLR languages for the actual compiler (Java, Python, Ruby, C#, C++). That compiler then should output C code, which does the entire decoding and pretty-printing.
You can use yacc or antlr, add appropriate parsing rules, populate some data structure out of it(a tree may be) while parsing, then traverse the data structure and do whatever you like.
I'm going to assume that all you need to do is format the records and output them.
Use a custom code generator. The generated code will look something like this:
typedef struct { word messageid; } Header;
//repeated for each record type
typedef struct {
word messageid;
// <members here>
} Record_##;
//END
void Process(Input inp, Output out) {
char buffer[BIG_ENOUGH];
char *offset;
offset = &buffer[BIG_ENOUGH];
while(notEnd) {
if(&offset[sizeof(LargestStruct)] >= &buffer[BIG_ENOUGH])
// move remaining buffer to start and fill tail from inp
Header *hpt = (Header*)offset;
switch(hpt->messageid)
{
//repeated for each record type
case <recond ID for given type>:
{
Record_##* rpt = (Record_##*)offset;
outp.format("name1: %t, ...\n", rpt->name1, ...);
offset += sizeof(Record_##);
break;
}
//END
}
}
}
Most of that's boiler plate so writing a program to generate it shouldn't be to hard.
If you need more processing, I think this idea could be tweaked some to make that work as well.
Edit: after re-reading the question, it looks like you might have the structs defined already. In that cases you can just #include them and use them directly. However then you end up with the issue of how to parse the structs to generate the input to the formating function. Awk or sed might be handy there.