Saxon XSD or Schema parser in java - parsing

Is there any way we can parse Schema or XSD file using saxon?, I need to display all possible XPath for given XSD.
I found a way in org.apache.xerces but wanted to implement logic in Saxon as it supports XSLT 3.0 (we want to use same lib for XSLT related functionality as well)
thanks in advance

Saxon-EE of course includes an XSD processor that parses schema documents. I think your question is not about the low-level process of parsing the documents, it is about the higher-level process of querying the schemas once they have been parsed.
Saxon-EE offers several ways to access the components of a compiled schema programmatically.
You can export the compiled schema as an SCM file in XML format. This format isn't well documented but its structure corresponds very closely to the schema component model defined in the W3C specifications.
You can access the compiled schema from XPath using extension functions such as saxon:schema() and saxon:schema - see http://www.saxonica.com/documentation/index.html#!functions/saxon/schema
You can also access the schema at the Java level: the methods are documented in the Javadoc, but they are really designed for internal use, rather than for the convenience of this kind of application.
Of course, getting access to the compiled schema doesn't by itself solve your problem of displaying all valid paths. Firstly, the set of all valid paths is in general infinite (because types can be recursive, and because of wildcards). Secondly, features such as substitution groups and types derived by extension make if challenging even when the result is finite. But in principle, the information is there: from an element name with a global declaration, you can find its type, and from its type you can find the set of valid child elements, and so on recursively.

Related

What is the advantage of F# Type Providers over traditional 'type providers'?

From the MSDN page on F# Type Providers:
An F# type provider is a component that provides types, properties,
and methods for use in your program.
So it is like a .NET class library? What is the difference? And:
Writing these types manually is very time-consuming and difficult to
maintain.
Does a Type Provider write itself automatically? There is more:
Similarly, a type provider for WSDL web services will provide the
types, properties, and methods you need to work directly with any WSDL
web service.
There are utilities for generating types from a WSDL URL, again what is the advantage provided by Type Providers here?
My first thoughts were F# Type Providers provide types at runtime like .NET remoting but that does not seem to be the case. What are the advantages of using them?
In many ways, code generation is a natural comparison for type providers. However, type providers have several desirable properties that code generation lacks:
Type providers can be used in F# scripts without ever having to context switch. With a code generator, you'd have to invoke the generator, reference the code, etc. With a type provider you reference the type provider assembly (which is just like referencing any other F#/.NET assembly) and then use the provided types right away. This is really a game changer for interactive scripting.
As Gustavo mentions, erased types allow type providers to handle situations where traditional code generation would generate too much code (e.g. Freebase has thousands of types, which is no problem for a type provider).
Type providers can support invalidation, so that if a data source changes the compiler will immediately recheck the file.
Likewise, with a code generator it's possible for the generated code to get out of sync with the data source; type providers can prevent this problem from occurring, inspecting the data source each time your program is compiled (though many type providers also provide the option of using a cached schema for convenience).
Type providers are arguably easier to implement, though it probably depends on the scenario and the author's background.
You can generate types from a WSDL or from a DB using a code generation tool, as the ones integrated into Visual Studio. Type providers do basicly the same, but integrate that process directly in the compilation. This way you don't need to worry about regenerating the types when the the schema changes.
Additionaly, type providers support doing this with erased types, which are "virtual" types that don't really exist. This means that instead of generating 500 types and a big assembly, only what is actually used is generated, which means smaller assemblies and support for importing huge and recursive schemas, like Freebase

What options exist to store configuration data in Delphi?

I want to store and load diverse program data in a Delphi project. This data ranges from simple strings to more complex recurring configuration object data.
As we all know ini files provide a fast and easy way to store program data but are limited to key-value representations.
XML is often the weapon of choice when it comes to requirements like this but I want to know if there is an alternative to XML.
Recently I found superobject for Delphi which seems to be a lot easier to handle than XML. Is there anything to be said against using JSON for such "non web task"?
Are you aware of other options that support data storage and load in plain text (like ini, xml, json) in Delphi?
In fact it doesn't matter which storing format you choose (ini, xml, json, whatever). Build an abstract Configuration class that fits all your needs and after that think about the concrete class and the concrete storing format, and decide by how easy to implement and maybe human readability
In some cases you also want to have different configuration aspects (global, machine, user).
With your configuration class you can easily mix them together (use global if not user defined) and can also mix up storing formats (global-config from DB, machine-config from Registry, user-config from file).
Good old INI Files work great for me, in combination with the built in TIniFile and TMemIniFile classes in the IniFiles unit
Benefits of INI files;
Not binary.
Easier to move from machine to machine than Registry settings.
Easy to inspect and view.
Unlike XML, it's simple and human readable
INI files are easy to modify either by hand or by tool and are almost bulletproof, whereas it's easy to make a malformed JSON or XML that is completely unreadable, it's hard to do more than "damage one section" of an INI file. Simplicity wins.
Drawbacks:
Unlike XML and Registry it's more or less "two levels", sections and items.
TMemIniFile doesn't order the results in any controllable way. I often wish I could control the order of items in my ini files if they are generated by a human being, I would like the order to be preserved, and TMemIniFile does not preserve order, thus I find I do not love TMemIniFile as much as love plain old TIniFile.

xml merging with Lua script

I have one task... of course I am not expecting you people to give me ready-made solution, but some outline will be very much helpful. Please help, as Lua is a new language for me.
So the task is:
I have three xml files. All the xml files are storing the data about the same objects say equipment. Except the name of the equipment, the parameters, xmls storing are different.
Now I want to make a generic xml file, which is carrying all the data(all parameters) about the equipment.
Please note that, the name will be unique and thus it will act as a key parameter.
I want to achieve this task with Lua script.
Lua does not do xml "by default". It is a language thought to be "embedded" into other systems, so it could happen that the system you have it "embedded in" is able to parse the xml files and pass them on to Lua. If that's the case, translate the xmls to Lua tables on the host system, then give them to Lua, manipulate them, in Lua, and return the resulting Lua table, so that the host can transform it to xml.
Another option, if available, would be installing a binary library for parsing xml, such as luaxml. If you are able to install it in your system, you should be able to manipulate the xml files more or less easily directly from Lua. But this possibility depends on the system you have embedded Lua into; a lot of systems don't allow installation of additional libraries.

Framework for building structured binary data parsers?

I have some experience with Pragmatic-Programmer-type code generation: specifying a data structure in a platform-neutral format and writing templates for a code generator that consume these data structure files and produce code that pulls raw bytes into language-specific data structures, does scaling on the numeric data, prints out the data, etc. The nice pragmatic(TM) ideas are that (a) I can change data structures by modifying my specification file and regenerating the source (which is DRY and all that) and (b) I can add additional functions that can be generated for all of my structures just by modifying my templates.
What I had used was a Perl script called Jeeves which worked, but it's general purpose, and any functions I wanted to write to manipulate my data I was writing from the ground up.
Are there any frameworks that are well-suited for creating parsers for structured binary data? What I've read of Antlr suggests that that's overkill. My current target langauges of interest are C#, C++, and Java, if it matters.
Thanks as always.
Edit: I'll put a bounty on this question. If there are any areas that I should be looking it (keywords to search on) or other ways of attacking this problem that you've developed yourself, I'd love to hear about them.
Also you may look to a relatively new project Kaitai Struct, which provides a language for that purpose and also has a good IDE:
Kaitai.io
You might find ASN.1 interesting, as it provide an absract way to describe the data you might be processing. If you use ASN.1 to describe the data abstractly, you need a way to map that abstract data to concrete binary streams, for which ECN (Encoding Control Notation) is likely the right choice.
The New Jersey Machine Toolkit is actually focused on binary data streams corresponding to instruction sets, but I think that's a superset of just binary streams. It has very nice facilities for defining fields in terms of bit strings, and automatically generating accessors and generators of such. This might be particularly useful
if your binary data structures contain pointers to other parts of the data stream.

F# type providers, how do they work

I don't quite get type providers after watching Don Symes's pdc video
http://player.microsoftpdc.com/Session/04092962-4ed1-42c6-be07-203d42115274
Do I understand this correctly. You can get ready made type providers for Twitter, Excel...
What if I have a custom Xml structure, do I need to implement my own type provider for that and how is this different from creating my own custom mapper?
Say you have some arbitrary data entity out in the world. For this example, let's say it's a spreadsheet.
Let's also say you have some way to get/infer schema/metadata for that data - that is, you can know types (e.g. double versus string) and relationships (e.g. this column means 'salary') and metadata (e.g. this sheet is for the June 2009 budget).
Type providers lets you code up a kind of 'shim library' that knows about some kind of data entity (e.g. a spreadsheet) and use that library as part of the compiler/IDE toolchain so that you can write code like
mySpreadsheet.ByRowAndColumn.C4
or something, and get Intellisense (autocompletion) and tooltips (e.g. describing cell C4 as Salary for Bob) and static typing (e.g. have it be a double or a string or whatever it is). Essentially this gives you the tooling affordances of statically-typed object models with the ease-of-use leverage of various dynamic or code-generation systems, with some improvements on both. The 'cost' is that someone has to write the shim library (the 'type provider'), but many such providers are very general (e.g. one that speaks OData or Excel or WMI or whatnot) and so a small handful of type provider libraries makes vast quantities of the world's data available in your programming language with static typing and first-class tooling support.
The architecture is an open compiler, where provider-authors implement a small interface that allows them to inject new names/types into the programming context. A type provider might be just another library you pass to the compiler (a reference in your project, -r-ed), with extra metadata that marks it as a type provider that participates in the compilation/IDE/codegen portions of development.
I don't know exactly what a "custom mapper" is in your xml example to draw a comparison.
I understand that this is an old question, but now Type providers are available (as F# 3.0 got released). There is a white paper explaining it too. And we have a code drop from Microsoft that can let you see under the hood.
http://www.infoq.com/news/2012/09/fsharp-type-providers
Type providers use F#'s quotations to act as (effectively) compiler plugins that can generate code based on meta-data at compile time.
This allows you to (for example) read in some JSON, or a data base schema, or some XSD or whatever and then generate F# classes to model the domain that meta-data represents.
In terms of creating them, I wrote a few blog posts that might be of interest starting with Type Providers from the Ground Up.

Resources