difficulty using saxon in java code for .sch to .xsl conversion - saxon

I’m trying to use schematron validation using saxon.
Firstly, i want to compile .sch file into .xsl . Later , i want to validate an .xml file with firstly produced .xsl file.
I found command line usage of saxon like below. And i used successfully them.
But i need to make these actions with java code.
I tryed some codes like below , but i did not guess how to put sch extensined file as a parameter (edefter_yevmiye.sch) and iso_svrl_for_xslt2.xsl into the code.
I searched the internet but i did not find enough information.
Is there a sample java code for converting .sch to .xsl or could you guide me please?
My java code
**Compiling .sch to .xsl**
net.sf.saxon.s9api.Processor processor1 = new net.sf.saxon.s9api.Processor(false);
net.sf.saxon.s9api.XsltCompiler xsltCompiler1 = processor1.newXsltCompiler();
xsltCompiler1.setXsltLanguageVersion("2.0");
xsltCompiler1.setSchemaAware(true);
net.sf.saxon.s9api.XsltExecutable xsltExecutable1 = xsltCompiler1.compile(new StreamSource(new FileInputStream(new File("File1.xsl"))));
net.sf.saxon.s9api.XsltTransformer xsltTransformer1 = xsltExecutable1.load();
xsltTransformer1.setSource(new StreamSource(new FileInputStream(new
File("File2.sch"))));
**Validation**
net.sf.saxon.s9api.Processor processor2 = new net.sf.saxon.s9api.Processor(false);
net.sf.saxon.s9api.XsltCompiler xsltCompiler2 = processor2.newXsltCompiler();
xsltCompiler2.setXsltLanguageVersion("2.0");
xsltCompiler2.setSchemaAware(true);
net.sf.saxon.s9api.XsltExecutable xsltExecutable2 = xsltCompiler2.compile(new StreamSource(new
FileInputStream(new File(“File1.xslt"))));
net.sf.saxon.s9api.XsltTransformer xsltTransformer2 = xsltExecutable2.load();
xsltTransformer2.setSource(new StreamSource(new FileInputStream(new
File("src.xml"))));
net.sf.saxon.s9api.Destination dest2 = new Serializer(System.out);
xsltTransformer2.setDestination(dest2);
xsltTransformer1.setDestination(xsltTransformer2);
xsltTransformer1.transform();
Command line usage
Compiling:
java -jar saxon9he.jar -o:output.xsl -s:some.sch iso_svrl_for_xslt2.xsl
Validation:
java -jar saxon9he.jar -o:warnings.xml -s:some.xml output.xsl

You're using your second transformation as the destination for the first, but that means that the output of the first transformation is used as the source document for the second, whereas you want to use it, I think, as the stylesheet for the second transformation.
The simplest way to do this is probably to set an XdmDestination for the first transformation, and then with this destination object, do destination.getXdmNode().asSource() to get the input to the compile() method for the second transformation.

Related

Generate files with one input to multiply outputs

I'm trying to create a code generator that takes input a JSON file and generates multiple classes in multiple files.
And my question is, is it possible to create multiple files for one input using build from dart lang?
Yes it is possible. There are currently many tools in available on pub.dev that have code generation. For creating a simple custom code generator, check out the package code_builder provided by the core Dart team.
You can use dart_style as well to format the output of the code_builder results.
Here is a simple example of the package in use (from the package's example):
import 'package:code_builder/code_builder.dart';
import 'package:dart_style/dart_style.dart';
final _dartfmt = DartFormatter();
// The string of the generated code for AnimalClass
String animalClass() {
final animal = Class((b) => b
..name = 'Animal'
..extend = refer('Organism')
..methods.add(Method.returnsVoid((b) => b
..name = 'eat'
..body = refer('print').call([literalString('Yum!')]).code)));
return _dartfmt.format('${animal.accept(DartEmitter())}');
}
In this example you can use the dart:io API to create a File and write the output from animalClass() (from the example) to the file:
final animalDart = File('animal.dart');
// write the new file to the disk
animalDart.createSync();
// write the contents of the class to the file
animalDart.writeAsStringSync(animalClass());
You can use the File API to read a .json from the path, then use jsonDecode on the contents of the file to access the contents of the JSON config.

Parsing LLVM IR code (with debug symbols) to map it back to the original source

I'm thinking about building a tool to help me visualise the generated LLVM-IR code for each instruction/function on my original source file.
Something like this but for LLVM-IR.
The steps to build such tool so far seem to be:
Start by with LLVM-IR AST builder.
Parse generated IR code.
On caret position get AST element.
Read the element scope, line, column and
file and signal it on the original source file.
Is this the correct way to approach it? Am I trivialising it too much?
I think your approach is quite correct. The UI part will probably be quite long to implement so I'll focus on the llvm part.
Let's say you start from a input file containing your LLVM-IR.
Step 1 process module:
Read file content to a string. Then Build a module from it, and process it to get the debug info:
llvm::MemoryBuffer* buf = llvm::MemoryBuffer::getMemBuffer(llvm::StringRef(fileContent)).release();
llvm::SMDiagnostic diag;
llvm::Module* module = llvm::parseIR(buf->getMemBufferRef(), diag, *context).release();
llvm::DebugInfoFinder* dif = new llvm::DebugInfoFinder();
dif->processModule(*module);
Step 2 iterate on instructions:
Once done with that, you can simply iterate on function and blocks and instructions:
// pseudo code for loops (real code is a bit long)
foreach(llvm::Function f in module.functions)
{
foreach(llvm::BasicBlock b in f.BasicBlockList)
{
foreach(llvm::Instruction inst in b.InstList)
{
llvm::DebugLoc dl = inst.getDebugLoc();
unsigned line = dl->getLine();
// accordingly populate some dictionary between your instructions and source code
}
}
}
Step 3 update your UI
This is another story...

tika returning incorrect line of text for pdf with lots of tables

I am using tika to extract text from a pdf file that has lot of tables.
java -jar tika-app-0.9.jar -t https://s3.amazonaws.com/centraldoc/alg1.pdf
It is returning some invalid text and sometimes it is trimming white space between 2 words; for example it returns
"qu inakli fmyathematical ideas to the real world" instead of "Link mathematical ideas to the real world".
Is there a way to minimize this kind of error? or is there another library that I can use? Does it make sense to use OCR to process these kind of pdf.
Try to control order when using PDFBox parser: PDFTextStripper has a flag that controls the order of lines in the document. By default (in PDFBox) it's set to false for performance reasons (no order preserved), but Tika changed its behavior between releases switching this flag on and off.
More details exactly on this problem in my blog Extracting text from PDF files with Apache Tika 0.9 (and PDFBox under the hood).
To get text from PDF to display in the right order, I had to set the SortByPosition flag to true... (tika-app-1.19.jar)
BodyContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
ParseContext context = new ParseContext();
PDFParser pdfParser = new PDFParser();
PDFParserConfig config = pdfParser.getPDFParserConfig();
config.setSortByPosition(true); // needed for text in correct order
pdfParser.setPDFParserConfig(config);
pdfParser.parse(is, handler, metadata, context);

How to call input file which is qlready in the package

In my Hadoop Map Reduce application I have one input file.I want that when I execute the jar of my application, then the input file will automatically be called.To do this I code one class to specify the input,output and file itself but from where I am calling the file, there I want to specify the file path. To do that I have used this code:
QueriesTest.class.getResourceAsStream("/src/main/resources/test")
but it is not working (cannot read the input file from the generated jar)
so I have used this one
URL url = this.getClass().getResource("/src/main/resources/test") here I am getting the problem of URL. So please help me out. I am using Hadoop 0.21.
I'm not sure what you want to tell us with your resource loading, but the usual way to add an input file is this:
Configuration conf = new Configuration();
Job job = new Job(conf);
Path in = new Path("YOUR_PATH_IN_HDFS");
FileInputFormat.addInputPath(job, in);
job.setInputFormatClass(TextInputFormat.class); // could be a sequencefile also
// set the other stuff
job.waitForCompletion(true);
Make sure your file resides in HDFS then.

How to write a simple .txt content processor in XNA?

I don't really understand how Content importer/processor works in XNA.
I need to read a text file (Content/levels/level1.txt) of the form:
x x
x x
x x
where x's are just integers, into an int[,] array.
Any tips on writting a SIMPLE .txt importer??? By searching google/msdn I only found .x/.fbx file importer examples. And they seem too complicated.
Do you actually need to process the text file? If not, then you can probably skip most of the content pipeline.
Something like:
string filename = "Content/TextFiles/sometext.txt";
string path = Path.Combine(StorageContainer.TitleLocation, filename);
string lineOfText;
StreamReader sr = new StreamReader(path);
while ((lineOfText = sr.ReadLine()) != null)
{
// do something
}
Also, be sure to set the "Build Action" to "None" and the "Copy to Output Directory" to "Copy if newer" on the text files you've added. This tells the content pipeline not to compile the text file but rather copy it to the output directory for use as is.
I got this (more or less) from the RacingGame sample provided by Microsoft. It foregoes much of the content pipeline and simply loads and processes text files (XML) for much of its level data.
XNA 4.0 uses
System.IO.Stream stream = TitleContainer.OpenStream("tilename.txt");
See http://msdn.microsoft.com/en-us/library/bb199094.aspx and also http://blogs.msdn.com/b/shawnhar/archive/2010/12/09/reading-files-in-xna-game-studio-4-0.aspx
There doesn't seem to be a lot of info out there, but this blog post does indicate how you can load .txt files through code using XNA.
Hopefully this can help you get the file into memory, from there it should be straightforward to parse it in any way you like.
XNA 3.0 - Reading Text Files on the Xbox
http://www.ziggyware.com/readarticle.php?article_id=69 is probably a good place to start. It covers creating a basic content processor.

Resources