How to provide an empty Source in xslTransformer.transform() method? - xslt-2.0

I have an xslt 2.0 file which is being used to transform a csv file to an xml file. The xsl has been taken from here:
http://p2p.wrox.com/xslt/40898-transform-csv-file-xml.html#post164344
Now I am trying to execute this through Java transformer (using the Saxon9 xsl transformer factory). Since the csv file is being passed into the xsl as a parameter, there is no need for me to pass anything in the Source parameter in the transform method. Since the javadocs for the transform method state the following:
The javadocs for the Transformer.transform method clearly state that the following:
"An empty Source is represented as an empty document as constructed by DocumentBuilder.newDocument(). The result of transforming an empty Source depends on the transformation behavior; it is not always an empty Result."
I tried to create an empty document and try the transformation as seen below:
TransformerFactory transformerFactory = TransformerFactory.newInstance("net.sf.saxon.TransformerFactoryImpl",null);
Source xsltSource = new StreamSource("file:///C:/my.xsl");
Transformer xsltTransformer = transformerFactory.newTransformer(xsltSource);
xsltTransformer.setParameter("pathToCSV", "'file:///C:/input.csv'");
StringWriter writer = new StringWriter();
xsltTransformer.transform(new DOMSource(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument()), new StreamResult(writer));
The above piece of code does not output anything and does not work as expected since I think the empty document given as input is taken into consideration rather than the csv file passed in the following line in the xsl:
<xsl:param name="pathToCSV" />
<xsl:variable name="input" select="unparsed-text($pathToCSV)"/>
Could anyone give me pointers on how to accomplish what I am trying to achieve?

Consider to use the Saxon API http://saxonica.com/documentation/html/using-xsl/embedding/s9api-transformation.html and not to use the JAXP API if you want to use XSLT 2.0 features like starting with a named template as the XSLT you linked to requires. Or, if you want to use JAXP with an empty dummy document you at least need to add a template doing
<xsl:template match="/">
<xsl:call-template name="main"/>
</xsl:template>

Related

XSLT 2.0: Check if string within a node-set is contained in another string

I have a requirement in which the input XML that is received has different error description for the same error code. I need to compare whether a part of the text is contained within the error description in order to do some filtering. Below is the snippet of what I am trying to do.
Created a variable to store a list of all the partial text to be checked within the error description.
<xsl:variable name="partialTextList">
<errorDesc value="insufficient funds" />
<errorDesc value="amount cannot exceed" />
</xsl:variable>
Created a key to access the variable
<xsl:key name="kErrorDesc" match="errorDesc" use="#value" />
The input XML to this XSL will have something like
<Error>
<Code>123</Code>
<Desc>Transaction cannot be processed as account has insufficient funds.</Desc>
</Error>
OR
<Error>
<Code>123</Code>
<Desc>The withdrawal amount cannot exceed account balance.</Desc>
</Error>
Is it possible to use contains function to check whether <Desc> has one of the values from partialTextList?
I tried to look up a solution for this comparison but was not able to find one. Most of the solutions are to check whether <Desc> value is present in the list but not vice-versa.
Any help is appreciated.
In the context of e.g. xsl:template match="Error" you can certainly check $partialTextList/errorDesc/#value[contains(current()/Desc, .)] or move it to the pattern xsl:template match="Error[$partialTextList/errorDesc/#value[contains(current()/Desc, .)]]" if you like.

Encoding in POJO to/from XML conversion within Camel

We have been very successful to carry out POJO to/from XML conversion within Camel. The following code exemplifies a typical case how we use Camel. Our application listens to an Oracle AQ. The queue entry is an xml String. The xml is then converted to POJO class (MyClass), we then do some transformation on the MyClass with data from other source. After this transformation, POJO object is converted back to a string and sent to other system (here we save to a file)
<route id="testing">
<from uri="oracleaq:queue:FUSEQUEUE"/>
<convertBodyTo type="generated.MyClass"/>
<bean ref="mainReqprocessor" method="Modify"/>
<convertBodyTo type="java.lang.String"/>
<setHeader headerName="Exchange.FILE_NAME">
<simple>output.xml</simple>
</setHeader>
<to uri="file:C:\\Temp\\OUT"/>
</route>
Everything works fine until yesterday when we introduced html tags into one of the text field of the POJO class. We wrapped the text with CData "<![CDATA[" + str + "]]>". But, when the POJO is converted to string, the encoding still occurred on the starting and ending brackets of CGata section, such as the following. Because of this, the resulting xml string is not valid xml any more, and therefore can not be converted back to MyClass for other application. This is not the desired behavior. How can I avoid the encoding on CDATA starting and ending brackets?[Notes: the first < and the last > in the cdata are encoded.]
<TEXT>
&lt;![CDATA[&lt;html&gt;&lt;div&gt;&lt;pre&gt;COMPONENT PARTS.&lt;/br&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/html&gt;]]&gt;
<\/TEXT>
Although you have a marshalling/unmarshalling problem, you don't mention how you convert the XML to POJO and back. This would be a very important information to help.
If you are using JAXB for the conversion, this Q/A could perhaps help you:
JAXB Marshalling Unmarshalling with CDATA

Saxon 9.8: Which patterns are supported in EXPath File Module function file:list?

Good afternoon,
I am working with Java Saxon 9.8.0.4. I would like to use EXPath File Module function "file:list" with its third "pattern" parameter. But I am in doubt, which style of pattern is supported.
I read both Saxon documentation and EXPath documentation. But I do not know, which patterns are supported in Saxon 9.8.0.4. It would be great to support regular expression, but I understand it is overkill for most users. I tried several blind tests, but just * and ? wildchars works for me as defined in EXPath documentation.
Yes, I can quite easily do regexp postprocessing in for-each, but to know more about list function could help.
Thank You in advance for Your help, Stepan
P.S: My use-case is to get all files without extension ("test" and not "test.txt") recursively from large and deep directory structure and process all of matching files with XSL-T 3.0. Most of such files have identical fileName and thus I can not do "copy to one folder" pre-processing for Saxon's -s:directory -o:directory one time invocation and invocation of Java (Saxon) for each file is of cource terrible time overhead. So I would like to read all matching files into sequence and process each item of such sequence using for-each (files are text ones and I read them using unparsed-text). And no, GAWK is not solution, as I have all transformation infrastructure from XML to SQL already in XSL-T, because 95 % of files are XMLs.
--ADDED code and explanation below:
Example of my test files.
XML file "a.xml":
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="a.xsl"?>
<root/>
XSL-T file "a.xsl":
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
xmlns:expathFile="http://expath.org/ns/file"
exclude-result-prefixes="xs saxon"
version="3.0">
<xsl:output method="text" />
<xsl:template match="/root">
<xsl:variable name="list" select="expathFile:list('C:\temp\temp\test\', false(), '^.*$')"/>
<xsl:for-each select="$list">
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
My folder "C:\temp\temp\test\" contains 6 test files: "a.txt", "b.txt", "c.txt", "e", "f", "g".
But after testing of online Java RegExp tester on "http://www.regexplanet.com/advanced/java/index.html" I have found, that the problem is solely on my side, because Java regular expression behaves little different than PCRE (Perl), sed, gawk regular expressions. So it is my fault and I need to learn Java regular expression.
Saxon uses the same code for this pattern as for the filter in select="pattern" in collection URIs, which is described at http://www.saxonica.com/documentation/index.html#!sourcedocs/collections
Extracting the relevant details:
The pattern used in the select parameter can use glob-like syntax, for
example *.xml selects all files with extension "xml". More generally,
the pattern is converted to a regular expression by prepending "^",
appending "$", replacing "." by "\.", "*" by ".*", and "?" by ".?",
and it is then used to match the file names appearing in the directory
using the Java regular expression rules. So, for example, you can
write ?select=*.(xml|xhtml) to match files with either of these two
file extensions. Note however, that special characters used in the URL
(that is, characters such as backslash and curly braces that are not
allowed in the query part of a URI) must be escaped using the %HH
convention. For example, vertical bar needs to be written as %7C. This
escaping can be achieved using the encode-for-uri() function.
Note that Saxon's collection() function now also supports match=pattern in the URI, where the pattern is a standard XPath 3.1 regular expression.

difficulty using saxon in java code for .sch to .xsl conversion

I’m trying to use schematron validation using saxon.
Firstly, i want to compile .sch file into .xsl . Later , i want to validate an .xml file with firstly produced .xsl file.
I found command line usage of saxon like below. And i used successfully them.
But i need to make these actions with java code.
I tryed some codes like below , but i did not guess how to put sch extensined file as a parameter (edefter_yevmiye.sch) and iso_svrl_for_xslt2.xsl into the code.
I searched the internet but i did not find enough information.
Is there a sample java code for converting .sch to .xsl or could you guide me please?
My java code
**Compiling .sch to .xsl**
net.sf.saxon.s9api.Processor processor1 = new net.sf.saxon.s9api.Processor(false);
net.sf.saxon.s9api.XsltCompiler xsltCompiler1 = processor1.newXsltCompiler();
xsltCompiler1.setXsltLanguageVersion("2.0");
xsltCompiler1.setSchemaAware(true);
net.sf.saxon.s9api.XsltExecutable xsltExecutable1 = xsltCompiler1.compile(new StreamSource(new FileInputStream(new File("File1.xsl"))));
net.sf.saxon.s9api.XsltTransformer xsltTransformer1 = xsltExecutable1.load();
xsltTransformer1.setSource(new StreamSource(new FileInputStream(new
File("File2.sch"))));
**Validation**
net.sf.saxon.s9api.Processor processor2 = new net.sf.saxon.s9api.Processor(false);
net.sf.saxon.s9api.XsltCompiler xsltCompiler2 = processor2.newXsltCompiler();
xsltCompiler2.setXsltLanguageVersion("2.0");
xsltCompiler2.setSchemaAware(true);
net.sf.saxon.s9api.XsltExecutable xsltExecutable2 = xsltCompiler2.compile(new StreamSource(new
FileInputStream(new File(“File1.xslt"))));
net.sf.saxon.s9api.XsltTransformer xsltTransformer2 = xsltExecutable2.load();
xsltTransformer2.setSource(new StreamSource(new FileInputStream(new
File("src.xml"))));
net.sf.saxon.s9api.Destination dest2 = new Serializer(System.out);
xsltTransformer2.setDestination(dest2);
xsltTransformer1.setDestination(xsltTransformer2);
xsltTransformer1.transform();
Command line usage
Compiling:
java -jar saxon9he.jar -o:output.xsl -s:some.sch iso_svrl_for_xslt2.xsl
Validation:
java -jar saxon9he.jar -o:warnings.xml -s:some.xml output.xsl
You're using your second transformation as the destination for the first, but that means that the output of the first transformation is used as the source document for the second, whereas you want to use it, I think, as the stylesheet for the second transformation.
The simplest way to do this is probably to set an XdmDestination for the first transformation, and then with this destination object, do destination.getXdmNode().asSource() to get the input to the compile() method for the second transformation.

Object Detection using FERNS

I am new to image processing and have just started working in OpenCV. I was trying to do object detection using GenericDescriptorMatcher of type fern. But I don't know what to pass as the params_filename. What should be the format of the file? What parameters do I write in the file and in what format?
Ptr<GenericDescriptorMatcher> descriptorMatcher = GenericDescriptorMatcher::create("FERN", params_filename);
The opencv-2.x.x/samples/cpp should contain an example version of 'fern_params.xml', which according to opencv-2.4.8 contains the following xml content,
<?xml version="1.0"?>
<opencv_storage>
<nclasses>0</nclasses>
<patchSize>31</patchSize>
<signatureSize>INT_MAX</signatureSize>
<nstructs>50</nstructs>
<structSize>9</structSize>
<nviews>1000</nviews>
<compressionMethod>0</compressionMethod>
</opencv_storage>

Resources