ANT FilterChains and FilterReaders - ant

I have a file depends.txt containing 2015001, 2015002, 2015003. I created an ANT target that have the following code. I tried searching on how to use the stringtokenizer attribute but the descriptions are vague. I would like to run the target and get
2015001
2015002
2015003
All the help are greatly appreciated. Thanks.
`<loadfile srcFile="depends.txt" property="depends"/>
<filterchain>
<tokenfilter>
<stringtokenizer delims="," />
</tokenfilter>
</filterchain>`

Ant's filterreaders can modify the input read by Ant and the tokenfilter is one of them. The tokenfilter doesn't do anything by itself but rather coordinates two different actors - a tokenizer and a filter.
The filter is the thing that performs the real action and the tokenizer is responsible for feeding the filter with chunks of text it works on. The separation of tokenizer and filters allows the same algorithm - say uniq - to be applied to either words or lines depending on the tokenizer.
In your example you only specify a tokenizer but no filter so the output is the same as the input. IIUC you only want to strip the comma characters, in that case
<loadfile srcFile="depends.txt" property="depends">
<filterchain>
<deletecharacters chars=","/>
</filterchain>
</loadfile>
should do the trick.

Related

Verifying that two list properties in an Ant script have the same number of elements (and using them in parallel)

I have a property that defines a list of files and goes through them with
<foreach target="target-name" param="file"
parallel="true" trim="true">
<path>
<filelist dir="${dir}" files="${files}" />
</path>
</foreach>
But I also have another property, which defines a respective "package" for each file. But how do I use the second list in parallel?
To verify that the ${packages} contains the same number of elements as ${files}
Provide the nth element of ${packages} to the target-name task that use the single file from the filelist.
or just ensure that only one file argument is provided, if I cannot verify the packages.
The properties are user configurable and will be provided from a properties file, so I don't know them in advance.
It sounds I am overreaching the capabilities of 'ant', but this is an existing script that I need to just modify to at least make sure that it cannot be run with two files and just one package. If nothing else, I would just need to detect that situation.
in order to verify that the ${packages} contains the same number of elements as ${files} you may use <countfilter> over both- ${packages} and ${files}, to count the number of , or any other delimiter, and compare the values to ensure the number of elements in files and packages are same. you may run the <foreach> when the the counts are same.
see CountFilter
obtaining, in the ith iteration of <foreach>, the ith elements of ${files} and ${packages} respectively (assuming the order in ${packages} conforms with the order in ${files}) does not seem straightforward. mainly because the <foreach> contains only one param attribute, which therefore transfers only one current element of the_input_list as per the delimiter specified.
a workaround might be possible by using <foreach> over one list, say ${files} and using the other list, i.e. ${packages}, in target-name (by extracting one element from the list for every iteration of <foreach>).
how you implement this is up to you. one example: (in target-name)
<propertyregex property="curr_pkg" input="${packages}" regexp="^(.*?)," select="\1" />
<propertyregex property="${packages}" input="${packages}" regexp="^.*?," replace="" override="true"/>
in every iteration of <foreach> and subsequent invocation of target-name the first element of ${packages} will be available in ${curr_pkg} and it will also be removed from {$packages} (or use a substitute property if you dont want to modify ${packages})

JAXB unmarshalling empty tags [duplicate]

I changed the datatype of some parameters within my xsd file from string to their real type, in this case double. now I'm facing the problem that around here the comma is used as a separator and not the point as defined by the w3 (http://www.w3.org/TR/xmlschema-2/#double) causing erros all over during deserialization (C#, VS2008).
my question: Can I use the w3 schema for validation but with a different separator for double values?
thx for your help
You cannot do that if you want to continue to use the XML Schema simple types. decimal and the types derived from it are locked down to using a period. As you say: the spec here.
If you want to use a comma as a separator, you need to define your own simple type, for example:
<xs:simpleType name="MyDecimal">
<xs:restriction base="xs:string">
<xs:pattern value="\d+(,\d+)?"/>
</xs:restriction>
</xs:simpleType>
Before you do that, though, be careful; XML is a data storage format, not a presentation format. You might want to think about whether you sort this out after loading or during XSLT transformation, etc.

JAXB: How to keep consecutive spaces as they are in source XML during unmarshalling

I am using JAXB to unmarshall an XML message. It seems to replace multiple consecutive spaces by a single space.
<testfield>this is a test<\testfield>
(several spaces between a and test)
upon unmarshalling, the above becomes:
this is test
How do I keep consecutive spaces as they are in source XML?
From the msdn page:
Document authors can use the xml:space attribute to identify portions
of documents where white space is considered important. Style sheets
can also use the xml:space attribute as a hook to preserve white space
in presentation. However, because many XML applications do not
understand the xml:space attribute, its use is considered advisory.
You can try adding xml:space="preserve" so it doesn't replace the spaces
<poem xml:space="default">
<author xml:space="default">
<givenName xml:space="default">Alix</givenName>
<familyName xml:space="default">Krakowski</familyName>
</author>
<verse xml:space="preserve">
<line xml:space="preserve">Roses are red,</line>
<line xml:space="preserve">Violets are blue.</line>
<signature xml:space="default">-Alix</signature>
</verse>
</poem>
http://msdn.microsoft.com/en-us/library/ms256097%28v=vs.110%29.aspx

Approximate matching in voicexml

I don't know if I could get an answer here... The problem I am trying to solve is: The system listens to the user's input,judge if the user's input contains the word "loop".
Does VoiceXML support grammars for this kind of task? It seems that it can only pick up word from words listed. The user can say:
using a loop, loop, for loop, looping through the array......
Is there a way for me to only consider if the sentence contains "loop"?
Thanks in advance.
You can create your own grammer and attach to your field:
<field name="loopField">
<prompt>What's your way to say loop?</prompt>
<grammar src="mygrammar.gram" type="application/srgs+xml" />
<help> Please say your employee number. </help>
</field>
See more on W3C grammar here

Get ant concat to ignore BOM's'?

I have an ant build that concatenates my javascript into one file and then compresses it. The problem is that Visual Studio's default encoding attaches a BOM to every file. How do I configure ant to strip out BOM's that would otherwise appear in the middle of the resulting concatenated file?
My googl'ing revealed this discussion which is the exact problem I'm having but doesn't provide a solution: http://marc.info/?l=ant-user&m=118598847927096
The Unicode byte order mark codepoint is U+FEFF. This concatenation command will strip out all BOM characters when concatenating two files:
<concat encoding="UTF-8" outputencoding="UTF-8" destfile="nobom-concat.txt">
<filelist dir="." files="bom1.txt,bom2.txt" />
<filterchain>
<deletecharacters chars="" />
</filterchain>
</concat>
This form of the concat command tells the task to decode the files as UTF-8 character data. I'm assuming UTF-8 as this is usually where Java/BOM issues occur.
In UTF-8, the BOM is encoded as the bytes EF BB BF. If you needed it to appear at the start of the resultant file, you could use a subsequent concatenation to prefix the output file with a BOM again.
Encoded values for U+FEFF in other UTF encodings are listed here.

Resources