Ant exclude file based on it's content - ant

Is there any way to exclude files from an ant fileset based on the file content?
We do have test servers where code files are mixed up with files that have been generated by a CMS.
Usually, the files are placed in different folders, but there is a risk that real code files are in the middle of generated code.
The only way to differentiate generated files is to open the files and look at it's content. If the file contains a keyword, it should be excluded.
Does anyone know a way to perform this with Ant?
From the answer provided by Preet Sangha, Ishould use a filterchain. However, I'm missing a step here.
Let's say I load a text file of exclusions to be performed:
<loadfile property="exclusions" srcFile="exclusions.txt" />
But I don't know how to integrate it into my current copy task:
<copy todir="${test.dir}">
<fileset dir="${src.dir}">
</fileset>
</copy>
I tried to add the following exclude to the fileset but it does not do anything:
<exclude name="${exclusions}"/>
I'm sure I'm missing a simple step...

Have a look at the not and contains selectors.
The not selector contains an example of pretty much exactly what you're trying to do.
<copy todir="${test.dir}">
<fileset dir="${src.dir}">
<not>
<contains text="your-keyword-here"/>
</not>
</fileset>
</copy>
There's also the containsregexp selector which might be useful if your criteria for exclusion is more complicated.
There's a load more selectors you can use to refine your selection if needed.

I don't know ant but reading the docs....
Can you build a files list using a filterchain, and put this into the excludefiles of a fileset?
or
perhaps create a fileset with a filterchain that uses a filterreader and linecontainsregexp

Related

<zipfileset> vs. <fileset> in ant

The ant build tool provides two different tasks <fileset/> and <zipfileset/>.
According to the documentation <zipfileset/> allows us to extract files from a .zip file
if we use src attribute.
My question is if we are using dir attribute to select files then what is the difference between the two, <zipfileset/> and <fileset/>.
e.g.
<zipfileset dir="conf/Gateway>
<include name="jndi.properties" />
</zipfileset>
and
<fileset dir="conf/Gateway>
<include name="jndi.properties" />
</fileset>
One useful difference between the two tasks if you're building an archive (a ZIP or WAR or JAR for example) is that a zipfileset has a prefix attribute you can use to relocate the given files at a different folder in the archive. For example, if the following is included in a bigger set of fileset and zipfileset elements:
<zipfileset dir="conf/Gateway" prefix="properties">
<include name="jndi.properties" />
</zipfileset>
then the file conf/Gateway/jndi.properties will actually be included in the output as conf/Gateway/properties/jndi.properties. You can achieve the same end in other ways, but this is occasionally useful.
Otherwise, just use the task that seems most appropriate for the task at hand.

Why does ANT update the contents of a fileset after it was created, and can I override this?

I think this may be easiest explained by an example, so here goes:
<target name="test">
<fileset id="fileset" dir="target">
<include name="*"/>
</fileset>
<echo>${toString:fileset}</echo>
<touch file="target/test"/>
<echo>${toString:fileset}</echo>
</target>
Outputs:
test:
[echo]
[touch] Creating target/test
[echo] test
What I ideally want is to have the fileset stay the same so I can have a before/after set (in order to get a changed set using <difference>, so if you know of a way to skip right to that...).
I've tried using <filelist> instead, but I can't get this correctly populated and compared in the <difference> task (they're also hard to debug since I can't seem to output their contents). I also tried using <modified/> to select files in the fileset, but it doesn't seem to work at all and always returns nothing.
Even if there is an alternative approach I would appreciate a better understanding of what ANT is doing in the example above and why.
The path selector is evaluated on the fly. When a file is added, it will reflect in the set when you use it.
You may able to evaluate and keep it in variable using pathconvert. Then this can be converted back to filest using pathtofilest
A fileset is something like a selector. It's a set of "instructions" (inclusions, exclusions, patterns) allowing to get a set of files.
Each time you actually do something with the fileset (like printing the files it "references"), the actual set of files is computed based on the "instructions" contained in the fileset.
As Jayan pointed out it might be worth posting the final outcome as an answer, so here's a simplified version with the key parts:
<fileset id="files" dir="${target.dir}"/>
<pathconvert property="before.files" pathsep=",">
<fileset refid="files"/>
</pathconvert>
<!-- Other Ant code changes the file-system. -->
<pathconvert property="after.files" pathsep=",">
<fileset refid="files"/>
</pathconvert>
<filelist id="before.files" files="${before.files}"/>
<filelist id="after.files" files="${after.files}"/>
<difference id="changed.files">
<filelist refid="before.files"/>
<filelist refid="after.files"/>
</difference>

Ant: use include and exclude together

OK, this seems like it should be really simple. I'm using Apache Ant 1.8, and I have a target which does:
<delete file="output/program.tar.bz2"/>
<tar basedir="input" destfile="output/program.tar.bz2" compression="bzip2">
<tarfileset dir="input">
<include name="goodfolder1/**"/>
<include name="goodfolder2/**"/>
<exclude name="**/badfile"/>
<exclude name="**/*.badext"/>
</tarfileset>
</tar>
I want it to make a .tar.bz2 of input/goodfolder1 and input/goodfolder2, excluding files named "badfile", and excluding files with extension ".badext". It's giving me a .tar.bz2, but it's including badfile and *.badext -- the excludes seem to be ignored.
The order of include/exclude doesn't seem to make a difference. I tried wrapping the includes/excludes in a (the docs say it's implicit?), but it made no difference.
I'm sure there's something simple I'm missing, since the manual has a very similar example, though in a somewhat different context.
EDIT: It looks like it could be related to the dir="input" attribute: it's adding everything in "input", and then adding everything in the tarfileset to that. Files I want appear twice in the program.tar.bz2, but files that are excluded only appear once. But dir is mandatory, and I don't see how this is different from the examples in the manual.
Ah, the <tarfileset> itself is what was causing my problem.
If I remove that, and put the includes/excludes directly inside the <tar>, it works fine.
I think you will need to use two <tarfileset> elements. The <include> and <exclude> elements are for files only. Not for directories. Example:
<tar basedir="input" destfile="output/program.tar.bz2" compression="bzip2">
<tarfileset dir="goodfolder1">
<exclude name="**/badfile"/>
<exclude name="**/*.badext"/>
</tarfileset>
<tarfileset dir="goodfolder2">
<exclude name="**/badfile"/>
<exclude name="**/*.badext"/>
</tarfileset>
</tar>

Why `**/*ant*` exclude pattern doesn't work, but `"**/*ant*/**` works fine?

Let's say I have this in one of my targets:
<path id="files">
<fileset dir="${env.DIRECTORY}" casesensitive="false">
<include name="**/*.html"/>
<exclude name="**/*ant*"/>
</fileset>
</path>
I'd like to group all the html files, except the ones containing the string ant. The way I wrote it above, it does not work. I also tried specifying the exclude like this:
<exclude name="*ant*"/>
Please notice that the fileset has it's case sensitiveness turned off. However, if I write:
<exclude name="**/*ant*/**"/>
This does work. Why don't the first and second versions of exclude work?
First and second case don't match because you are searching for directory name containing ant
Third case matches all files that have a ant element in their path, including ant as a filename.
You can also refer this Ant documentation

How to suppress ant jar warnings for duplicates

I don't want ant's jar task to notify me every time it skips a file because the file has already been added. I get reams of this:
[jar] xml/dir1/dir2.dtd already added, skipping
Is there a way to turn this warning off?
This is an older question, but there is one obvious way to exclude the duplicates warning, do not include the duplicate files. You could do this in one of two ways:
Exclude the duplicate files in some fashion, or
Copy the files to a staging area, so that the cp task deals with duplicates, not the jar task.
So, instead of doing:
<jar destfile="${dist}/lib/app.jar">
<fileset dir="a" include="xml/data/*.{xml,dtd}"/>
<fileset dir="b" include="xml/data/*.{xml,dtd}"/>
<fileset dir="c" include="xml/data/*.{xml,dtd}"/>
</jar>
do one of:
<jar destfile="${dist}/lib/app.jar">
<fileset dir="a" include="xml/data/*.{xml,dtd}"/>
<fileset dir="b" include="xml/data/*.xml"/>
<fileset dir="c" include="xml/data/*.xml"/>
</jar>
or
<copy todir="tmpdir">
<fileset dir="a" include="xml/data/*.{xml,dtd}"/>
<fileset dir="b" include="xml/data/*.{xml,dtd}"/>
<fileset dir="c" include="xml/data/*.{xml,dtd}"/>
</copy>
<jar destfile="${dist}/lib/app.jar">
<fileset dir="tmpdir" include="xml/data/*.{xml,dtd}"/>
</jar>
<delete dir="tmpdir"/>
Edit: Base on the comment to this answer, there is a third option, although it is a fair bit more work... You can always get the source to the jar task, and modify it so that it does not print out the warnings. You could keep this as a local modification to your tree, move it to a different package and maintain it yourself, or try to get the patch pushed back upstream.
I don't know of any options on the jar task to suppress these messages, unless you run the whole build with the -quiet switch, in which case you may not see other information you want.
In general if you have lots of duplicate files it is a good thing to be warned about them as a different one may be added to that which you expect. This possibly indicates that a previous target of the build has not done its job as well as it might, though obviously without more details it is impossible to say.
Out of interest why do you have the duplicate files?

Resources