I want to rename files using Ant maintaining their directory structure.
e.g. Assume following directory structure:
- copy
- new
- testthis.a
Using code below, I could rename files containing "this" word to "that.a" using copy task, but they all are getting pasted into "paste" directory loosing their directory structure.
<copy todir="paste" overwrite="true">
<fileset dir="copy"/>
<regexpmapper from="^(.*)this(.*)\.a$$" to="that.a"/>
</copy>
Output:
- paste
- that.a
If I change regexmapper to (notice \1 before that.a):
<regexpmapper from="^(.*)this(.*)\.a$$" to="\1that.a"/>
It's generating correct directory structure but always prepends word before "this" to "that.a"
Output:
- paste
- new
- testthat.a
Is there any way to rename files maintaining their directory structure without pre-pending or appending any word?
Is there any other mapper which can be used for the same?
Any help would be appreciated.
<copy todir="paste" verbose="true">
<fileset dir="copy" includes="**/*this*.a"/>
<regexpmapper from="((?:[^/]+/)*)[^/]+$$" to="\1that.a" handledirsep="true"/>
</copy>
First, setting handledirsep="true" allows us use forward slashes to match backslashes. This makes the regular expression a bit cleaner.
Next, I'll explain the gnarly regex by breaking it into parts.
I explode ((?:[^/]+/)*) into...
(
(?:
[^/]+
/
)
*
)
What the parts mean:
( -- capture group 1 starts
(?: -- non-capturing group starts
[^/]+ -- greedily match as many non-directory separators as possible
/ -- match a single directory-separator character
) -- non-capturing group ends
* -- repeat the non-capturing group zero-or-more times
) -- capture group 1 ends
The above parts repeatedly match as many subdirectories as possible. The ( and ) put all of the matches into capture group 1. Later, capture group 1 can be used in the to attribute of <regexpmapper> with a \1 backreference.
If there are no / directory separators in a path, then the above parts won't match anything and capture group 1 will be an empty string.
Moving to the end of the regex, the $$ anchors the regex to the end of each path selected by the <fileset>.
In the double dollar-sign expression, $$, the first $ escapes the second $. This is necessary because Ant would treat a single $ as the start of a property reference.
The [^/]+ matches just the filename because it matches all characters at the end of the path that aren't directory separators (/).
Example
Given the following directory structure...
- copy (dir)
- new (dir)
- notthis.b
- testthis.a
- anythis.a
...Ant outputs...
[copy] Copying 2 files to C:\ant\paste
[copy] Copying C:\ant\copy\anythis.a to C:\ant\paste\that.a
[copy] Copying C:\ant\copy\new\testthis.a to C:\ant\paste\new\that.a
Try this regexpmapper:
<regexpmapper from="^(.*)/([^/]*)this(.*)\.a$$" to="\1/that\3"/>
This cuts the path (\1) and filename prefix (\2), so you can preserve the directory structure.
Also, you can preserve the file extension if you use \3in the replacement string.
Related
Is there a possibility to delete all text lines using Ant in a text file that are after a specific keyword? - after the first occurrence of the keyword.
Example
Line1
Line2
Line3
Line4
Line5
.....
Line1000
I want to delete everything that is in that file that is after "Line3" keyword excluding that line.
Ant's replaceregexp task can handle this pretty easily:
<replaceregexp
file="input.txt"
match="(.*Line3).*"
replace="\1"
flags="s"
/>
Brief explanation: The regex pattern captures everything up to and including "Line3" in a group, then continues to match the rest of the input. The replacement consists of only the captured group, effectively deleting the part you don't want. The s flag is switched on so that newlines are matched with the . wildcard.
I am following instructions from this link on how to append Stata files via a foreach loop. I think that it's pretty straightforward.
However, when I try to refer to each f in datafiles in my foreach loop, I receive the error:
invalid `
I've set my working directory and the data is in a subfolder called csvfiles. I am trying to call each file f in the csvfiles subfolder using my local macro datafiles and then append each file to an aggregate Stata dataset called data.dta.
I've included the code from my do file below:
clear
local datafiles: dir "csvfiles" files "*.csv"
foreach f of local datafiles {
preserve
insheet using “csvfiles\`f'”, clear
** add syntax here to run on each file**
save temp, replace
restore
append using temp
}
rm temp
save data.dta, replace
The backslash character has meaning to Stata: it will prevent the interpretation of any following character that has a special meaning to Stata, in particular the left single quote character
`
will not be interpreted as indicating a reference to a macro.
But all is not lost: Stata will allow you to use the forward slash character in path names on any operating system, and on Windows will take care of doing what must be done to appease Windows. Replacing your insheet command with
insheet using “csvfiles/`f'”, clear
should solve your problem.
Note that the instructions you linked to do exactly that; some of the code includes backslashes in path names, but where a macro is included, forward slashes are used instead.
We have almost 2.5 million files of archive data that need to be organized according to the year in which they were created. We need to move files from their current folder on our NAS to another folder that is the year in which the file was created. The destination folder is a four character year value (2003, 2004, etc.). The filename is in the format AAAAAAAAA_YYYYMMDD_BBBBBB.dfa where YYYY is the year value in which the file was created. The file extension can be either .dfa or .dfc. Folders for the appropriate year already exist, but files that are incorrectly placed in the wrong year must be moved to the appropriate year folder.
I need a batch file that will move files from their current location to the appropriate year folder on the NAS, but do not know how to parse the year value from the filename to move the file to the proper year.
Could someone help me with a batch file or script that will do this?
The following batch file walks through the root directory given by variable ROOT recursively and moves files into the appropriate year-folder:
#echo off
rem specify the root directory here (the directory containing the year folders):
set ROOT="."
rem define the file search pattern(s) here:
set PATTERNS="*.dfa" "*.dfc"
rem set this to non-empty for flexible file name parsing:
set FLEXMODE=
rem set this to a log file path
rem (log contains date/time, TRUE/FALSE for move success/failure, source, dest.):
set LOGF=".\movement.log"
setlocal EnableDelayedExpansion
rem loop through every file recursively
for /R %ROOT% %%F in (%PATTERNS%) do (
rem extract parent folder name
set PARENT=%%~dpF
set PARENT=!PARENT:~-5,4!
rem parse file name, extract year portion
if defined FLEXMODE (
for /F "tokens=2 delims=_" %%N in ("%%~nF") do (
set YEAR=%%N
set YEAR=!YEAR:~,4!
)
) else (
set YEAR=%%~nF
set YEAR=!YEAR:~10,4!
)
rem check whether parent folder name equals year portion of file name
if not "!PARENT!"=="!YEAR!" (
rem move file if not in appropriate year folder (no overwrite)
if not exist "%%~dpF..\!YEAR!\%%~nxF" (
move /Y "%%~fF" "%%~dpF..\!YEAR!" > nul
echo %DATE%, %TIME% TRUE "%%~fF" "%%~dpF..\!YEAR!\%%~nxF"
) else (
echo %DATE%, %TIME% FALSE "%%~fF" "%%~dpF..\!YEAR!\%%~nxF"
)
) >> %LOGF%
)
endlocal
Pre-Requisites:
set the variables in the beginning block accordingly:
ROOT: full root directory path;
PATTERNS: file pattern(s) to use for searching;
FLEXMODE: set this to a non-empty value if the AAAAAAAAA portion in your file names AAAAAAAAA_YYYYMMDD_BBBBBB.* may vary in length; in such cases, the first underscore _ is used to find the year YYYY; otherwise (empty value), the year is extracted by its position;
LOGF: path and name of a log file that will contain four columns (separated by 4 spaces): date and time, TRUE/FALSE to indicate success/failure, source file path, destination file path; files that are already placed correctly are not logged here;
the year folders are placed as immediate childs of the given root directory;
all files are located within a year-folder (wrong or right year);
you have sufficient access privileges within the entire root directory tree;
for script testing, place rem in front of the move command and review the log file;
Explanation:
core element is the for /R loop that walks through the directory hierarchy;
for each file, variable PARENT is set to the immediate parent directory name (the ~dp modifier in %%~dpF extracts drive and path, including a trailing backslash);
depending on FLEXMODE, the year YYYY portion is extracted from the file name (if FLEXMODE is defined, a for /F loop is used to parse the file name and split it into underscore-delimited tokens, the first 4 characters of the second token are the year, stored in YEAR; if FLEXMODE is empty, 4 characters at offset 10 are extracted and stored in YEAR);
next the extracted YEAR is checked against the parent directory name PARENT; if equal, do nothing else and go to next for /R iteration; otherwise, the file is moved but only if the destination does not yet exist;
finally, a log string is generated and returned by echo, which is then redirected to the specified log file;
for modifying variable values (PARENT, YEAR) and using them within a loop structure (or also , other code blocks), delayed expansion is required;
I am trying to concat multiple files (say 15 txt files) to a single file at the same time by separate ant calls.
Say there are 15 concat() run at the same time.
However, the output file was not expected.
The data in the output file is corrupted.
Does anyone have idea to solve this problem?
Example:
Input 1:
a=1
b=2
c=3
Input 2:
d=4
e=5
f=6
Output:
a=1
b=2
d=4
e
c=3=5
f=6
You can do this with the concat task, which take a resource collection such as `filesets' as nested elements, allowing you to concatenate all the files in a single task call. Example:
<concat destfile="${build.dir}/output.txt">
<fileset file="${src.dir}/input1.txt" />
<fileset file="${src.dir}/input2.txt" />
</concat>
I have the following strings as input for scheduler file
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-hr454\SRISM.xml
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-lr454\Swap_MUL.xml
Z:\cnt_development\cnt\test\Test-cases-blr\v80-WM\scheduler\FRQ\AUTO\sml-lr456\Swap_MU.xml
I need to extract the complete part from v80-WM
i.e The regex must be able to select the following string
v80-WM\scheduler\FRQ\AUTO\sml-hr454\SRISM.xml
v80-WM\scheduler\FRQ\AUTO\sml-lr454\Swap_MUL.xml
v80-WM\scheduler\FRQ\AUTO\sml-lr456\Swap_MU.xml
Currently I am using the following regex where the regex finds the last occurence of "Q" in the above string and trimming for there and using workardoung to construct the above mentioned results.
<echo message="runpART ... Scheduler File ${schedulerFile}"/>
<propertyregex property="cfg.arg" input="${schedulerFile}" regexp="([^Q]*).xml" select="\1" casesensitive="false"/>
Need help in extracting string from "v80-WM....xml".
Some inputs will be helpful
That's good. The v80-WM gives you a fixed "starting point"
Using this as your regular expression should do it.
^.(v80-WM.)
What it means:
^.* match anything until you get to *the caret isn't really necessary, but I like making the reg exp more strict)
v80-WM
= .* then match the rest
The parens include the v80-WM name and everything that comes after so you don't have to reconstruct it.