I find something interesting when I try to read text files from a folder:
$files = Get-ChildrenItem "C:\MyFolder" -Recursive | ?{$_.Extension -like '.txt'}
foreach ($file in $files) {
[string]fileFullName = $file.FullName
$content = Get-Content -Path $fileName
....
}
I encountered some exceptions if file names contains [ and ] chars, for example, a[1].txt.
I guess that every thing is object in PS. Therefore, as for my understanding, [..] in file names maybe treated as index access. Is there any way to deal with files with brackets?
You want -LiteralPath:
$files = Get-ChildItem "C:\MyFolder" -Recurse | ? { $_.Extension -like '.txt' }
foreach ($file in $files) {
[string]$fileFullName = $file.FullName
$content = Get-Content -LiteralPath $fileFullName
$content
}
From detailed help () you willsee:
-LiteralPath
Specifies the path to an item. Unlike Path, the value of LiteralPath is used exactly as it is typed. No characters are
interpreted as wildcards. If the path includes escape characters,
enclose it in single quotation marks. Single quotation marks tell
Windows PowerShell not to interpret any characters as escape
sequences.
Related
I am trying to parse an INF; specifically, driver version from the file. I am new to PowerShell, so I've gotten only this far.
The file looks like this:
[Version]
Signature = "$WINDOWS NT$"
Class = Bluetooth
ClassGuid = {e0cbf06c-cd8b-4647-bb8a-263b43f0f974}
Provider = %PROVIDER_NAME%
CatalogFile = ibtusb.cat
DriverVer=11/04/2014,17.1.1440.02
CatalogFile=ibtusb.cat
The second last line has the information I am looking for. I am trying to parse out just 17.1.1440.02.
One file may contain multiple lines with DriverVer=..., but I am only interested in the first instance.
Right now I've the following script.
$path = "C:\FilePath\file.inf"
$driverVersoin = Select-String -Pattern "DriverVer" -path $path
$driverVersoin[0] # lists only first instance of 'DriverVer'
$driverVersoin # lists all of the instances with 'DriverVer'
Output is:
Filepath\file.inf:7:DriverVer=11/04/2014,17.1.1440.02
But I am only looking for 17.1.1440.02
Make your expression more specific and make the part you want to extract a capturing group.
$pattern = 'DriverVer\s*=\s*(?:\d+/\d+/\d+,)?(.*)'
Select-String -Pattern $pattern -Path $path |
select -Expand Matches -First 1 |
% { $_.Groups[1].Value }
Regular expression breakdown:
DriverVer\s*=\s* matches the string "DriverVer" followed by any amount of whitespace, an equals sign and again any amount of whitespace.
(?:\d+/\d+/\d+,)? matches an optional date followed by a comma in a non-capturing group ((?:...)).
(.*) matches the rest of the line, i.e. the version number you want to extract. The parentheses without the ?: make it a capturing group.
Another option (if the version number is always preceded by a date) would be to just split the line at the comma and select the last field (index -1):
Get-Content $path |
Where-Object { $_ -like 'DriverVer*' } |
Select-Object -First 1 |
ForEach-Object { $_.Split(',')[-1] }
I have a large number of files in numerous directories with this type of naming convention: "filename_yymmdd.csv", etc. I need to remove the underscore and the yymmdd. So the new file name would be "filename.csv". i need to recursively search through for .csv files and remove the date and underscore in powershell V2.0
$pattern = '(.*)_\d{6}(.csv)'
Get-ChildItem -Recurse | ? { $_.Name -match $pattern } |
Rename-Item -NewName { $_.Name -replace $pattern, '$1$2' }
I need to get part of this file for example, I need extract the following
Main, Branches\Branch1
in one variable also the I cannot have duplicate values
It is possible with powershell?
This is the file:
This is a garbage line
This is another garbage line
c:\Folder\Main\Folder\..\Folder
c:\Folder\Main\Folder\..\Folder
c:\Folder\Branches\Branch1\Folder\..\Folder
c:\Folder\Branches\Branch1\Folder\..\Folder
c:\Folder\Branches\Branch1\Folder\..\Folder
c:\Folder\Main\Folder\..\Folder
c:\Folder\Main\Folder\..\Folder
this is the final line..
But of course ...
According to the fact $files contain your lines
$files = Get-content "your file"
You can use the following to be sure that there is no duplicate :
$files | Sort-Object -Unique
Then you can use Test-path to be sure that path exists
$files | Sort-Object -Unique | where {Test-Path $_ -ErrorAction SilentlyContinue}
This will extract those values from the sample data using a -like filter to take out the garbage and a -replace to do the extract. The sort -unique will remove the duplicates, but it won't keep the extracted values in the same order they were in the file.
(get-content testfile.txt) -like 'c:\Folder*' -replace 'c:\\Folder\\(.+?)\\Folder.+','$1' |
sort -unique
I have a csv document with multiple headers like:
"Date","RQ","PM","SME","Activity","Status code"
"2/2/12","6886","D_WV","John Smith","Recent","2004"
and a text document that is just a list of status codes, one per line.
I am trying to figure out how to remove all lines from the CSV that contain the status codes from the text file.
So far I have tried using:
$m = gc textfile.txt
Select-String data.csv -Pattern $m -NotMatch
However that leaves me with extra data such as
data.csv:1"Date","RQ","PM","SME","Activity","Status code"
data.csv:2"2/2/12","6886","D_WV","John Smith","Recent","2004"
I have also tried:
gc data.csv | ? { $_ -notlike $m }
That uses the proper formatting but does not want to remove any of the values. Any help is much appreciated.
Those matchinfo objects from select-string can be confusing.
Does this do what you need?
$m = gc textfile.txt
select-string data.csv -pattern $m -notmatch |
select -expand line
I'd suggest a different approach to avoid false positives:
$m = Get-Content textfile.txt
Import-Csv data.csv `
| ? { $m -notcontains $_."Status code" } `
| Export-Csv output.csv -NoTypeInformation
All, I'm very new to powershell and am hoping someone can get me going on what I think would be a simple script.
I need to parse a text file, capture certain lines from it, and save those lines as a csv file.
For example, each alert is in its own text file. Each file is similar to this:
--start of file ---
Name John Smith
Dept Accounting
Codes bas-2349,cav-3928,deg-3942
iye-2830,tel-3890
Urls hxxp://blah.com
hxxp://foo.com, hxxp://foo2.com
Some text I dont care about
More text i dont care about
Comments
---------
"here is a multi line
comment I need
to capture"
Some text I dont care about
More text i dont care about
Date 3/12/2013
---END of file---
For each text file if I wanted to write only Name, Codes, and Urls to a CSV file. Could someone help me get going on this?
I'm more a PERL guy so I know I could write a regex for capturing a single line beginning with Name. However I am completely lost on how I could read the "Codes" line when it might be one line or it might be X lines long until I run into the Urls field.
Any help would be greatly appreciated!
Text parsing usually means regex. With regex, sometimes you need anchors to know when to stop a match and that can make you care about text you otherwise wouldn't. If you can specify that first line of "Some text I don't care about" you can use that to "anchor" your match of the URLs so you know when to stop matching.
$regex = #'
(?ms)Name (.+)?
Dept .+?
Codes (.+)?
Urls (.+)?
Some text I dont care about.+
Comments
---------
(.+)?
Some text I dont care about
'#
$file = 'c:\somedir\somefile.txt'
[IO.File]::ReadAllText($file) -match $regex
if ([IO.File]::ReadAllText($file) -match $regex)
{
$Name = $matches[1]
$Codes = $matches[2] -replace '\s+',','
$Urls = $matches[3] -replace '\s+',','
$comment = $matches[4] -replace '\s+',' '
}
$Name
$Codes
$Urls
$comment
If the file is not too big to be processed in memory, the simple way is to read it as an array of strings. (What too big means is subject to your system. Anything sub-gigabyte should work without too much a hickup.)
After you've read the file, set up a head and tail counters to point to element zero. Move the tail pointer row by row forward, until you find the date row. You can match data with regexps. Now you know the start and end of a single record. For the next record, set head counter to tail+1, tail to tail+2 and start scanning rows again. Lather, rinse, repeat until end of array is reached.
When a record is matched, you can extract name with a regex. Codes and Urls are a bit trickier. Match the Codes row with a regex. Extract it and all the next rows unless they do not match the code pattern. Same goes to Urls data. If the file always has whitespace padding on rows that are data to previous Urls and Codes, you could use match whitespace count with a regexp to get data rows too.
Maybe something line this would to it:
foreach ($Line in gc file.txt) {
switch -regex ($Line) {
'^(Name|Dept|Codes|Urls)' {
$Capture = $true
break
}
'^[A-Za-z0-9_-]+' {
$Capture = $false
break
}
}
if ($Capture) {
$Line
}
}
If you want the end result as a CSV file then you may use the Export-Csv cmdlet.
According the fact that c:\temp\file.txt contains :
Name John Smith
Dept Accounting
Codes bas-2349,cav-3928,deg-3942
iye-2830,tel-3890
Urls hxxp://blah.com
hxxp://foo.com
hxxp://foo2.com
Some text I dont care about
More text i dont care about
.
.
Date 3/12/2013
You can use regular expressions like this :
$a = Get-Content C:\temp\file.txt
$b = [regex]::match($a, "^.*Codes (.*)Urls (.*)Some.*$", "Multiline")
$codes = $b.groups[1].value -replace '[ ]{2,}',','
$urls = $b.groups[2].value -replace '[ ]{2,}',','
If all files have the same structure you could do something like this:
$srcdir = "C:\Test"
$outfile = "$srcdir\out.csv"
$re = '^Name (.*(?:\r\n .*)*)\r\n' +
'Dept .*(?:\r\n .*)*\r\n' +
'Codes (.*(?:\r\n .*)*)\r\n' +
'Urls (.*(?:\r\n .*)*)' +
'[\s\S]*$'
Get-ChildItem $srcdir -Filter *.txt | % {
[io.file]::ReadAllText($_.FullName)
} | Select-String $re | % {
$f = $_.Matches | % { $_.Groups } | ? { $_.Index -gt 0 }
New-Object -TypeName PSObject -Prop #{
'Name' = $f[0].Value;
'Codes' = $f[1].Value;
'Urls' = $f[2].Value;
}
} | Export-Csv $outfile -NoTypeInformation