I need to get numbers out of a website script after a certain string - powershell-2.0

I'm trying to get a certain string of numbers (the numbers vary in length on each reload) out of a script tag on a website. However, I am struggling to figure out how to do it as I am stuck with PowerShell v2 and cannot upgrade it higher.
I've managed to get the full script by getting element by loading the site in IE and getting element by tag name "script" and I've attempted to try some regex to find the string but can't quite figure it out.
I have also tried stripping the characters off the front and back of the script, that's when I realised the lengths of the numbers change each time.
Part of the script is:
var value = document.wizform.selActivities.options[document.wizform.selActivities.selectedIndex].value;
if (value == "Terminate") {
if (confirm("Are you sure you want to terminate the selected business process(es)?")) {
document.wizform.action = "./Page?next=page.actionrpt&action=terminate&pos=0&1006999619";
javascript:document.wizform.submit();
}
} else if (value == "TerminateAndRestart") {
if (confirm("Are you sure you want to terminate and restart the selected business process(es)?")) {
document.wizform.action = "./Page?next=page.actionrpt&action=terminateandrestart&pos=0&237893352";
javascript:document.wizform.submit();
}
}
The part I want to capture is the numbers here
document.wizform.action = "./Page?next=page.actionrpt&action=terminateandrestart&pos=0&237893352";
The PowerShell code I have so far is
$checkbox = $ie.Document.getElementsByTagName("script") | Where-Object {
$_.outerHTML -like "*./Page?next=page.actionrpt&action=terminate*"
} # | select -Expand outerHTML
$content = $checkbox
$matches = [regex]::Matches($content, '".\action=terminate\.([^"]+)')
$matches | ForEach-Object {
$_.Groups[1].Value
}
What I would like is PowerShell to have just the number as a variable, so in the example above I would like to be able to have either 0&237893352 or just 237893352 (as the note does not change, so I can add the 0& back in after if I need to).

Use a positive lookbehind assertion for matching the particular action you're interested in:
$re = '(?<=action=terminateandrestart&pos=)0&\d+'
$content |
Select-String -Pattern $re |
Select-Object -Expand Matches |
Select-Object -Expand Value
(?<=...) is a regular expression construct called "positive lookbehind assertion" that allows for matching something that is preceded by a particular string (in your case "action=terminateandrestart&pos=") without making that string part of the returned match. This way you can look for the string "action=terminateandrestart&pos=" followed by "0&" and one or more digits (\d+) and return only "0&" and the digits.

Related

Search string with specific termination DXL

Im trying to perform a search in DXL of a string that ends with specific characteres Im not able to find the way to perform this.
Example, I'm looking for
" A: 23.1.23.2.4"
But if this contains at the end the character "~" the find function does not work
Example Where the skip list contains "A: 12.2.1.4.5~ text text text text"
I just need to know in the object.text contains A: 12.2.1.4.5
string string_text = "A: 12.2.1.4.5"
if(find(skip[i],string_text,string_text)){
modify_attributes(req_text)
}else{
output << "stgring not found : "
}
use a regular expression, like this
void modify_attributes (string fulltext) {print "modifying.."}
string fulltext = "A: 12.2.1.4.5~ text text text text"
Regexp searchme = regexp2 "A: 12.2.1.4.5"
if(searchme (fulltext)){
modify_attributes(fulltext)
}else{
print "string not found "
}
The "find"-method for Skip lists is O(1), if I am not mistaken. But for that to work properly, the key, you are asking for, has to match exactly.
So, to benefit from the speed of value-retrieval by the find method, I suggest, that you have a look at your code part, where you put stuff into your Skip, (only put "clean" information in the Skip, which you know, you want to ask for later on).
That of course only works, if you have the possibility to do so, i.e. you don't get the Skip from somewhere you don't have control over..

Is this reserve declaration hard to understand or defective?

I've got a problem with using a reserve (backslash) declaration for priority disambiguation. Below is a self-contained example. The production 'Ipv4Address' is a strict subset of 'Domain0'. In parsing URL's, though, you want dotted-quad addresses to be handled differently than domain names, so you want to split 'Domain0' into two parts; 'Domain1' is one of those two parts. The test suite included, however, is failing at 't3()', where 'Domain1' is accepting an IP address, which looks like it should be excluded.
Is this a problem with the reserve declaration, or is this a defect in the current version of Rascal? I'm on the 0.10.x unstable branch at present, per advice to see if that corrected a different problem (with the Tutor). I haven't checked with the stable branch since keeping them both installed means parallel Eclipse environments, which I haven't been motivated to do.
module grammar_test
import ParseTree;
syntax Domain0 = { Subdomain '.' }+;
syntax Domain1 = Domain0 \ IPv4Address ;
lexical Subdomain = [0-9A-Za-z]+ | [0-9A-Za-z]+'-'[a-zA-Z0-9\-]*[a-zA-Z0-9] ;
lexical IPv4Address = DecimalOctet '.' DecimalOctet '.' DecimalOctet '.' DecimalOctet ;
lexical DecimalOctet = [0-9] | [1-9][0-9] | '1'[0-9][0-9] | '2'[0-4][0-9] | '25'[0-5] ;
test bool t1()
{
return parseAccept(#IPv4Address, "192.168.0.1");
}
test bool t2()
{
return parseAccept(#Domain0, "192.168.0.1");
}
test bool t3()
{
return parseReject(#Domain1, "192.168.0.1");
}
bool parseAccept( type[&T<:Tree] begin, str input )
{
try
{
parse(begin, input, allowAmbiguity=false);
}
catch ParseError(loc _):
{
return false;
}
return true;
}
bool parseReject( type[&T<:Tree] begin, str input )
{
try
{
parse(begin, input, allowAmbiguity=false);
}
catch ParseError(loc _):
{
return true;
}
return false;
}
This example has been cut down from larger code. I first encountered the error in a larger scope. Using the rule "IPv4Address | Domain1" was throwing an Ambiguity exception, which I tracked down to the behavior that "Domain1" was accepting something it shouldn't be. Curiously "IPv4Address > Domain1" was also throwing Ambiguity, but I'm guessing this has the same root cause as the present isolated example.
The difference operator for keyword reservations currently only works correctly if the right-hand side is a finite language expressed as disjunction of literal keywords like "if" | "then" | "while" or a non-terminal which is defined like that: lexical X = "if" | "then" | "while". And then you can writeA \ X` for some effect.
For other types of non-terminals the parser is just generated but the \ constraint has no effect. You wrote Domain0 \ IPv4Address and IPv3Address does not hold to the above assumption.
(We should either add a warning about that or generate a parser which can implement the full semantics of language difference; but that's for another time).
Admittedly such a powerful difference operator could be used to express an some order of preference between non-terminals. Alas.
Possible (sketches of) solutions:
stage two passes solution: parse the input using the more general Subdomain syntax, then pattern and match rewrite in a single pass all quadruples to IPv4Address
maximal munch solution: adapt the grammar using follow restrictions to implement eager behavior for the IPv4Address, like {Subdomain !>> [.][0-9] "."}+ or something in that vain.

PowerShell Parse INF file

I am trying to parse an INF; specifically, driver version from the file. I am new to PowerShell, so I've gotten only this far.
The file looks like this:
[Version]
Signature = "$WINDOWS NT$"
Class = Bluetooth
ClassGuid = {e0cbf06c-cd8b-4647-bb8a-263b43f0f974}
Provider = %PROVIDER_NAME%
CatalogFile = ibtusb.cat
DriverVer=11/04/2014,17.1.1440.02
CatalogFile=ibtusb.cat
The second last line has the information I am looking for. I am trying to parse out just 17.1.1440.02.
One file may contain multiple lines with DriverVer=..., but I am only interested in the first instance.
Right now I've the following script.
$path = "C:\FilePath\file.inf"
$driverVersoin = Select-String -Pattern "DriverVer" -path $path
$driverVersoin[0] # lists only first instance of 'DriverVer'
$driverVersoin # lists all of the instances with 'DriverVer'
Output is:
Filepath\file.inf:7:DriverVer=11/04/2014,17.1.1440.02
But I am only looking for 17.1.1440.02
Make your expression more specific and make the part you want to extract a capturing group.
$pattern = 'DriverVer\s*=\s*(?:\d+/\d+/\d+,)?(.*)'
Select-String -Pattern $pattern -Path $path |
select -Expand Matches -First 1 |
% { $_.Groups[1].Value }
Regular expression breakdown:
DriverVer\s*=\s* matches the string "DriverVer" followed by any amount of whitespace, an equals sign and again any amount of whitespace.
(?:\d+/\d+/\d+,)? matches an optional date followed by a comma in a non-capturing group ((?:...)).
(.*) matches the rest of the line, i.e. the version number you want to extract. The parentheses without the ?: make it a capturing group.
Another option (if the version number is always preceded by a date) would be to just split the line at the comma and select the last field (index -1):
Get-Content $path |
Where-Object { $_ -like 'DriverVer*' } |
Select-Object -First 1 |
ForEach-Object { $_.Split(',')[-1] }

Variable range operator in PowerShell

I am trying to use a range operator to input a series of numbers for use in a PowerShell script. Here is my code:
$computers = servername + [1-9]
I would like the $computers variable to iterate the 1-9 i.e., servername1, servername 2, etc. etc. Any ideas?
1..9 | % { $computers += "servername$_`n" }
And the variable $computers will contain:
servername1
servername2
servername3
[...]
Try running only the 1..9 part on your command line and it'll be easier to see what's gonig on. You could also read up on arrays in PowerShell with Get-Help about_Arrays - look for the part about "range operator" near the beginning.
The following line of code does the same thing (and seems cleaner to me) and might be easier to understand as well.
$computers = 1..9 | foreach { "servername$_" }
Or simply 1..9 | foreach { "servername$_" } to see it on screen without saving it in a variable.

Powershell parse parts of a text file and save to CSV

All, I'm very new to powershell and am hoping someone can get me going on what I think would be a simple script.
I need to parse a text file, capture certain lines from it, and save those lines as a csv file.
For example, each alert is in its own text file. Each file is similar to this:
--start of file ---
Name John Smith
Dept Accounting
Codes bas-2349,cav-3928,deg-3942
iye-2830,tel-3890
Urls hxxp://blah.com
hxxp://foo.com, hxxp://foo2.com
Some text I dont care about
More text i dont care about
Comments
---------
"here is a multi line
comment I need
to capture"
Some text I dont care about
More text i dont care about
Date 3/12/2013
---END of file---
For each text file if I wanted to write only Name, Codes, and Urls to a CSV file. Could someone help me get going on this?
I'm more a PERL guy so I know I could write a regex for capturing a single line beginning with Name. However I am completely lost on how I could read the "Codes" line when it might be one line or it might be X lines long until I run into the Urls field.
Any help would be greatly appreciated!
Text parsing usually means regex. With regex, sometimes you need anchors to know when to stop a match and that can make you care about text you otherwise wouldn't. If you can specify that first line of "Some text I don't care about" you can use that to "anchor" your match of the URLs so you know when to stop matching.
$regex = #'
(?ms)Name (.+)?
Dept .+?
Codes (.+)?
Urls (.+)?
Some text I dont care about.+
Comments
---------
(.+)?
Some text I dont care about
'#
$file = 'c:\somedir\somefile.txt'
[IO.File]::ReadAllText($file) -match $regex
if ([IO.File]::ReadAllText($file) -match $regex)
{
$Name = $matches[1]
$Codes = $matches[2] -replace '\s+',','
$Urls = $matches[3] -replace '\s+',','
$comment = $matches[4] -replace '\s+',' '
}
$Name
$Codes
$Urls
$comment
If the file is not too big to be processed in memory, the simple way is to read it as an array of strings. (What too big means is subject to your system. Anything sub-gigabyte should work without too much a hickup.)
After you've read the file, set up a head and tail counters to point to element zero. Move the tail pointer row by row forward, until you find the date row. You can match data with regexps. Now you know the start and end of a single record. For the next record, set head counter to tail+1, tail to tail+2 and start scanning rows again. Lather, rinse, repeat until end of array is reached.
When a record is matched, you can extract name with a regex. Codes and Urls are a bit trickier. Match the Codes row with a regex. Extract it and all the next rows unless they do not match the code pattern. Same goes to Urls data. If the file always has whitespace padding on rows that are data to previous Urls and Codes, you could use match whitespace count with a regexp to get data rows too.
Maybe something line this would to it:
foreach ($Line in gc file.txt) {
switch -regex ($Line) {
'^(Name|Dept|Codes|Urls)' {
$Capture = $true
break
}
'^[A-Za-z0-9_-]+' {
$Capture = $false
break
}
}
if ($Capture) {
$Line
}
}
If you want the end result as a CSV file then you may use the Export-Csv cmdlet.
According the fact that c:\temp\file.txt contains :
Name John Smith
Dept Accounting
Codes bas-2349,cav-3928,deg-3942
iye-2830,tel-3890
Urls hxxp://blah.com
hxxp://foo.com
hxxp://foo2.com
Some text I dont care about
More text i dont care about
.
.
Date 3/12/2013
You can use regular expressions like this :
$a = Get-Content C:\temp\file.txt
$b = [regex]::match($a, "^.*Codes (.*)Urls (.*)Some.*$", "Multiline")
$codes = $b.groups[1].value -replace '[ ]{2,}',','
$urls = $b.groups[2].value -replace '[ ]{2,}',','
If all files have the same structure you could do something like this:
$srcdir = "C:\Test"
$outfile = "$srcdir\out.csv"
$re = '^Name (.*(?:\r\n .*)*)\r\n' +
'Dept .*(?:\r\n .*)*\r\n' +
'Codes (.*(?:\r\n .*)*)\r\n' +
'Urls (.*(?:\r\n .*)*)' +
'[\s\S]*$'
Get-ChildItem $srcdir -Filter *.txt | % {
[io.file]::ReadAllText($_.FullName)
} | Select-String $re | % {
$f = $_.Matches | % { $_.Groups } | ? { $_.Index -gt 0 }
New-Object -TypeName PSObject -Prop #{
'Name' = $f[0].Value;
'Codes' = $f[1].Value;
'Urls' = $f[2].Value;
}
} | Export-Csv $outfile -NoTypeInformation

Resources