Parse Binary file with Powershell

Parse Binary file with Powershell - parsing

I am trying to search through a binary file. After reviewing the file via a hex editor I found patterns throughout the file. You can see them here. As you can see they are before and after the file listing.
/% ......C:\Users\\Desktop\test1.pdf..9
/% ......C:\Users\\Desktop\testtesttesttest.pdf..9
What I woudld like to do is find ..9 (HEX = 000039), and then "backup" until I find, /% ...... (hex = 2F25A01C1000000000), then move forward x amount of bytes so I can get the complete path. The code I have now is below:
$file = 'C:\Users\<username>\Desktop\bc03160ee1a59fc1.automaticDestinations-ms'
$begin_pattern = '2F25A01C1000000000' #/% ......
$end_pattern = '000039' #..9
$prevBytes = '8'
$bytes = [string]::join('', (gc $file -en byte | % {'{0:x2}' -f $_}))
[regex]::matches($bytes, $end_pattern) |
% {
$i = $_.index - $prevBytes * 2
[string]::join('', $bytes[$i..($i + $prevBytes * 2 - 1)])
}
Some of the output roughly translates to this:
ffff2e0000002f000000300000003b0000003200000033000000340000003500000036000000370000003800
655c4465736b746f705c466f72656e7369635f426f6f6b735c5b656e5d646566745f6d616e75616c2e706466
0000000000000000000000000000010000000a00000000000000000020410a000000000000000a00000000
ÿÿ./0;2345678?e\Desktop\deft_manual.pdf?
?sic Science, Computers, and the Internet.pdf
?ware\Desktop\Dive Into Python 3.pdf?

You can use the System.IO.BinaryReader class from PowerShell.
$path = "<yourPathToTheBinaryFile>"
$binaryReader = New-Object System.IO.BinaryReader([System.IO.File]::Open($path, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read, [System.IO.FileShare]::ReadWrite))
Then you have access to all the methods like:
$binaryReader.BaseStream.Seek($pos, [System.IO.SeekOrigin]::Begin)
AFAIK, no easy way to "find" a pattern without reading the bytes (using ReadBytes) and implementing the search yourself.

Related

Removing the file paths and using the file number to perform some calculations while plotting

I am trying to read .txt files from a directory which have the following order;
x-23.txt
x-43.txt
x-83.txt
:
:
x-243.txt
I am calling these files using filename = system("ls ../Data/*.txt"). The goal is to load these files and plot certain columns. At the same time, I am trying to parse the file names such that it would look like as below so that I can use them as title in the plot and add/subtract them from a certain column;
23
43
83
:
:
243
For that, I tried the following;
dirname = '../Data/'
str = system('echo "'.dirname. '" | perl -pe ''s/x[\d-](\d+).txt/\1.\2/'' ')
cv = word(str, 1)
The above lines doesn't seem to trim and produce numbers on the files. The code all together;
filelist1 = system("ls ../Data/*.txt")
print filelist1
dirname = '../Data/'
str = system('echo "'.dirname. '" | perl -pe ''s/x[\d-](\d+).txt/\1.\2/'' ')
cv = word(str, 1)
plot for [filename1 in filelist1] filename1 using (-cv/1000+ Tx($4)):(X($3)) with points pt 7 lc 6 title system('basename '.filename1),\
I am trying to use the file numbers "cv" after parsing the .txt files to subtract them from column Tx($4) while plotting.

directory = "../temp/"
filelist = system("cd ../temp/ ; ls *.txt")
files = words(filelist)
filename(i) = directory . word(filelist,i)
title(i) = word(filelist,i)[3 : strstrt(word(filelist,i),'.')-1]
plot for [i=1:files] filename(i) using ... title title(i)
Test case (edited to show pulling files from another directory):
gnuplot> print filelist
x-234.txt
x-23.txt
x-2.txt
x-34.txt
gnuplot> do for [i=1:files] { print i, ": ", filename(i) }
1: ../temp/x-234.txt
2: ../temp/x-23.txt
3: ../temp/x-2.txt
4: ../temp/x-34.txt
gnuplot> plot for [i=1:files] x*i title title(i)

ProgressBar Overlay Or ProgressBar from Copy Functions

Function GetTotalBytesOfCopyDestination{
param($destinationPath);
$colItems = (Get-ChildItem $destinationPath | Measure-Object -property length -sum)
return $colItems.sum;
}
Function GetBytesOfFile{
param($sourcePath);
return (Get-Item $sourcePath).length;
}
Function GetPosition{
param([double]$currentOfBytesSended)
param([double]$countsOfBytesWillSend)
$position = ($currentOfBytesSended / $countsOfBytesWillSend) * 100;
#range 0 - 100
#(15800 bytes / 1975633689 bytes)*100
#
return $position;
}
Function Copy-File {
#.Synopsis
# Copies all files and folders in $source folder to $destination folder, but with .copy inserted before the extension if the file already exists
param($source,$Destination2)
# create destination if it's not there ...
mkdir $Destination2 -force -erroraction SilentlyContinue
[double]$currentOfBytesSended = 0;
[double]$countsOfBytesWillSend = 0;
[double]$countsOfBytesWillSend = GetTotalBytesOfCopyDestination($source);
$progressbar6.Maximum = 100;
$progressbar6.Step = 1;
foreach($original in ls $source -recurse) {
$result = $original.FullName.Replace($source,$Destination2)
while(test-path $result -type leaf){ $result = [IO.Path]::ChangeExtension($result,"copy$([IO.Path]::GetExtension($result))") }
[System.Windows.Forms.Application]::DoEvents()
if($original.PSIsContainer) {
mkdir $result -ErrorAction SilentlyContinue
} else {
copy $original.FullName -destination $result
[System.Windows.Forms.Application]::DoEvents()
$currentCopyingFileSizeInBytes = 0;
$currentCopyingFileSizeInBytes = GetBytesOfFile($original.FullName);
$currentOfBytesSended = [double]$currentOfBytesSended + [double]$currentCopyingFileSizeInBytes;
#$currentOfBytesSended += $currentCopyingFileSizeInBytes;
$progressbar6.Value=GetPosition([double]$currentOfBytesSended, [double]$countsOfBytesWillSend);
[System.Windows.Forms.Application]::DoEvents()
#$progressbar6.PerformStep();
$progressbar6.Refresh();
}
}
}
what I'm trying to get is Copy-File Function ,copy files & directory from remote machine to local machine while moving progress bar depending on total amount to copy,and which are already copied and it define position for the progress bar and i get this error
ERROR: GetPosition : Cannot process argument transformation on parameter 'currentOfBytesSended'. Cannot convert the "System.Object[]" value of type "System.Object[]"
ERROR: to type "System.Double".
Assit App.pff (337): ERROR: At Line: 337 char: 35
ERROR: + $progressbar6.Value=GetPosition([double]$currentOfBytesSended, [double]$coun ...
ERROR: + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ERROR: + CategoryInfo : InvalidData: (:) [GetPosition], ParameterBindingArgumentTransformationException
ERROR: + FullyQualifiedErrorId : ParameterArgumentTransformationError,GetPosition
ERROR:
>> Script Ended

First of all, welcome to StackOverflow. Try to break up your questions into individual questions, rather than grouping many of them together. The format of this site is better suited to one question, one answer, unless maybe there is guidance that states otherwise.
Excluding Folders
You can ignore folders by using the Exclude parameter on the Get-ChildItem command.
Imagine that you have a folder structure similar to the following:
c:\gci
|
|\a
| \a.txt
|\b
| \b.txt
|\c
| \c.txt
If you only want to get the contents of c:\gci\a and c:\gci\c, you can exclude c:\gci\b using the command:
Get-ChildItem -Path c:\gci -Exclude b* -Recurse;
Keep in mind that this will also exclude other items starting with "b" such as c:\gci\bcd.txt.
Progress Bar
You can create a progress bar using the Write-Progress command. You will have to write your own, custom logic to determine what tasks to report progress on, and what the percentage of progress is. You can do this based on the number of bytes copied vs. the total number of bytes to be copied, the number of files copied vs. the total number of files, or some other sort of metric.
There's no simple answer to this one. You will have to perform the calculations yourself, and call Write-Progress at the appropriate times.

Parse and change the output of a system through Powershell

initially I have to state, that I have little to no experience with powershell so far. A previous system generates the wrong output for me. So I want to use PowerShell to change this. From the System I get an output looking like this:
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^...^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')^|^N^|^LIKE^|^('6','7','8','9')^|^...^|^Y^|^NOT IN^|^('1','2','15','16','17')^|^Y^|^NOT IN^|^('18','19','20','21','22')
When you look at it, there is a starting part for each line (TEST1^|^9999^|^) followed by a1 to a-n tuples (example: Y^|^NOT IN^|^('1','2','3')^|^).
The way I want this to look like is here:
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')
TEST1^|^9999^|^N^|^LIKE^|^('4','5','6','7')
TEST1^|^9999^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')
TEST2^|^9998^|^N^|^LIKE^|^('6','7','8','9')
TEST2^|^9998^|^Y^|^NOT IN^|^('1','2','15','16','17')
TEST2^|^9998^|^Y^|^NOT IN^|^('18','19','20','21','22')
So the tuples shall be printed out per line, with the starting part attached in front.
My solution approach is the AWK equivalent in Powershell, but to date I lack the understanding of how to tackle the issue of how to deal with an indetermined number of tuples and to repeat the starting block.
I thank you so much in advance for your help!

I'd split the lines at ^|^ and recombine the fields of the resulting array in a loop. Something like this:
$sp = '^|^'
Get-Content 'C:\path\to\input.txt' | % {
$a = $_ -split [regex]::Escape($sp)
for ($i=2; $i -lt $a.length; $i+=3) {
"{0}$sp{1}$sp{2}$sp{3}$sp{4}" -f $a[0,1,$i,($i+1),($i+2)]
}
} | Set-Content 'C:\path\to\output.txt'

The data looks quite regular so you could loop over it using | as the delimiter and counting the following cells in 3s:
$data = #"
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^Y^|^NOT IN^|^('8','9','10','11','12')
TEST2^|^9998^|^Y^|^NOT IN^|^('4','5','6')^|^N^|^LIKE^|^('6','7','8','9')^|^Y^|^NOT IN^|^('1','2','15','16','17')^|^Y^|^NOT IN^|^('18','19','20','21','22')
"#
$data.split("`n") | % {
$ds = $_.split("|")
$heading = "$($ds[0])|$($ds[1])"
$j = 0
for($i = 2; $i -lt $ds.length; $i += 1) {
$line += "|$($ds[$i])" -replace "\^(\((?:'\d+',?)+\))\^?",'$1'
$j += 1
if($j -eq 3) {
write-host $heading$line
$line = ""
$j = 0
}
}
}

Parsing an arbitary length string record to row records is quite error prone. A simple solution would be processing the data row-by-row and creating output.
Here is a simple illustration how to process a single row. Processing the whole input file and writing output is left as trivial an exercise to the reader.
$s = "TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')^|^N^|^LIKE^|^('4','5','6','7')^|^Y^|^NOT IN^|^('8','9','10','11','12')"
$t = $s.split('\)', [StringSplitOptions]::RemoveEmptyEntries)
$testNum = ([regex]::match($t[0], "(?i)(test\d+\^\|\^\d+)")).value # Hunt for 1st colum values
$t[0] = $t[0] + ')' # Fix split char remove
for($i=1;$i -lt $t.Length; ++$i) { $t[$i] = $testNum + $t[$i] + ')' } # Add 1st colum and split char remove
$t
TEST1^|^9999^|^Y^|^NOT IN^|^('1','2','3')
TEST1^|^9999^|^N^|^LIKE^|^('4','5','6','7')
TEST1^|^9999^|^Y^|^NOT IN^|^('8','9','10','11','12')

text file -> json -> ios array

I have a file with the following format:
/Users/devplayerx/Sandbox/pics/images/001012DG-161.JPG
pixelWidth: 1600
pixelHeight: 1050
filename: 001012DG-161.JPG
/Users/devplayerx/Sandbox/pics/images/001019DG-151 COPY.JPG
pixelWidth: 1600
pixelHeight: 1050
filename: 001019DG-151 COPY.JPG
and would like to, ultimately, have an iOS dictionary with the filename as key, and either a dictionary or array with the pixelWidth and pixelHeight as value. I was considering converting my text file into a JSON file, and then parse it using NSJSONSerialization, but I'm not sure how to convert my text file into JSON. Also, I'd like to remove the full path from the text file, since it's not needed.

Here is a perl script that seems to do the job:
#!/usr/bin/perl
use strict;
use warnings;
open FILE,"< yourfile.txt" or die "I/O error : $!\n";
my $w = 0;
my $h = 0;
my $f = "";
print "{\n";
while (my $line = <FILE>)
{
if ($f)
{
print ",\n";
$f = "";
}
if ($line =~ /pixelWidth: ([0-9]+)/)
{
$w = $1;
}
if ($line =~ /pixelHeight: ([0-9]+)/)
{
$h = $1;
}
if ($line =~ /filename: (.*)$/)
{
$f = $1;
print "\t\"$f\" : [ $w, $h ]"
}
}
print "\n}\n";
close FILE;
Note that I'm not an expert in perl so maybe it can be improved, but using it on your input file seems to produce your expected JSON as below:
prompt$ perl scr.pl
{
"001012DG-161.JPG" : [ 1600, 1050 ],
"001019DG-151 COPY.JPG" : [ 1600, 1050 ]
}
Note that once you have your JSON file, you may optionally convert it into a PLIST file using the plutil tool. For example perl scr.pl | plutil -convert binary1 -o yourfile.plist - will create yourfile.plist from the JSON produced by perl scr.pl (my script above). You can then easily read this file in your code using [NSDictionary dictionaryWithContentsOfFile:pathToYourFilePlist] and directly have access to your data as an NSDictionary in one line.

The json object could look like this:
{ picture: { path : "thePath", pixelWidth: 1600, pixelHeight: 1050, filename : "name" }}
I would then convert it by taking the rows and putting them in a list, looping through the list and eventually spitting everything out in another text file using the above object notation.
What language do you need to write the tooling to convert it to json in?

You can use NSInputStream to read the text file, on each iteration you can build your dictionary the way you want.
After that just use NSJSONSerialization.

What is the best way to encode 1mb of data into a decodable analog image?

The image generated will have to be space restricted, not too big.
By analog i mean, it should be printed out. Much like how a QR code works, except, storing much larger size!
B/w image
Can be compressed (obviously because its better, time taken to uncompress should not be much either, so compression can be minimal)
You can assume any lower print density, and bits per pixel would be directly related to the solution you propose. Please go ahead and assume :)

I'm not sure how you could exactly encode this to fit on paper... I think the best way is to convert your digital information into a more compact analog representation. To clarify... What has been proposed is to take the digital information and represent it (with bits) digitally on an analog medium. What I'm trying to say is take the digital info, convert it and represent it in an analog form on an analog medium. How to do this? I have no idea, but I have a lead for you.
Remember the Voyager Golden Records? They fit 116 (115?) images onto the disks, both simple B&W images and full size, color images with high(ish) resolution. They did it with an analog box of wavelengths representing digital bits over 512 (2^9) lines.
Not sure if that helps, but maybe it gives you (or someone else) an idea?

Assuming no overhead for error correction, and 1mm resolution, you're looking at a 1 x 1 meter image. This assumes no compression (how compressible your data is really depends) and no real encoding scheme. This is not practical.
I'd break up your data into ~5kb chunks, encode them as QR codes together with their sequence numbers, and use the collection of QR codes. Each image has built-in error correction. You'll end up with maybe two hundred 15*15cm images. At least you'll be able to use regular paper and printers for it.
The reason I suggest QR codes is that they have error correction built in, so you have more tolerance for smeared ink, wrinkled or torn paper, or a less than perfect camera/scanner. Plus, the overhead of the error correction isn't bad, and you're getting something like 90% use out of your paper. Assuming some maximum resolution that you're willing to use for a pixel, regardless of encoding scheme. Plus, they're fast to decode, hence their name.

How about encoding your 1 MB into an image and print is at high enough resolution and color depth, this is a simple straight forward approach.
Recovering of the data depends greatly on the capabilities of your scanner/camera or "analog" reader
(optical resolution etc.)
No error correction in this example, redundancy can be included in the data itself.
Which is a container that store this image of 1.1 MB:
Images hosted on
Other approaches: store data on paper.
The Perl script to create the image is below (it makes use of convert from imageMagick):
#!/usr/bin/perl
# this script take any data and make an image with it (format .png)
# deps
# convert from ImageMagick
our $dbug=0;
#--------------------------------
# -- Options parsing ...
#
my $if = undef;
my $of = undef;
while (#ARGV && $ARGV[0] =~ m/^-/)
{
$_ = shift;
#/^-(l|r|i|s)(\d+)/ && (eval "\$$1 = \$2", next);
if (/^-v(?:erbose)?/) { $verbose= 1; }
elsif (/^-?if?=?([\w.]+)?/) { $if= $1?$1:shift; }
elsif (/^-?of?=?([\w.]+)?/) { $of= $1?$1:shift; }
else { die "Unrecognized switch: $_\n"; }
}
#understand variable=value on the command line...
eval "\$$1='$2'"while $ARGV[0] =~ /^(\w+)=(.*)/ && shift;
my $data;
if (! defined $of) {
if (#ARGV) { $of = pop #ARGV }
else { $of = '-' }
}
if (defined $if) {
local *IN;
local $/ = undef;
open IN,'<',$if;
$data = <IN>;
close IN;
} else {
if ($ARGV > 0) { $if = '<>'; }
elsif ($#ARGV == 0) { $if = $ARGV[0]; }
else { $if = '-'; }
local $/ = undef;
$data = <>;
close STDIN;
}
my $size = length($data);
my $pi = atan2(0,-1);
#my $iratio = 4/3; # y/x
my $iratio = $pi; # x/y
my $xy = $size/3 * ($iratio);
my $x = sqrt($xy);
if ($verbose) {
printf STDERR "if: %s\n",$if;
printf STDERR "of: %s\n",$of;
printf STDERR "size: %s\n",$size;
printf STDERR "x: %.3f\n",$x;
printf STDERR "y: %.3f\n",$x / $iratio;
}
my $y = int($x / $iratio + $iratio);
$x = int( ( $size / 3 + $y - 1) / $y );
my $n = $x*$y*3;
my $delta = $n - $size;
if ($delta < 0) {
$x++;
$n = $x*$y*3;
$delta = $n - $size;
}
my $pad = "\x00" x $delta;
if ($verbose) {
printf STDERR "playload: %sx%s = %s\n",$x,$y,$n;
printf STDERR "delta: %s\n", $delta;
}
my $hdr = <<"EOS";
P6
$x $y
255
EOS
#my $fname = $file; $fname =~ s,.*/,,;
#my $bname = $fname; $bname =~ s/\.[^\.]*$//;
#printf STDERR "fname: %s\n",$fname;
#printf STDERR "fname: %s.png\n",$bname;
local *PPM; open PPM,"| convert -compress LZW -strip -quality 90 ppm:- png:$of";
print PPM $hdr;
binmode(PPM);
print PPM $data;
print PPM $pad;
close PPM;
exit $?;
1; # $Source: /my/perl/scripts/dat2png.pl$

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Parse Binary file with Powershell - parsing

Related

Removing the file paths and using the file number to perform some calculations while plotting

ProgressBar Overlay Or ProgressBar from Copy Functions

Parse and change the output of a system through Powershell

text file -> json -> ios array

What is the best way to encode 1mb of data into a decodable analog image?

Categories

Resources