dd a compressed *.xz image into a partition - beagleboneblack

I'm trying to copy a compressed image into a partition inside a Beaglebone.
Usually, it is a 2 step process:
xz -d console.img.xz # console.img is created
dd if=console.img of=/dev/mmcblk0p3
Is there a way, I can do it in a single step without uncompressing the file *.img.xz? This is because after uncompressed the image, it is too big for the current partition.

xzcat console.img.xz | dd of=/dev/mmcblk0p3 status=progress
xz -dc console.img.xz | dd of=/dev/mmcblk0p3 status=progress

This seems to work, if that is what you mean:
xz -d < console.img.xz - | dd of=/dev/mmcblk0p3

Related

How visualize j48 tree in weka

I have an output tree in weka but can't view it (right click ...). Is there a tool to generate the resulting tree in an understandable way from the copy of the log (figures)?
The above textual representation cannot be converted into other formats, unless you write your own parser.
However, if you use the -g option on the command-line, the tree will get output on stdout in dot-notation. You can then take this output and convert it into other formats, like PNG or PDF using the GraphViz software.
You can run Weka from the command line if you have java installed.
On my Windows machine from the Weka-3-9-5 directory:
C:\Weka-3-9-5> java -cp weka.jar weka.classifiers.trees.J48 -C 0.25 -M 2 -t .\data\iris.arff
This gives you the output that you current have with the trees. However:
C:\Weka-3-9-5> java -cp weka.jar weka.classifiers.trees.J48 -C 0.25 -M 2 -t .\data\iris.arff -g
gives you a different format:
digraph J48Tree {
N0 [label="petalwidth" ]
...
}
and you can feed this to GraphViz to get a nice printed tree.
I put the digraph output into a tree.txt file and then generated a png image file through GraphViz:
C:\GraphViz> dot -Tpng tree.txt > tree.png

Output file much larger than input files after cat + grep

I have 18 csv files, all between 1mb and 14mb. The sum of all files is 64mb. I want to create a new csv file that contains a subset of those files-- only the lines featuring the pattern "Hello" (or "HELLO", or "hello" ...). Here's what I'm doing
cat *.csv | head -n 1 > new.csv # I want to create a header first
cat *.csv | grep -i "hello" >> new.csv
I'm running Debian on WSL. The output file is much, much larger than the original 64mb (I stopped the process after 1+ hour, and the file was 300+ GB).
How can a subset of a text file be larger than the original files? Does it have anything to do with WSL?
This is not an OS issue. When you redirect your output to new.csv, shell creates that file first, before the glob expression *.csv is evaluated. That means the expansion of *.csv would include new.csv as well. That seems like the root cause of the recursive grep issue you are facing.
You are reading all the files twice, which is not necessary. You can make your operation a lot simpler and efficient with a single awk command:
awk 'NR==1 {print} tolower($0) ~ /hello/ {print}' *.csv > csv.new
mv csv.new new.csv
since the output file is named csv.new it won't interfere with the glob *.csv
NR==1 picks up the first line (header) from the very first file
The awk command can be written more succinctly as:
awk 'NR==1 || tolower($0) ~ /hello/' *.csv > csv.new
You are using *.csv and redirecting the output to new.csv which falls under *.csv which is causing recursion in grep result. perhaps you can try,
grep -i hello *.csv --exclude="new.csv" >> new.csv

Ignoring directories from a file

I am in the process of creating a script that lists all files opened via lsof output. I would like to checksum specific files and ignore directories from that output but am at a loss to do so EFFECTIVELY. For example: (I'm using FreeBSD btw)
lsof | awk '/\//{print $9}' | sort -u | head -n 5
prints:
/
/bin/sleep
/dev/bpf
What I'd like to do is: FROM that output, ignore any directories and perform an md5 on FILES (not directories).
Any pointers?
Give a try to following perl command:
lsof | perl -MDigest::MD5=md5_hex -ane '
$f = $F[ $#F ];
-f $f and printf qq|%s %s\n|, $f, md5_hex( $f )
'
It filters lsof output to plain files (-f). Take a look into perlfunc to change it to add different kind of files.
It outputs each file and its md5 separated by a space character. An example in my system is like:
/usr/lib/libm-2.17.so a2d3b2de9a1f59fb99427714fefb49ca
/usr/lib/libdl-2.17.so d74d8ac16c2d13128964353d4be7061a
/usr/lib/libnsl-2.17.so 34b6909ec60c337c21b044642b9baa3d
/usr/lib/ld-2.17.so 3d0e7b5b5c4e59c5c4b6a858cc79fcf1
/usr/sbin/lsof b9b8fbc8f296e47969713f6369d97c0d
/usr/lib/locale/locale-archive 3ea56273193198a718b9a5de33d553db
/usr/lib/libc-2.17.so ba51eeb4025b7f5d7f400f1968f4b5f9
/usr/lib/ld-2.17.so 3d0e7b5b5c4e59c5c4b6a858cc79fcf1
...

tar pre-run to evaluate expected size or amount of files

The problem:
I have a back-end process that at some point he collect and build a big tar file.
This tar receive few directories and an exclude files.
the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).
i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.
the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'
10x for everyone who is willing to assist.
You can pipe the output to the wc tool instead of actually making a file.
With file listing (verbose):
[git#server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080
Without:
[git#server]$ tar czf - ./test-dir | wc -c
2734080
Why don't you run a
DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[#]} -type f | wc -l
beforehand. This gets all the files (-type f) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable
If you want to know the size of all the stored files, you can use du
DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[#]} | tail -1 | awk -F ' ' '{print $1}'
This prints the disk usage with du, calculates a grand total (-c flag), gets the last line (example 4378921 total), and uses just the first column with awk

Creating a tar stream of arbitrary data and size

I need to create an arbitrarily large tarfile for testing but don't want it to hit the disk.
What's the easiest way to do this?
You can easily use python to generate such a tarfile:
mktar.py:
#!/usr/bin/python
import datetime
import sys
import tarfile
tar = tarfile.open(fileobj=sys.stdout, mode="w|")
info = tarfile.TarInfo(name="fizzbuzz.data")
info.mode = 0644
info.size = 1048576 * 16
info.mtime = int(datetime.datetime.now().strftime('%s'))
rand = open('/dev/urandom', 'r')
tar.addfile(info,rand)
tar.close()
michael#challenger:~$ ./mktar.py | tar tvf -
-rw-r--r-- 0/0 16777216 2012-08-02 13:39 fizzbuzz.data
You can use tar with -O option tar -O, like this tar -xOzf foo.tgz bigfile | process
https://www.gnu.org/software/tar/manual/html_node/Writing-to-Standard-Output.html
PS: However, it could be, that you will not get the benefits you intend to gain as tar starts writing stdout only after it has read through the entire compressed file. You can demonstrate this behavior by starting a large file extraction and following the file size over time; it should be zero most of the processing time and start growing at very late stage. On the other hand I haven't researched this extensively, there might be some work around, or I might be just plain wrong with my first hand out-of-memory experience.

Resources