I have an output file called filename.mat0 which contains a large list of data points for a number of different variables for a number of different time steps. I want to use something like the grep command to retrieve all the instances for a given variable, i.e. variable_A, then sum the total value associated with variable_A, then take an average.
The number of time steps is constant so variable_A, variable_B, etc all appear 100 times in my .mat file.
Please can you suggest the best way to do this?
An example of the output data is:
Timestep1 Variable_A 10
Timestep1 Variable_B 20
Timestep1 Variable_C 30
Timestep2 Variable_A 40
Timestep2 Variable_B 50
Timestep2 Variable_C 60
Timestep3 Variable_A 70
Timestep3 Variable_B 80
Timestep3 Variable_C 90
Desired output:
Variable_A = 40
Referencing to this.
awk should be able to solve the problem. Check the link for how to use awk.
The below command should be okay for your case, but it is not easy to use if there is many Variable .Hope anyone more familiar with awk can suggest how to improve.
awk '{if ($2 == "Variable_A"){ total += $3; count++ }} END { print "Variable_A = " total/count }' sample.mat > avg_a.txt
Above command will do for each row, check if column 2(correspond to $2) equals "Variable_A", if yes, sum the value in column 3(correspond to $3) and add a count. After processing all rows, print the average to a text file.
For further question
In order to show multiple variables average in same file, you can make use of array and for loop in AWK. Add elements to vars for more variables.
awk 'BEGIN {vars[0]="Variable_A"; vars[1]="Variable_B"; vars[2] ="Variable_C" } { for (i in vars) { if ($2 == vars[i]){ total[i] += $3; count[i]++ }}} END { for(i in vars) {print vars[i]" = " total[i]/count[i]}}' sample.mat > avg.txt
After reading:
Combine two files with unequal length on common column with multiple matches with linux command line
I wonder how you would do a full outer join.
(hopefully it's ok to start a new question with it)
file one
A 1
C 4
file two
A 2
B 5
file three
A 7
D 9
the result would be:
A 1 2 7
B N 5 N
C 4 N N
D N N 9
Is there an awk-one-line-solution like I saw for left outer join?
With GNU awk for true multi-dimensional arrays, ARGIND, and sorted_in:
$ cat tst.awk
{ vals[$1][ARGIND] = $2 }
END {
PROCINFO["sorted_in"] = "#ind_str_asc"
for (key in vals) {
printf "%s%s", key, OFS
for (fileNr=1; fileNr<=ARGIND; fileNr++) {
val = (fileNr in vals[key] ? vals[key][fileNr] : "N")
printf "%s%s", val, (fileNr<ARGIND ? OFS : ORS)
}
}
}
$ awk -f tst.awk file1 file2 file3
A 1 2 7
B N 5 N
C 4 N N
D N N 9
It's possible to solve this using the POSIX-standard join command. Given that the three files proposed in the question are named 1.txt, 2.txt and 3.txt:
join -a1 -a2 -eN -o 0,1.2,2.2 1.txt 2.txt | join -a1 -a2 -eN -o 0,1.2,1.3,2.2 - 3.txt
Note that while this provides the required output for the question as stated, it's quite a bit less flexible than Ed Morton's awk-based solution. For one thing, join needs its input files to be sorted on the shared field(s). And for another, it will only work for exactly three input files.
However, it's possibly simpler to use join in once-off ad hoc cases!
I want to create a CSV to import it on excel, containing all the packet details shown in wireshark.
Each row should correspond to a packet and the columns to the field details.
Using the following tshark command:
tshark -r mycapturefile.cap -E -V
I can show the information I need like:
Frame 1077: 42 bytes on wire (336 bits), 42 bytes captured (336 bits)
Encapsulation type: Ethernet (1)
Arrival Time: Aug 15, 2017 14:02:27.095521000 EDT
[Time shift for this packet: 0.000000000 seconds]
Epoch Time: 1502820147.095521000 seconds
and other packet details...
What I want is that information provided with -V, so the -T fields option in wireshark is discarded. Wireshark export options also don't provide the data I need, only the pdml format, but I think is more tedius to parse.
I have searched for a tool, a script or parser with no results. Since each packet is different, make a personal parser may be difficult/tedious and considering people can extract this information but provide no sources of how to do it, there must be a method or tool that can do it.
Do you know any tool, script or method that already do this?
Thanks in advance.
There is a ton of information coming down. You gotta use that -Y display filter to whittle it down. The resulting text can then be parsed.
Try -Y "frame.number == 1077" -V and then parse the text that is returned.
In my case I wanted certificate information.
Function GetCertsFromWireSharkPackets2 ($CERTTEXT){
foreach($Cert in($CERTTEXT|?{$_ -match "Source:.*\d{1,3}\.\d{1,3}\.\d{1,3}\.|Destination:.*\d{1,3}\.\d{1,3}\.\d{1,3}\.|Certificate:"} | %{$_.trim() -replace 'Source:','|Source:' -replace ":",'=' }) -join "`n"| %{$_.split('|')}|?{$_}) {
$Cert|%{$Props = [regex]::matches($_,"(?sim)(?<=^).*?(?=\=)").value ; $Dups = [regex]::matches($Props,"(?sim)\b(\w+)\s+\1\b").value.split(' ') ; $values = [regex]::matches($_,"(?sim)(?<=\=).*?(?=$)").value.trim()}
$PropsNoDups = ($Props -join "`n").replace(($Dups|select -first 1),'').split(10)|?{$_} ;
if(($PropsNoDups.count + $Dups.count) -ne $Props.count){$dups+=($dups|select -First 1)}
for($X=1;$X -lt $Dups.count;$X++){$dups[$X] +=$X}
$ValidProps = $PropsNoDups+$Dups ; $StitchCount = $Values.Count
$ValidP_V = For($x=0;$x -lt $StitchCount;$x++){ '"'+$ValidProps[$x] + '"="' + $Values[$x] +'"'} ;$ValidP_V =($ValidP_V -join "`n")|?{$_} ; $ExpText = "New-Object psobject -Property #{`n"+$ValidP_V+"`n}"
Invoke-Expression($ExpText)|select Source, Destination, Certificate, Certificate1, Certificate2, Certificate3
} }
#Click refresh on a few browser tabs to generate traffic.
$CERTTEXT = .\tshark.exe -i 'Wi-Fi' -Y "ssl.handshake.certificate" -V -a duration:30
GetCertsFromWireSharkPackets2 $CERTTEXT
Source : cybersandwich.com (107.170.193.139)
Destination : KirtCarson.com (222.168.3.118)
Certificate : 3082057e30820466a0030201020212030e2782075e8f90f5... (id-at-commonName=multi.zeall.us)
Certificate1 : 308204923082037aa00302010202100a0141420000015385... (id-at-commonName=Let's Encrypt Authority
X3,id-at-organizationName=Let's Encrypt,id-at-countryName=US)
Certificate2 :
Certificate3 :
I'm trying to understand how I might parse binary per 4 bits if it is possible.
For example:
I have 2-byte codes that need to be parsed to determine which instruction to use
#{1NNN} where the first 4 bits tell where which instruction, and NNN represents a memory location (i.e. #{1033} says jump to memory address #{0033}
It seems to be easy to do this with full bytes, but not with half bytes:
parse #{1022} [#{10} {#22}]
because #{1} isn't valid binary!
So far, I've used giant switch statements with: #{1033} AND #{F000} = #{1000} in order to process these, but wondering how a more mature reboler might do this.
This is a rather big entry, but it addresses your needs and shows off PARSE a bit.
This is basically a working, albeit simple VM which uses the memory layout you describe above.
I set up a simple block of RAM which is an actual program that it executes when I use PARSE with the emulator grammar rule... basically, it increments an address and then jumps to that address, skipping over an NOP.
it then hits some illegal op and dies.
REBOL [
title: "simple VM using Parse, from scratch, using no external libraries"
author: "Maxim Olivier-Adlhoch"
date: 2013-11-15
]
;----
; builds a bitset with all low-order bits of a byte set,
; so only the high bits have any weight
;----
quaternary: func [value][
bs: make bitset!
reduce [
to-char (value * 16)
'-
to-char ((value * 16) + 15)
]
]
;------
; get the 12 least significant bits of a 16 bit value
LSB-12: func [address [string! binary!] ][
as-binary (address AND #{0FFF})
]
;------
i32-to-binary: func [
n [integer!]
/rev
][
n: load join "#{" [form to-hex to-integer n "}"]
either rev [head reverse n][n]
]
;------
; load value at given address. (doesn't clear the opcode).
LVAL: func [addr [binary!]][
to-integer copy/part at RAM ( (to-integer addr) + 1) 2
]
;------
; implement the opcodes which are executed by the CPU
JMP: func [addr][
print ["jumping to " addr]
continue: at RAM ((to-integer addr) + 1) ; 0 based address but 1 based indexing ;-)
]
INC: func [addr][
print ["increment value at address: " addr]
new-val: 1 + LVAL addr
addr: 1 + to-integer addr
bin-val: at (i32-to-binary new-val) 3
change at RAM addr bin-val
]
DEC: func [addr][
print ["decrement value at address: " addr]
]
NOP: func [addr][
print "skipping Nop opcode"
]
;------
; build the bitsets to match op codes
op1: quaternary 1
op2: quaternary 2
op3: quaternary 3
op4: quaternary 4
;------
; build up our CPU emulator grammar
emulator: [
some [
[
here:
[ op1 (op: 'JMP) | op2 (op: 'INC) | op3 (op: 'DEC) | op4 (op: 'NOP)] ; choose op code
:here
copy addr 2 skip (addr: LSB-12 addr) ; get unary op data
continue:
(do reduce [op addr])
:continue
]
| 2 skip (
print ["^/^/^/ERROR: illegal opcode AT: " to-binary here " offset[" -1 + index? here "]"] ; graceful crash!
)
]
]
;------
; generate a bit of binary RAM for our emulator/VM to run...
0 2 4 6 8 ; note ... don't need comments, Rebol just skips them.
RAM: #{2002100540FF30015FFF}
RAM-blowup: { 2 002 1 005 4 0FF 3 001 5 FFF } ; just to make it easier to trace op & data
parse/all RAM emulator
print "^/^/Yes that error is on purpose, I added the 5FFF bytes^/in the 'RAM' just to trigger it :-)^/"
print "notice that it doesn't run the NOP (at address #0006), ^/since we used the JMP opcode to jump over it.^/"
print "also notice that the first instruction is an increment ^/for the address which is jumped (which is misaligned on 'boot')^/"
ask "press enter to continue"
the output is as follows:
increment value at address: #{0002}
jumping to #{0006}
decrement value at address: #{0001}
ERROR: illegal opcode AT: #{5FFF} offset[ 8 ]
Yes that error is on purpose, I added the 5FFF bytes
in the 'RAM' just to trigger it :-)
notice that it doesn't run the NOP (at address #0006),
since we used the JMP opcode to jump over it.
also notice that the first instruction is an increment
for the address which is jumped (which is misaligned on 'boot')
press enter to continue
Okay, this is a Windows specific question.
I need to be able to access the ink levels of a printer connected to a computer. Possibly direct connection, or a network connection.
I recognize that it will likely be different for each printer (or printer company at least) but where can I find the information of how they reveal ink levels to a PC. Also, what is the best language to read this information in?
Okay, this is a OS agnostic answer... :-)
If the printer isn't a very cheapo model, it will have built-in support for SNMP (Simple Network Management Protocol). SNMP queries can return current values from the network devices stored in their MIBs (Management Information Bases).
For printers there's a standard defined called Printer MIB. The Printer MIB defines standard names and tree locations (OIDs == Object Identifiers in ASN.1 notation) for prtMarkerSuppliesLevel which in the case of ink marking printers map to ink levels.
Be aware that SNMP also allows private extensions to the standard MIBs. Most printer vendors do hide many additional pieces of information in their "private MIBs", though the standard info should always be available through the queries of the Printer MIB OIDs.
Practically every programming language has standard libraries which can help you to make specific SNMP queries from your own application.
One such implementation is Open Source, called Net-SNMP, which also comes with a few powerfull commandline tools to run SNMP queries.
I think the OID to query all levels for all inks is .1.3.6.1.2.1.43.11.1.1.9 (this webpage confirms my believe) but I cannot verify that right now, because I don't have a printer around in my LAN at the moment. So Net-SNMP's snmpget command to query ink levels should be something like:
snmpget \
-c public \
192.168.222.111 \
".1.3.6.1.2.1.43.11.1.1.9"
where public is the standard community string and 192.168.222.111 your printer's IP address.
I have an SNMP-capable HP 8600 pro N911a around to do some digging, so the following commands may help you a bit. Beware that this particular model has some firmware problems, you can't query "magenta" with snmpget, but you see a value with snmpwalk (which does some kind of recursive drill-down).
OLD: You can query the names and sequence of values, but I couldn't find the "max value" to calculate a clean percentage so far ;(. I'm guessing so far the values are relative to 255, so dividing by 2.55 yields a percentage.
Update: Marcelo's hint was great! From Registers .8.* you can read the max level per cartridge, and I was totally wrong assuming the max value can only be an 8-bit value. I have updated the sample script to read the max values and calculate c
There is also some discussion over there at Cacti forums.
One answer confirms that the ink levels are measured as percent (value 15 is "percent" in an enumeration):
# snmpwalk -v1 -c public 192.168.100.173 1.3.6.1.2.1.43.11.1.1.7
SNMPv2-SMI::mib-2.43.11.1.1.7.0.1 = INTEGER: 15
SNMPv2-SMI::mib-2.43.11.1.1.7.0.2 = INTEGER: 15
SNMPv2-SMI::mib-2.43.11.1.1.7.0.3 = INTEGER: 15
SNMPv2-SMI::mib-2.43.11.1.1.7.0.4 = INTEGER: 15
You need to install the net-snmp package. If you're not on Linux you might need some digging for SNMP command line tools for your preferred OS.
# snmpwalk -v1 -c public 192.168.100.173 1.3.6.1.2.1.43.11.1.1.6.0
SNMPv2-SMI::mib-2.43.11.1.1.6.0.1 = STRING: "black ink"
SNMPv2-SMI::mib-2.43.11.1.1.6.0.2 = STRING: "yellow ink"
SNMPv2-SMI::mib-2.43.11.1.1.6.0.3 = STRING: "cyan ink"
SNMPv2-SMI::mib-2.43.11.1.1.6.0.4 = STRING: "magenta ink"
# snmpwalk -v1 -c public 192.168.100.173 1.3.6.1.2.1.43.11.1.1.9.0
SNMPv2-SMI::mib-2.43.11.1.1.9.0.1 = INTEGER: 231
SNMPv2-SMI::mib-2.43.11.1.1.9.0.2 = INTEGER: 94
SNMPv2-SMI::mib-2.43.11.1.1.9.0.3 = INTEGER: 210
SNMPv2-SMI::mib-2.43.11.1.1.9.0.4 = INTEGER: 174
# snmpwalk -v1 -c praxis 192.168.100.173 1.3.6.1.2.1.43.11.1.1.8.0
SNMPv2-SMI::mib-2.43.11.1.1.8.0.1 = INTEGER: 674
SNMPv2-SMI::mib-2.43.11.1.1.8.0.2 = INTEGER: 240
SNMPv2-SMI::mib-2.43.11.1.1.8.0.3 = INTEGER: 226
SNMPv2-SMI::mib-2.43.11.1.1.8.0.4 = INTEGER: 241
On my Linux box I use the following script to do some pretty-printing:
#!/bin/sh
PATH=/opt/bin${PATH:+:$PATH}
# get current ink levels
eval $(snmpwalk -v1 -c praxis 192.168.100.173 1.3.6.1.2.1.43.11.1.1.6.0 |
perl -ne 'print "c[$1]=$2\n" if(m!SNMPv2-SMI::mib-2.43.11.1.1.6.0.(\d) = STRING:\s+"(\w+) ink"!i);')
# get max ink level per cartridge
eval $(snmpwalk -v1 -c praxis 192.168.100.173 1.3.6.1.2.1.43.11.1.1.8.0 |
perl -ne 'print "max[$1]=$2\n" if(m!SNMPv2-SMI::mib-2.43.11.1.1.8.0.(\d) = INTEGER:\s+(\d+)!i);')
snmpwalk -v1 -c praxis 192.168.100.173 1.3.6.1.2.1.43.11.1.1.9.0 |
perl -ne '
my #c=("","'${c[1]}'","'${c[2]}'","'${c[3]}'","'${c[4]}'");
my #max=("","'${max[1]}'","'${max[2]}'","'${max[3]}'","'${max[4]}'");
printf"# $c[$1]=$2 (%.0f)\n",$2/$max[$1]*100
if(m!SNMPv2-SMI::mib-2.43.11.1.1.9.0.(\d) = INTEGER:\s+(\d+)!i);'
An alternative approach could be using ipp. While most of the printers I tried support both, I found one which only worked with ipp and one that only worked for me with snmp.
Simple approach with ipptool:
Create file colors.ipp:
{
VERSION 2.0
OPERATION Get-Printer-Attributes
GROUP operation-attributes-tag
ATTR charset "attributes-charset" "utf-8"
ATTR naturalLanguage "attributes-natural-language" "en"
ATTR uri "printer-uri" $uri
ATTR name "requesting-user-name" "John Doe"
ATTR keyword "requested-attributes" "marker-colors","marker-high-levels","marker-levels","marker-low-levels","marker-names","marker-types"
}
Run:
ipptool -v -t ipp://192.168.2.126/ipp/print colors.ipp
The response:
"colors.ipp":
Get-Printer-Attributes:
attributes-charset (charset) = utf-8
attributes-natural-language (naturalLanguage) = en
printer-uri (uri) = ipp://192.168.2.126/ipp/print
requesting-user-name (nameWithoutLanguage) = John Doe
requested-attributes (1setOf keyword) = marker-colors,marker-high-levels,marker-levels,marker-low-levels,marker-names,marker-types
colors [PASS]
RECEIVED: 507 bytes in response
status-code = successful-ok (successful-ok)
attributes-charset (charset) = utf-8
attributes-natural-language (naturalLanguage) = en-us
marker-colors (1setOf nameWithoutLanguage) = #00FFFF,#FF00FF,#FFFF00,#000000,none
marker-high-levels (1setOf integer) = 100,100,100,100,100
marker-levels (1setOf integer) = 6,6,6,6,100
marker-low-levels (1setOf integer) = 5,5,5,5,5
marker-names (1setOf nameWithoutLanguage) = Cyan Toner,Magenta Toner,Yellow Toner,Black Toner,Waste Toner Box
marker-types (1setOf keyword) = toner,toner,toner,toner,waste-toner
marker-levels has current toner/ink levels, marker-high-levels are maximus (so far I've only seen 100s here), marker-names describe meaning of each field (tip: for colors you may want to strip everything after first space, many printers include cartridge types in this field).
Note: the above is with cups 2.3.1. With 2.2.1 I had to specify the keywords as one string instead ("marker-colors,marker-h....). Or it can be left altogether, then all keywords are returned.
More on available attributes (may differ between printers): https://www.cups.org/doc/spec-ipp.html
More on executing ipp calls (including python examples): https://www.pwg.org/ipp/ippguide.html
I really liked tseeling's approach!
Complementarily, I found out that the max value for the OID ... .9 is not 255 as guessed by him, but it actually varies per individual cartridge. The values can be obtained from OID .1.3.6.1.2.1.43.11.1.1.8 (the results obtained by dividing by these values match the ones obtained by running hp-inklevels command from hplip.
I wrote my own script that output CSVs like below (suppose printer IP addr is 192.168.1.20):
# ./hpink 192.168.1.20
black,73,366,19.9454
yellow,107,115,93.0435
cyan,100,108,92.5926
magenta,106,114,92.9825
values are in this order: <color_name>,<level>,<maxlevel>,<percentage>
The script source (one will notice I usually prefer awk over perl when the puzzle is simple enough):
#!/bin/sh
snmpwalk -v1 -c public $1 1.3.6.1.2.1.43.11.1.1 | awk '
/.*\.6\.0\./ {
sub(/.*\./,"");
split($0,TT,/[ "]*/);
color[TT[1]]=TT[4];
}
/.*\.8\.0\./ {
sub(/.*\./,"");
split($0,TT,/[ "]*/);
maxlevel[TT[1]]=TT[4];
}
/.*\.9\.0\./ {
sub(/.*\./,"");
split($0,TT,/[ "]*/);
print color[TT[1]] "," TT[4] "," maxlevel[TT[1]] "," TT[4] / maxlevel[TT[1]] * 100;
}'