Limit of 4Kbytes when using print in AWK? - printing

I'm trying to replace a a blank line in a set of text files (*.txt) for a "--" if the previous line matchs a pattern. My code is
awk 'BEGIN{$headerfound=0} { if (/pattern/) {print> FILENAME ; $headerfound=1} else { if((/^\s*$/) && ($headerfound == 1)) { $headerfound=0; print "--" > FILENAME } else {print > FILENAME} } }' *.txt
But for some reason, output is limited to 4kbytes files (if the file is larger, it gets clipped). Do you know where is the limitation?
Thanks,
Ariel

See #glennjackman's comments for problems in your script.
Since you are using GNU awk (you used \s which is gawk-specific) you can use inplace-editing and write your script as (spread out with white space to improve readability):
awk -i inplace '{
if (/pattern/) {
print
headerfound=1
} else {
if((/^\s*$/) && (headerfound == 1)) {
headerfound=0
print "--"
} else {
print
}
}
}' *.txt
but you can do the same thing much more concisely (and awk-ishly) as:
awk -i inplace '
/pattern/ { headerfound=1 }
headerfound && !NF { $0="--"; headerfound=0 }
1' *.txt
If you don't have inplace editing then do it this way:
for file in *.txt; do
awk '
/pattern/ { headerfound=1 }
headerfound && !NF { $0="--"; headerfound=0 }
1' "$file" > tmp$$ &&
mv tmp$$ "$file"
done

You can probably get away with:
suffix=".$$.tmp" '
awk -v suf="$suffix" '
FNR == 1 {outfile = FILENAME suf}
/pattern/ {headerfound = 1}
headerfound && /^[[:blank:]]*$/ {$1 = "--"}
{ print > outfile }
' *.txt
for f in *.txt; do
echo mv "${f}$suffix" "$f"
done
Remove the echo from the for loop if you're satisfied it's working.
Missed the "just after" requirement (using Ed's use of NF to find a blank line):
awk -v suf="$suffix" '
FNR == 1 {outfile = FILENAME suf}
/pattern/ {lineno = FNR}
FNR == lineno+1 && NF == 0 {$0 = "--"}
{ print > outfile }
' *.txt

Related

Performant comparisons in awk?

I've got a python script that runs through some logs and figured it'd be instructive to do a few benchmarks against some other approaches before deploying this out. When looking at awk, I'm hoping to minimize overhead to get a 'fair' shake at beating the somewhat optimized python variant.
My log entries look like:
--------
SomeField=SomeValue
OptionallyAppearingField=WhoKnowsWhat
AnotherField=AnotherValue
ExtraStuff=OneBonusKey=1,SecondBonusKey=2,ThirdBonusKey=3,...
--------
And I'm keen to get the value of AnotherField when one of our ThirdBonusKeys exists and has a certain value (actually just the number 1).
The 'stupid' way here is to set our RS to '--------' and then just apply a regex to $0 twice, first to see if ThirdBonusKey=1 is in the record, and then to extract AnotherField=(desired_value).
But that seems like an unfair comparison, given it's just throwing a regex at the problem (twice!). Without a guaranteed ordering of fields to leverage awk's cool FS skills, is there a quicker or more appropriate approach here? It's possible that the answer is just "this is not a job for awk", and that's okay too, I guess.
Cyrus has kindly pointed out that the sketch of code I gave above is not technically code, and he's technically correct, so here's a reasonably stupid implementation:
awk 'BEGIN{RS="--------"} { if ($0 ~ /ThirdBonusKey=1/) { for(i=1;i<NF;i++) {if ($i ~ "AnotherField=") { print $i }}}}'
Given input
--------
SomeField=SomeValue
OptionallyAppearingField=WhoKnowsWhat
AnotherField=DesiredValue1
ExtraStuff=OneBonusKey=1,SecondBonusKey=2,ThirdBonusKey=1,...
--------
SomeField=SomeValue
OptionallyAppearingField=WhoKnowsWhat
AnotherField=DesiredValue2
ExtraStuff=OneBonusKey=1,SecondBonusKey=2,ThirdBonusKey=0,...
--------
SomeField=
ExtraStuff=
--------
we'd expect output
AnotherField=DesiredValue1
Most efficiently I expect:
$ awk '/^AnotherField=/{val=$0; next} /[=,]ThirdBonusKey=1(,|$)/{print val}' file
AnotherField=DesiredValue1
but more robustly and easier to enhance to do anything else you want later:
$ cat tst.awk
BEGIN { FS="[,=[:space:]]"; OFS="=" }
/^-+$/ {
if ( f["ExtraStuff_ThirdBonusKey"] == 1 ) {
print "AnotherField", f["AnotherField"]
}
delete f
next
}
{
if ( $1 == "ExtraStuff" ) {
pfx = $1
sub(/[^=]+=/,"")
f[pfx] = $0
pfx = pfx "_"
}
else {
pfx = ""
}
for (i=1; i<NF; i+=2) {
f[pfx $i] = $(i+1)
}
}
$ awk -f tst.awk file
AnotherField=DesiredValue1
That second script first stores all of the values in an array f[] so you can access the values by their names, here's what the contents of that array look like:
$ cat tst.awk
BEGIN { FS="[,=[:space:]]"; OFS="=" }
/^-+$/ {
for (i in f) printf "> f[%s]=%s\n", i, f[i]
if ( f["ExtraStuff_ThirdBonusKey"] == 1 ) {
print "AnotherField", f["AnotherField"]
}
print "----"
delete f
next
}
{
if ( $1 == "ExtraStuff" ) {
pfx = $1
sub(/[^=]+=/,"")
f[pfx] = $0
pfx = pfx "_"
}
else {
pfx = ""
}
for (i=1; i<NF; i+=2) {
f[pfx $i] = $(i+1)
}
}
.
$ awk -f tst.awk file
----
> f[OptionallyAppearingField]=WhoKnowsWhat
> f[AnotherField]=DesiredValue1
> f[ExtraStuff_SecondBonusKey]=2
> f[ExtraStuff_ThirdBonusKey]=1
> f[ExtraStuff_OneBonusKey]=1
> f[SomeField]=SomeValue
> f[ExtraStuff]=OneBonusKey=1,SecondBonusKey=2,ThirdBonusKey=1,...
AnotherField=DesiredValue1
----
> f[OptionallyAppearingField]=WhoKnowsWhat
> f[AnotherField]=DesiredValue2
> f[ExtraStuff_SecondBonusKey]=2
> f[ExtraStuff_ThirdBonusKey]=0
> f[ExtraStuff_OneBonusKey]=1
> f[SomeField]=SomeValue
> f[ExtraStuff]=OneBonusKey=1,SecondBonusKey=2,ThirdBonusKey=0,...
----
> f[SomeField]=
> f[ExtraStuff]=
----
Given that you can create whatever conditions and/or print whatever combinations of fields you want in any input or output order.

Is it possible to prevent perforce submit without filename?

My usual way to submit a file is:
p4 submit –d “some description” filename
I could do:
p4 submit
and use the editor, but I always have many files open, so that method is inconvenient
Several times, I have mistakenly typed
p4 submit –d "some description"
(forgot the filename)
This submitted dozens of open files to production, with unintended consequences.
Time to panic and spend the afternoon doing damage control.
I would like to prevent p4 -d when the filename is not specified.
If you are using Linux you can define function in your .bashrs file that validates number of arguments and won't let you submit if you miss4th parameter.
function p4()
{
# validate what parameters are passed and if they are correct
# pass them to /opt/perforce/p4 ...
}
Thanks #pitseeker
I created a Perl wrapper "p4s" which checks the arguments and forwards the call to the real "p4 submit".
#!/usr/bin/perl
use warnings;
use strict;
use Capture::Tiny 'capture_merged';
die "Description and file is required!\n" if #ARGV < 2;
my ($description, #files) = #ARGV;
if ( -f $description ) {
die "It looks like you forgot the description before the filenames";
}
my $cmd;
my %summary;
print `date`;
for my $file (#files) {
if ( ! -f $file ) {
$summary{$file} = "File $file not found!";
next;
}
my $pwd = `pwd`;
chomp $pwd;
# print p4 filelog to screen
print `ls -l $file`;
$cmd = "p4 filelog $file | head -n 2";
$cmd = "p4 fstat -T 'headRev' $file";
print $cmd . "\n";
my $filelog = `$cmd`;
print "$filelog" . "\n";
$cmd = "p4 diff -sa $file";
my ($merged, $status) = Capture::Tiny::capture_merged {system($cmd)};
if ( ! $merged ) {
$summary{$file} = "Skipped since the local file does not differ from p4";
next;
}
# p4 submit
$cmd = "p4 submit -r -d \"$description\" $file";
print $cmd . "\n";
($merged, $status) = Capture::Tiny::capture_merged {system($cmd)};
chomp $merged;
print $merged . "\n";
if ( $merged =~ /No files to submit from the default changelist/ ) {
$summary{$file} = "$merged (You may need to 'p4 add' or 'p4 edit' this file)";
next;
}
$summary{$file} = "Success";
}
if ( scalar #files > 0 ) {
print "\nSummary:\n";
for my $file (#files) {
printf "%s %s\n", $file, $summary{$file};
}
}

Print Field into Terminal Command

Sorry for confusing you,
well, actually i want to send serial message to my arduino which is connected to /dev/ttyACM0 and it can be done by typing this command into terminal
$ echo "Hello Arduino" > /dev/ttyACM0
so, i need my awk to send a command just like that.
Here is my PBH.awk file:
BEGIN{
FS = "[ .]";
RS = "\0";
IGNORECASE = 1;
}{
for (i=1;i<NF;i++){
if(i == 1){
printf("Diketahui : %s\n",$18);}
if($i=="y" && $(i+1)=="=")
{
printf(" Persamaan : %s %s %s %s %s %s %s %s %s %s %s\n",$(i),$(i+1),$(i+2),$(i+3),$(i+4),$(i+5),$(i+6),$(i+7),$(i+8),$(i+9),$(i+10))
inisialisasi = "stty -F /dev/ttyACM0 cs8 9600 ignbrk -brkint -icrnl -imaxbel -opost -onlcr -isig -icanon -iexten -echo -echoe -echok -echoctl -echoke noflsh -ixon -crtscts"
kirim = "echo \"Field2 contains: $2""\" > /dev/ttyACM0"
print | inisialisasi
print | kirim
}
}
}
and here is the soalPBH.txt:
Persamaan gelombang berjalan pada seutas tali dinyatakan dengan y = 0,02 sin (20 π t – 0,2 π x). Jika x dan y dalam cm dan t dalam sekon, tentukan:
Then i run my awk with
$ awk -f PBH.awk soalPBH.txt
My program doesnt send the text on field number 2.
Is there something wrong with this??
kirim = "echo \"Field2 contains: $2""\" > /dev/ttyACM0"
Its VERY unclear what you're trying to do. Is this it:
$ cat file
field1 field2 field3
$ awk '{printf "echo \"Hello Arduino %s\" > /dev/ttyACM0\n", $2}' file
echo "Hello Arduino field2" > /dev/ttyACM0
If not, clarify your question and provide some clear sample input and expected output.
Given your updated question, just move the quote so $2 is outside of the quotes instead of inside of them, i.e. "$2 instead of $2":
kirim = "echo \"Field2 contains: " $2 "\" > /dev/ttyACM0"
Then tell us if you still have a problem.

parsing issue with comma separated csv file

I am trying to extract 4th column from csv file (comma separated, and skipping first 2 header lines) using this command,
awk 'NR <2 {next}{FS =","}{print $4}' filename.csv | more
However, it doesn't work because the first column cantains comma, thus 4th column is not really 4th. Below is an example of a row:
"sdfsdfsd, sfsdf", 454,fgdfg, I_want_this_column,sdfgdg,34546, 456465, etc
Unless you have specific reasons for using awk, I would recommend using a CSV parsing library. Many scripting languages have one built-in (or at least available) and they'll save you from these headaches.
if your first column has quotes always,
$ awk 'BEGIN{ FS="\042[ ]*," } { m=split($2,a,","); print a[3] } ' file
I_want_this_column
if the column you want is always the last 2nd,
$ awk -F"," '{print $(NF-1)}' file
I_want_this_column
You can try this demo script to break down the columns
awk 'BEGIN{ FS="," }
{
for(i=1;i<=NF;i++){
# save normal
if($i !~ /^[ ]*\042|[ ]*\042[ ]*$/){
a[++j]=$i
}
# if quotes at the end
if(f==1 && $i ~ /[ ]*\042[ ]*$/){
s=s","$i
a[++j]=s
#reset
s="";f=0
}
# if quotes in front
if($i ~ /^[ ]*\042/){
s=s $i
f=1
}
if(f==1 && ( $i !~/\042/ ) ){
s=s","$i
}
}
}
END{
# print columns
for(p=1;p<=j;p++){
print "Field "p,": "a[p]
}
} ' file
output
$ cat file
"sdfsdfsd, sfsdf", "454,fgdfg blah , words ", I_want_this_column,sdfgdg
$ ./shell.sh
Field 1 : "sdfsdfsd, sfsdf"
Field 2 : fgdfg blah
Field 3 : "454,fgdfg blah , words "
Field 4 : I_want_this_column
Field 5 : sdfgdg
You shouldn't use awk here. Use Python csv module or Perl Text::CSV or Text::CSV_XS modules or another real csv parser.
Related question -
parse csv file using gawk
If you can't avoid awk, this piece of code does the job you need:
BEGIN {FS=",";}
{
f=0;
j=0;
for (i = 1; i <=NF ; ++i) {
if (f) {
a[j] = a[j] "," $(i);
if ($(i) ~ "\"$") {
f = 0;
}
}
else {
++j;
a[j] = $(i);
if ((a[j] ~ "^\"[^\"]*$")) {
f = 1;
}
}
}
for (i = 1; i <= j; ++i) {
gsub("^\"","",a[i]);
gsub("\"$","",a[i]);
gsub("\"\"","\"",a[i]);
print "i = \"" a[i] "\"";
}
}
Working with CSV files that have quoted fields with commas inside can be difficult with the standard UNIX text tools.
I wrote a program called csvquote to make the data easy for them to handle. In your case, you could use it like this:
csvquote filename.csv | awk 'NR <2 {next}{FS =","}{print $4}' | csvquote -u | more
or you could use cut and tail like this:
csvquote filename.csv | tail -n +3 | cut -d, -f4 | csvquote -u | more
The code and docs are here: https://github.com/dbro/csvquote

How do I know if PDF pages are color or black-and-white?

Given a set of PDF files among which some pages are color and the remaining are black & white, is there any program to find out among the given pages which are color and which are black & white? This would be useful, for instance, in printing out a thesis, and only spending extra to print the color pages. Bonus points for someone who takes into account double sided printing, and sends an appropriate black and white page to the color printer if it is are followed by a color page on the opposite side.
This is one of the most interesting questions I've seen! I agree with some of the other posts that rendering to a bitmap and then analyzing the bitmap will be the most reliable solution. For simple PDFs, here's a faster but less complete approach.
Parse each PDF page
Look for color directives (g, rg, k, sc, scn, etc)
Look for embedded images, analyze for color
My solution below does #1 and half of #2. The other half of #2 would be to follow up with user-defined color, which involves looking up the /ColorSpace entries in the page and decoding them -- contact me offline if this is interesting to you, as it's very doable but not in 5 minutes.
First the main program:
use CAM::PDF;
my $infile = shift;
my $pdf = CAM::PDF->new($infile);
PAGE:
for my $p (1 .. $pdf->numPages) {
my $tree = $pdf->getPageContentTree($p);
if (!$tree) {
print "Failed to parse page $p\n";
next PAGE;
}
my $colors = $tree->traverse('My::Renderer::FindColors')->{colors};
my $uncertain = 0;
for my $color (#{$colors}) {
my ($name, #rest) = #{$color};
if ($name eq 'g') {
} elsif ($name eq 'rgb') {
my ($r, $g, $b) = #rest;
if ($r != $g || $r != $b) {
print "Page $p is color\n";
next PAGE;
}
} elsif ($name eq 'cmyk') {
my ($c, $m, $y, $k) = #rest;
if ($c != 0 || $m != 0 || $y != 0) {
print "Page $p is color\n";
next PAGE;
}
} else {
$uncertain = $name;
}
}
if ($uncertain) {
print "Page $p has user-defined color ($uncertain), needs more investigation\n";
} else {
print "Page $p is grayscale\n";
}
}
And then here's the helper renderer that handles color directives on each page:
package My::Renderer::FindColors;
sub new {
my $pkg = shift;
return bless { colors => [] }, $pkg;
}
sub clone {
my $self = shift;
my $pkg = ref $self;
return bless { colors => $self->{colors}, cs => $self->{cs}, CS => $self->{CS} }, $pkg;
}
sub rg {
my ($self, $r, $g, $b) = #_;
push #{$self->{colors}}, ['rgb', $r, $g, $b];
}
sub g {
my ($self, $gray) = #_;
push #{$self->{colors}}, ['rgb', $gray, $gray, $gray];
}
sub k {
my ($self, $c, $m, $y, $k) = #_;
push #{$self->{colors}}, ['cmyk', $c, $m, $y, $k];
}
sub cs {
my ($self, $name) = #_;
$self->{cs} = $name;
}
sub cs {
my ($self, $name) = #_;
$self->{CS} = $name;
}
sub _sc {
my ($self, $cs, #rest) = #_;
return if !$cs; # syntax error
if ($cs eq 'DeviceRGB') { $self->rg(#rest); }
elsif ($cs eq 'DeviceGray') { $self->g(#rest); }
elsif ($cs eq 'DeviceCMYK') { $self->k(#rest); }
else { push #{$self->{colors}}, [$cs, #rest]; }
}
sub sc {
my ($self, #rest) = #_;
$self->_sc($self->{cs}, #rest);
}
sub SC {
my ($self, #rest) = #_;
$self->_sc($self->{CS}, #rest);
}
sub scn { sc(#_); }
sub SCN { SC(#_); }
sub RG { rg(#_); }
sub G { g(#_); }
sub K { k(#_); }
Newer versions of Ghostscript (version 9.05 and later) include a "device" called inkcov. It calculates the ink coverage of each page (not for each image) in Cyan (C), Magenta (M), Yellow (Y) and Black (K) values, where 0.00000 means 0%, and 1.00000 means 100% (see Detecting all pages which contain color).
For example:
$ gs -q -o - -sDEVICE=inkcov file.pdf
0.11264 0.11605 0.11605 0.09364 CMYK OK
0.11260 0.11601 0.11601 0.09360 CMYK OK
If the CMY values are not 0 then the page is color.
To just output the pages that contain colors use this handy oneliner:
$ gs -o - -sDEVICE=inkcov file.pdf |tail -n +4 |sed '/^Page*/N;s/\n//'|sed -E '/Page [0-9]+ 0.00000 0.00000 0.00000 / d'
It is possible to use the Image Magick tool identify. If used on PDF pages it converts the page first to a raster image. If the page contained color can be tested using the -format "%[colorspace]" option, which for my PDF printed either Gray or RGB. IMHO identify (or what ever tool it uses in the background; Ghostscript?) does choose the colorspace depending on the presents of color.
An example is:
identify -format "%[colorspace]" $FILE.pdf[$PAGE]
where PAGE is the page starting from 0, not 1. If the page selection is not used all pages will be collapsed to one, which is not what you want.
I wrote the following BASH script which uses pdfinfo to get the number of pages and then loops over them. Outputting the pages which are in color. I also added a feature for double sided document where you might need a non-colored backside page as well.
Using the outputted space separated list the colored PDF pages can be extracted using pdftk:
pdftk $FILE cat $PAGELIST output color_${FILE}.pdf
#!/bin/bash
FILE=$1
PAGES=$(pdfinfo ${FILE} | grep 'Pages:' | sed 's/Pages:\s*//')
GRAYPAGES=""
COLORPAGES=""
DOUBLECOLORPAGES=""
echo "Pages: $PAGES"
N=1
while (test "$N" -le "$PAGES")
do
COLORSPACE=$( identify -format "%[colorspace]" "$FILE[$((N-1))]" )
echo "$N: $COLORSPACE"
if [[ $COLORSPACE == "Gray" ]]
then
GRAYPAGES="$GRAYPAGES $N"
else
COLORPAGES="$COLORPAGES $N"
# For double sided documents also list the page on the other side of the sheet:
if [[ $((N%2)) -eq 1 ]]
then
DOUBLECOLORPAGES="$DOUBLECOLORPAGES $N $((N+1))"
#N=$((N+1))
else
DOUBLECOLORPAGES="$DOUBLECOLORPAGES $((N-1)) $N"
fi
fi
N=$((N+1))
done
echo $DOUBLECOLORPAGES
echo $COLORPAGES
echo $GRAYPAGES
#pdftk $FILE cat $COLORPAGES output color_${FILE}.pdf
The script from Martin Scharrer is great. It contains a minor bug: It counts two pages which contain color and are directly consecutive twice. I fixed that. In addition the script now counts the pages and lists the grayscale pages for double-paged printing. Also it prints the pages comma separated, so the output can directly be used for printing from a PDF viewer. I've added the code, but you can download it here, too.
Cheers,
timeshift
#!/bin/bash
if [ $# -ne 1 ]
then
echo "USAGE: This script needs exactly one paramter: the path to the PDF"
kill -SIGINT $$
fi
FILE=$1
PAGES=$(pdfinfo ${FILE} | grep 'Pages:' | sed 's/Pages:\s*//')
GRAYPAGES=""
COLORPAGES=""
DOUBLECOLORPAGES=""
DOUBLEGRAYPAGES=""
OLDGP=""
DOUBLEPAGE=0
DPGC=0
DPCC=0
SPGC=0
SPCC=0
echo "Pages: $PAGES"
N=1
while (test "$N" -le "$PAGES")
do
COLORSPACE=$( identify -format "%[colorspace]" "$FILE[$((N-1))]" )
echo "$N: $COLORSPACE"
if [[ $DOUBLEPAGE -eq -1 ]]
then
DOUBLEGRAYPAGES="$OLDGP"
DPGC=$((DPGC-1))
DOUBLEPAGE=0
fi
if [[ $COLORSPACE == "Gray" ]]
then
GRAYPAGES="$GRAYPAGES,$N"
SPGC=$((SPGC+1))
if [[ $DOUBLEPAGE -eq 0 ]]
then
OLDGP="$DOUBLEGRAYPAGES"
DOUBLEGRAYPAGES="$DOUBLEGRAYPAGES,$N"
DPGC=$((DPGC+1))
else
DOUBLEPAGE=0
fi
else
COLORPAGES="$COLORPAGES,$N"
SPCC=$((SPCC+1))
# For double sided documents also list the page on the other side of the sheet:
if [[ $((N%2)) -eq 1 ]]
then
DOUBLECOLORPAGES="$DOUBLECOLORPAGES,$N,$((N+1))"
DOUBLEPAGE=$((N+1))
DPCC=$((DPCC+2))
#N=$((N+1))
else
if [[ $DOUBLEPAGE -eq 0 ]]
then
DOUBLECOLORPAGES="$DOUBLECOLORPAGES,$((N-1)),$N"
DPCC=$((DPCC+2))
DOUBLEPAGE=-1
elif [[ $DOUBLEPAGE -gt 0 ]]
then
DOUBLEPAGE=0
fi
fi
fi
N=$((N+1))
done
echo " "
echo "Double-paged printing:"
echo " Color($DPCC): ${DOUBLECOLORPAGES:1:${#DOUBLECOLORPAGES}-1}"
echo " Gray($DPGC): ${DOUBLEGRAYPAGES:1:${#DOUBLEGRAYPAGES}-1}"
echo " "
echo "Single-paged printing:"
echo " Color($SPCC): ${COLORPAGES:1:${#COLORPAGES}-1}"
echo " Gray($SPGC): ${GRAYPAGES:1:${#GRAYPAGES}-1}"
#pdftk $FILE cat $COLORPAGES output color_${FILE}.pdf
ImageMagick has some built-in methods for image comparison.
http://www.imagemagick.org/Usage/compare/#type_general
There are some Perl APIs for ImageMagick, so maybe if you cleverly combine these with a PDF to Image converter you can find a way to do your black & white test.
I would try to do it like that, although there might be other easier solutions, and I'm curious to hear them, I just want to give it try:
Loop through all pages
Extract the pages to an image
Verify the color range of the image
For the page count, you can probably translate that without too much effort to Perl. It's basically a regex. It's also said that:
r"(/Type)\s?(/Page)[/>\s]"
You simply have to count how many
times this regular expression occurs
in the PDF file, minus the times you
find the string "<>"
(empty ages which are not rendered).
To extract the image, you can use ImageMagick to do that. Or see this question.
Finally, to get whether it is black and white, it depends if you mean literally black and white or grayscale. For black and white, you should only have, well, black and white in all the image. If you want to see grayscale, now, it's really not my speciality but I guess you could see if the averages of the red, the green and the blue are close to each other or if the original image and a grayscale converted one are close to each other.
Hope it gives some hints to help you go further.
Here is the ghostscript solution for Windows, which requires grep from GnuWin (http://gnuwin32.sourceforge.net/packages/grep.htm):
Monochrome (Black and White) pages:
gswin64c -q -o - -sDEVICE=inkcov DOCUMENT.pdf | grep "^ 0.00000 0.00000 0.00000" | find /c /v ""
Color pages:
gswin64c -q -o - -sDEVICE=inkcov DOCUMENT.pdf | grep -v "^ 0.00000 0.00000 0.00000" | find /c /v ""
Total pages (you get this one easier from any pdf reader):
gswin64c -q -o - -sDEVICE=inkcov DOCUMENT.pdf | find /c /v ""

Resources