How to concatenate a datafile with four columns

How to concatenate a datafile with four columns - opencv

My data file has four columns because opencv saves elements in an n x 4 form in xml when exporting a cv::Mat to xml.
How can I concatenate these four columns into one?
reset
set terminal postscript eps enhanced
set output 'imp.eps'
set pm3d
set palette model HSV defined ( 0 0 1 1, 1 1 1 1 )
set style data histogram
set colorbox user
set origin 0.0,0.10
set size ratio 0 1,0.8
set colorbox horizontal origin graph 0.0, -0.15, 0 size graph 1, 0.05, 0 noborder
set xrange [0:4096]
plot 'var_imp_first_run.data' using 1
set output
I have 4094 elements, and this example creates an histogram where the first 1024 is plotted. I need to append column 2 from x=1024:2048. Please ignore the colorbox stuff; I'm just playing around to learn gnuplot.
I found out that following solve my problem above.
set xrange [0:4096]
plot newhistogram at 0, 'var_imp_first_run.data' every:: 7 using 1,\
newhistogram at 1024,'var_imp_first_run.data' every:: 7 using 2,\
newhistogram at 2048,'var_imp_first_run.data' every:: 7 using 3,\
newhistogram at 3072,'var_imp_first_run.data' every:: 7 using 4;
Then I found out that it was not what I wanted, because the data are arranged as entries,
x1,x2,x3,x4
x5,x6,x7,x8
and so on.
So what I really need is a histogram that plots the rows before columns. Is it possible?

If you want it to read
x1
x2
x3
instead of
x1,x2,x3
Then (assuming you are on *nix) use
plot 'cat var_imp_first_run.data | tr "," "\n" |' using 1
This is calling the following command and reading the output via the pipe at the end:
cat var_imp_first_run.data | tr "," "\n"
Here, tr is just transcribing commas as newlines.

Related

Gnuplot filled curves adds unwanted "bottom" border

I am trying to visualize a time series data set on one plot as a pseudo 3d figure. However, I am having some trouble getting the filledcurves capability working properly. It seems to be adding an unwanted border at the "bottom" of my functions and I do not know how to fix this.
This is my current set up: I have nb_of_frames different files that I want to plot on one figure. Without the filledcurves option, I can do something like this
plot for [i=1:nb_of_frames] filename(i) u ($1):(50.0 * $2 + (11.0 - (i-1)*time_step)) w l linewidth 1.2 lt rgb "black" notitle
which produces a figure like this:
no fill options
Instead of doing this, I want to use the filledcurves option to bring my plots "forward" and highlight the function that is more "forward" which I try to do with:
plot for [i=1:nb_of_frames] filename(i) u ($1):(50. * $2 + (11. - (i-1)*time_step)) w filledcurves fc "white" fs solid 1.0 border lc "black" notitle
This produces a figure as follows:
This is very close to what I want, but it seems that the border option adds a line underneath the function which I do not want. I have tried several variants of with filledcurves y1=0.0 with different values of y1, but nothing seems to work.
Any help would be appreciated. Thank you for your time.

Here is another workaround for gnuplot 5.2.
Apparently, gnuplot closes the filled area from the last point back to the first point. Hence, if you specifiy border, then this line will also have a border which is undesired here (at least until gnuplot 5.4rc2 as #Ethan says).
A straightforward solution would be to plot the data with filledcurves without border and then again with lines. However, since this is a series of shifted data, this has to be plotted alternately. Unfortunately, gnuplot cannot switch plotting styles within a for loop (at least I don't know how). As a workaround for this, you have to build your plot command in a previous loop and use it with a macro # (check help macros) in the plot command. I hope you can adapt the example below to your needs.
Code:
### filledcurves without bottom border
reset session
set colorsequence classic
$Data <<EOD
1 0
2 1
3 2
4 1
5 4
6 5
7 2
8 1
9 0
EOD
myData(i) = sprintf('$Data u ($1-0.1*%d):($2+%d/5.)',i,i)
myFill = ' w filledcurves fc "0xffdddd" fs solid 1 notitle'
myLine = ' w l lc rgb "black" notitle'
myPlotCmd = ''
do for [i=11:1:-1] {
myPlotCmd = myPlotCmd.myData(i).myFill.", ".myData(i).myLine.", "
}
plot #myPlotCmd
### end of code
Result:

I can reproduce this in gnuplot 5.2.8 but not in the output from the release candidate for version 5.4. So I think that some bug-fix or change was applied during the past year or so. I realize that doesn't help while you are using verion 5.2, but if you can download and build from source for the 5.4 release candidate that would take care of it.
Update
I thought of a work-around, although it may be too complicated to be worth it.
You can treat this as a 2D projection of a 3D fence plot constructed using plot style with zerrorfill. In this projection the y coordinate is the visual depth. X is X. Three quantities are needed on z: the bounding line, the bottom, and the top. I.e. 5 fields in the using clause: x depth zline zbase ztop.
unset key
set view 90, 180
set xyplane at 0
unset ytics
set title "3D projection into the xz plane\nplot with zerrorfill" offset 0,-2
set xlabel "X axis" offset 0,-1
set zlabel "Z"
splot for [i=1:25] 'foo.dat' using ($1+i):(i/100.):($2-i):(-i):($2-i) \
with zerrorfill fc "light-cyan" lc "black" lw 2

How to get the accurate font size(height) in pdf

I have a sample pdf (attached), and it includes a text object and a rectangle object that have almost the same height. Then I checked the content of the pdf by using itextrup as below:
1 1 1 RG
1 1 1 rg
0.12 0 0 0.12 16 50 cm
q
0 0 m
2926 0 l
2926 5759 l
0 5759 l
0 0 l
W
n
Q
1 1 1 RG
1 1 1 rg
q
0 0 m
2926 0 l
2926 5759 l
0 5759 l
0 0 l
W
n
/F1 205.252 Tf
BT
0 0 0 RG
0 0 0 rg
/DeviceGray CS
/OC /oc1 BDC
0 -1 1 0 1648 5330 Tm
0 Tc
100 Tz
(Hello World) Tj
ET
Q
q
0 0 m
2926 0 l
2926 5759 l
0 5759 l
0 0 l
W
n
0 0 0 RG
0 0 0 rg
/DeviceGray CS
6 w
1 j
1 J
1649 5324 m
1649 4277 l
1800 4277 l
1800 5324 l
1649 5324 l
S
EMC
Q
Obviously the user space matrix is determined by [0.12 0 0 0.12 16 50], and the height for the rectangle is (1800-1649)*0.12*1=18.12, and for the font size I use 205.252*0.12=24.63024. Since the two values are not close, my problem is how to get the height/size of the font?
sample.pdf

OK - I took a look at your file and you're basically hosed. That's the scientific answer, now let me clarify :)
Bad PDF!
The PDF you have up there as a sample contains a font that is not embedded. That "/F1 Tf" command you have there points to the font "ArialMT" in the resources dict for that page. Because the font has not been embedded, you only have two options:
Try to find the actual font on the system and extract the necessary information from there.
Live with the information in the PDF. Let's start with that.
Font Descriptor
Here is an image from pdfToolbox examining the font in the PDF file (caution: I'm associated with this tool):
I've cut off some of the "Widths" table, but other than that this is all of the information you have in the PDF document for this font. And this means you can access the widths for each glyph, but you don't have access to the heights of each glyph. The only information you have regarding heights is the font bounding box which is the union of all glyph bounding boxes. In other words, the font bounding box is guaranteed to be big enough to contain any glyph from the font (both horizontally and vertically).
System Information
You don't say why you need this information so it becomes a little harder to advise further. But if you can't get the information from the PDF, you're only option is to live with the inaccurate information from the PDF or to turn to the system your code is running on to get you more.
If you have the ArialMT font installed, you could basically try to find the font file and then parse the TrueType font file to find the bounding boxes for each glyph. I've done that, it's not funny.
Or you can see if your system can't provide you with the information in a better way. Many operating systems / languages have text calls that can get accurate measurements for you. If not, you can brute force it by rendering the text you want in black on a white image and then examining the pixels to see where you hit and thus how big the largest glyph in your text string was.
Wasteful though that last option sounds, it's probably the quickest and easiest to implement and it - depending on your needs - may actually be the best option all around.

I have a sample pdf (attached), and it includes a text object and a rectangle object that have almost the same height.
Indeed, your PDF is displayed like this:
But looking at this one quickly realizes that the glyphs in your text "Hello World" do not extend beneath the base line like a 'g', 'j' or some other glyphs would:
(The base line is the line through the glyph origins)
Since the two values are not close, my problem is how to get the height/size of the font
Obviously the space required for such descenders beneath the base line must also be part of the font size.
Thus, it is completely correct and not a problem that the height of the box (18.12) is considerably smaller than the font size (24.63024).
BTW, this corresponds with the specification which describes a font size of 1 to be arranged so that the nominal height of tightly spaced lines of text is 1 unit, cf. section 9.2.2 "Basics of Showing Text" of ISO 32000-1. Tightly spaced lines obviously need to include not only glyph parts above the base line but also those below. Additionally it furthermore includes a small gap between such lines as even tightly spaced lines are not expected to touch each other.

Gnuplot multiplot positions all text incorrectly in epslatex terminal

I am trying to use the multiplot feature of gnuplot to make a inset graphic on the main plot. I can generate the plot exactly as I want with term='wxt' except for the axis labels, which require LaTeX formatting for generating the desired symbols. When I submit the same commands to term='epslatex', the plot is fine, but all text (axis and tic mark labels) is positioned incorrectly.
I thought using the set size & origin commands might have confused the epslatex terminal output, so I attempted to use the layout command and make the plots side by side just to see if the text would print correctly. It did not.
I'm using gnuplot 4.6 patch 4, and Linux Mint 17.
My script is below. The commented sections indicate the original script that used set size and origin commands to manually place the second plot as a inset, rather than side by side.
set term epslatex color font ",16"
unset key
set termoption dash
set style line 1 lc rgb 'blue' lw 2 lt 1
set style line 2 lc rgb 'red' lw 2 lt 3
set style line 3 lc rgb 'green' lw 2 lt 5
set style line 4 lc rgb 'magenta' lw 2 lt 7
set style line 5 lc rgb 'black' lw 1 lt 0
set output "gr-thresholds.tex"
#set size 1,1
# set multiplot
set multiplot layout 1,2
# bigger plot
set autoscale
set ytics scale default autofreq
set xrange[0:14]
set yrange[0:1.7]
set xlabel 'r (\AA)'
set ylabel '$g(r)$'
#set size 1,1
#set origin 0,0
plot "foo1.csv" w l ls 2, \
"foo2.csv" w l ls 3 , \
"foo3.csv" w l ls 1, \
"foo4.csv" w l ls 4
#small inset
#set size 0.4, 0.4
#set origin 0.5,0.15
set xrange[1.2:2.2]
set yrange[0:0.8]
set ytics 0, 0.2, 2
set xlabel ""
set ylabel ""
plot "foo1.csv" w l ls 2, \
"foo2.csv" w l ls 3 , \
"foo3.csv" w l ls 1, \
"foo4.csv" w l ls 4
unset multiplot
set output
The figure that was generated:

It might be a problem with the way you generate a pdf. The two commands dvipdfm and dvipdf produce different outcomes.
If I take your code, but plot sin(x) instead, and use the following in the terminal:
$ latex file.tex
$ dvipdfm file.dvi
I also get a mismatch between the axes and the plots.
If I use dvipdf however everything looks fine:
$ dvipdf file.dvi

Ok,
Per Tom Fenech's suggestion, I made a minimum code sample to reproduce the error, and the issue that arose is a machine state problem. To generate my graphs, I had run the script twice, once using term wxt and then again using term epslatex.
The problem is that somewhere the state of the gnuplot environment is changed and is not reset by this script. Specifically, the first time through, the default placement of the text labels is fine. The second time through, the range and labels are still attached to the size and origin from the last plot, which is the inset. I thought this was due to the order of the commands set origin/size relative to x/ylabel and x/y range, but simply running the below code twice without restarting gnuplot will generate two different plots. The first time is exactly what I wanted, the second time will skew the labels as shown above.
So I have a "solution", but it is fragile. I would appreciate if someone could explain what I need to do to make this script run multiple times without restarting each time.
Cheers,
--Jim
set term epslatex color font ",16"
unset key
f(x) = sin(x)
set output "sin.tex"
set multiplot
set size 1,1
set origin 0,0
set xrange[0:14]
set yrange[0:6]
set xlabel 'r (\AA)'
set ylabel '$g(r)$'
plot f(x)
#small inset
set size 0.4, 0.4
set origin 0.5,0.15
set xrange[1:3]
set yrange[0:4]
set ytics 0, 0.2, 2
set xlabel ""
set ylabel ""
plot f(x)
unset multiplot
set output

How to remove unknown slits when using filledcurve in Gnuplot (epslatex)?

I'm now trying to use filledcurve in gnuplot 4.6, patchlevel 1. Following shows the sample script:
set term epslatex
set output "figure.tex"
set xlabel "\\huge{x-axis}"
set ylabel "\\huge{y-axis}"
set format xy "\\LARGE{%.0f}"
set xrange [0.0:10.0]
set yrange [0.0:100.0]
set xtics 2.0
set ytics 20.0
set xtics offset 0, -0.3
f1(x) = x**1
f2(x) = x**2
f3(x) = x**3
set nokey
plot '+' using 1:(f2($1)):(f3($1)) with filledcurve lt 1 lc rgb "gray60",\
'+' using 1:(f1($1)):(f2($1)) with filledcurve lt 1 lc rgb "gray40",\
'+' using 1:(0.0):(f1($1)) with filledcurve lt 1 lc rgb "gray20"
I don't known why, but it seems that there are white annoying slits between bars. It cannot be get rid of even if I increase the number of set samples.
Is there any idea to remove these slits?

Unfortunately, this is a viewer problem related to the drawing of adjacent filled polygons, see also problematic Moire pattern in image produced with gnuplot pm3d and pdf output or the bug report #1259 cairolatex pdf fill patterns.
In your case you can use a workaround:
When you have only two columns in the using statement, the area is drawn as closed polygon and doesn't show these artifacts (filledcurves closed). So you must fill the area between each curve and the x1 axis (with filledcurves x1).
Because of a bug in the clipping of curves which exceed the y-range, you must do the clipping of the f3 curve yourself (i.e. use f3($1) > 100 ? 100 : f3($1)). This bug is fixed in the development version.
So you script is:
set term epslatex standalone
set output "figure.tex"
set xlabel "\\huge x-axis"
set ylabel "\\huge y-axis"
set format xy "\\LARGE %.0f"
set xrange [0.0:10.0]
set yrange [0.0:100.0]
set xtics 2.0
set ytics 20.0
set xtics offset 0, -0.3
f1(x) = x**1
f2(x) = x**2
f3(x) = x**3
set nokey
plot '+' using 1:(f3($1) > 100 ? 100 : f3($1)) with filledcurve x1 lt 1 lc rgb "gray60",\
'+' using 1:(f2($1)) with filledcurve x1 lt 1 lc rgb "gray40",\
'+' using 1:(f1($1)) with filledcurve x1 lt 1 lc rgb "gray20"
set output
system('latex figure.tex && dvips figure.dvi && ps2pdf figure.ps')
with the result (using 4.6.1):
Note also, that LaTeX commands like \huge don't take arguments, but are switches. Test e.g. \huge{A}BC, this will make all letter huge. Usually you must limit the scope of \huge with brackets like {\huge ABC}, but if the whole label is affected, it is enough to use set xlabel "\\huge x-axis". That doesn't change anything in your case, but may give you troubles in other circumstances :)

Irregular gnuplot x-values

I have some data (benchmarking results of k-selection algorithms) which has irregular x-values. I have them labeled explicitly (1, 50, 100, 500, 1000, 2500, and 5000).
Because these values are not linearly increasing (or exponentially increasing, although making the x-axis a logscale does improve things a bit, bug leaves a huge gap in between 1 and 50) they are oddly spread out and clumped together. Is there a way to scale the x-axis so that these data points are drawn at an even spacing?
Below is a sample of the grid spacing (labels and legends are not visible in the eps file, these are drawn later by the graphicx package) as well as the gnuplot commands I'm using.
set terminal epslatex
set xlabel 'k'
set ylabel 'seconds'
set tic scale 0
set key Left outside
set xtics rotate
set xtics ('1' 1, '50' 50, '100' 100, '500' 500, '1000' 1000, '2500' 2500, '5000' 5000)
set logscale y
set logscale x
set style data linespoints
set output 'selection.tex'
plot '../data/selection.dat' u 1:($2/1000) t 'Sort', \
'../data/selection.dat' u 1:($3/1000) t 'Partial Sort', \
'../data/selection.dat' u 1:($4/1000) t 'Heap', \
'../data/selection.dat' u 1:($5/1000) t 'Partial Heap', \
'../data/selection.dat' u 1:($6/1000) t 'Order Statistics'

The cleanest way to do it is to use the xticlabels command:
plot '../data/selection.dat' u ($2/1000):xticlabels(1) t 'Sort', \...
This takes the value from the first column and uses it as the x axis labels, while using the second column as data.
This is a bit like using the command
plot 'file' u 2
Which just plots the data from the second column against a dummy index (1,2,3,4...). This gives an even spacing to the data points, which seems to be what you want here.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart