How to get the blktrace tool to show the D - issued action - blktrace

This question is about the blktrace tool. On several Ubuntu 3.16.0 machines in our lab I need to trace the software vs device block IO performance. We sometimes use our custom nvme driver and sometimes the standard one. Here is a excerpt of the blkparse output (with the standard nvme driver):
259,0 2 189505 9.997188463 8160 Q R 126875648 + 248 [fio]
259,0 2 189506 9.997191290 8160 Q R 126875896 + 8 [fio]
259,0 2 189507 9.997215574 8160 Q R 363057152 + 248 [fio]
259,0 2 189508 9.997218444 8160 Q R 363057400 + 8 [fio]
259,0 2 189509 9.997219210 8160 C R 216536568 + 8 [0]
259,0 2 189510 9.997220497 8160 C R 126875896 + 8 [0]
259,0 2 189511 9.997230160 8160 C R 363057400 + 8 [0]
259,0 2 189512 9.997248050 8160 Q R 147316736 + 248 [fio]
259,0 2 189513 9.997250930 8160 Q R 147316984 + 8 [fio]
259,0 2 189514 9.997277161 0 C R 147316984 + 8 [0]
This shows the Queued and Complete actions but not the D - issued actions that I am interested in. That is the problem. I need more actions (events) shown. This is from
blktrace /dev/nvme0n1
meanwhile, on other Linux machines it works, or even on the same machine if I trace a different device like
blktrace /dev/sda
That works as shown in this excerpt:
8,0 18 69 17.778827207 8538 Q RA 306186592 + 8 [ls]
8,0 18 70 17.778827767 8538 G RA 306186592 + 8 [ls]
8,0 18 71 17.778828037 8538 I R 306186592 + 8 [ls]
8,0 18 72 17.778828284 8538 D R 306186592 + 8 [ls]
8,0 18 73 17.778832181 8538 A RA 306186600 + 8 <- (8,1) 306184552
8,0 18 74 17.778832397 8538 Q RA 306186600 + 8 [ls]
8,0 18 75 17.778832951 8538 G RA 306186600 + 8 [ls]
8,0 18 76 17.778833221 8538 I R 306186600 + 8 [ls]
8,0 18 77 17.778833441 8538 D R 306186600 + 8 [ls]
8,0 18 78 17.778837161 8538 A RA 306186608 + 8 <- (8,1) 306184560
This last one (with /dev/sda) shows all the different actions, which is great.
So how do I get the detailed blktrace for the nvme0n1 device? And why does it not automatically show the other actions (besides Q and C)?

You should be able to see D as long as your nvme device is has a block device interface. Try libaio engine in fio and run a random write. Make sure to set NOOP IO scheduler in /sys/block/nvme../scheduler

Related

How T Transpose Multiple Columns Values by Groups between groups delimiters in adjacent Column Google Sheets?

I have the following minimal example data (in reality 100's of groups) in range A1:P9 (same data in range A14:A22):
With Sample A1:AR9:
2
61
219
2
4
2
:
61
219
26
26
26
94
21
33
4
26
26
26
94
2
2
:
154
26
40
19
3
2
21
33
14
1
2
3
:
87
39
54
38
26
32
38
26
32
87
39
54
38
26
23
23
4
6
28
2
154
26
2
2
40
19
14
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
23
2
23
4
4
3
6
20
28
Or Sample A14:AQ22:
2
61
219
2
:
61
219
4
:
26
26
26
94
2
:
21
33
4
26
26
26
94
2
:
154
26
2
:
40
19
3
2
21
33
14
:
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
:
23
2
:
23
4
:
3
6
20
2
154
26
2
2
40
19
14
87
39
54
38
26
32
38
26
32
87
39
54
38
26
1
23
2
23
4
4
3
6
20
28
I need the output as shown in range Q1:AR3 or as in range Q14:AQ16.
Basically, at each group delimited/inbetween values in Column A, I would need:
The intemediary adjacent values in Column B to be transposed horizontally
And the adjacent content of Columns C to P (14 Columns, at least) to be "joined" together horizontaly an sequencialy "per group", including the content of the delimiter's row (in Column A).
As a bonus it would be really nice to have the Transposed data followed by a :, and each sub Content of Columns C to P to be also separated by a | (as shown in screenshot Q1:AR3 or Q14:AR16).
(Or if it's more feasible, alternatively, the simpler to read 2nd model as in A14:AQ22).
I have a really hard time putting together a formula to come to the expected result.
All I could think of was:
Transposing Column B's content by getting the rows of the adjacent Cells with values in column A,
Concatenating with the Column letter,
Duplicating it in a new column, and Filtering out the blank intermediary cells,
Then shifting the duplicated column 1 cell up,
Then concatenating within a TRANSPOSE formula to get the range of the groups,
Then finally transposing all the groups from Columns B in a new Colum
(very convoluted but I couldn't find better way).
To get to that input:
=TRANSPOSE(B1:B3)
=TRANSPOSE(B4:B5)
=TRANSPOSE(B7:B9)
That was already a very manual and error prone process, and still I could not successfully think of how to do the remaining content joining of Column C to P in a formula.
I tested the following approach but it's not working and would be very tedious process to fix to go and to implement on large datasets:
=TRANSPOSE(B1:B3)&": "&JOIN( " | " , FILTER(C1:P1, NOT(C2:P2 = "") ))&JOIN( " | " , FILTER(C2:P2, NOT(C2:P2 = "") ))&JOIN( " | " , FILTER(C43:P3, NOT(C3:P3 = "") ))
=TRANSPOSE(B4:B5)&": "&JOIN( " | " , FILTER(C4:P4, NOT(C4:P4 = "") ))&JOIN( " | " , FILTER(C5:P5, NOT(C5:P5 = "") ))
=TRANSPOSE(B6:B9)&": "&JOIN( " | " , FILTER(C6:P6, NOT(C6:P6 = "") ))&JOIN( " | " , FILTER(C7:P7, NOT(C7:P7 = "") ))&JOIN( " | " , FILTER(C8:P8, NOT(C8:P8 = "") ))&JOIN( " | " , FILTER(C8:P8, NOT(C9:P9 = "") ))
What better approach to favor toward the expected result? Preferably with a Formula, or if not possible with a script.
Any help is greatly appreciated.
For Sample 1 try this out:
=LAMBDA(norm,MAP(UNIQUE(norm),LAMBDA(ζ,{TRANSPOSE(FILTER(B1:B9,norm=ζ)),":",SPLIT(BYROW(TRANSPOSE(FILTER(BYROW(C1:P9,LAMBDA(r,TEXTJOIN("ζ",1,r))),norm=ζ)),LAMBDA(rr,TEXTJOIN("γ|γ",1,rr))),"ζγ")})))(SORT(SCAN(,SORT(A1:A9,ROW(A1:A9),),LAMBDA(a,c,IF(c="",a,c))),ROW(A1:A9),))

Get a list of function results until result > x

I basically want the same thing as this OP:
Is there a J idiom for adding to a list until a certain condition is met?
But I cant get the answers to work with OP's function or my own.
I will rephrase the question and write about the answers at the bottom.
I am trying to create a function that will return a list of fibonacci numbers less than 2.000.000. (without writing "while" inside the function).
Here is what i have tried:
First, i picked a way to culculate fibonacci numbers from this site:
https://code.jsoftware.com/wiki/Essays/Fibonacci_Sequence
fib =: (i. +/ .! i.#-)"0
echo fib i.10
0 1 1 2 3 5 8 13 21 34
Then I made an arbitrary list I knew was larger than what I needed. :
fiblist =: (fib i.40) NB. THIS IS A BAD SOLUTION!
Finally, I removed the numbers that were greater than what I needed:
result =: (fiblist < 2e6) # fiblist
echo result
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1.34627e6
This gets the right result, but is there a way to avoid using some arbitrary number like
40 in "fib i.40" ?
I would like to write a function, such that "func 2e6" returns the list of fibonacci numbers below 2.000.000. (without writing "while" inside the function).
echo func 2e6
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1.34627e6
here are the answers from the other question:
first answer:
2 *^:(100&>#:])^:_"0 (1 3 5 7 9 11)
128 192 160 112 144 176
second answer:
+:^:(100&>)^:(<_) ] 3
3 6 12 24 48 96 192
As I understand it, I just need to replace the functions used in the answers, but i dont see how
that can work. For example, if I try:
echo (, [: +/ _2&{.)^:(100&>#:])^:_ i.2
I get an error.
I approached it this way. First I want to have a way of generating the nth Fibonacci number, and I used f0b from your link to the Jsoftware Essays.
f0b=: (-&2 +&$: -&1) ^: (1&<) M.
Once I had that I just want to put it into a verb that will check to see if the result of f0b is less than a certain amount (I used 1000) and if it was then I incremented the input and went through the process again. This is the ($:#:>:) part. $: is Self-Reference. The right 0 argument is the starting point for generating the sequence.
($:#:>: ^: (1000 > f0b)) 0
17
This tells me that the 17th Fibonacci number is the largest one less than my limit. I use that information to generate the Fibonacci numbers by applying f0b to each item in i. ($:#:>: ^: (1000 > f0b)) 0 by using rank 0 (fob"0)
f0b"0 i. ($:#:>: ^: (1000 > f0b)) 0
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
In your case you wanted the ones under 2000000
f0b"0 i. ($:#:>: ^: (2000000 > f0b)) 0
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269
... and then I realized that you wanted a verb to be able to answer your original question. I went with dyadic where the left argument is the limit and the right argument generates the sequence. Same idea but I was able to make use of some hooks when I went to the tacit form. (> f0b) checks if the result of f0b is under the limit and ($: >:) increments the right argument while allowing the left argument to remain for $:
2000000 (($: >:) ^: (> f0b)) 0
32
fnum=: (($: >:) ^: (> f0b))
2000000 fnum 0
32
f0b"0 i. 2000000 fnum 0
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269
I have little doubt that others will come up with better solutions, but this is what I cobbled together tonight.

torch / lua: retrieving n-best subset from Tensor

I have following code now, which stores the indices with the maximum score for each question in pred, and convert it to string.
I want to do the same for n-best indices for each question, not just single index with the maximum score, and convert them to string. I also want to display the score for each index (or each converted string).
So scores will have to be sorted, and pred will have to be multiple rows/columns instead of 1 x nqs. And corresponding score value for each entry in pred must be retrievable.
I am clueless as to lua/torch syntax, and any help would be greatly appreciated.
nqs=dataset['question']:size(1);
scores=torch.Tensor(nqs,noutput);
qids=torch.LongTensor(nqs);
for i=1,nqs,batch_size do
xlua.progress(i, nqs)
r=math.min(i+batch_size-1,nqs);
scores[{{i,r},{}}],qids[{{i,r}}]=forward(i,r);
end
tmp,pred=torch.max(scores,2);
answer=json_file['ix_to_ans'][tostring(pred[{i,1}])]
print(answer)
Here is my attempt, I demonstrate its behavior using a simple random scores tensor:
> scores=torch.floor(torch.rand(4,10)*100)
> =scores
9 1 90 12 62 1 62 86 46 27
7 4 7 4 71 99 33 48 98 63
82 5 73 84 61 92 81 99 65 9
33 93 64 77 36 68 89 44 19 25
[torch.DoubleTensor of size 4x10]
Now, since you want the N best indexes for each question (row), let's sort each row of the tensor:
> values,indexes=scores:sort(2)
Now, let's look at what the return tensors contain:
> =values
1 1 9 12 27 46 62 62 86 90
4 4 7 7 33 48 63 71 98 99
5 9 61 65 73 81 82 84 92 99
19 25 33 36 44 64 68 77 89 93
[torch.DoubleTensor of size 4x10]
> =indexes
2 6 1 4 10 9 5 7 8 3
2 4 1 3 7 8 10 5 9 6
2 10 5 9 3 7 1 4 6 8
9 10 1 5 8 3 6 4 7 2
[torch.LongTensor of size 4x10]
As you see, the i-th row of values is the sorted version (in increasing order) of the i-th row of scores, and each row in indexes gives you the corresponding indexes.
You can get the N best values/indexes for each question (i.e. row) with
> N_best_indexes=indexes[{{},{indexes:size(2)-N+1,indexes:size(2)}}]
> N_best_values=values[{{},{values:size(2)-N+1,values:size(2)}}]
Let's see their values for the given example, with N=3:
> return N_best_indexes
7 8 3
5 9 6
4 6 8
4 7 2
[torch.LongTensor of size 4x3]
> return N_best_values
62 86 90
71 98 99
84 92 99
77 89 93
[torch.DoubleTensor of size 4x3]
So, the k-th best value for question j is N_best_values[{{j},{values:size(2)-k+1}]], and its corresponding index in the scores matrix is given by this row, column values:
row=j
column=N_best_indexes[{{j},indexes:size(2)-k+1}}].
For example, the first best value (k=1) for the second question is 99, which lies at the 2nd row and 6th column in scores. And you can see that values[{{2},values:size(2)}}] is 99, and that indexes[{{2},{indexes:size(2)}}] gives you 6, which is the column index in the scores matrix.
Hope that I explained my solution well.

How do I capture images in OpenCV and saving in pgm format?

I am brand new to programming in general, and am working on a project for which I need to capture images from my webcam (possibly using OpenCV), and save the images as pgm files.
What's the simplest way to do this? Willow Garage provides this code for image capturing:
http://opencv.willowgarage.com/wiki/CameraCapture
Using this code as a base, how might I modify it to:
capture an image from the live cam every 2 seconds
save the images to a folder in pgm format
Thanks so much for any help you can provide!
First of all, please use newer site - opencv.org. Using outdated references leads to chain effect, when new users see old references, read old docs and post old links again.
There's actually no reason to use old C API. Instead, you can use newer C++ interface, which, among other things, handles capturing video gracefully. Here's shortened version of example from docs on VideoCapture:
#include "opencv2/opencv.hpp"
using namespace cv;
int main(int, char**)
{
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
return -1;
for(;;)
{
Mat frame;
cap >> frame; // get a new frame from camera
// do any processing
imwrite("path/to/image.png", frame);
if(waitKey(30) >= 0) break; // you can increase delay to 2 seconds here
}
// the camera will be deinitialized automatically in VideoCapture destructor
return 0;
}
Also, if you are new to programming, consider using Python interface to OpenCV - cv2 module. Python is often considered simpler than C++, and using it you can play around with OpenCV functions right in an interactive console. Capturing video with cv2 looks something like this (adopted code from here):
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# do what you want with frame
# and then save to file
cv2.imwrite('path/to/image.png', frame)
if cv2.waitKey(30) & 0xFF == ord('q'): # you can increase delay to 2 seconds here
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Since ffriend's answer is only partially true, I'll add some more to it (in C++). The author of this question asks explicitly for exporting to PGM (PXM file format that stores each pixel in 8 bits) and not PNG (as ffriend describes in his/her reply). The main issue here is that the official documentation for imwrite is omho not clear about this matter at all:
For PPM, PGM, or PBM, it can be a binary format flag ( CV_IMWRITE_PXM_BINARY ), 0 or 1. Default value is 1.
If we read the sentence in normal English, we have a list of options: CV_IMWRITE_PXM_BINARY, 0 or 1. There is no mention that those can and actually are supposed to be combined! I had to experiment a little bit (I also needed to store 8-bit images for my project) and finally got to the desired solution:
std::vector<int> compression_params; // Stores the compression parameters
compression_params.push_back(CV_IMWRITE_PXM_BINARY); // Set to PXM compression
compression_params.push_back(0); // Set type of PXM in our case PGM
const std::string imageFilename = "myPGM.pgm"; // Some file name - C++ requires an std::string
cv::imwrite(imageFilename, myImageMatrix, compression_params); // Write matrix to file
My investigation was also fueled by this question where the author was (maybe still is) struggling with the very same issue and also by some basic information on the PXM format, which you can find here.
The result (only part of the image) is displayed below:
P2
32 100
255
121 237 26 102 88 143 67 224 160 164 238 8 119 195 138 16 176 244 72 106 72 211 168 45 250 161 37 1 96 130 74 8
126 122 227 86 106 120 102 150 185 218 164 232 111 230 207 191 39 222 236 78 137 71 174 96 146 122 117 175 34 245 6 125
124 121 241 67 225 203 118 209 227 168 175 40 90 19 197 190 40 254 68 90 88 242 136 32 123 201 37 35 99 179 198 163
97 161 129 35 48 140 234 237 98 73 105 77 211 234 88 176 152 12 68 93 159 184 238 5 172 252 38 68 200 130 194 216
160 188 21 53 16 79 71 54 124 234 34 92 246 49 0 17 145 102 72 42 105 252 81 63 161 146 81 16 72 104 66 41
189 100 252 37 13 91 71 40 123 246 33 157 67 96 71 59 17 196 96 110 109 116 253 30 42 203 69 53 97 188 90 68
101 36 84 5 41 59 80 8 107 160 168 9 194 8 71 55 152 132 232 102 12 96 213 24 134 208 1 55 64 43 74 22
92 77 30 44 139 96 70 152 160 146 142 8 87 243 11 91 49 196 104 250 72 67 159 44 240 225 69 29 34 115 42 2
109 176 145 90 137 172 65 25 162 57 169 92 214 211 72 94 149 20 104 56 27 67 218 17 203 182 5 124 138 2 130 48
121 225 25 106 89 76 69 189 34 25 173 8 114 83 72 52 145 154 64 40 91 2 251 53 251 237 20 124 82 2 194 42 ...
Which is exactly what is required in this case. You can see the "P2" marking at the top and also the values are clearly from 0 to 255, which is exactly 8 bits per pixel.
Read most of the answers but none of them could satisfy my requirement. Here's how I implemented it.
This program will use webcam as a camera and clicks picture when you press 'c' - we can change the condition, then make it to click pictures automatically after certain interval
# Created on Sun Aug 12 12:29:05 2018
# #author: CodersMine
import cv2
video_path = 0 # 0 internal cam, 1 external webcam
cap = cv2.VideoCapture(video_path)
img_ctr = 0 # To Generate File Names
while(True):
ret, frame = cap.read()
cv2.imshow("imshow",frame)
key = cv2.waitKey(1)
if key==ord('q'): # Quit
break
if key==ord('c'): # Capture
cv2.imshow("Captured",frame)
flag = cv2.imwrite(f"image{img_ctr}.png", frame)
print(f"Image Written {flag}")
img_ctr += 1
# Release the Camera
cap.release()
cv2.destroyAllWindows()
If you don't need superaccurate 2seconds then simply put a sleep(2) or Sleep(2000) in the while(1) loop to wait fro 2seconds before each grab,
Write images with cvSaveImage() just put the extention .pgm on the filename and it will use pgm.
I believe that the format is chosen from the extension of the filename - so assuming your opencv lib's are linked against the appropriate libs you can do something like: (this is from memory, might not be correct.)
CvCapture* capture = cvCaptureFromCam(0);
IplImage* frame = cvQueryFrame(capture);
while (1) {
frame = cvQueryFrame(capture);
cvSaveImage("foo.pgm", &frame);
sleep(2);
}
cvReleaseImage(&frame);
cvReleaseCapture(&capture);

Unexpected behavior of io:fread in Erlang

This is an Erlang question.
I have run into some unexpected behavior by io:fread.
I was wondering if someone could check whether there is something wrong with the way I use io:fread or whether there is a bug in io:fread.
I have a text file which contains a "triangle of numbers"as follows:
59
73 41
52 40 09
26 53 06 34
10 51 87 86 81
61 95 66 57 25 68
90 81 80 38 92 67 73
30 28 51 76 81 18 75 44
...
There is a single space between each pair of numbers and each line ends with a carriage-return new-line pair.
I use the following Erlang program to read this file into a list.
-module(euler67).
-author('Cayle Spandon').
-export([solve/0]).
solve() ->
{ok, File} = file:open("triangle.txt", [read]),
Data = read_file(File),
ok = file:close(File),
Data.
read_file(File) ->
read_file(File, []).
read_file(File, Data) ->
case io:fread(File, "", "~d") of
{ok, [N]} ->
read_file(File, [N | Data]);
eof ->
lists:reverse(Data)
end.
The output of this program is:
(erlide#cayle-spandons-computer.local)30> euler67:solve().
[59,73,41,52,40,9,26,53,6,3410,51,87,86,8161,95,66,57,25,
6890,81,80,38,92,67,7330,28,51,76,81|...]
Note how the last number of the fourth line (34) and the first number of the fifth line (10) have been merged into a single number 3410.
When I dump the text file using "od" there is nothing special about those lines; they end with cr-nl just like any other line:
> od -t a triangle.txt
0000000 5 9 cr nl 7 3 sp 4 1 cr nl 5 2 sp 4 0
0000020 sp 0 9 cr nl 2 6 sp 5 3 sp 0 6 sp 3 4
0000040 cr nl 1 0 sp 5 1 sp 8 7 sp 8 6 sp 8 1
0000060 cr nl 6 1 sp 9 5 sp 6 6 sp 5 7 sp 2 5
0000100 sp 6 8 cr nl 9 0 sp 8 1 sp 8 0 sp 3 8
0000120 sp 9 2 sp 6 7 sp 7 3 cr nl 3 0 sp 2 8
0000140 sp 5 1 sp 7 6 sp 8 1 sp 1 8 sp 7 5 sp
0000160 4 4 cr nl 8 4 sp 1 4 sp 9 5 sp 8 7 sp
One interesting observation is that some of the numbers for which the problem occurs happen to be on 16-byte boundary in the text file (but not all, for example 6890).
I'm going to go with it being a bug in Erlang, too, and a weird one. Changing the format string to "~2s" gives equally weird results:
["59","73","4","15","2","40","0","92","6","53","0","6","34",
"10","5","1","87","8","6","81","61","9","5","66","5","7",
"25","6",
[...]|...]
So it appears that it's counting a newline character as a regular character for the purposes of counting, but not when it comes to producing the output. Loopy as all hell.
A week of Erlang programming, and I'm already delving into the source. That might be a new record for me...
EDIT
A bit more investigation has confirmed for me that this is a bug. Calling one of the internal methods that's used in fread:
> io_lib_fread:fread([], "12 13\n14 15 16\n17 18 19 20\n", "~d").
{done,{ok,"\f"}," 1314 15 16\n17 18 19 20\n"}
Basically, if there's multiple values to be read, then a newline, the first newline gets eaten in the "still to be read" part of the string. Other testing suggests that if you prepend a space it's OK, and if you lead the string with a newline it asks for more.
I'm going to get to the bottom of this, gosh-darn-it... (grin) There's not that much code to go through, and not much of it deals specifically with newlines, so it shouldn't take too long to narrow it down and fix it.
EDIT^2
HA HA! Got the little blighter.
Here's the patch to the stdlib that you want (remember to recompile and drop the new beam file over the top of the old one):
--- ../erlang/erlang-12.b.3-dfsg/lib/stdlib/src/io_lib_fread.erl
+++ ./io_lib_fread.erl
## -35,9 +35,9 ##
fread_collect(MoreChars, [], Rest, RestFormat, N, Inputs).
fread_collect([$\r|More], Stack, Rest, RestFormat, N, Inputs) ->
- fread(RestFormat, Rest ++ reverse(Stack), N, Inputs, More);
+ fread(RestFormat, Rest ++ reverse(Stack), N, Inputs, [$\r|More]);
fread_collect([$\n|More], Stack, Rest, RestFormat, N, Inputs) ->
- fread(RestFormat, Rest ++ reverse(Stack), N, Inputs, More);
+ fread(RestFormat, Rest ++ reverse(Stack), N, Inputs, [$\n|More]);
fread_collect([C|More], Stack, Rest, RestFormat, N, Inputs) ->
fread_collect(More, [C|Stack], Rest, RestFormat, N, Inputs);
fread_collect([], Stack, Rest, RestFormat, N, Inputs) ->
## -55,8 +55,8 ##
eof ->
fread(RestFormat,eof,N,Inputs,eof);
_ ->
- %% Don't forget to count the newline.
- {more,{More,RestFormat,N+1,Inputs}}
+ %% Don't forget to strip and count the newline.
+ {more,{tl(More),RestFormat,N+1,Inputs}}
end;
Other -> %An error has occurred
{done,Other,More}
Now to submit my patch to erlang-patches, and reap the resulting fame and glory...
Besides the fact that it seems to be a bug in one of the erlang libs I think you could (very) easily circumvent the problem.
Given the fact your file is line-oriented I think best practice is that you process it line-by-line as well.
Consider the following construction. It works nicely on an unpatched erlang and because it uses lazy evaluation it can handle files of arbitrary length without having to read all of it into memory first. The module contains an example of a function to apply to each line - turning a line of text-representations of integers into a list of integers.
-module(liner).
-author("Harro Verkouter").
-export([liner/2, integerize/0, lazyfile/1]).
% Applies a function to all lines of the file
% before reducing (foldl).
liner(File, Fun) ->
lists:foldl(fun(X, Acc) -> Acc++Fun(X) end, [], lazyfile(File)).
% Reads the lines of a file in a lazy fashion
lazyfile(File) ->
{ok, Fd} = file:open(File, [read]),
lazylines(Fd).
% Actually, this one does the lazy read ;)
lazylines(Fd) ->
case io:get_line(Fd, "") of
eof -> file:close(Fd), [];
{error, Reason} ->
file:close(Fd), exit(Reason);
L ->
[L|lazylines(Fd)]
end.
% Take a line of space separated integers (string) and transform
% them into a list of integers
integerize() ->
fun(X) ->
lists:map(fun(Y) -> list_to_integer(Y) end,
string:tokens(X, " \n")) end.
Example usage:
Eshell V5.6.5 (abort with ^G)
1> c(liner).
{ok,liner}
2> liner:liner("triangle.txt", liner:integerize()).
[59,73,41,52,40,9,26,53,6,34,10,51,87,86,81,61,95,66,57,25,
68,90,81,80,38,92,67,73,30|...]
And as a bonus, you can easily fold over the lines of any (lineoriented) file w/o running out of memory :)
6> lists:foldl( fun(X, Acc) ->
6> io:format("~.2w: ~s", [Acc,X]), Acc+1
6> end,
6> 1,
6> liner:lazyfile("triangle.txt")).
1: 59
2: 73 41
3: 52 40 09
4: 26 53 06 34
5: 10 51 87 86 81
6: 61 95 66 57 25 68
7: 90 81 80 38 92 67 73
8: 30 28 51 76 81 18 75 44
Cheers,
h.
I noticed that there are multiple instances where two numbers are merged, and it appears to be at the line boundaries on every line starting at the fourth line and beyond.
I found that if you add a whitespace character to the beginning of every line starting at the fifth, that is:
59
73 41
52 40 09
26 53 06 34
10 51 87 86 81
61 95 66 57 25 68
90 81 80 38 92 67 73
30 28 51 76 81 18 75 44
...
The numbers get parsed properly:
39> euler67:solve().
[59,73,41,52,40,9,26,53,6,34,10,51,87,86,81,61,95,66,57,25,
68,90,81,80,38,92,67,73,30|...]
It also works if you add the whitespace to the beginning of the first four lines, as well.
It's more of a workaround than an actual solution, but it works. I'd like to figure out how to set up the format string for io:fread such that we wouldn't have to do this.
UPDATE
Here's a workaround that won't force you to change the file. This assumes that all digits are two characters (< 100):
read_file(File, Data) ->
case io:fread(File, "", "~d") of
{ok, [N] } ->
if
N > 100 ->
First = N div 100,
Second = N - (First * 100),
read_file(File, [First , Second | Data]);
true ->
read_file(File, [N | Data])
end;
eof ->
lists:reverse(Data)
end.
Basically, the code catches any of the numbers which are the concatenation of two across a newline and splits them into two.
Again, it's a kludge that implies a possible bug in io:fread, but that should do it.
UPDATE AGAIN The above will only work for two-digit inputs, but since the example packs all digits (even those < 10) into a two-digit format, that will work for this example.

Resources