Octave thinks, that image I want to imread() doesn't exist - image-processing

I'm writing a function in Octave to easily add particles on an image, but I have a problem.
function [ out ] = enparticle( mainImg, particleNames, particleData, frames, fpp, sFrame, eFrame )
%particleData format:
% [ p1Xline p1StartHeight p1EndHeight;
% p2Xline p2StartHeight p2EndHeight;
% p3Xline p3StartHeight p3EndHeight;
% ... ]
%particleNames format:
% [ p1Name;
% p2Name;
% p3Name;
% ... ]
pAmount = size(particleData, 1);
for i= 1:pAmount
tmp = particleNames(i,:)
[ pIMG pMAP pALPHA ] = imread( tmp );
end
end
When I run this simple code with
enparticle( "ffield.png", [ "p_dot.png"; "p_star.png"; "p_dot.png" ], [ 100 50 100; 200 50 100; 300 50 100 ], 30, 10, 5, 25 )
I get this written in console
tmp = p_dot.png
error: imread: unable to find file p_dot.png
error: called from
imageIO at line 71 column 7
imread at line 106 column 30
enparticle at line 24 column 23
When I try to imread() file this way, Octave thinks, that there is no file named like this. But it is actually. In the same folder as script file.
The most curious thing is that, when I change
tmp = particleNames(i,:)
to
tmp = particleNames(:,:)
and Octave assigns all names to tmp as array, it magically find all the files with passed names.
But it's not the way I want it to work, because all files will be replaced, or merged, or sth along image processing then.
Why I'm trying to do it that way is corelated with fact, that I want to put every frame (of image and alpha) separately into a cell array later.
I totally don't have any clue, about what I do wrong there and can't google it anywhere also :(

The code:
filenames = [ "p_dot.png"; "p_star.png"; "p_dot.png" ]
does not do what you think it does. This will create a 2 dimensional
array of characters. See
octave> size (filenames)
ans =
3 10
Of interest note is that it lists 10 columns. Take a look at your
filenames and you will notice that their file names are of different
lengths, two have length 9 and one has length 10. But this is just
like a numeric matrix, the only difference is you have ascii
characters. And like a numeric matrix, all rows must have the length.
What is happening, is that the shortest rows get padded with
whitespace. You can confirm this by checking the ascii decimal code
of your filenames:
octave> int8 (filenames)
ans =
112 95 100 111 116 46 112 110 103 32
112 95 115 116 97 114 46 112 110 103
112 95 100 111 116 46 112 110 103 32
Note how the first and third row end in '32'. In ASCII, that's the
space character (see the wikipedia article about
ASCII which has the tables)
So the problem is not that imread does not find a file named
'p_dot.png', the problem is that it does not find a file named
'p_dot.png '.
You should not be using character arrays for this. Instead, use a
cell array. In a cell array of char arrays. Like so:
filenames = {"p_dot.png", "p_start.png", "p_dot.png"}
for i = 1:numel (filenames)
fname = filenames{i};
[pIMG, pMAP, pALPHA] = imread (fname);
## do some stuff
endfor

Related

missing data in time series

As im so new to this field and im trying to explore the data for a time series, and find the missing values and count them and study a distribution of their length and fill in these gaps, the thing is i have, let's say 10 file.txt and for each file i have 2 columns as follows:
C1 C2
944 0
920 1
920 2
928 3
912 7
920 8
920 9
880 10
888 11
920 12
944 13
and so on... lets say till 100 and not necessarily the 10 files have the same number of observations.
so here for example the missing values and not necessarily appears in all files that i have, missing value are: 4,5 and 6 in C2 and the corresponding 1st column C1(measured in milliseconds, so the value of 928ms is not a time neighbor of 912ms). So i want to find those gaps(the total missing values in all 10 files) and show a histogram of their lengths.
i wrote a piece of code in R, but the problem is that i don't get the exact total number that i should have for the missing values.
path = "files path"
out.file<-data.frame(TS = 0, Index = 0, File = '')
file.names <- dir(path, pattern =".txt")
for(i in 1:length(file.names)){
file <- cbind(read.table(file.names[i],
header=F,
sep ="\t",
stringsAsFactors=FALSE),
file.names[i])
colnames(file) <- c('TS', 'Index', 'File')
out.file <- rbind(out.file, file)
}
d = dim(out.file)[1]
misDa = 0
for(i in 2:(d-1)){
if(abs(out.file$Index[i]-out.file$Index[i+1]) > 1)
misDa = misDa+1
}
Hard to give specific hints without having a more extensive example of your data that contains some of the actual NAs.
If you are using R (like it seems) the naniar and the imputeTS packages offer nice functions for missing data visualizations.
Some examples from the naniar package, which is especially good for multivariate data (more plot examples):
Some examples from the imputeTS package, which is especially good for time series data (additional plot examples):

Caffe Error - Data Transformer Check Failed: datum_channels > 0 (0 vs. 0)

If anyone knows what each of the three numbers in this error message means exactly, then I'd definitely appreciate a description, so I can avoid an audit of the data_transformer.cpp code.
Yes, I understand the broad brushstrokes of the error.. a mismatch between the LMDB entries and the Caffe data layer reading the entries. But in order to fix the error, I need to understand what the first zero in the error message means. And the second zero, and the third.
Context
Caffe data layer is reading from an LMDB source
LMDB entries contain custom, non-image data comprising (c x h x w) 1 x 1024 x 300 float values
I0816 20:28:34.749409 103 layer_factory.hpp:77] Creating layer data
I0816 20:28:34.768201 103 db_lmdb.cpp:35] Opened lmdb ../data/emails/inbox
I0816 20:28:34.768442 103 net.cpp:84] Creating Layer data
I0816 20:28:34.768995 103 net.cpp:380] data -> data
I0816 20:28:34.769502 103 net.cpp:380] data -> label
F0816 20:28:34.770326 103 data_transformer.cpp:465] Check failed: datum_channels > 0 (0 vs. 0)
UPDATE
I resorted to a byte-level scrub of the LMDB records being read by Caffe to understand the reason for this error. Below is a sample record:
Key = string value of "00000000_"
Record comprises serialized header and data in row-major form. For ease, all values are in decimal form:
Header
Byte 0 = 8 Byte 1 = 1 (field 1 of type int32 = 1)
Byte 2 = 16 Byte 3 = 128 Byte 4 = 8 (field 2 of type int32 = 1024)
Byte 5 = 24 Byte 6 = 172 Byte 7 = 2 (field 3 of type int32 = 300)
Byte 8 = 40 Byte 9 = 0 (field 5 of type int32 = 0)
Byte 10 = 56 Byte 11 = 0 (field 7 of type boolean = 0)
Data (first three float values among the 300 in the first row)
Byte 12 = 53 Byte 13 = 64 Byte 14 = 164 Byte 15 = 95 Byte 16 = 190
Byte 17 = 53 Byte 18 = 176 Byte 19 = 254 Byte 20 = 143 Byte 21 = 190
Byte 22 = 53 Byte 23 = 24 Byte 24 = 96 Byte 25 = 191 Byte 26 = 62
Field = 53 signifies a fixed length, 32-bit float value in the ensuing four bytes. These values are non-VarInt and in (or should be in) little endian form.
Unless I'm overlooking something completely fundamental, why would Caffe be seeing zero channels in this data stream?
This line in data_transformer.cpp raises your error:
CHECK_GT(datum_channels, 0);
Caffe checks that the Datum read from LMDB is valid. The check that fails you is verifying datum_channels is strictly greater than zero.
Your error implies you have an "empty" Datum: an entry with zero channels.
I suggest you go through your lmdb and verify all entries are valid.
LMDB environments cannot have more than one database, else Caffe will not know what to do, attempt to read the first DB's information and throw an error along the lines mentioned above when the information is not in the expected caffe.proto format.
Because of legacy reasons, my LMDB environment was set up to have named databases, which meant that a separate database containing database names was also present. So even though one DB with data of interest resided in the specified file path, there were actually two DBs. Caffe was reading the first, non-Caffe data DB.
Bottom Line: no named LMDB databases!

CGPDFScanner - \x15 character while scanning

I am trying to extract the text of page 5 in pdf.
The pdf have a font YLJAAA+CMSY10 which has no mappings (CMap) or even encodings (default encoding or /Differences). While extracting text, after string "tetex package" CGPDFScanner returns "\x15" character which is encountered many times. When this character is encountered current font is the above mentioned font which has nothing to extract the text from pdf string.
What is this \x15 character?
Thanks.
I found 2 (not "many") occurrences of this:
[ (\025) ] TJ
which is a number in octal – this is the number that is \x15 in hexadecimal.
The font definition for "YLJAA+CMSY10" in the PDF carries no special encoding, so it has the default encoding for "CMSY" ("Computer Modern Symbol"):
114 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont 210 0 R % -> "/YLJAAA+CMSY10"
/FirstChar 0
/FontDescriptor 211 0 R
/LastChar 127
/Widths 204 0 R
>>
211 0 obj
<<
/Ascent 750
/CapHeight 683
/CharSet (/bullet/greaterequal/arrowright/arrowdblright/element/negationslash/backslash/radical)
/Descent 0
/Flags 4
/FontBBox [ -29 -960 1116 775 ]
/FontFile 205 0 R
/FontName 210 0 R % -> '/YLJAAA+CMSY10'
/ItalicAngle -14
/StemV 85
/XHeight 430
>>
endobj
In itself, this still says nothing definitive: a PDF producer may reorder glyphs and encodings at will, as long as it does the same with the embedded font). Assuming the font set is not reordered, checking a random list of CMxx encodings shows that the character code 0x1F could well be GREATER-THAN OR EQUAL TO (Unicode U+2265).
Acrobat agrees; inspecting the font in the PDF shows that character code 21 (decimal) is named 'GREATER-THAN OR EQUAL' and looks like it as well.

xcode : retrieving one line of xcode based on search query

Here is a sample of my CSV
10820 0 0 0 0
10900 2 4 4 4
11000 21 50 54 58
11100 23 54 59 63
11200 25 59 63 68
11300 27 63 68 73
11400 29 68 73 78
11500 31 72 78 83
11600 32 76 82 88
11700 34 81 87 93
I'm looking to create to use xcode to retreive one line of code from this very lengthy CSV based on the first line.
For example:
if the user enters "10900", the second line columns will be returned.
If the user returns 11650, the 11600 line columns will be returned...always taking the lower line when the input value is less then the following line.
Any help would be appreciated. I've seen code to parse an entire CSV file, but I'm thinking this may be a big memory drain, right now my CSV has 2000 lines of values, which are all in ascending order based on the first column.
You have to load a file into memory anyways to find correct value.
With such a big CSV file I would recommend to turn CSV file into binary file (plist file for example) and put it as binary into your application - instead of parsing it each time in RunTime. It has much better performance and it's easier to work with that since you are working directly with NSDictonaries an NSArrays.
If you don't want to do it for some reason, the next solution is to use something like CHCSVParser:
https://github.com/davedelong/CHCSVParser
It provides optimization for loading only part of file at a time - which is the optimization you might be looking for.

Parsing a Large Text in Sections in Matlab

I have a large text file as below imported in MATLAB:
Run Lat Long Time
1 32 32 34
1 23 22 21
2 23 12 11
2 11 11 11
2 33 11 12
up to 10 runs etc.
So I'm trying to break up each section in the file: section 1, section 2, etc and write it to 10 different text files. File 1 will have data from Run 1. File 2 will have data from Run 2.
What you're looking for is Matlab's textread function. I'll give you the pieces you need and frame out the logic, but you'll need to connect the pieces yourself :)
Your read would look something like this
[head1, head2, head3, head4] = textread(file_name,'%s %s %s %s',1);
[run, lat, long, time] = textread(file_name,'%u %u %u %u');
and your write method would use a loop to iterate over the values in
unique(run)
creating a file with
fout = fopen([base_file_name_out num2str(run_number)]);
and writing to it the values contained in
lat_this_run=Lat(run==run_number);
using the method
fprintf(fout,'%u %u %u\n', lat_this_run, long_this_run, time_this_run)
If your data is already loaded into matlab and named A, you could do:
>> a = max(A(:,1));
>> AA={};
>> for i = 1:a
AA{i}=A(find(A(:,1)==i),:)
name=sprintf('%d.txt',i);
dlmwrite(name,AA{i},'\t');
end
The output will be .txt files containing tab-delimited data.

Resources