How can be series, a bunch of signals, not the single signal input of sequential data be classified by several groups of classification with DL4J? - deeplearning4j

I have 60 signals sequences samples with length 200 each labeled by 6 label groups, each label is marked with one of 10 values. I'd like to get prediction in each label group on each label when feeding the 200-length or even shorter sample to the network.
I tried to build own network based on https://github.com/eclipse/deeplearning4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/recurrent/seqclassification/UCISequenceClassificationExample.java example, which, however, provides the label padding. I use no padding for the label and I'm getting exception like this:
Exception in thread "main" java.lang.IllegalStateException: Sequence lengths do not match for RnnOutputLayer input and labels:Arrays should be rank 3 with shape [minibatch, size, sequenceLength] - mismatch on dimension 2 (sequence length) - input=[1, 200, 12000] vs. label=[1, 1, 10]

In fact, it is a requirement for the labels to have a time dimension what is 200-long for the features the same as features are. So here I have to do some kind of techniques like zeroes padding in all 6 labels channel. On other hand, the input was wrong, I put all 60*200 there, however it should be [1, 200, 60] there while 6 labels are [1, 200, 10] each.
The thing under the question is in which part of 200-length label I should place the real label value [0], [199] or may be place labels to the typical parts of the signals they are associated with? My trainings that should check this is still in progress. What kind of padding is better? Zeroes padding or the label value padding? Still not clear and can't google out paper explaining what is the best.

Related

Duplicates in vertical axis in the Thingsboard chart

I am new to Thingsboard.
Today I created a chart to see temperature from a few sensors.
The problem is, I can see a few vertical marks which are the same:
I.e. two pairs of 28, two pairs of 26, etc.
As to me, it is quite strange… Maybe, it was designed this way, and I simply don’t understand why…
This is caused by rounding: By default, the Y-Axis doesn't show decimal values.
In this example, I have 4 Values: 20.1, 19.7, 20, 20 which are all rounded to 20 in the Y-Axis:
If you change the number of decimals displayed in the settings, it works as expected:

What does the "source hidden state" refer to in the Attention Mechanism?

The attention weights are computed as:
I want to know what the h_s refers to.
In the tensorflow code, the encoder RNN returns a tuple:
encoder_outputs, encoder_state = tf.nn.dynamic_rnn(...)
As I think, the h_s should be the encoder_state, but the github/nmt gives a different answer?
# attention_states: [batch_size, max_time, num_units]
attention_states = tf.transpose(encoder_outputs, [1, 0, 2])
# Create an attention mechanism
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
num_units, attention_states,
memory_sequence_length=source_sequence_length)
Did I misunderstand the code? Or the h_s actually means the encoder_outputs?
The formula is probably from this post, so I'll use a NN picture from the same post:
Here, the h-bar(s) are all the blue hidden states from the encoder (the last layer), and h(t) is the current red hidden state from the decoder (also the last layer). One the picture t=0, and you can see which blocks are wired to the attention weights with dotted arrows. The score function is usually one of those:
Tensorflow attention mechanism matches this picture. In theory, cell output is in most cases its hidden state (one exception is LSTM cell, in which the output is the short-term part of the state, and even in this case the output suits better for attention mechanism). In practice, tensorflow's encoder_state is different from encoder_outputs when the input is padded with zeros: the state is propagated from the previous cell state while the output is zero. Obviously, you don't want to attend to trailing zeros, so it makes sense to have h-bar(s) for these cells.
So encoder_outputs are exactly the arrows that go from the blue blocks upward. Later in a code, attention_mechanism is connected to each decoder_cell, so that its output goes through the context vector to the yellow block on the picture.
decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
decoder_cell, attention_mechanism,
attention_layer_size=num_units)

What is the block size in HDF5?

Quoting from the HDF5 Hyperslab doc -:
The block array determines the size of the element block selected from
the dataspace.
The example shows in a 2x2 dataset having the parameters set to the following-:
start offset is specified as [1,1], stride is [4,4], count is [3,7], and block is [2,2]
will result in 21 2x2 blocks. Where the selections will be (1,1), (5,1), (9,1), (1,5), (5,5) I can understand that because the starting point is (1,1) the selection starts at that point, also since the stride is (4,4) it moves 4 in each dimension, and the count is (3,7) it increments 3 times 4 in direction X and 7 times 4 in direction Y ie. in its corresponding dimension.
But what I don't understand is what is block size doing ? Does it mean that I will get 21 2x2 dimensional blocks ? That means each block contains 4 elements, but the count is already set in 3 in 1 dimension so how will that be possible ?
A hyperslab selection created through H5Sselect_hypserslab() lets you create a region defined by a repeating block of elements.
This is described in section 7.4.2.2 of the HDF5 users guide found here (scroll down a bit to 7.4.2.2). The H5Sselect_hyperslab() reference manual entry might also be helpful.
Here is a diagram from the UG:
And here are the values used in that figure:
offset = (0,1)
stride = (4,3)
count = (2,4)
block = (3,2)
Notice how the repeating unit is a 3x2 element block. So yes, you will get 21 2x2 blocks in your case. There will be a grid of three blocks in one dimension and seven in the other, each spaced 4 elements apart in each direction. The first block will be offset by 1,1.
The most confusing thing about this API call is that three of the parameters have elements as their units, while count has blocks as its unit.
Edit: Perhaps this will make how block and count are used more obvious...
HDFS default block size is 64 mb which can be increased according to our requirements.1 mapper processes 1 block at a time.

spatial set operations in Apache Spark

Has anyone been able to do spatial operations with #ApacheSpark? e.g. intersection of two sets that contain line segments?
I would like to intersect two sets of lines.
Here is a 1-dimensional example:
The two sets are:
A = {(1,4), (5,9), (10,17),(18,20)}
B = {(2,5), (6,9), (10,15),(16,20)}
The result intersection would be:
intersection(A,B) = {(1,1), (2,4), (5,5), (6,9), (10,15), (16,17), (18,20)}
A few more details:
- sets have ~3 million items
- the lines in a set cover the entire range
Thanks.
One approach to parallelize this would be to create a grid of some size, and group line segments by the grids they belong to.
So for a grid with sizes n, you could flatMap pairs of coordinates (segments of line segments), to create (gridId, ( (x,y), (x,y) )) key-value pairs.
The segment (1,3), (5,9) would be mapped to ( (1,1), ((1,3),(5,9) ) for a grid size 10 - that line segment only exists in grid "slot" 1,1 (the grid from 0-10,0-10). If you chose a smaller grid size, the line segment would be flatmapped to multiple key-value pairs, one for each grid-slot it belongs to.
Having done that, you can groupByKey, and for each group, calculation intersections as normal.
It wouldn't exactly be the most efficient way of doing things, especially if you've got long line segments spanning multiple grid "slots", but it's a simple way of splitting the problem into subproblems that'll fit in memory.
You could solve this with a full cartesian join of the two RDDs, but this would become incredibly slow at large scale. If your problem is smallish, sure, this is an easy and cheap approach. Just emit the overlap, if any, between every pair in the join.
To do better, I imagine that you can solve this by sorting the sets by start point, and then walking through both at the same time, matching one's current interval versus another and emitting overlaps. Details left to the reader.
You can almost solve this by first mapping each tuple (x,y) in A to something like ((x,y),'A') or something, and the same for B, and then taking the union and sortBy the x values. Then you can mapPartitions to encounter a stream of labeled segments and implement your algorithm.
This doesn't quite work though since you would miss overlaps between values at the ends of partitions. I can't think of a good simple way to take care of that off the top of my head.

How to detect if a frame is odd or even on an interlaced image?

I have a device that is taking TV screenshots at precise times (it doesn't take incomplete frames).
Still this screenshot is an interlace image made from two different original frames.
Now, the question is if/how is possible to identify which of the lines are newer/older.
I have to mention that I can take several sequential screenshots if needed.
Take two screenshots one after another, yielding a sequence of two images (1,2). Split each screenshot into two fields (odd and even) and treat each field as a separate image. If you assume that the images are interlaced consistently (pretty safe assumption, otherwise they would look horrible), then there are two possibilities: (1e, 1o, 2e, 2o) or (1o, 1e, 2o, 2e). So at the moment it's 50-50.
What you could then do is use optical flow to improve your chances. Say you go with the
first option: (1e, 1o, 2e, 2o). Calculate the optical flow f1 between (1e, 2e). Then calculate the flow f2 between (1e, 1o) and f3 between (1o,2e). If f1 is approximately the same as f2 + f3, then things are moving in the right direction and you've picked the right arrangement. Otherwise, try the other arrangement.
Optical flow is a pretty general approach and can be difficult to compute for the entire image. If you want to do things in a hurry, replace optical flow with video tracking.
EDIT
I've been playing around with some code that can do this cheaply. I've noticed that if 3 fields are consecutive and in the correct order, the absolute error due to smooth, constant motion will be minimized. On the contrary, if they are out of order (or not consecutive), this error will be greater. So one way to do this is two take groups of 3 fields and check the error for each of the two orderings described above, and go with the ordering that yielded the lower error.
I've only got a handful of interlaced videos here to test with but it seems to work. The only down-side is its not very effective unless there is substantial smooth motion or the number of used frames is low (less than 20-30).
Here's an interlaced frame:
Here's some sample output from my method (same frame):
The top image is the odd-numbered rows. The bottom image is the even-numbered rows. The number in the brackets is the number of times that image was picked as the most recent. The number to the right of that is the error. The odd rows are labeled as the most recent in this case because the error is lower than for the even-numbered rows. You can see that out of 100 frames, it (correctly) judged the odd-numbered rows to be the most recent 80 times.
You have several fields, F1, F2, F3, F4, etc. Weave F1-F2 for the hypothesis that F1 is an even field. Weave F2-F3 for the hypothesis that F2 is an even field. Now measure the amount of combing in each frame. Assuming that there is motion, there will be some combing with the correct interlacing but more combing with the wrong interlacing. You will have to do this at several times in order to find some fields when there is motion.

Resources