I can't find information on the accuracy of different error detection techniques. Say if I want to be able to correct 1, 2, or 3 bit errors in 32-bit word, then I can use a modified Hamming code with 7 redundant bits.
But what about other coding techniques - I can't find any data on them. For example, what CRC polynomials will be able to detect 1 to 3 bit errors, and how many redundant bits that will require? What about other techniques?
Each n-bit CRC will detect every n-bit burst error.
Otherwise, the n-bit CRC will detect an arbitrary error with a probability of 1 − 2−n.
For example, CRC-32 will detect every error where there are no more than 30 bits between the first flipped bit and the last flipped bit.
Related
I am working on an LSTM to generate music. My input data will be a BooleanTensor of size 88xLx3, 88 being the amount of available notes, L being the length of each "piece" which will be in the order of 1k - 10k (TBD), and 3 being the parts for "lead melody", "accompaniment", and "bass". A value of 0 would symbolize that that specific note is not being played by that part (instrument) at that time, and a 1 would symbolize that it is.
The problem is that each entry of a BooleanTensor takes 1 byte of space in memory instead of 1 bit, which wastes a lot of valuable GPU memory.
As a solution I thought of packing each BooleanTensor to a ByteTensor (uint8) of size 11xLx3 or 88x(L/8)x3.
My question is: Would packing the data as such have an effect on the learning and generation of the LSTM or would the ByteTensor-based data and model be equivalent to their BooleanTensor-based counterparts in practice?
I wouldn't really care about the fact that the input is taking X instead of Y number of bits, at least when it comes to GPU memory. Most of it is occupied by the network's weights and intermediate outputs, which will likely be float32 anyway (maybe float16). There is active research on training with lower precision (even binary training), but based on your question, it seems completely unnecessary. Lastly, you can always try Quantization to your production models, if you really need it.
With regards to the packing: it can have an impact, especially if you do it naively. The grouping you're suggesting doesn't seem to be a natural one, therefore it may be harder to learn patterns from the grouped data than otherwise. There'll always be workarounds, but then this answer become an opinion because it is almost impossible to antecipate what could work; an opinion-based questions/answer are off-topic around here :)
I would like to understand whether using fixed point Q31 is better than floating-point (single precision) for DSP applications where accuracy is important.
More details, I am currently working with ARM Cortex-M7 microcontroller and I need to perform FFT with high accuracy using CMSIS library. I understand that the SP has 24 bits for the mantissa while the Q31 has 31 bits, therefore, the precision of the Q31 should be better, but I read everywhere that for algorithms that require multiplication and so on, the floating-point representation should be used, which I do not understand why.
Thanks in advance.
Getting maximum value out of fixed point (that extra 6 or 7 bits of mantissa accuracy), as well as avoiding a ton of possible underflow and overflow problems, requires knowing precisely the bounds (min and max) of every arithmetic operation in your CMSIS algorithms for every valid set of input data.
In practice, both a complete error analysis turns out to be difficult, and the added operations needed to rescale all intermediate values to optimal ranges reduces performance so much, that only a narrower set of cases seems worth the effort, over using either IEEE signal or double, which the M7 supports in hardware, and where the floating point exponent range hides an enormous amount (but not all !!) of intermediate result numerical scaling issues.
But for some more simple DSP algorithms, sometimes analyzing and fixing the scaling isn't a problem. Hard to tell which without disassembling the numeric range of every arithmetic operation in your needed algorithm. Sometimes the work required to use integer arithmetic needs to be done because the processors available don't support floating point arithmetic well or at all.
I have 100,000 points that I would like to cluster using the OPTICS algorithm in ELKI. I have a upper triangular distance matrix of about 5 billion entries for this point set. In the format that ELKI wants the matrix, it will take about 100GB in memory. I am wondering does ELKI handle that sort of data load? Can any one confirm if you have made this work before?
I frequently use ELKI with 100k points, up to 10 million.
However, for this to be fast you should use indexes.
For obvious reasons, any dense matrix based approach will scale at best O(n^2), and need O(n^2) memory. Which is why I cannot process these data sets with R or Weka or scipy. They usually first try to compute the full distance matrix, and either fail halfway through, or run out of memory halfway through, or fail with negative allocation size (Weka, when your data set overflows the 2^31 positive integers, i.e. is around 46k objects).
In the binary format, with float precision, the ELKI matrix format should be around 100000*999999/2*4 + 4 bytes, maybe add another 4 bytes for size information. This is 20 GB.
If you use the "easy to use" ascii format, then it will indeed be more. But if you use gzip compression it may end up being about the same size. It's common to have gzip compress such data to 10-20% of the raw size. In my experience gzip compressed ascii can be as small as binary encoded doubles.
The main benefit of the binary format is that it will actually reside on disk, and memory caching will be handled by your operating system.
Either way, I recommend to not compute distance matrixes at all in the first place.
Because if you decide to go from 100k to 1 million, the raw matrix would grow to 2 TB, and when you go to 10 million it will be 200 TB. If you want double precision, double that.
If you are using distance matrixes, your method will be at best O(n^2), and thus not scale. Avoiding computing all pairwise distances in the first place is an important speed factor.
I use indexes for everything. For kNN or radius bound approaches (for OPTICS, use the epsion parameter to make indexes effective! Choose a low epsilon!) you can precompute these queries once, if you are going to need them repeatedly.
On a data set I frequently use, with 75k instances and 27 dimensions, the file storing the precomputed 101 nearest neighbors + ties, with double precision, is 81 MB (note: this can be seen as a sparse similarity matrix). By using an index for precomputing this cache, it takes just a few minutes to compute; and then I can ran most kNN based algorithms such as LOF on this 75k dataset in 108 ms (+262 ms for loading the kNN cache + parsing the raw input data 2364 ms, for a total runtime of 3 seconds; dominated by parsing double values).
I'm writing a sliding window to extract features and feed it into CvSVM's predict function.
However, what I've stumbled upon is that the svm.predict function is relatively slow.
Basically the window slides thru the image with fixed stride length, on number of image scales.
The speed traversing the image plus extracting features for each
window takes around 1000 ms (1 sec).
Inclusion of weak classifiers trained by adaboost resulted in around
1200 ms (1.2 secs)
However when I pass the features (which has been marked as positive
by the weak classifiers) to svm.predict function, the overall speed
slowed down to around 16000 ms ( 16 secs )
Trying to collect all 'positive' features first, before passing to
svm.predict utilizing TBB's threads resulted in 19000 ms ( 19 secs ), probably due to the overhead needed to create the threads, etc.
My OpenCV build was compiled to include both TBB (threading) and OpenCL (GPU) functions.
Has anyone managed to speed up OpenCV's SVM.predict function ?
I've been stuck in this issue for quite sometime, since it's frustrating to run this detection algorithm thru my test data for statistics and threshold adjustment.
Thanks a lot for reading thru this !
(Answer posted to formalize my comments, above:)
The prediction algorithm for an SVM takes O(nSV * f) time, where nSV is the number of support vectors and f is the number of features. The number of support vectors can be reduced by training with stronger regularization, i.e. by increasing the hyperparameter C (possibly at a cost in predictive accuracy).
I'm not sure what features you are extracting but from the size of your feature (3780) I would say you are extracting HOG. There is a very robust, optimized, and fast way of HOG "prediction" in cv::HOGDescriptor class. All you need to do is to
extract your HOGs for training
put them in the svmLight format
use svmLight linear kernel to train a model
calculate the 3780 + 1 dimensional vector necessary for prediction
feed the vector to setSvmDetector() method of cv::HOGDescriptor object
use detect() or detectMultiScale() methods for detection
The following document has very good information about how to achieve what you are trying to do: http://opencv.willowgarage.com/wiki/trainHOG although I must warn you that there is a small problem in the original program, but it teaches you how to approach this problem properly.
As Fred Foo has already mentioned, you have to reduce the number of support vectors. From my experience, 5-10% of the training base is enough to have a good level of prediction.
The other means to make it work faster:
reduce the size of the feature. 3780 is way too much. I'm not sure what this size of feature can describe in your case but from my experience, for example, a description of an image like the automobile logo can effectively be packed into size 150-200:
PCA can be used to reduce the size of the feature as well as reduce its "noise". There are examples of how it can be used with SVM;
if not helping - try other principles of image description, for example, LBP and/or LBP histograms
LDA (alone or with SVM) can also be used.
Try linear SVM first. It is much faster and your feature size 3780 (3780 dimensions) is more than enough of "space" to have good separation in higher dimensions if your sets are linearly separatable in principle. If not good enough - try RBF kernel with some pretty standard setup like C = 1 and gamma = 0.1. And only after that - POLY - the slowest one.
I am working on handwriting recognition and related stuff on visual studio platform and using openCV libraries. Input is in the form of binary scanned .tif images.
Currently I went into a roadblock trying to figure out a way to recognize striked out words as in you strike out (cancel) words using a straight/ curved line. I am not going to do individual character recognition 'coz that will be a waste of computation power.
Is there any way to recognize such occurrences in an alternate way?
Following are two ideas I've come upon but I am not sure -
1> use a mask like < 0 0 0 , 1 1 1, 0 0 0 > that will help in finding all horizontal lines... but this will be a very big assumption. the lines can be wavy and in any orientation.
2> skeletonize the input and look for intersections. this will give me quite a few intersections - including those that occur due to the line used to strike out the word. using some approximation like least squares etc. i can get an approximate line. but there's the problem that intersections can occur at many places - eg. 2 intersections in 'b' etc.
any suggestions?
Have you considered using the Hough transform to detect the strike lines?
Here's an illustration of the use of hough transform in handwriting, that will give you the intuition of the approach:
You can quickly test it with openCV. The function is called cvHoughLines2.
Why not processing contours? you could take advantage of Poly (Ten-Chin) approximation and analyze only the few vectors resulting from the chain reconstruction. If you want to do more, then use a mixed pyramid/contour scheme, in order to get vectors approximations with different Level of Detail, starting from rough resolution up to finest.
Stop the refinement when you get a "reasonable" number of unique segments, apply normalization (see Moments - Hu's Moments) to make a fingerprint of your sample, and finally adopt a strong classification system.
I suggest you to look at ML (Machine Learning) part of OpenCV suite, for better reference on this latter part. For raster data, Haar's wavelets + Hidden Markovian Models work well, for vectors maybe you could use something less hard to setup (SOM, KNN, KMeans).
I would go with the individual character recognition. It may be a waste of computing power but it could give the best results. Just find a way to get a value from the character recognition that shows how good the character was recognized, then find a threshold for things that aren't characters. I think the canceling will destroy the char in a way that the recognition will have it problems finding something and maybe you can use this fact to find the canceled characters. To improve the results look for many characters that are badly recognized in the same region of the text, often whole words are canceled and therefore the bad recognition results will cluster.
If your performance is very bad in the end you can always come back and improve the algorithm later on.