How to create a confusion matrix without any packages - machine-learning

How do I create a confusion matrix without any packages? I want to be able to see the logic of creating one. This can be any language or pseudo-code.

Technically, a confusion matrix is just a regular matrix.
Just compute the intersection sizes and then label the rows and columns as desired.

Well an option(maybe not the best in performance but great to understand the concept is this one which was the first one i implemented:
true_positives = 0;
true_negatives = 0;
false_negatives = 0;
false_positives = 0;
for i in range(0,np.size(predictions)):
if predictions[i]==1 and real_values[i]==1:
true_positives+=1;
if predictions[i]==0 and real_values[i]==0:
true_negatives+=1;
if predictions[i]==0 and real_values[i]==1:
false_negatives+=1;
if predictions[i]==1 and real_values[i]==0:
false_positives+=1;

Related

How to use autoDiffToGradientMatrix to solve for Coriolis Matrix in drake?

I am trying to get the Coriolis matrix for my robot (need the matrix explicitly for the controller) based on the following approach which I have found online:
plant_.CalcBiasTerm(*context, &Cv_);
auto jac = autoDiffToGradientMatrix(Cv_);
C = 0.5*jac.rightCols(n_v_);
where Cv_, plant_, context are AutoDiffXd and n_v_ is the number of generalized velocities. So basically I have a 62-joint robot loaded from URDF into drake which is a free body (floating base system). After finalizing the robot I am using the DiagramBuilder.Build() method and then the CreateDefaultContext() in order to get the context. Next, I am trying to set up the AutoDiff environment like this:
plant_autodiff = drake::systems::System<double>::ToAutoDiffXd(*multibody_plant);
context_autodiff = plant_autodiff->CreateDefaultContext();
context_autodiff->SetTimeStateAndParametersFrom(*diagram_context);
The code above is contained in an initialization setup code. In another method, which is called on update events, the following lines of code are written:
drake::AutoDiffVecXd c_auto_diff_ = drake::AutoDiffVecXd::Zero(62);
plant_autodiff->CalcBiasTerm(*context_autodiff, &c_auto_diff_);
MatrixXd jac = drake::math::autoDiffToGradientMatrix(c_auto_diff_);
auto C = 0.5*jac.rightCols(jac.size());
This setup compiles and runs, however the size of the jac matrix is 0, whereas I would expect 62x62. I am also extracting and then exposing the Coriolis vector, which is 62x1 and seems to be more or less correct. The c_auto_diff_ variable is 62x1 as well, but all the elements are 0.
I am clearly making a mistake, but I do not know where exactly.
Any help is appreciated,
Thank you all,
Robert
You are close. You need to tell the autodiff pipeline what you want to take the derivative with respect to. In this case, I believe you want
auto v = drake::math::initializeAutoDiff(Eigen::VectorXd::Zero(62))
plant_autodiff->SetVelocities(context_autodiff.get(), v);
By calling initializeAutoDiff, you are initializing the autodiff terms to the identity matrix, which is saying that you want to take the derivative with respect to v. Then you should get non-zero derivatives.
Btw - I normally would use
plant_autodiff = multibody_plant->ToAutoDiffXd();
but I guess what you have must work, too!

Iterative programming using PCollectionViews

I wish to create a PCollection of say one hundred thousand objects (maybe even a million) such that I apply an operation on it a million times in a for-loop on the same data, but with DIFFERENT values for the PCollectionView calculated on each iteration of the loop. Is this a use-case that df can handle reasonably well? Is there a better way to achieve this? My concerns is that PCollectionView has too much overhead, but it could be that that used to be a problem a year ago but now this a use-case that DF can support well. In my case, I can hardcode the number of iterations of the for-loop (as I believe that DF can't handle the situation in which the number of iterations is dynamically determined at run-time.) Here's some pseudocode:
PCollection<KV<Integer,RowVector>> rowVectors = ...
PCollectionView<Map<Integer, Float>> vectorX;
for (int i=0; i < 1000000; i++) {
PCollection<KV<Integer,Float>> dotProducts =
rowVectors.apply(ParDo.of(new DoDotProduct().withSideInputs(vectorX));
vectorX = dotProducts.apply(View.asMap());
}
Unfortunately we only support up to 1000 transformations / stages. This would require 1000000 (or whatever your forloop iterates over) stages.
Also you are correct in that we don't allow changes to the graph after the pipeline begins running.
If you want to do less than 1000 iterations, then using a map side input can work but you have to limit the number of map lookups you do per RowVector. You can do this by ensuring that each lookup has the whole column instead of walking the map for each RowVector. In this case you'd represent your matrix as a PCollectionView of a Map<ColumnIndex, Iterable<RowIndex, RowValue>>

Mathematical Operations on an Image Stack in ImageJ (Fiji)

I am writing an imageJ/Fiji plugin in Jython using the pydev plugin in eclipse.The plugin will be the ImageJ version of an already existing denoising software called CANDLE written as a matlab program. Changing the value of every pixel(voxel) of an image in matlab is trivial:
InputImage = 2 * sqrt(InputImage + (3/8));
Median3DFilteredImage = 2 * sqrt(Median3DFiltered + (3/8));
Here "InputImage" and "Median3DFilteredImage" are 3D Matrices, with the last dimension being time (slices). To reproduced the following operation on an ImageJ image, I had to employ two for loops, one to iterate through the image slices (3rd dimension) and the other loop to iterate over all the pixels in a particular slice:
medFiltStack = medianFilteredImage.getStack()
newMedFiltStack = ImageStack(medianFilteredImage.width, medianFilteredImage.height)
InputStack = InputImage.getStack()
newInputStack = ImageStack(InputImage.width, InputImage.height)
for i in xrange(1 , medianFilteredImage.getNSlices() + 1):
ip = medFiltStack.getProcessor(i).convertToFloat()
ip2 = InputStack.getProcessor(i).convertToFloat()
pixels = ip.getPixels()
pixels2 = ip2.getPixels()
for j in xrange (len(pixels)):
pixels[j] = 2 * javaMath.sqrt(pixels[j] + (3.0/8.0) )
pixels2[j] = 2 * javaMath.sqrt(pixels2[j] + (3.0/8.0) )
newMedFiltStack.addSlice(ip)
newInputStack.addSlice(ip2)
medianFilteredImage = ImagePlus("MedianFiltered-Image", newMedFiltStack)
InputImage = ImagePlus("Input-Image", newInputStack)
My question is as follows: Is there a way to perform mathematical operations on an image Stack, i.e. on every pixel (voxel) in the image stack, without having to write code that explicitly visits every pixel in every slice of the image, i.e. for loops. It just seems to be a very primitive way of going about it and I am wondering if there isn't an optimal way of doing this operation. I also had to work with copies and then gave the new images the same names as before as opposed to working with the original images and editing them directly. So is there a way to edit the pixel values of the original images rather than copies of the images? Any help would be appreciated as there are plenty of more math operations that I have to perform. It would be super useful to find a way to do mathematical operations on images in an optimal way both in terms of the amount of code and if possible, in terms of speed.
In pure ImageJ 1.x, the answer is: no, there's no other way than to visit every slice and get its ImageProcessor. That's the way how ImageJ1 deals with its limited number of dimensions (z, time, channel), you always have a (Hyper-)Stack of 2D planes.
There is however a more powerful way of dealing with n-dimensional images called ImgLib, which is included into Fiji together with ImageJ2.
To avoid re-inventing the wheel, you should have a look a Jean-Yves Tinevez's great plugin Image Expression Parser. Use it headlessly with Fiji, or just have look at its source code (it uses a previous version though, ImgLib1, but the idea is the same: you avoid hard-coding the dimensions by using Java generics), see e.g. for the sqrt function:
public final <R extends RealType<R>> float evaluate(final R alpha) {
return (float) Math.sqrt(alpha.getRealDouble());
}

What algorithm can I use to turn a drunkards walk into a correlated RNG?

I'm a novice programmer (the only reason I say this is because I'm not super familiar with all the terms yet) and I'm trying to make walls generate in respect to the wall before it. I've posted a question about it on here before
Randomly generated tunnel walls that don't jump around from one to the next
and sort of got the answer. What I was mainly looking for was the for loop that was used (I think). Th problem is I didn't know how to implement it properly without getting errors.
My problem ended up being "I couldn't figure out how to inc. this in to it. I have 41 walls altogether that i'm using and the walls are named Left1 and Right1. i had something like this
CGFloat Left1 = 14; for( int i = 0; i < 41; i++ ){
CGFloat offset = (CGFloat)arc4random_uniform(2*100) - 100;
Left1 += offset;
Right1 = Left1 + 100;
but it was telling me as a yellow text that Local declaration of "Left1" hides instance variable and then in a red text it says "Assigning to 'UIImageView *__strong' from incompatible type 'float'. i'm not sure how to fix this"
and I wasn't sure how to fix it. I realize (I think) that arc4random and arc4random_uniform are pretty much the same thing, as far as i know, with slight differences, but not the difference i'm looking for.
As I said before, i'm pretty novice so any example would really be helpful, especially with the variables i'm trying to use. Thank you.
You want a "hashing" function, and preferably a "cryptographic" one because they tend to be significantly higher quality - at the expense of requiring additional CPU resources. But on modern hardware the extra CPU power usually isn't a problem.
The basic idea is you can give any data to the function, and it will spit out a completely random result, but always the same result if you provide the same input.
Have a read up on them here:
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Cryptographic_hash_function
There are hundreds of different algorithms in common use, which is best will depend on what you need.
Personally I recommend sha256. A quick search of "sha256 ios" here on stack overflow will show you how to make one, with the CommonCrypto library. The gist is you should create an NSString or NSData object that contains every offset, then run the entire thing through sha256. The result will be a perfectly random 256 bit number.
If 256 bits is too much, just cut it up. For example you could grab just the first 16 bits of the number, and you will have a perfectly random 16 bit number.

Running time and memory

If you cannot see the code of a function, but know that it takes arguments. Is it possible to find the running time speed and memory. If so how would you do it. Is there a way to use Big O in this case?
No, it's not possible to find either the memory or performance of a function by just looking at its parameters. For example, the same function
void DoSomething(int x, int y, int z);
Can be implemented as O(1) time and memory:
void DoSomething(int x, int y, int z) { }
or as a very, very expensive function taking O(x*y*z):
void DoSomething(int x, int y, int z)
{
int a = 0;
for (int i = 0; i < x; i++) {
for (int j = 0; j < y; j++) {
for (int k = 0; k < z; k++) {
a++;
}
}
}
Console.WriteLine(a);
}
And many other possibilities. So, it's not possible to find how expensive the function is.
Am I allowed to run the function at all? Multiple times?
I would execute the function with a range of parameter values and measure the running time and (if possible) the memory consumption for each run. Then, assuming the function takes n argument, I would plot each data point on an n+1-dimensional plot and look for trends from there.
First of all, it is an interview question, so you'd better never say no.
If I were in the interview, here is my approach.
I may ask the interviewer a few questions, as an interview is meant to be interactive.
Because I cannot see the code, I suppose I can at least run it, hopefully, multiple times. This would be my first question: can I run it? (If I cannot run it, then I can do literally nothing with it, and I give up.)
What is the function used for? This may give a hint of the complexity, if the function is written sanely.
What are the type of argument? Are some they primitive types? Try some combinations of them. Are some of them "complex" (e.g. containers)? Try some different size combinations. Are some of them related (e.g. one for a container, and one for the size of the container)? Some test runs can be saved. Besides, I hope the legal ranges of the arguments are given, so I won't waste time on illegal guesses. Last, to test some marginal cases may help.
Can you run the function with a code? something like this:
start = clock();
//call the function;
end = clock();
time = end-start;
Being an interview question, you should never answer like "no it cannot be done".
What you need is the ability to run the code. Once you can run the code, call the same function with different parameters and measure the memory and time required. You can then plot these data and get a good estimate.
For big-O type notations also, you can follow the same approach and plot the results WRT the data set size. Then try to fit this curve with the known complexity curves like n, n^2, n^3, n*log(n), (n^2)*log(n) etc using a least square fit.
Lastly, remember that all these methods are approximations only.
no you cannot, this would have solved the Halting Problem , since code might run endlessly O(infinity). thus, solving this problem also solves HP, which is of course proven to be impossible.

Resources