Bio.PDB.Superimposer - what is RMS? - biopython

I am comparing two structures and a bit confused about meaning of the result parameters:
What is the value returned saved in super_imposer.rms ?
I guess it's RMSD, but why is it not written as such?

super_imposer.rms is indeed the root-mean-square deviation.
If you consult the source of Bio.PDB.Superimposer, you'll see that the rms attribute is the result of a call to get_rms(). The docstring for get_rms() reads:
Root mean square deviation of superimposed coordinates.
The Tutorial does say that "The RMSD is stored in the rmsd attribute," however. No idea why this discrepancy exists.

Related

how to get the hash value when using StaticWordValueEncoder in Mahout

I'm look at an example in the Mahout in Action book. It uses the StaticWordValueEncoder to encoder a text in the feature hashing manner.
When encode "text to magically vectorize" with a standard analyser and probe = 1, the vector is {12:1.0, 54:1.0, 78:1.0}. However, I can't figure out which word the hash index refers to.
Is there any method to get the [hash, original word] as a pair? e.g. hash 12 refers to the word "text"?
if you have read Mahout in Action paragraph:
"The value of a continuous
variable gets added directly to one or more locations that are allocated for the storage
of the value. The location or locations are determined by the name of the feature.
This hashed feature approach has the distinct advantage of requiring less memory
and one less pass through the training data, but it can make it much harder to reverse engineer
vectors to determine which original feature mapped to a vector location."
-----I am not sure how the reverse engineering can be done(which certainly a difficult task as Author has put) Perhaps some one might put some light on this.

How to fix Amos error: "observed variable is represented by an ellipse in the path diagram"?

I received the following question by email and have seen a lot of students with this problem:
I am trying to fit a structural equation model in Amos, but when I click "calculate estimates", I get the following error: "observed variable [variable name] is represented by an ellipse in the path diagram". Could you please advise me of what I am doing wrong?
IBM Help discusses this error but isn't that helpful.
In practice, I've seen this error come up a number of times. It can occur because you have incorrectly specified a variable as latent that you wanted to be observed. However, more commonly, it is the result of giving an inappropriate variable to a latent variable. Specifically, it is relatively easy to give a name to a latent factor that is the same as an observed variable in your data file.
For example, one time I had some personality variables in a dataset and the extraversion items were called E1, E2, E3, and so on. These are common names for residuals. So when giving residuals these names, there was a conflict with the names in the data file.
Another even more common cause is when you name a latent factor an appropriate name (e.g., selfesteem, extraversion, jobsatisfaction, etc.) and you have already created a scale score in your data file with the same name. This also causes the conflict.
The basic solution is just to give the latent variable a unique name that doesn't conflict with one in the data file. So for example, name the variable selfesteem_factor rather than selfesteem if you already have a variable called selfesteem.
I recently experienced the same problem. I followed Jeromy's advise and it worked. Actually that error message is caused by you giving the same name to a latent variable and an observed variable. In my case, I had a latent variable, trust, but I had also created a summated scale for trust(making it become an observed variable). So I got the same error message. when I changed the name of the latent variable, the model run properly

Why does ELKI need db.in file in addition to distance matrix? Also what should db.in file contain?

I tried to follow this tutorial on using ELKI with pre-computed distances for clustering.
http://elki.dbs.ifi.lmu.de/wiki/HowTo/PrecomputedDistances
I used the following set of command line options:
-dbc.filter FixedDBIDsFilter -dbc.startid 0 -algorithm clustering.OPTICS
-algorithm.distancefunction external.FileBasedDoubleDistanceFunction
-distance.matrix /path/to/matrix -optics.minpts 5 -resulthandler ResultWriter
ELkI fails with a configuration error saying db.in file is needed to make the computation.
The following configuration errors prevented execution:
No value given for parameter "dbc.in":
Expected: The name of the input file to be parsed.
No value given for parameter "parser.distancefunction":
Expected: Distance function used for parsing values.
My question is what is db.in file? Why should I provide it in addition to the distance matrix file since the pair-wise distance matrix file completely specifies all the information about the point cloud. (also I don't have access to any other information other than the pair-wise distance information).
What should I do about db.in? Should I override it, or specify some dummy information etc. Kindly help me understand.
thank you.
This is documented in the ELKI HowTos:
http://elki.dbs.ifi.lmu.de/wiki/HowTo/PrecomputedDistances
Using without primary data
-dbc DBIDRangeDatabaseConnection -idgen.count 100
However, there is a bug (patch is on the howto page, and will be in the next release) so you right now can't fully use this; as a workaround you can use a text file that enumerates the objects.
The reason for this is that ELKI is designed to work on multi-relational data. It's not just processing matrixes. But some algorithms may e.g. need a geographic representation of an object, some measurements for this object, and a label for evaluation. That is three relations.
What the DBIDRange data source essentially does is create a single "fake" relation that is just the DBIDs 0 to 99. On algorithms that don't need actual data, but only distances (e.g. LOF or DBSCAN or OPTICS), it is sufficient to have object IDs and a distance matrix.

Unsupported format or combination of formats when using cv::reduce method in OpenCV

I am using OpenCV 2.4.2 and I am trying to take projections of two matrices (tmpl(32x44), subj(32x44)) along row and column. I have initialised a result matrix as rowProjectionSubj(subj.rows,1,CV_8UC1) Then I call cv::reduce(subj,rowProjectionSubj,1,CV_REDUCE_SUM,-1);
Why is this complaining about the type mismatch? I have kept the types same (by keeping dtype=-1 in cv::reduce. I get the tmpl and subj objects by doing cv::imread("image_path",0) i.e. scanning grayscale images in.
I might not be right, but after I saw this:
http://answers.opencv.org/question/3698/cvreduce-gives-unsupported-format-exception/?answer=3701#post-id-3701
and with a little experiment and using an old friend called "register math", I realised that when you add two 8-bit numbers, you need to consider a 8+1+1 bit register to store the sum because it potentially has carry output. so any result of reduce should have bigger space than the source i.e. if the source is 8-bit unsigned, it should be at least 16-bit unsigned or signed; might as well be 32-bit if it is going to be used for some product calculation and stuff...
NOTE: The destination type must be EXPLICITLY stated in the cv::reduce method. Please follow my openCV link for further information.

Tell Mathematica functions not to change the DataType of the parameters

I've been doing some image processing and i noticed that when i call a Mathematica function like GaussianFilter it returns an image of type "Real" even though the image i passed was of type "Byte" which causes a huge increase in memory usage.
I'm aware i can change the type of the image after the call using Image[img,"Byte"] but that is just tedious and processing overhead.
So is there a way to tell Mathematica not to change the type?
If Mma does not change the image type, you may get unexpected results. Consider (as a limit case) the binarized image of Lena:
BTW, anyone who used Lena as an example should read these two articles:
http://www.ecogito.net/articles/lena.html
http://www-2.cs.cmu.edu/~chuck/lennapg/lenna_visit.html
And optionally this one, of historical interest (not much to read, though):
(NSFW)http://www.lenna.org/full/len_full.html

Resources