How to get distance between two atoms using for loop? - biopython

I have one PDB structure. This structure has 13 residues. I have to find the distance between two atoms(only C,O,N,S) using for loop. First I have to find the distance between first and second residue. after that first and third residue.up to first and 13 th residue and so on. How can I write the python script using for loop?

Using the xyz coordinates you can calculate distances between each atom. First you'll have to parse the PDB file and store the coordinates. Then just iterate over the list of atoms (for atom in list_of_atoms) and calculate the euclidean distance between them..
http://en.wikipedia.org/wiki/Euclidean_distance#Three_dimensions
Biopython's Bio.PDB module also allows such calculation easily.

Related

Comsol: Infinite Element Domain module

I want to simulate a 2D heat transfer process in the subsurface on a region which is infinite on the r-direction. So, as you know, the very basic way to model this is to draw a geometry that is very long in the r direction. I have done this, and the results that I obtain is correct as in this case, the results are matched with the analytical solution. As you know, there is a capability in Comsol called infinite element domain which serves the purpose to the problem mentioned above. In this case, we need to define a limited geometry on which we want to solve the PDE, and also need to draw a small domain acting as the Infinite Element Domain. However, in this case, the results are not correct because they are not matched with the analytical solution. Is there anything that I am missing to correctly use Infinite Element Domain in comsol?
Any help or comment would be appreciated.
Edit:
I edited the post to be more specific.
Please consider the following figure where a fluid with high temperature is being injected into a region with lower temperature:
https://i.stack.imgur.com/BQycC.png
The equation to solve is:
https://i.stack.imgur.com/qrZcK.png
With the following initial and boundary conditions (note that the upper and lower boundary condition is no-flux):
https://i.stack.imgur.com/l7pHo.png
We want to obtain the temperature profile over the length of rw<r<140 m (rw is very small and is equal to 0.005 m here) at different times. One way to model this numerically in Comsol is to draw a rectangle that is 2000 m in the r-direction, and get results only in the span of r [rw,140] m:
https://i.stack.imgur.com/BKCOi.png
The results of this case is fine, because they are well-matched with the analytical solution.
Another way to model this is to replace the above geometry with a bounded one that is [rw, 140] m in the r-direction and then augment it with an Infinite Element domain that is meshed mapped, as follows:
https://i.stack.imgur.com/m9ksm.png
Here, I have set the thickness of Infinite Element to 10 m in the r-direction. However, the results in this case are not matched with the analytical solution (or the above case where Infinite Element domain was not used). Is there anything that I am missing in Comsol? I have also changed some variables with regard to Infinite Element in Comsol such as physical width or distance, but I didn't see any changes in the results.
BTW, here are the results:
https://i.stack.imgur.com/cdaPH.png

Get a reaction resultant by automatically summing nodal reactions

In Abaqus, I want to compute the force resulting from a pressure I apply on a surface. This force is the sum of the nodal reactions of all nodes belonging to the surface.
Using history output, the only thing I can do is exporting the individual nodal reactions, which becomes awkward to handle when there is a lot of nodes.
So, is there a simple way, in the CAE interface or in the .inp input file to do this in a straightforward way?
In Abaqus/Standard, you may print nodal and/or element output to the data file (.dat) using the *node print input file keyword. In Standard or Explicit, you may print to the results file (.fil/.sel) using *node file. The keyword can be used for an individual node or for an entire node set. You can control whether the values are totaled, whether the output is in a local or global coordinate system, whether a summary is also printed, and the frequency the output is written to file.
The options and defaults are slightly different between *node print and *node file - for example, the summary and totals args are only available for node print. See the docs for more detail.
These keywords have to be placed within a Step definition. Assuming you have already defined the nset of interest, you can do something like:
*node print, nset=my_nset, totals=yes, global=yes
RF,
*node file, nset=my_nset, global=yes, frequency=999
RF,

Why does ELKI need db.in file in addition to distance matrix? Also what should db.in file contain?

I tried to follow this tutorial on using ELKI with pre-computed distances for clustering.
http://elki.dbs.ifi.lmu.de/wiki/HowTo/PrecomputedDistances
I used the following set of command line options:
-dbc.filter FixedDBIDsFilter -dbc.startid 0 -algorithm clustering.OPTICS
-algorithm.distancefunction external.FileBasedDoubleDistanceFunction
-distance.matrix /path/to/matrix -optics.minpts 5 -resulthandler ResultWriter
ELkI fails with a configuration error saying db.in file is needed to make the computation.
The following configuration errors prevented execution:
No value given for parameter "dbc.in":
Expected: The name of the input file to be parsed.
No value given for parameter "parser.distancefunction":
Expected: Distance function used for parsing values.
My question is what is db.in file? Why should I provide it in addition to the distance matrix file since the pair-wise distance matrix file completely specifies all the information about the point cloud. (also I don't have access to any other information other than the pair-wise distance information).
What should I do about db.in? Should I override it, or specify some dummy information etc. Kindly help me understand.
thank you.
This is documented in the ELKI HowTos:
http://elki.dbs.ifi.lmu.de/wiki/HowTo/PrecomputedDistances
Using without primary data
-dbc DBIDRangeDatabaseConnection -idgen.count 100
However, there is a bug (patch is on the howto page, and will be in the next release) so you right now can't fully use this; as a workaround you can use a text file that enumerates the objects.
The reason for this is that ELKI is designed to work on multi-relational data. It's not just processing matrixes. But some algorithms may e.g. need a geographic representation of an object, some measurements for this object, and a label for evaluation. That is three relations.
What the DBIDRange data source essentially does is create a single "fake" relation that is just the DBIDs 0 to 99. On algorithms that don't need actual data, but only distances (e.g. LOF or DBSCAN or OPTICS), it is sufficient to have object IDs and a distance matrix.

Trying to implement an 8 point 1D DCT-II in labview; can only put one value in my output array

I am trying to implement a 1D DCT type II filter in Labview. The formula for this can be seen here
As you can see xk = the sum of a sum function involving an iteration of n.
As far as I know the nested for loop should handle the function with the shift registers keeping a running total of the output. My problem lies with the output the the matrix xk. There is either only one output to the matrix or each output over-writes the last output due to no indexig. trying to put the matrix inside the for loop results in an error between the shift register and the matrix:
You have connected two terminals of different types.
The source is a double and the sink is a 1D array of double
Anyone know how I can index the output to the array?
I believe this should work. Please check the math.
the inner for-loop will run either 8 times, or however many elements are in the array xn. LabVIEW uses whichever number is smaller to determine the iteration count. So if xn is empty, the for loop wont run at all. If it's 20, the for loop will run 8 times.
Regardless, the outer loop will always run 8 times, so xk will have 8 elements total.
Also, shift registers that do not initialize a value at the beginning of a for or while loop can cause problems, unless you mean to do that. The value stored in the shift register after running the first time could be a problem the second time you go to run it.

How to display the results of multiple comparisons

If you compare two sets of data (such as two files), the differences between these sets can be displayed in two columns, or two panes, such as WinMerge does.
But are there any visual paradigms to display the differences between multiple data sets?
Update
The starting point of my question was the assumption that displaying differences between 2 files is relatively easy, as I mentioned WinMerge, whereas comparing 3 or more text files turns out to be more complicated, as there will be more and more differences between, say, different versions of a document that have been created over time.
How would you highlight parts of the file that are the same in 2 versions, but different from other versions?
The data sets I have in mind are objects (A, B, C, ...) which may or may not exist and have properties (a, b, c, ...) which may be set or not set.
Example:
Set 1: A(a, b, c), B(b, c), C(c)
Set 2: A(a, b, c), B(b), C(c)
Set 3: A(a, b), B(b)
If you compare 2 sets, e.g. 1 and 2, the difference would be in B(c). Comparing sets 2 and 3 results in the difference A(c) and C().
If you compare all 3 sets, you end up with 3 comparisons (n * (n-1) / 2)
I have a different view than some of those who provided Answers--i.e., that you need to further specify the problem. The abstraction level is about right. Further specification would make the problem easier, but the solution less useful.
A couple of years ago, i saw a graphic on ProgrammableWeb--it compared the results from a search on Yahoo with the results from the same search on Google. There's a lot of information to covey: some results are in both sets, some in just one, and the common results will have different positions in the respective engine's results, which somehow has to be shown.
I like the graphic and reimplemented it in Matplotlib (a Python scientific plotting library). Below is an example using some random points as well as python code i used to generate it:
from matplotlib import pyplot as PLT
xvals = NP.array([(2,3), (5,7), (8,6), (1.5,1.8), (3.0,3.8), (5.3,5.2),
(3.7,4.1), (2.9, 3.7), (8.4, 6.1), (7.1, 6.4)])
yvals = NP.tile( NP.array([5,3]), [10,1] )
fig = PLT.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x, y, "-", lw=3, color='b')
ax1.plot(x, y2, "-", lw=3, color='b')
for a, b in zip(xvals, yvals) : ax1.plot(a,b,'-o',ms=8,mfc='orange', color='g')
PLT.axis("off")
PLT.show()
This model has some interesting features: (i) it actually deals with 'similarity' on a per-item basis (the vertically-oriented line connecting the dots) rather than aggregate similarity; (ii) the degree of similarity between two data points is proportional to the angle of the line connecting them--90 degrees if they are equal, with a decreasing angle as the difference increases; this is very intuitive; (iii) cases in which a point in one data set is not present in the second data set are easy to show--a point will appear on one of the two lines but without a line connecting it to a point on the other line.
This model works well for comparing search results because each search result has a 'score' (its index, or order in the Results List). For other types of data, you might have to assign a score to each data point--a similarity metric might i suppose (in a sense, that's actually what the search result order is, an distance from the top of the list)
Since there has been so much work into displaying a diff of two files, you might start by expressing your 'multiple data sets' in an appropriate text format, then using whatever you want to show a diff between those text formats.
But you should tell us more about your data sets!
I experimented a bit, and implemented two displays:
Matrix
Timeline
I agree with Peter, you should specify what type your data is and what you wish to bring out in the comparison.
Depending on the nature of the data/comparison you can consider different visualisations. Is your data ordered or unordered? How many things are you comparing, i.e. fine grain or gross comparison?
Examples:
Visualizing a comparison of unordered data could just be plotting the two histograms of your sets (i.e. distributions):
image source
On the other hand, comparing a huge ordered dataset like DNA can be done innovatively.
Also, check out visual complexity, it's a great resource for interesting visualization.

Resources