How can I use the biopython Restriction package with all enzymes that are on REBASE? - biopython

I want to search a sequence for the enzyme CneH16IP, which has the recognition site GAYNNNNNCTTGY. When I import the list of enzymes included in the Restriction package, the enzyme is not included (code below)
from Bio import Restriction
from Bio.Restriction import *
dir()
I am assuming that not all of the enzymes from REBASE are imported by default, so is there a way to import all of them as an option? Or to add new enzymes? Alternatively, should I just try to do a string search that allows the degenerate bases and discontinuous bases?
Thank you for any help!

Related

Import specific section from another tex file

I'm trying to import a very specific section or subsection from another latex document. I basically have an export tool that creates a nice tex document and makes the sections for the headers. It's nice to call the whole file but at some point in my combined document I have to just call a subsection over and over again.
How do I call a specific subsection from a whole document?
I have a file call aa.tex and I'm able to use \subimport{}{aa} and it brings in the whole file.
In the file it look similar to
\section{Test Descriptions}
\subsection{Setup}
Hardware and Software... CPU, GPU, RAM etc
\subsection{test1}
\subsubsection{Steps1}
a,b,c,
\subsubsection{Steps2}
a,b,c
I want to be able to call \subsection{setup} over and over again because of what I have to reference.
So, logically how do I only call \subsection{Setup} from aa.tex?
With the catchfilebetweentags package one can selectivity input parts of a file
Main file:
\documentclass{article}
\usepackage{catchfilebetweentags}
\begin{document}
zzz
\ExecuteMetaData[subdocument]{setup}
zzz
\ExecuteMetaData[subdocument]{setup}
\end{document}
subdocument.tex:
xxx
%<*setup>
\subsection{Setup}
%</setup>
xxx

ArangoDB - how to import neo4j database export into ArangoDB

Are there any utilities to import database from Neo4j into ArangoDB? arangoimp utility expects the data to be in certain format for edges and vertices than what is exported by Neo4j.
Thanks!
Note: This is not an answer per se, but a comment wouldn't allow me to structure the information I gathered in a readable way.
Resources online seem to be scarce w/r to the transition from neo4j to arangodb.
One possible way is to combine APOC (https://github.com/neo4j-contrib/neo4j-apoc-procedures) and neo4j-shell-tools (https://github.com/jexp/neo4j-shell-tools)
Use apoc to create a cypher export file for the database (see https://neo4j.com/developer/kb/export-sub-graph-to-cypher-and-import/)
Use the neo4j-shell-tool cypher import with the -o switch -- this should generate csv-files
Analyse the csv-files,
massage them with csvtool OR
create json-data with one of the numerous csv2json converters available (npm, ...) and massage these files with jq
Feed the files to arangoimp, repeat 3 if necessary
There is also a graphml to json converter (https://github.com/uskudnik/GraphGL/blob/master/examples/graphml-to-json.py) available, so that you could use the afforementioned neo4j-shell-tools to export to graphml, convert this representation to json and massage these files to the necessary format.
I'm sorry that I can't be of more help, but maybe these thoughts get you started.

Parse/ tree diagram for phrasal constituents of a specific sentence

I'm trying to do some language analysis on the opening paragraph of The Kite Runner by Khaled Hosseini, specifically looking at phrasal constituents. The first sentence is as follows:
"I became what I am today at the age of twelve, on a frigid overcast day in the winter of 1975."
I've got a pretty good idea of what the phrasal constituents are, but I'm a bit unsure as to how to draw the tree, as it seems like the tree should be split into two distinct branches, splitting at the comma after twelve. I've uploaded an image my tree so far, but I'm not sure if it's correct or not. Any help would be greatly appreciated.
Thanks in advance :)
There is a library called constituent-treelib that can be used to construct, process and visualize constituent trees. First, we must install the library:
pip install constituent-treelib
Then, we can use it as follows to parse the sentence into a constituent tree, visualize it, and finally export the result to a PDF file:
from constituent_treelib import ConstituentTree
# Define a sentence
sentence = "You must construct additional pylons!"
# Define the language that should be considered
language = ConstituentTree.Language.English
spacy_model_size = ConstituentTree.SpacyModelSize.Medium
# Construct the neccesary NLP pipeline by downloading and installing the required models (benepar and spaCy)
nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True)
# Instantiate a ConstituentTree object and pass it both the sentence and the NLP pipeline
tree = ConstituentTree(sentence, nlp)
# Export the visualization of the tree into a PDF file
tree.export_tree('my_constituent_tree.pdf')
The result...

How to use nat properties in agda

In agda there's a module Data.Nat.Properties. It contains a lot of useful facts, which are hidden inside of records, for example, isCommutativeSemiring. How can I extract, for example * associativity and use it?
Open the modules in question. For example:
open import Algebra
open import Data.Nat.Properties
open CommutativeSemiring commutativeSemiring
-- now you can use *-assoc, *-comm, etc.
If you want to browse the contents of a module, try the C-c C-o key combination, since the recursive opening and re-exporting of algebraic structures makes it hard to see what's available.

Adding extra arguments to Haskell functions before compiling

As part of a program which dynamically loads user inputted strings as Haskell source code, I want to do some pre-processing on the user's input before compiling it.
One of the things I would like to be able to do is to search the source for particular function occurrences and add an extra argument to them. So, for example , I might want all occurrences of :
addThreeNumbers 3 5
To become:
addThreeNumbers 3 5 10
What is the best way of accomplishing such behavior? Is it complicated enough to warrant manipulating some sort of abstract syntax tree with functions in the GHC API / Template Haskell? Or is this something simple that can be accomplished with some sort of Haskell pre-processing / parsing library? If so, what libraries and resources would you recommend?
Ghc 7.6 qualified imports, ghc-pkg hide, and ghc's -package option allow you to seamlessly add a layer between the importing file and the imported file.
Example:
Create a package with your own Data.Char, with standard .cabal file and cabal install.
{-# language PackageImports #-}
module Data.Char (
toUpper
, Char
, String
-- ... Export every else from "Base" Data char because the limitation of
-- the current export facility you can not use
-- module Data.Char hiding (toUpper)
) where
import "base" Data.Char hiding (toUpper)
import qualified "base" Data.Char as OldChar
toUpper :: Char -> IO Char
toUpper c = do
print "Oh Yeahhhhhhhhh"
return $ OldChar.toUpper c
Hide the base package ghc-pkg hide base -- this hides many modules in this case an you need to wrap all of them if you need them.
> ghci -XNoImplicitPrelude -- We need language flag because the Prelude is in
-- base and I did not make a wrapped Prelude
ghci> import Data.Char
ghci> toUpper 'c' -- The wrapped function
"Oh Yeahhhhhhhhh"
'C'
ghci> isSpace ' ' -- The unwrapped normal Data.Char function
True
Now you can use template Haskell to wrap your functions and call any IO action you need to get external information. The Users do not even need to change any of their function calls or module imports with some variation of adding 'internal' to their name.
Being able to wrap module interfaces seamlessly also means you can change the implantation of a imported module without touching the package/module code or the existing code base you are working with either; you only have to make a middle layer.
Edit response to question:
Sure you can the ghc-api lets you do all of that, but is considerably more complex, fewer examples then I would like are floating around and I seem to see more people having a hard time with it then success stories.
For evaluation of code hint
pluggins is suggested for dynamic loading of modules
haskell-src-ext suggested for parsing and changing code. This is what stylish-haskell uses to do small modification to code and is your best bet. It reportedly covers most(all?) of Haskell 2010, and many but not all GHC extensions and is probably your best bet if you do not like the first solution I provided.
The GHC-API is the only one fully compatibly with GHC compatible code as far as I know but is considerably more complex, less well documented, and more likely to change from GHC version to GHC version, or at least there is no promise it will be the same, from my limited experience. I suggested putting a module in the middle because it seemed like the quickest to get working with good test coverage, took the least amount of new knowledge and fulfilled the requirements that I picked out of your question.

Resources