Spyder: Edit template.py file (Need variable for the current filename) - spyder

Yo, guys.
I would like to edit the template.py file in Spyder editor.
template.py:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
#File : -
#Date : %(date)s
#Author : Jane Daw
#Contact : ****#gmail.com
"""
How could I retrieve the current file name ?
What is the Spyder varible that describes it ?

It appears there is no such variable. Here is a bug report that asks similar things since 2015-12-27. This issue also suggests only date and username are suported.

Related

How to get the PDB id of a mystery sequence?

I have a bunch of proteins, from something called proteinnet.
Now the sequences there have some sort of ID, but it is clearly not a PDB id, so I need to find that in some other way. For each protein I have the amino acid sequence. I'm using biopython, but I'm not very experienced in it yet and couldn't find this in the guide.
So my question is how do I find a proteins PDB id given that I have the amino acid sequence of the protein? (Such that I can download the PDB file for the protein)
hi I was playing a little bit ago with the RCSB PDB search API,
ended up with this piece of code (can't find examples on rcsb pdb website anymore),
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sun Dec 27 16:20:43 2020
#author: Pietro
"""
import PDB_searchAPI_5
from PDB_searchAPI_5.rest import ApiException
import json
#"value":"STEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS"
# Defining the host is optional and defaults to https://search.rcsb.org/rcsbsearch/v1
# See configuration.py for a list of all supported configuration parameters.
configuration = PDB_searchAPI_5.Configuration(
host = "http://search.rcsb.org/rcsbsearch/v1"
)
data_entry_1 = '''{
"query": {
"type": "terminal",
"service": "sequence",
"parameters": {
"evalue_cutoff": 1,
"identity_cutoff": 0.9,
"target": "pdb_protein_sequence",
"value": "STEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS"
}
},
"request_options": {
"scoring_strategy": "sequence"
},
"return_type": "entry"
}'''
# Enter a context with an instance of the API client
with PDB_searchAPI_5.ApiClient(configuration) as api_client:
# Create an instance of the API class
api_instance = PDB_searchAPI_5.SearchServiceApi(api_client)
try:
# Get RCSB PDB data schema as JSON schema extended with RCSB metadata.
pippo = api_instance.run_json_queries_get(data_entry_1)
except ApiException as e:
print("Exception when calling SearchServiceApi->run_json_queries_get: %s\n" % e)
exit()
print(type(pippo))
print(dir(pippo))
pippox = pippo.__dict__
print('\n bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb \n' ,pippox)
print('\n\n ********************************* \n\n')
print(type(pippox))
pippoy = pippo.result_set
print(type(pippoy))
for i in pippoy:
print('\n',i,'\n', type(i))
print('\n LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL\n')
for i in pippoy:
for key in i:
print('\n', i['identifier'], ' score : ', i['score'])
the search module (import PDB_searchAPI_5) was generated with: openapi-generator-cli-4.3.1.jar link here
the open api specs where 1.7.3 now they are 1.7.15 see https://search.rcsb.org/openapi.json
the data_entry_1 bit was copied from rcsb pdb website but can't find it anymore,
it was saying something about mmseqs2 being the sofware doing the search, played with:
"evalue_cutoff": 1,
"identity_cutoff": 0.9,
parameters but didn't find a way to select only 100% identity
here the PDB_searchAPI_5 install it in a virtual enviroment with:
pip install PDB-searchAPI-5-1.0.0.tar.gz
was generated by openapi-generator-cli-4.3.1.jar with:
java -jar openapi-generator-cli-4.3.1.jar generate -g python -i pdb-search-api-openapi.json --additionalproperties=generateSourceCodeOnly=True,packageName=PDB_searchAPI_5
don't put spaces in --additionalproperties part (took one week to figure it out)
the README.md file is the most important part as it explain how to use the OPEN-API client
you need your fasta sequences here:
"value":"STEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS"
the score = 1 should be the exact match,
probably the biopython blast module is easier, but it blast NIH database instead of RCSB PDB, sorry can't elaborate more on this, still need to figure out what is a JSON file, and wasnt able to find a better free tool that automatically generate a better OPEN-API python client (I believe is kind of not so easy task... but we always want more...)
to get API documentation try:
java -jar openapi-generator-cli-4.3.1.jar generate -g html -i https://search.rcsb.org/openapi.json --skip-validate-spec
You get html document or for pdf: https://mrin9.github.io/RapiPdf/
http://search.rcsb.org/openapi.json
works as well as https://search.rcsb.org/openapi.json so that you can look at the exchanges between client and server with wireshark

How can I insert a file into another file when using the Spyder IDE?

When editing a file using the Spyder IDE editor I want to add the contents of another file, similar to what Emacs ctrl-x i does. For example:
main.py
import sys
def main():
help_text = """ ### external file contents go here ###"""
print(help_text)
if __name__ == '__main__':
sys.exit(main())
insertme.txt
Help text someone else gave me.
My desired result is a main.py looking like below (after file insertion and a little clean up):
import sys
def main():
help_text = """Help text someone else gave me."""
print(help_text)
if __name__ == '__main__':
sys.exit(main())
Going through help, online searches, etc. I can't find any direct way to do this (obviously I can do it other ways, but they are more time consuming). Is something like this directly possible with Spyder? If so, how?
(Spyder maintainer here) This is not possible in our editor, sorry.

Need to collect only emails using Ruby code

I've received a list of emails that I'd like to run an email campaign on, however, in the list there are some URL's... and it complicates things.
Here's the standard formatting of the email address, for example:
news#ydr.com
I'd like to paste the list in terminal and run a command to ONLY capture all of the email addresses and save them to a file and remove any URLS.
Please advise! It is much appreciated :)
If you are just looking to catch most emails this regex might work.
I got this regex from here How to validate an email address using a regular expression?
They talk about the much more complicated RFC822 email regex
#!/usr/bin/env ruby
input = $stdin.readlines # ctrl + D after paste
input.each do |f|
puts f if f[/^[a-zA-Z0-9_.+\-]+#[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-.]+$/]
end
# test input
# foo#bar.com
# www.cnn.com
# test.email#go.com
# turdburgler#mcdo.net
# http://www.google.com
To write emails to a file:
#!/usr/bin/env ruby
file = File.open("emails.txt", "w")
input = $stdin.readlines # ctrl + D after paste
input.each do |f|
file.write(f) if f[/^[a-zA-Z0-9_.+\-]+#[a-zA-Z0-9\-]+\.[a-zA-Z0-9\-.]+$/]
end
file.close
Just to be clear, this is a ruby script which should be ran like this.
Save the script as a file, ie email_parser.rb.
chmod +x email_parser.rb
./email_parser.rb # this will wait for stdin, here you paste the list in to the terminal
When the terminal is hanging waiting, paste the list of emails in, then press ctrl + D to tell the program that this is the EOF. The program will then run through the list of emails/urls and parse. The output of this will be a file if using the updated script. The file will be in the same folder you ran the script and be called emails.txt

How am I able to pass a var into MeCab for Python?

The code is:
import MeCab
m = MeCab.Tagger("-O wakati")
text = raw_input("Enter Japanese here: ")
print m.parse(text)
The problem is that after entering the string into the raw_input it gives an error in IDLE:
Traceback (most recent call last):
File "C:\Users\---\Desktop\---\Python\japanesetest.py", line 5, in <module>
print m.parse(text)
File "C:\Users\---\Desktop\---\Python\lib\site-packages\MeCab.py", line 220...
def parse(self, *args): return _MeCab.Tagger_parse(self, *args)
TypeError: in method 'Tagger_parse', argument 2 of type 'char const *'
If I do this however:
import MeCab
m = MeCab.Tagger("-O wakati")
print m.parse('なるほど、マルコフ辞書のキーはタプルにしたほうがスッキリしますね。')
I get the proper result:
なるほど 、 マルコフ 辞書 の キー は タプル に し た ほう が スッキリ し ます ね 。
Things I have tried are unicode tags at the beginning, writing to a textfile in unicode and parsing the text, and a few other million things. I'm running Python 2.7 and MeCab 0.98. If this can't be answer, even a little light shed on the error would be appreciated.
I am able to run your snippet successfully using Python 2.7 and MeCab 0.98 in both IDLE and IPython command line.
import MeCab
m = MeCab.Tagger("-O wakati")
text = raw_input("Enter Japanese here: ")
Enter Japanese here: 私の車はとても高いです。
print m.parse(text)
私 の 車 は とても 高い です 。
However, when reading from a UTF file I will get errors when trying to parse the text. For those cases I explicitly encode the text to shift-jis. You might try this technique. Below is an example.
rawtext = open("UTF.file", "rb").read()
tagger = MeCab.Tagger()
encoded_text = rawtext.encode('shift-jis', errors='ignore')
print tagger.parse(encoded_text).decode('shift-jis', errors='ignore')
This is my current workaround, and should help people coming across the same issue:
import MeCab
import codecs
write_to = codecs.open("pholder.txt", "w", "utf-8")
text = raw_input("Please insert Japanese text here: ")
write_to.write(text)
write_to.close()
read_from = open('pholder.txt').read()
mecab = MeCab.Tagger("-Owakati")
print mecab.parse(read_from)
The deal-breaker here is adding .read() to the open func. Why? Maybe you can tell me. :/

Spreadsheet - encoding problem with reading cyrillic characters

I'm working on a rails app for a small shop. It needs to load an .xls file, parse it and maybe load to the database.
I use Spreadsheet gem to work with the file.
The problem is that the file contains russian characters which are displayed as "└ÛÛ.ExT H-1727F (ÓÝÓÙ¯Ò GP T304)"
The reference says, I need to specify the encoding, but I don't know which one is used in this file. I tried "win-1251" but it gave me an error about being unable to find a "utf-8 to win-1251 converter"
I've setting encoding to "WINDOWS-1251" but it gave me this error:
U+00BE to WINDOWS-1251 in conversion from CP850 to UTF-8 to WINDOWS-1251
So then I've tried CP850, which didn't throw an error, but the characters were still not readable.
There's not much code really.
# -*- encoding : utf-8 -*-
...
def show
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
book = Spreadsheet.open 'c:\rails\renergy23\public\price-16-04-11.xls'
#sheet = book.worksheet 0
end
For simpicity I don't load it to the database right now. Instead I output it in my view:
- 30.times do |i|
= #sheet.row i+10
%br
http://dl.dropbox.com/u/4976861/price-16-04-11.xls
I kinda solved this after 1.5 months by first saving the document in .xlsx and then saving it in .xls (97-2003). I couldn't use the .xlsx because of some weird OLE signature incorrect error.

Resources