I am getting error in Keras Callback function - machine-learning

This is the error. I was trying to call my created callback function in model.fit(Callback=[EarlyStopping, reduceLROnPlateau, Callback()])
**This is callback function**
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if logs.get('val_accuracy') > 0.90:
print('\n Validation accuracy has reached upto \
90% so, stopping further training.')
self.model.stop_training = True

Related

Display progress on dask.compute(*something) call

I have the following structure on my code using Dask:
#dask.delayed
def calculate(data):
services = data.service_id
prices = data.price
return [services, prices]
output = []
for qid in notebook.tqdm(ids):
r = calculate(parts[parts.quotation_id == qid])
output.append(r)
Turns out that, when I call the dask.compute() method over my output list, I don't have any progress indication. The Diagnostic UI don't "capture" this action, and I'm not even sure that's properly running (judging by my processor usage, I think it's not).
result = dask.compute(*output)
I'm following the "best practices" article from the dask's documentation:
https://docs.dask.org/en/latest/delayed-best-practices.html
What I'm missing?
Edit: I think it's running, because I still got memory leak/high usage warnings. Still no progress indication.
As pointed out in the related post, dask has two methods for displaying the progress: one for "normal" dask, and one for dask.distributed.
Here's a reproducible example:
import random
from time import sleep
import dask
from dask.diagnostics import ProgressBar
from dask.distributed import Client, progress
# simulate work
#dask.delayed
def work(x):
sleep(x)
return True
# generate tasks
random.seed(42)
tasks = [work(random.randint(1,5)) for x in range(50)]
Using plain dask
ProgressBar().register()
dask.compute(*tasks)
produces:
using dask.distributed
client = Client()
futures = client.compute(tasks)
progress(futures)
produces:

Why dask delayed do nothing?

I am using dask to process files line by line. However, dask seems that do not do anything. My code logic is as follows:
import dask
from dask import delayed
from time import sleep
#dask.delayed
def inc(x):
sleep(1)
print(x)
def test():
for i in range(5):
delayed(inc)(i)
dask.compute(test())
However, no any outputs in console. Why?
Your function test does not return anything.
Perhaps you meant to do something like
def test():
out = []
for i in range(5):
out.append(inc(i))
return out
(note that you already decorated inc with delayed, there is no need to call delayed(inc) again)

Is there any fast and efficient way to get abstracts from pubmed?

I would like to download large scientific abstract data for lets say about 2000 Pubmed IDs. My python code is sloppy and seems rather slow working. Is there any fast and efficient method to do harvest these abstracts?
If this is the fastest method how do I measure it so I become able compare against others or home against work situation (different ISP may play part in speed)?
Attached my code below.
import sqlite3
from Bio.Entrez import read,efetch,email,tool
from metapub import PubMedFetcher
import pandas as pd
import requests
from datetime import date
import xml.etree.ElementTree as ET
import time
import sys
reload(sys)
sys.setdefaultencoding('utf8')
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
def abstract_download(self,dict_pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index=0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
for names in dict_pmids:
for pmid in dict_pmids[names]:
try:
abstract = []
url = baseUrl+"efetch.fcgi?db=pubmed&id="+pmid+"&rettype=xml"+
response=requests.request("GET",url,timeout=500).text
response=response.encode('utf-8')
root=ET.fromstring(response)
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
if abstract is not None:
Abstract_data.loc[index]=names,pmid,"".join(abstract)
index+=1
except:
print "Connection Refused"
time.sleep(5)
continue
return Abstract_data
EDIT: The general error that occurs for this script is seemingly a "Connection Refused". See the answer of ZF007 below how this was solved.
The below code works. Your script hang on malformed URL construction. Also if things went wrong inside the script the response was a refused connection. This was infact not the case because it was the code that did the processing of the retrieved data.. I've made some adjustments to get the code working for me and left comments in place where you need to adjust due to the lack of the dict_pmids list.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, time, requests, sqlite3
import pandas as pd
import xml.etree.ElementTree as ET
from metapub import PubMedFetcher
from datetime import date
from Bio.Entrez import read,efetch,email,tool
def abstract_download(pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index = 0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
collected_abstract = []
# code below diabled to get general abstract extraction from pubmed working. I don't have the dict_pmid list.
"""
for names in dict_pmids:
for pmid in dict_pmids[names]:
move below working code to the right to get it in place with above two requirements prior to providing dict_pmid list.
# from here code works upto the next comment. I don't have the dict_pmid list.
"""
for pmid in pmids:
print 'pmid : %s\n' % pmid
abstract = []
root = ''
try:
url = '%sefetch.fcgi?db=pubmed&id=%s&rettype=xml' % (baseUrl, pmid)
# checks my url... line to parse into a webbrowser like firefox.
print 'url', url
response = requests.request("GET", url, timeout=500).text
# check if I got a response.
print 'response', response
# response = response.encode('utf-8')
root = ET.fromstring(response)
except Exception as inst:
# besides a refused connection.... the "why" it was connected comes in handly to resolve issues at hand
# if and when they happen.
print "Connection Refused", inst
time.sleep(5)
continue
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
# check if I found something
print 'root_find : %s\n\n' % root_find
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
# check if I found something
#print 'abstract : %s\n' % abstract
# code works up to the print statement ''abstract', abstract' teh rest is disabled because I don't have the dict_pmid list.
if abstract is not None:
# Abstract_data.loc[index] = names,pmid,"".join(abstract)
index += 1
collected_abstract.append(abstract)
# change back return Abstract_data when dict_pmid list is administered.
# return Abstract_data
return collected_abstract
if __name__ == '__main__':
sys.stdout.flush()
reload(sys)
sys.setdefaultencoding('utf8')
pubmedIDs = range(21491000, 21491001)
mydata = abstract_download(pubmedIDs)
print 'mydata : %s' % (mydata)

Clock.schedule_interval doesn't schedule callback

I am testing the functionality of the kivy.clock.Clock.schedule_interval function.
My schedule_interval isn't calling the test function but rather exits without any errors.
What is it that I'm not understanding? I have correctly modeled this test by the documentation.
from kivy.clock import Clock
class TestClass:
def __init__(self):
print("function __init__.")
schedule = Clock.schedule_interval(self.test, 1)
def test(self, dt):
print("function test.")
if __name__ == '__main__':
a = TestClass()
The expected output should be:
function __init__.
function test.
function test.
function test.
function test.
function test.
function test.
Instead I'm just getting:
function __init__.
The main problem is that your program exits before one second passes. I'm not sure but I also assume that there has to be a kivy app in order for the Clock to work (I tried to make an empty while loop instead of running an app but that didn't help).
Here's an easy fix that gives the desired output:
from kivy.clock import Clock
from kivy.base import runTouchApp
class TestClass:
def __init__(self, **kwargs):
print("function __init__.")
schedule = Clock.schedule_interval(self.test, 1)
def test(self, dt):
print("function test.")
if __name__ == '__main__':
test = TestClass()
runTouchApp() # run an empty app so the program doesn't close
Otherwise consider making TestClass inherit from kivy's App and running it with TestClass().run() - you will achieve the same result.

Kivy - threads, queues, clocks and Python sockets

I'm brand new to Kivy, and also new to GUI, but not new to programming.
I am completely missing the boat, the canoe, and the airplane on using Kivy.
In 30 years of programming, from machine code, assembly, Fortran, C, C++, Java, Python, I've never tried to use a language such as Kivy who's documentation is this thin, because it's so new. I know it'll get better, but I'm trying to use it now.
In my code, I'm trying to implement Queueing, so that I can obtain Python socket data. In normal Python programming, I would have IPC via a Queue - put data in, get data out.
I understand from Kivy, mostly from what I've read in various forums, but can't say I've found it in the documentation at kivy.org, that I can't do the following:
Kivy needs to be in it's own thread.
Nothing in Kivy should sleep.
Nothing in Kivy should do blocking IO.
After a LOT of Google searching, the only thing I've actually found that approaches being useful, is an informative note here on StackOverFlow . However, while it almost solves my problem, the answer assumes I know more about Kivy than I do; I don't know how to incorporate the answer.
If someone could take the time to put together a COMPLETE short demo of using that example, or one of your own unique COMPLETE answers, I would much appreciate it!
Here's some short code I put together, but it doesn't work, because it blocks on the get() call.
from Queue import Queue
from kivy.lang import Builder
from kivy.app import App
from kivy.uix.boxlayout import BoxLayout
from kivy.properties import StringProperty
from kivy.clock import Clock
from threading import Thread
class ClockedQueue(BoxLayout):
text1 = StringProperty("Start")
def __init__(self):
super(ClockedQueue,self).__init__()
self.q = Queue()
self.i=0
Clock.schedule_interval(self.get, 2)
def get(self,dt):
print("get entry")
val = self.q.get()
print(self.i + val)
self.i += 1
class ClockedQueueApp(App):
def build(self):
return ClockedQueue()
class SourceQueue(Queue):
def __init__(self):
q = Queue()
for word in ['First','Second']:
q.put(word)
print("SourceQueue finished.")
def main():
th = Thread(target=SourceQueue)
th.start()
ClockedQueueApp().run()
return 0
if __name__ == '__main__':
main()
Thanks!
Here's some short code I put together, but it doesn't work, because it blocks on the get() call.
So what you really want to do is get items from your queue in a non-blocking way?
There are multiple ways to do this. The simplest seems to be to just check if the queue has any items before getting one - Queue has several methods that help with this, including checking if it is empty or setting whether get is allowed to be blocking (by setting its first argument to False). If you just do this instead of calling get on its own, you won't block things waiting for the queue to have any items - if it's empty or you can't immediately get anything, you just do nothing.
I don't know what you want to do with the items you get from the queue, but if it's short operations that don't take long then you won't need anything more than this. For instance, you could Clock.schedule_interval the get method to happen every frame, do nothing if the queue is empty, or operate on the data if you get something back. No blocking, and no messing with your own threads.
You can also create your own thread and run the blocking code in it, which is general way to deal with blocking issues, especially tasks that can't be split up into short sections that can be performed between frames. I don't know about the details of this, but it should just involve using python threads normally. You can check the source of kivy's UrlRequest for an example, this can download a web source in a background thread.
Edit: Also your SourceQueue is messed up (you override its __init__ to make a new queue that you don't store anywhere), and your clock scheduling has a meaningless third argument false which isn't even defined. I don't know what's going on here, it probably affects what you're trying to do, but doesn't matter to my general answer above.
I was finally able to create something that worked.
Thanks everyone for your suggestions!
Here's the code (because I'm new, Stackoverflow wouldn't let me post it as answering my own question until 5:00 AM)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# threads_and_kivy.py
#
'''threads_and_kivy.py
Trying to build up a foundation that satisfies the following:
- has a thread that will implement code that:
- simulates reads data from a Python socket
- works on the data
- puts the data onto a Python Queue
- has a Kivy mainthread that:
- via class ShowGUI
- reads data from the Queue
- updates a class variable of type StringProperty so it will
update the label_text property.
'''
from threading import Thread
from Queue import Queue, Empty
import time
from kivy.app import App
from kivy.uix.boxlayout import BoxLayout
from kivy.properties import StringProperty
from kivy.lang import Builder
from kivy.clock import Clock
kv='''
<ShowGUI>:
Label:
text: str(root.label_text)
'''
Builder.load_string(kv)
q = Queue()
class SimSocket():
global q
def __init__(self, queue):
self.q = queue
def put_on_queue(self):
print("<-----..threaded..SimSocket.put_on_queue(): entry")
for i in range(10):
print(".....threaded.....SimSocket.put_on_queue(): Loop " + str(i))
time.sleep(1)#just here to sim occassional data send
self.some_data = ["SimSocket.put_on_queue(): Data Loop " + str(i)]
self.q.put(self.some_data)
print("..threaded..SimSocket.put_on_queue(): thread ends")
class ShowGUI(BoxLayout):
label_text = StringProperty("Initial - not data")
global q
def __init__(self):
super(ShowGUI, self).__init__()
print("ShowGUI.__init__() entry")
Clock.schedule_interval(self.get_from_queue, 1.0)
def get_from_queue(self, dt):
print("---------> ShowGUI.get_from_queue() entry")
try:
queue_data = q.get(timeout = 5)
self.label_text = queue_data[0]
for qd in queue_data:
print("SimKivy.get_from_queue(): got data from queue: " + qd)
except Empty:
print("Error - no data received on queue.")
print("Unschedule Clock's schedule")
Clock.unschedule(self.get_from_queue)
class KivyGui(App):
def build(self):
return ShowGUI()
def main():
global q
ss = SimSocket(q)
simSocket_thread = Thread(name="simSocket",target=ss.put_on_queue)
simSocket_thread.start()
print("Starting KivyGui().run()")
KivyGui().run()
return 0
if __name__ == '__main__':
main()

Resources