Error in chatterbot from chatterbot.corpus.english - path

I am new to python, and wont to learn more. getting fee wet with a simple chatbot.
running to error when Debugging the program.
import time
from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer
time.clock = time.time
chatbot = ChatBot('Bob') <------ Here (If i didn't use import time module ('time' has no attribute 'clock')error is thrown)
trainer = ChatterBotCorpusTrainer(chatbot)
trainer.train(
chatterbot.corpus.english'<----- Here i get (Exception has occurred: AttributeError
module 'collections' has no attribute 'Hashable')
)
bot = ChatBot(
'Bob',
storage_adapter='chatterbot.storage.SQLStorageAdapter',
database_uri='sqlite:///database.sqlite3'
)
bot = ChatBot(
'Bob',
logic_adapters=[
'chatterbot.logic.BestMatch',
'chatterbot.logic.MathematicalEvaluation',
'chatterbot.logic.TimeLogicAdapter'],
)
response = bot.get_response('Greetings.')
print("Bot Response:", response)
name=input("Enter Your Name: ")
print("Welcome to the Bot Service! Let me know how can I help you?")
while True:
request=input(name+':')
if request=='Bye' or request =='bye':
print('Bot: Bye')
break
else:
response=bot.get_response(request)
print('Bot:',response)
So far i trying to create my own corpus file on my computer. But not sure if it will throw the same error or if this is an error that has to do with incompatibility issues. New here any help would be great.
Thanks

Related

Vertex AI - Deployment failed

I'm trying to deploy my custom-trained model using a custom-container, i.e. create an endpoint from a model that I created.
I'm doing the same thing with AI Platform (same model & container) and it works fine there.
At the first try I deployed the model successfully, but ever since whenever I try to create an endpoint it says "deploying" for 1+ hours and then it fails with the following error:
google.api_core.exceptions.FailedPrecondition: 400 Error: model server never became ready. Please validate that your model file or container configuration are valid. Model server logs can be found at (link)
The log shows the following:
* Running on all addresses (0.0.0.0)
WARNING: This is a development server. Do not use it in a production deployment.
* Running on http://127.0.0.1:8080
[05/Jul/2022 12:00:37] "[33mGET /v1/endpoints/1/deployedModels/2025850174177280000 HTTP/1.1[0m" 404 -
[05/Jul/2022 12:00:38] "[33mGET /v1/endpoints/1/deployedModels/2025850174177280000 HTTP/1.1[0m" 404 -
Where the last line is being spammed until it ultimately fails.
My flask app is as follows:
import base64
import os.path
import pickle
from typing import Dict, Any
from flask import Flask, request, jsonify
from streamliner.models.general_model import GeneralModel
class Predictor:
def __init__(self, model: GeneralModel):
self._model = model
def predict(self, instance: str) -> Dict[str, Any]:
decoded_pickle = base64.b64decode(instance)
features_df = pickle.loads(decoded_pickle)
prediction = self._model.predict(features_df).tolist()
return {"prediction": prediction}
app = Flask(__name__)
with open('./model.pkl', 'rb') as model_file:
model = pickle.load(model_file)
predictor = Predictor(model=model)
#app.route("/predict", methods=['POST'])
def predict() -> Any:
if request.method == "POST":
instance = request.get_json()
instance = instance['instances'][0]
predictions = predictor.predict(instance)
return jsonify(predictions)
#app.route("/health")
def health() -> str:
return "ok"
if __name__ == '__main__':
port = int(os.environ.get("PORT", 8080))
app.run(host='0.0.0.0', port=port)
The deployment code which I do through Python is irrelevant because the problem persists when I deploy through GCP's UI.
The model creation code is as follows:
def upload_model(self):
model = {
"name": self.model_name_on_platform,
"display_name": self.model_name_on_platform,
"version_aliases": ["default", self.run_id],
"container_spec": {
"image_uri": f'{REGION}-docker.pkg.dev/{GCP_PROJECT_ID}/{self.repository_name}/{self.run_id}',
"predict_route": "/predict",
"health_route": "/health",
},
}
parent = self.model_service_client.common_location_path(project=GCP_PROJECT_ID, location=REGION)
model_path = self.model_service_client.model_path(project=GCP_PROJECT_ID,
location=REGION,
model=self.model_name_on_platform)
upload_model_request_specifications = {'parent': parent, 'model': model,
'model_id': self.model_name_on_platform}
try:
print("trying to get model")
self.get_model(model_path=model_path)
except NotFound:
print("didn't find model, creating a new one")
else:
print("found an existing model, creating a new version under it")
upload_model_request_specifications['parent_model'] = model_path
upload_model_request = model_service.UploadModelRequest(upload_model_request_specifications)
response = self.model_service_client.upload_model(request=upload_model_request, timeout=1800)
print("Long running operation:", response.operation.name)
upload_model_response = response.result(timeout=1800)
print("upload_model_response:", upload_model_response)
My problem is very close to this one with the difference that I do have a health check.
Why would it work on the first deployment and fail ever since? Why would it work on AI Platform but fail on Vertex AI?
This issue could be due to different reasons:
Validate the container configuration port, it should use port 8080.
This configuration is important because Vertex AI sends liveness
checks, health checks, and prediction requests to this port on the
container. You can see this document about containers, and this
other about custom containers.
Another possible reason is quota limits, which could need to be increased. You will be able to verify this using this document to do it
In the health and predict route use the MODEL_NAME you are using.
Like this example
"predict_route": "/v1/models/MODEL_NAME:predict",
"health_route": "/v1/models/MODEL_NAME",
Validate that the account you are using has enough permissions to
read your project's GCS bucket.
Validate the Model location, should be the correct path.
If any of the suggestions above work, it’s a requirement to contact GCP Support by creating a Support Case to fix it. It’s impossible for the community to troubleshoot it without using internal GCP resources
In case you haven't yet found a solution you can try out custom prediction routines. They are really helpful as they strip away the necessity to write the server part of the code and allows us to focus solely on the logic of our ml model and any kind of pre or post processing. Here is the link to help you out https://codelabs.developers.google.com/vertex-cpr-sklearn#0. Hope this helps.

Setting timezone in AsyncIOScheduler

I'm in the Pacific timezone and I'm creating a discord bot to send a message at 8am in CENTRAL time.
import os
import discord
from discord.ext import commands
from dotenv import load_dotenv
from rich import print
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.triggers.cron import CronTrigger
load_dotenv()
TOKEN = os.getenv('DISCORD_TOKEN')
intents = discord.Intents.default()
intents.members = True
bot = commands.Bot(command_prefix = '!', intents=intents)
# Will become the good morning message
async def gm():
c = bot.get_channel(channel_id_removed)
await c.send("This will be the good morning message.")
#bot.event
async def on_ready():
for guild in bot.guilds:
print(
f'{bot.user} is connected to the following guild:\n'
f'\t{guild.name} (id: {guild.id})'
)
#initializing scheduler for time of day sending
scheduler = AsyncIOScheduler()
# Attempts to set the timezone
# scheduler = AsyncIOScheduler(timezone='America/Chicago')
# scheduler = AsyncIOScheduler({'apscheduler.timezone': 'America/Chicago'})
# scheduler.configure(timezone='America/Chicago')
# Set the time for sending
scheduler.add_job(gm, CronTrigger(hour="6", minute="0", second="0"))
#starting the scheduler
scheduler.start()
#bot.event
async def on_member_join(member):
general_channel = None
guild_joined = member.guild
print(guild_joined)
general_channel = discord.utils.get(guild_joined.channels, name='general')
print(f'General Channel ID: {general_channel}')
if general_channel:
embed=discord.Embed(title="Welcome!",description=f"Welcome to The Dungeon {member.mention}!!")
await general_channel.send(embed=embed)
bot.run(TOKEN)
Environment:
Windows 10
Python 3.10.4
APScheduler 3.9.1
pytz 2022.1
pytz-deprecation-shim 0.1.0.post0
tzdata 2022.1
tzlocal 4.2
I'm just wondering if I'm doing something wrong? Or if what I'm trying to do simply isn't supported? It works if I use my local time so I know the function is ok.
You are using the asyncio scheduler but you're not running an asyncio event loop, so there is no way this could work. Copy/paste from the provided example:
from datetime import datetime
import asyncio
import os
from apscheduler.schedulers.asyncio import AsyncIOScheduler
def tick():
print('Tick! The time is: %s' % datetime.now())
if __name__ == '__main__':
scheduler = AsyncIOScheduler()
scheduler.add_job(tick, 'interval', seconds=3)
scheduler.start()
print('Press Ctrl+{0} to exit'.format('Break' if os.name == 'nt' else 'C'))
# Execution will block here until Ctrl+C (Ctrl+Break on Windows) is pressed.
try:
asyncio.get_event_loop().run_forever()
except (KeyboardInterrupt, SystemExit):
pass
The reason it is not working is because, while scheduler.start() instantiates an event loop as a side effect, it expects the loop to be run elsewhere so that the scheduler can do its work.

Unable to run rasa agent inside rasa core server

I am trying to load and run a rasa model inside my nlu server in rasa 3, however, after loading the model with the Agent I am unable to perform inference with the model.
#DefaultV1Recipe.register(
[DefaultV1Recipe.ComponentType.INTENT_CLASSIFIER], is_trainable=False
)
class MyCustomComponent(GraphComponent, EntityExtractorMixin):
def __init__(self, config):
model_path = "model_path"
self.model = Agent.load(model_path=model_path)
def process(self, messages):
for message in messages:
result = self.model.parse_message(message.get("text"))
message.set(
"my_field",
result.get("intent"),
add_to_output=True,
)
return messages
Everytime the parse_message method executes, it returns a coroutine, which I am not sure how to extract the results from.
And, if I try to go via asyncio.get_running_loop() and loop.run_until_complete method, I get the following error.
asyncio.run() cannot be called from a running event loop
Any ideas on how can this problem be solved?
Thanks!

Is there any fast and efficient way to get abstracts from pubmed?

I would like to download large scientific abstract data for lets say about 2000 Pubmed IDs. My python code is sloppy and seems rather slow working. Is there any fast and efficient method to do harvest these abstracts?
If this is the fastest method how do I measure it so I become able compare against others or home against work situation (different ISP may play part in speed)?
Attached my code below.
import sqlite3
from Bio.Entrez import read,efetch,email,tool
from metapub import PubMedFetcher
import pandas as pd
import requests
from datetime import date
import xml.etree.ElementTree as ET
import time
import sys
reload(sys)
sys.setdefaultencoding('utf8')
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
def abstract_download(self,dict_pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index=0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
for names in dict_pmids:
for pmid in dict_pmids[names]:
try:
abstract = []
url = baseUrl+"efetch.fcgi?db=pubmed&id="+pmid+"&rettype=xml"+
response=requests.request("GET",url,timeout=500).text
response=response.encode('utf-8')
root=ET.fromstring(response)
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
if abstract is not None:
Abstract_data.loc[index]=names,pmid,"".join(abstract)
index+=1
except:
print "Connection Refused"
time.sleep(5)
continue
return Abstract_data
EDIT: The general error that occurs for this script is seemingly a "Connection Refused". See the answer of ZF007 below how this was solved.
The below code works. Your script hang on malformed URL construction. Also if things went wrong inside the script the response was a refused connection. This was infact not the case because it was the code that did the processing of the retrieved data.. I've made some adjustments to get the code working for me and left comments in place where you need to adjust due to the lack of the dict_pmids list.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, time, requests, sqlite3
import pandas as pd
import xml.etree.ElementTree as ET
from metapub import PubMedFetcher
from datetime import date
from Bio.Entrez import read,efetch,email,tool
def abstract_download(pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index = 0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
collected_abstract = []
# code below diabled to get general abstract extraction from pubmed working. I don't have the dict_pmid list.
"""
for names in dict_pmids:
for pmid in dict_pmids[names]:
move below working code to the right to get it in place with above two requirements prior to providing dict_pmid list.
# from here code works upto the next comment. I don't have the dict_pmid list.
"""
for pmid in pmids:
print 'pmid : %s\n' % pmid
abstract = []
root = ''
try:
url = '%sefetch.fcgi?db=pubmed&id=%s&rettype=xml' % (baseUrl, pmid)
# checks my url... line to parse into a webbrowser like firefox.
print 'url', url
response = requests.request("GET", url, timeout=500).text
# check if I got a response.
print 'response', response
# response = response.encode('utf-8')
root = ET.fromstring(response)
except Exception as inst:
# besides a refused connection.... the "why" it was connected comes in handly to resolve issues at hand
# if and when they happen.
print "Connection Refused", inst
time.sleep(5)
continue
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
# check if I found something
print 'root_find : %s\n\n' % root_find
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
# check if I found something
#print 'abstract : %s\n' % abstract
# code works up to the print statement ''abstract', abstract' teh rest is disabled because I don't have the dict_pmid list.
if abstract is not None:
# Abstract_data.loc[index] = names,pmid,"".join(abstract)
index += 1
collected_abstract.append(abstract)
# change back return Abstract_data when dict_pmid list is administered.
# return Abstract_data
return collected_abstract
if __name__ == '__main__':
sys.stdout.flush()
reload(sys)
sys.setdefaultencoding('utf8')
pubmedIDs = range(21491000, 21491001)
mydata = abstract_download(pubmedIDs)
print 'mydata : %s' % (mydata)

What are the current dependencies for a standalone GORM script?

I am trying to following Graeme Rocher's example from Github:
https://gist.github.com/graemerocher/c25ec929d9bcd1adcbea
#Grab("org.grails:grails-datastore-gorm-hibernate4:3.0.0.RELEASE")
#Grab("org.grails:grails-spring:2.3.6")
#Grab("com.h2database:h2:1.3.164")
import grails.orm.bootstrap.*
import grails.persistence.*
import org.springframework.jdbc.datasource.DriverManagerDataSource
import org.h2.Driver
init = new HibernateDatastoreSpringInitializer(Person)
def dataSource = new DriverManagerDataSource(Driver.name, "jdbc:h2:prodDb;MVCC=TRUE;LOCK_TIMEOUT=10000;DB_CLOSE_ON_EXIT=FALSE", 'sa', '')
init.configureForDataSource(dataSource)
new Person(name: "Fred Flintstone").save(flush: true, failOnError: true)
println "Total people = ${Person.count()}"
#Entity
class Person {
String name
static constraints = {
name blank:false
}
}
I am getting
java.lang.RuntimeException: Error grabbing Grapes -- [download failed: >com.googlecode.concurrentlinkedhashmap#concurrentlinkedhashmap->lru;1.3.1!concurrentlinkedhashmap-lru.jar, download failed: >javax.transaction#jta;1.1!jta.jar, download failed: org.jboss.logging#jboss->logging;3.1.3.GA!jboss-logging.jar, download failed: org.javassist#javassist;3.18.1->GA!javassist.jar(bundle)]
I presume that this means that some set of dependencies has changed/gone away.
Is there a current working version of this code?
I had already tried using #GrabResolver() with various URLs (should have said, apologies).
I tried Jeff Beck's answer but to no avail.
I went through a bit of a debugging cycle by setting a few debug flags to see what was going on (-Dgroovy.grape.report.downloads=true -Divy.message.logger.level=4).
Eventually I had to create a custom ~/.groovy/grapeConfig.xml (as per http://groovy.codehaus.org/Grape#Grape-CustomizeIvysettings) and added the mavencentral URL Jeff gave as a further 'ibiblio' entry.
THEN all was fine.
Don't know why #GrabResolver() isn't resolving but this is a workaround.
I got the following to work fine:
#GrabResolver(name='mvncentral', root='http://central.maven.org/maven2/')
#Grab("org.grails:grails-datastore-gorm-hibernate4:3.1.1.RELEASE")
#Grab("org.grails:grails-spring:2.4.3")
#Grab("com.h2database:h2:1.3.164")
import grails.orm.bootstrap.*
import grails.persistence.*
import org.springframework.jdbc.datasource.DriverManagerDataSource
import org.h2.Driver
init = new HibernateDatastoreSpringInitializer(Person)
def dataSource = new DriverManagerDataSource(Driver.name, "jdbc:h2:prodDb;MVCC=TRUE;LOCK_TIMEOUT=10000;DB_CLOSE_ON_EXIT=FALSE", 'sa', '')
init.configureForDataSource(dataSource)
println "Total people = " + Person.count()
#Entity
class Person {
String name
static constraints = {
name blank:false
}
}
I added a different resolver from the default jcenter that is in the latest version of groovy. Also you may need to clear cached versions of what was tried to download. Here is a blog post about that.

Resources