NLP for extracting actions from text - machine-learning

I'm hoping somebody can point me in the right direction to learn about separating out actions from a bunch of text.
Suppose I have this text
Drop off the dry cleaning, and go to the corner store and pick-up a jug of milk and get a pint of strawberries.
Then, go pick up the kids from school. First, get John who is in the daycare next to the library, and then get Sam who is two blocks away.
By the time you've got the kids, you'll need to stop by the doctors office for the perscription. Tim's flight arrives at 4pm.
It's American Airlines flight 331 arriving from Dallas. It will be getting close to rush hour, so make sure you leave yourself enough time.
I'm trying to have it split up into
Drop off the dry cleaning,
and go to the corner store and pick-up a jug of milk and get a pint of strawberries.
Then, go pick up the kids from school. First, get John who is in the daycare next to the library, and then get Sam who is two blocks away.
By the time you've got the kids, you'll need to stop by the doctors office for the perscription.
Tim's flight arrives at 4pm.
It's American Airlines flight 331 arriving from Dallas. It will be getting close to rush hour, so make sure you leave yourself enough time.
I haven't been able to find anything in my searches that is specifically action based. It would need to be smarter than just picking out verbs, as there are multiple verbs that are sometimes associated with one action for, instance the second item has 'go','pick-up' and 'get', but that is all part of a single action. Of course, "Tim's flight" is only suggests an action with the present participle, with the verb coming toward the end of the segment.
Any suggestions on where to look to do this kind of thing? Things to watch-out for, recommended readings, etc. etc.

Simple approach: parse the text using [your favorite parser], then select the sentences or SBAR phrases that are in the imperative mood. The Stanford Parser just so happens to have "Improved recognition of imperatives" in its very latest release.
There's probably no need for machine learning beyond what is already incorporated in standard parser programs.

This domain is called Information Extraction.
The general approach to sentence understanding is either:
extract a Part-Of-Speech tagged parse-tree (Python spaCy.io, nltk, CoreNLP etc.)
extract a word-vector (e.g. word2vec)

Related

Twilio studio Gather Widget Not Detected voice input

I am using Twilio studio flow to make an IVR and not want to miss a single command from customer. When we say a sentence Gather Input widget is working but we say a single word like sales .Widget not detected any word and trigger no input .Can someone give suggestion how we used Gather input in our flow that it detect even a single word. I used hints and set language as well. I also used speech model like number and command but I am not sure how we used it .
Waiting for Answer.
Thanks
I am trying to detected single words as well from customers in Twilio studio
You're not going to be able to do both well and need to direct your user to either say something short or say something long. I recommend you go with short for best results. Additionally, if you don't want to miss anything your customer said you might want to record your call and to post call analysis to see if there was anything important there. Honestly, if this is your first time introducing speech to your customers do this:
"Thank you for calling X, how may I help you?"
Customer says whatever.
"Let's try this a different way for X press 1, for Y press 2."
This helps you to better understand what your customer's would normally ask for in their own words and then have a better idea if you really need to capture long sentences or short words.
david

How to import all English words to a set in Swift/Xcode?

I'm just learning and working through the Apple Xcode/Swift guide, and am currently working on the "Apple Pie" project (in case anyone is familiar). If you're not, it's a "hangman" style game, where you guess the letters in a word each round, and have a total number of guesses before you lose. The guide asks you to add your own list of words to an array, but that feels tedious and boring. I know this is just a guide to help me learn, but I think it'd be a lot more fun to pull a random word from a set of "all" English words (at least several thousand) to guess from.
How would I go about importing a set of words like this, to where I don't have to type them all hard coded? I found references to an "npm" that seems to contain what I'm looking for, but have no idea what an npm is or how to add it to my program as a searchable set.
The best thing you can do is to pull a request to a web were many english words are stored, for example: http://www.mieliestronk.com/corncob_caps.txt, create a file inside your app with all that words and then create an array in code where you can choose randomly from.

Autoformat Text with Machine Learning

I am currently working at an issue regarding optimizing the workflow of an agency.
The agency receives like 30-40 PDF/Word documents, which should be converted into Indesign-Files, which will be print in newspapers. Its always the same pattern: job adverts with a logo, the job position and some text.
Weekly the same customers send us their adverts. Our employees usually take the patterns of the existing files and copy-paste the new text.
We apply some fix formating rules like: not words overlapping across lines, distance between the job title and the first paragraph. One important thing is to keep the height as small as possible in order to reduce costs for our clients. Because we have many employees who are new, work in part-time etc. we face a huge fluctuation. therefore we want to standardize the process, in order to only do some little changes for new adverts.... I guess you know what I mean.
Do you see a possibilty to improve the process for example with NLTK? I think of training an algorithm which recognizes the "job title", "bullet-points", logo etc. and automatically propose a formation for the text.
A colleague told me just to write a script which formates the indesign document.
What do you think? Thanks so far.
Here is a brief example:
Example Picture

Rails: How to query records in different languages

I have a Rails inventory app that is available to global users, allowing them to enter their own inventory information and query those of others.
a British person in London adds 10 units of "bicycle" to the inventory table
a Japanese person adds 2 units of 自転車 (bicycle in Japanese)
a Vietnamese adds 5 units of xe dap (bicycle in Vietnamese)
The British person can query 'bicycle' and it will output all bicycles in the system (17 units) and can show the details of each in their original language, without the users classifying them beforehand. Likewise, the Japanese person can query '自転車', which will show all bicycles.
How can this be done?
The globalize gem requires users to manually translate each record so it's not the correct way. I've heard about machine learning and deep learning but I don't know if it's the right solution for this.
So if stackoverflow is not the right place to ask this? Where should I ask? Quora does not allow long questions.
Machine learning does not seem like a proper solution in this context since you don't have enough experience with it and it's a complex matter to just start with it and learn enough to apply to a real life problem.
Here are a few solutions you could implement today, as long as you understand the requirements and the up/downs for each, you will have to figure those out by yourself.
Since I don't have enough information about your system I'll try and generalize it to something that's likely.
Solutions:
1.Define a limited number of items for your system, like Bike and add
them to a config file or in a items database, each item having it's
unique id and when a user will have to add something they will have
to select from your list. Have a Other item as a catch-all, and
maybe provide a note so the users can add anything to recognize the
item.
2.Similar to the above solution but you give the users a way to add new items into the system, so you have 10 standard items and every user can add items to the site (those being moderated) and other users will have access to them.
3.Have a solid search system in place like Elasticsearch (or anything else), and when the user create items you index that item in the language that is entered, and then use Google translation API (or another translation service) to translate them in all the languages you need and index those for search as well.
I think solution 1 is the best if you are able to implement it followed by solution 2.

Open Ears API says every sound it hears is a word, even a cough

I am trying to use Open Ears for small part of my app. I have three or four keywords that I want to be able to "listen" to. Something like "Add", "Subtract", etc. I am just using the sample app found here. I want to have a special case in the app when I here "Add" etc. as opposed to a word that is not one of my four keywords. Right now I set my language to be only the four keywords, but whenever the Open Ears API hears anything, it picks between my four keywords. So if I cough, it picks the closest word out of the four words
How can I listen for a specific word without always choosing one of the keywords?
I was thinking I could have a whole bunch of words, a few hundred, and just check which word was spoken, and have a special case for my four keywords, but I don't want to have to type down each word. Does Open ears provide any default languages?
OpenEars developer here. Check out the dynamic grammar generation API that was just added in OpenEars 1.7 which may provide the right results for your requirements: http://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/
This approach might be more suitable for keyword detection and detection of fixed phrases. Please bring further questions to the OpenEars forums if you'd like to troubleshoot them with me.

Resources