Approximate matching in voicexml - voicexml

I don't know if I could get an answer here... The problem I am trying to solve is: The system listens to the user's input,judge if the user's input contains the word "loop".
Does VoiceXML support grammars for this kind of task? It seems that it can only pick up word from words listed. The user can say:
using a loop, loop, for loop, looping through the array......
Is there a way for me to only consider if the sentence contains "loop"?
Thanks in advance.

You can create your own grammer and attach to your field:
<field name="loopField">
<prompt>What's your way to say loop?</prompt>
<grammar src="mygrammar.gram" type="application/srgs+xml" />
<help> Please say your employee number. </help>
</field>
See more on W3C grammar here

Related

Machine learning: which algorithm fits for questions answering

I want to build a ml program who talks to user and get some inputs from the user.
The ml program analyze the input data(keywords) then predict the best solution.
So, you are looking at an AI application which needs some sort of machine intelligence for processing natural language.
Let us say the language of choice here is English. There are many things to be considered before building such a system.
Dependency parsing
Word Sense Disambiguation
Verb Sense Disambiguation
Coreference Resolution
Semantic Role Labelling
Universe of knowledge.
In brief you need to build all the above essential modules before you can generate your response.
You need to decide what kind of problem you are working on? Is it an open domain or closed domain problem, meaning what is the scope of knowledge of this application.
For example: Google now is an open domain problem which can practically take any possible input.
But some applications pertain to a particular task like automating food orders in an app etc where the scope of questions which can be asked is limited.
Once that is decided, you need to parse your input sentence and dependency parsing is the way to go. You can use Stanford core NLP suite to achieve most of the NLP tasks which were mentioned above.
Once the input sentence is parsed and you have the subjects, objects, etc it is time to disambiguate the words in the sentence as a particular word can have different meanings.
Then disambiguate the verb meaning identifying the type of verb (like return could mean going back to a place or giving back something )
Then you need to resolve coreference resolution meaning mapping the nouns and pronouns and other entities in a given context. For example:
My name is John. I work at ABC company.
Here I in the second sentence refers to John.
This helps us in answering questions like where does John work. Since John was only used in the first sentence and his work was mentioned in the second sentence coreference resolution helps us map them together.
The next task at hand is semantic role labelling, which basically means labelling all the arguments in a sentence with respect to each of its verb.
For example: John killed Mary.
Here the verb is kill, John and Mary are the arguments of the verb kill. John takes the role A0 and Mary the role A1. Where the definitions of these roles for each verb are mentioned in a huge frame and argument annotation framework created by the NLP community. Here A0 means the person who killed, A1 means the person who was killed.
Now once you have identified A0 and A1 just look into the definition of the kill frame and return A0 for killer and A1 for the victim.
Another important task at hand is to identify when your system must respond with an answer. For which you need to know if the given sentence is a declarative or assertive sentence or an interrogative sentence. You can just check that by seeing if the input sentence ends with a question mark.
Now to answer your question:
Let us say your input to the application is:
Input 1: John killed Mary.
Clearly this is an assertive sentence so just store it and process it as mentioned above.
Now the next input is:
Input 2: Who killed Mary?
This is an interrogative sentence so you need to come up with a reply or a response.
Now find the semantic role labels of input 1 and input 2 and return the word of input 1 which matches the argument of Who in sentence 2.
Here in this case who would be labeled as A0 and John would be labeled as A0, simply return John.
Most of the NLP modules mentioned can directly be implemented using Stanford core NLP however if you want to implement some algorithms on your own you can go through the recent publications in EMNLP, NIPS, ICML, CONLL etc to understand them better and implement the one which best suits you.
Good luck !

AIML Parser PHP

I am trying to develop Artificial Bot i found AIML is something that can be used for achieving such goal i found these points regarding AIML parsing which is done by Program-O
1.) All letters in the input are converted to UPPERCASE
2.) All punctuation is stripped out and replaced with spaces
3.) extra whitespace chatacters, including tabs, are removed
From there, Program O performs a search in the database, looking for all potential matches to the input, including wildcards. The returned results are then “scored” for relevancy and the “best match” is selected. Program O then processes the AIML from the selected result, and returns the finished product to the user.
I am just wondering how to define score and find relevant answer closest to user input
Any help or ideas will be appreciated
#user3589042 (rather cumbersome name, don't you think?)
I'm Dave Morton, lead developer for Program O. I'm sorry I missed this at the time you asked the question. It only came to my attention today.
The way that Program O scores the potential matches pulled from the database is this:
Is the response from the aiml_userdefined table? yes=300/no=0
Is the category for this bot, or it's parent (if it has one)? this=250/parent=0
Does the pattern have one or more underscore (_) wildcards? yes=100/no=0
Does the current category have a <topic> tag? yes(see below)/no=0
a. Does the <topic> contain one or more underscore (_) wildcards? yes=80/no=0
b. Does the <topic> directly match the current topic? yes=50/no=0
c. Does the <topic> contain a star (*) wildcard? yes=10/no=0
Does the current category contain a <that> tag? yes(see below)/no=0
a. Does the <that> contain one or more underscore (_) wildcards? yes=45/no=0
b. Does the <that> directly match the current topic? yes=15/no=0
c. Does the <that> contain a star (*) wildcard? yes=2/no=0
Is the <pattern> a direct match to the user's input? yes=10/no=0
Does the <pattern> contain one or more star (*) wildcards? yes=1/no=0
Does the <pattern> match the default AIML pattern from the config? yes=5/no=0
The script then adds up all passed tests listed above, and also adds a point for each word in the category's <pattern> that also matches a word in the user's input. The AIML category with the highest score is considered to be the "best match". In the event of a tie, the script will then select either the "first" highest scoring category, the "last" one, or one at random, depending on the configuration settings. this selected category is then returned to other functions for parsing of the XML.
I hope this answers your question.

md-highlight-text for multiple words

I am using md-highlight-text to highlight words in a list of checkbox-labels based on search. But I want to highlight multiple words searched.. There is no option/flag for this in the directive?
Code example from md site:
<input placeholder="Enter a search term..." ng-model="searchTerm" type="text">
<ul>
<li ng-repeat="result in results" md-highlight-text="searchTerm">
{{result.text}}
</li>
</ul>
here I want to highlight multiple words typed in the input.
Because I found this in my top search result on the web, and because the question was a bit unclear, I'm going to answer this multiple ways.
Of course, as mentioned in lorenzo montanan's answer, you do need to provide some css for the highlight (I think so, at least).
If the OP (or you, the reader) was asking to highlight multiple words in the results, there is now a md-highlight-flags which could help (see md-highlight-text documentation) which currently works like this:
md-highlight-flags - string - A list of flags (loosely based on JavaScript RexExp flags).
Supported flags:
g: Find all matches within the provided text
i: Ignore case when searching for matches
$: Only match if the text ends with the search term
^: Only match if the text begins with the search term
If, however, you want to type in multiple words in the input and have the output highlight each word separately without having to be in the same order, then md-highlight-text will not, and is not interested in doing that (see: request for highlighting multiple input words and another request). One way to do that is to write your own filter directive.

(XML Parsing) Which design pattern or approach is more efficient

I have a problem here, and it goes like this;
I have 50 Classes (an XML parser) for different responses/message types, to illustrate a sample of the existing XML message received see below:
<XML>
<transaction>
<messagetype>message</messagetype>
<message>Blah, Blah, Blah</message>
</transaction>
</XML>
Until recently, a requirement to allow multi-transaction messages be received has been imposed thus the recent message will now look like the one below:
<XML>
<transaction>
<messagetype>message</messagetype>
<message>Blah, Blah, Blah</message>
<messagetype>notification</messagetype>
<message>stopped</message>
<messagetype>notification</messagetype>
<message>started</message>
<messagetype>alert</messagetype>
<message>no service</message>
</transaction>
</XML>
What I want to know, is what approach will be more efficient:
a. Create a new Class/Method to catch all type of request and traverse through all the XML element then store it to an array, then iterate through the loop and pass each array (xml element node) to their respective parsers.
b. Edit each Parser to accomodate the changes. (I seem to see this a very, very tedious job)
c. Create one big parser, putting all parsing stuffs there and then traverse using switch cases (this disregarding all the existing parsers)
Also, take note that the element nodes can variably change during each responses. so the child nodes can be 1 to N ( where N is the limit ).
Are there any viable solution/s to this kind of scenario? I do not wish to re-write the existing code (one of the programmers virtues) but if its the only way, then so be it.
I am implementing this on iPhone using Objective-C
TIA

How do you think the "Quick Add" feature in Google Calendar works?

Am thinking about a project which might use similar functionality to how "Quick Add" handles parsing natural language into something that can be understood with some level of semantics. I'm interested in understanding this better and wondered what your thoughts were on how this might be implemented.
If you're unfamiliar with what "Quick Add" is, check out Google's KB about it.
6/4/10 Update
Additional research on "Natural Language Parsing" (NLP) yields results which are MUCH broader than what I feel is actually implemented in something like "Quick Add". Given that this feature expects specific types of input rather than the true free-form text, I'm thinking this is a much more narrow implementation of NLP. If anyone could suggest more narrow topic matter that I could research rather than the entire breadth of NLP, it would be greatly appreciated.
That said, I've found a nice collection of resources about NLP including this great FAQ.
I would start by deciding on a standard way to represent all the information I'm interested in: event name, start/end time (and date), guest list, location. For example, I might use an XML notation like this:
<event>
<name>meet Sam</name>
<starttime>16:30 07/06/2010</starttime>
<endtime>17:30 07/06/2010</endtime>
</event>
I'd then aim to build up a corpus of diary entries about dates, annotated with their XML forms. How would I collect the data? Well, if I was Google, I'd probably have all sorts of ways. Since I'm me, I'd probably start by writing down all the ways I could think of to express this sort of stuff, then annotating it by hand. If I could add to this by going through friends' e-mails and whatnot, so much the better.
Now I've got a corpus, it can serve as a set of unit tests. I need to code a parser to fit the tests. The parser should translate a string of natural language into the logical form of my annotation. First, it should split the string into its constituent words. This is is called tokenising, and there is off-the-shelf software available to do it. (For example, see NLTK.) To interpret the words, I would look for patterns in the data: for example, text following 'at' or 'in' should be tagged as a location; 'for X minutes' means I need to add that number of minutes to the start time to get the end time. Statistical methods would probably be overkill here - it's best to create a series of hand-coded rules that express your own knowledge of how to interpret the words, phrases and constructions in this domain.
It would seem that there's really no narrow approach to this problem. I wanted to avoid having to pull along the entirety of NLP to figure out a solution, but I haven't found any alternative. I'll update this if I find a really great solution later.

Resources