Classification of relationships in words? - machine-learning

I'm not sure whats the best algorithm to use for the classification of relationships in words. For example in the case of a sentence such as "The yellow sun" there is a relationship between yellow and sun. THe machine learning techniques I have considered so far are Baynesian Statistics, Rough Sets, Fuzzy Logic, Hidden markov model and Artificial Neural Networks.
Any suggestions please?
thank you :)

It kind of sounds like you're looking for a dependency parser. Such a parser will give you the relationship between any word in a sentence and its semantic or syntactic head.
The MSTParser uses an online max-margin technique known as MIRA to classify the relationships between words. The MaltParser package does the same but uses SVMs to make parsing decisions. Both systems are trainable and provide similar classification and attachment performance, see table 1 here.

Like the user dmcer pointed out, dependency parsers will help you. There is tons of literature on dependency parsing you can read. This book and these lecture notes are good starting points to introduce the conventional methods.
The Link Grammar Parser which is sorta like dependency parsing uses Sleator and Temperley's Link Grammar syntax for producing word-word linkages. You can find more information on the original Link Grammar page and on the more recent Abiword page (Abiword maintains the implementation now).
For an unconventional approach to dependency parsing, you can read this paper that models word-word relationships analogous to subatomic particle interactions in chemistry/physics.

The Stanford Parser does exactly what you want. There's even an online demo. Here's the results for your example.
Your sentence
The yellow sun.
Tagging
The/DT yellow/JJ sun/NN ./.
Parse
(ROOT
(NP (DT The) (JJ yellow) (NN sun) (. .)))
Typed dependencies
det(sun-3, The-1)
amod(sun-3, yellow-2)
Typed dependencies, collapsed
det(sun-3, The-1)
amod(sun-3, yellow-2)
From your question it sounds like you're interested in the typed dependencies.

Well, no one knows what the best algorithm for language processing is because it hasn't been solved. To be able to understand a human language is to create a full AI.
Hoever, there have, of course, been attempts to process natural languages, and these might be good starting points for this sort of thing:
X-Bar Theory
Phrase Structure Rules
Noam Chomsky did a lot of work on natural language processing, so I'd recommend looking up some of his work.

Related

Finding contradictory semantic sentences through natural language processing

I'm working on a project that aims to find conflicting Semantic Sentences (NLP - Semantic Search )
For example
Our text is: "I ate today. The lunch was very tasty. I was an honest guest."
Query: "I had lunch with my friend"
Do we want to give the query model and find the meaning of the sentences with a certain point in terms of synonyms and antonyms?
The solution that came to my mind was to first find the synonymous sentences and extract the key words from the synonymous sentences and then get the semantic opposite words and then find the semantic synonymous sentences based on these opposite words.
Do you think this idea is possible? If you have a solution or experience in this area, please reply
Thanks
You have not mentioned the exact use case for your problem so I am not sure if the solution I know will help your cause. But there is an approach in NLP (using Deep learning) which helps to find whether two sentences are correlated, unrelated or contradictory.
Below is the information about the pretrained model which is trained specifically for this task ->
https://huggingface.co/facebook/bart-large-mnli
The dataset on which the above model is trained is given here ->
https://huggingface.co/datasets/glue/viewer/mnli/train
You can check the dataset to verify if your use case is related to the classification task performed on the dataset.
Since the model is already pretrained, you do not need to perform any training and can jump straight to evaluation. Once you can somewhat satisfied with the results, you can fine tune the model a bit for your specific problem.
We can talk in comments if you need more clarification.

How do I implement a Semantic Parser to evaluate on a dataset?

As part of my bachelor-thesis I want to evaluate two Semantic Parsers on one common Dataset.
The two parsers are namely:
TRANX by Yin and Neubig: https://arxiv.org/abs/1810.02720 and
Seq2Action by Chen et al.: https://arxiv.org/abs/1809.00773
As I am relatively new to the topic I'm pretty much starting from scratch.
I don't know how to implement the parsers on my computer in order to run them on the ATIS or GeoQuery dataset.
It would be great if there is someone here who can help me with this problem as I have really no clue yet on how to approach this whole situation.

Neural Networks For Generating New Programming Language Grammars

I have recently had the need to create an ANTLR language grammar for the purpose of a transpiler (Converting one scripting language to another). It occurs to me that Google Translate does a pretty good job translating natural language. We have all manner of recurrent neural network models, LSTM, and GPT-2 is generating grammatically correct text.
Question: Is there a model sufficient to train on grammar/code example combinations for the purpose of then outputting a new grammar file given an arbitrary example source-code?
I doubt any such model exists.
The main issue is that languages are generated from the grammars and it is next to impossible to convert back due to the infinite number of parser trees (combinations) available for various source-codes.
So in your case, say you train on python code (1000 sample codes), the resultant grammar for training will be the same. So, the model will always generate the same grammar irrespective of the example source code.
If you use training samples from a number of languages, the model still can't generate the grammar as it consists of an infinite number of possibilities.
Your example of Google translate works for real life translation as small errors are acceptable, but these models don't rely on generating the root grammar for each language. There are some tools that can translate programming languages example, but they don't generate the grammar, work based on the grammar.
Update
How to learn grammar from code.
After comparing to some NLP concepts, I have a list of issue that may arise and a way to counter them.
Dealing with variable names, coding structures and tokens.
For understanding the grammar, we'll have to breakdown the code to its bare minimum form. This means understanding what each and every term in the code means. Have a look at this example
The already simple expression is reduced to the parse tree. We can see that the tree breaks down the expression and tags each number as a factor. This is really important to get rid of the human element of the code (such as variable names etc.) and dive into the actual grammar. In NLP this concept is known as Part of Speech tagging. You'll have to develop your own method to do the tagging, by it's easy given that you know grammar for the language.
Understanding the relations
For this, you can tokenize the reduced code and train using a model based on the output you are looking for. In case you want to write code, make use of a n grams model using LSTM like this example. The model will learn the grammar, but extracting it is not a simple task. You'll have to run separate code to try and extract all the possible relations learned by the model.
Example
Code snippet
# Sample code
int a = 1 + 2;
cout<<a;
Tags
# Sample tags and tokens
int a = 1 + 2 ;
[int] [variable] [operator] [factor] [expr] [factor] [end]
Leaving the operator, expr and keywords shouldn't matter if there is enough data present, but they will become a part of the grammar.
This is a sample to help understand my idea. You can improve on this by having a deeper look at the Theory of Computation and understanding the working of the automata and the different grammars.
What you're describing is 'just' learning structure of Context-Free Grammars.
I'm not sure if this approach will actually work for your case, but it's a long-standing problem in NLP: grammar induction for Context-Free Grammars. An example introduction how to tackle this problem using statistical learning methods can be found in Charniak's Statistical Language Learning.
Note that what I described is about CFGs in general, but you might want to check induction for LL grammars, because parser generators mostly use these types of grammars.
I know nothing about ANTLR, but there are pretty good examples of translating natural language e.g. into valid SQL requests: http://nlpprogress.com/english/semantic_parsing.html#sql-parsing.

Is there a machine learning method to learn the structure of a sentence instead of the words?

I'm trying to use HMM to do named entity recognition but later on, I found most of the sentences that contain the entities are very structured. For example:
What's Apple's price today? Than instead to teach the model to learn each word within the sentence, can i teach it to learn the structure of the sentence? Like every word after "What's " or "What is" should be the name of a kind of fruit?
Thanks!
Instead of using an HMM, consider using a conditional random field. They are very similar to HMMs, but are the discriminative version (in Ng and Jordan's terminology, HMMs and Linear Chain CRFs form a generative/discriminative pair).
The benefits of doing this are that you can define features of your word observation which are the POS tag of the current word, the POS tag of the previous word(s), etc, without making independence assumptions about these features. This would allow you to incorporate structural and lexical features into the same decision framework.
Edit: Here's the original paper. Here's a very comprehensive tutorial.
You could begin exploring that structure with something as simple as n-grams, or try something richer like grammar induction.

How do recursive ascent parsers work?

How do recursive ascent parsers work? I have written a recursive descent parser myself but I don't understand LR parsers all that well. What I found on Wikipedia has only added to my confusion.
Another question is why recursive ascent parsers aren't used more than their table-based counterparts. It seems that recursive ascent parsers have greater performance overall.
The clasical dragon book explains very well how LR parsers work. There is also Parsing Techniques. A Practical Guide. where you can read about them, if I remember well. The article in wikipedia (at least the introduction) is not right. They were created by Donald Knuth, and he explains them in his The Art of Computer Programming Volume 5. If you understand spanish, there is a complete list of books here posted by me. Not all that books are in spanish, either.
Before to understand how they work, you must understand a few concepts like first, follows and lookahead. Also, I really recommend you to understand the concepts behind LL (descendant) parsers before trying to understand LR (ascendant) parsers.
There are a family of parsers LR, specially LR(K), SLR(K) and LALR(K), where K is how many lookahead they need to work. Yacc supports LALR(1) parsers, but you can make tweaks, not theory based, to make it works with more powerful kind of grammars.
About performance, it depends on the grammar being analyzed. They execute in linear time, but how many space they need depends on how many states do you build for the final parser.
I'm personally having a hard time understanding how a function call can be faster -- much less "significantly faster" than a table lookup. And I suspect that even "significantly faster" is insignificant when compared to everything else that a lexer/parser has to do (primarily reading and tokenizing the file). I looked at the Wikipedia page, but didn't follow the references; did the author actually profile a complete lexer/parser?
More interesting to me is the decline of table-driven parsers with respect to recursive descent. I come from a C background, where yacc (or equivalent) was the parser generator of choice. When I moved to Java, I found one table-driven implementation (JavaCup), and several recursive descent implementations (JavaCC, ANTLR).
I suspect that the answer is similar to the answer of "why Java instead of C": speed of execution isn't as important as speed of development. As noted in the Wikipedia article, table-driven parsers are pretty much impossible to understand from code (back when I was using them, I could follow their actions but would never have been able to reconstruct the grammar from the parser). Recursive descent, by comparison, is very intuitive (which is no doubt why it predates table-driven by some 20 years).
The Wikipedia article on recursive ascent parsing references what appears to be the original paper on the topic ("Very Fast LR Parsing"). Skimming that paper cleared a few things up for me. Things I noticed:
The paper talks about generating assembly code. I wonder if you can do the same things they do if you're generating C or Java code instead; see sections 4 and 5, "Error recovery" and "Stack overflow checking". (I'm not trying to FUD their technique -- it might work out fine -- just saying that it's something you might want to look into before committing.)
They compare their recursive ascent tool to their own table-driven parser. From the description in their results section, it looks like their table-driven parser is "fully interpreted"; it doesn't require any custom generated code. I wonder if there's a middle ground where the overall structure is still table-driven but you generate custom code for certain actions to speed things up.
The paper referenced by the Wikipedia page:
"Very fast LR parsing" (1986)
http://portal.acm.org/citation.cfm?id=13310.13326
Another paper about using code-generation instead of table-interpretation:
"Very fast YACC-compatible parsers (for very little effort)" (1999)
http://www3.interscience.wiley.com/journal/1773/abstract
Also, note that recursive-descent parsing is not the fastest way to parse LL-grammar-based languages:
Difference between an LL and Recursive Descent parser?

Resources