I read different documents how CRF(conditional random field) works but all the papers puts the formula only. Is there any one who can send me a paper that describes about CRF with examples like if we have a sentence
"Mr.Smith was born in New York. He has been working for the last 20 years in Microsoft company."
if the above sentence is given as an input to train, how does the Model works during the training taking in to consideration for the formula for CRF?
Smith is tagged as "PER" New York is as "LOC" Microsoft Company as "ORG".
Moges.A
Here is a link to a set of slides made by Shasha Rush, a PhD student who is currently working on NLP at Google. One of the reasons I really like the slides is because they contain concrete examples and walk you through executions of important algorithms.
It is not a paper, but there is available whole online free course on probabilistic graphical models -- CRF is one of them.
It is very definitive and you'll get an intuitive level of understanding after completing it.
I don't think somebody will write such tutorial. You can check HMM tutorial which is easier to understand and can be explained by example. The problem with CRF is that it is global optimization with many dependencies, so it is very hard to show step by step how we optimize parameters and how we predict labels. But the idea is very simple - maximization of dependency(clique) graph using sparsity...
Related
I am trying to finetune gpt2 for a generative question answering task.
Basically I have my data in a format similar to:
Context : Matt wrecked his car today.
Question: How was Matt's day?
Answer: Bad
I was looking on the huggingface documentation to find out how I can finetune GPT2 on a custom dataset and I did find the instructions on finetuning at this address:
https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling
The issue is that they do not provide any guidance on how your data should be prepared so that the model can learn from it. They give different datasets that they have available, but none is in a format that fits my task well.
I would really appreciate if someone with more experience could help me.
Have a nice day!
Your task right now is ambiguous, it could be any of:
QnA via Classification (answer is categorical)
QnA via Extraction (answer is in the text)
QnA via Language Modeling (answer can be anything)
Classification
If all you're examples have Answer: X, where X is categorical (i.e. always "Good", "Bad", etc ...), you can do classification.
In this setup, you'd would have text-label pairs:
Text
Context: Matt wrecked his car today.
Question: How was Matt's day?
Label
Bad
For classification, you're probably better off just fine-tuning a BERT style model (something like RoBERTTa).
Extraction
If all you're examples have Answer: X, where X is a word (or consecutive words) in the text (for example), then it's probably best to do a SQuAD-style fine-tuning with a BERT-style model. In this setup, you're input is (basically) text, start_pos, end_pos triplets:
Text
Context: In early 2012, NFL Commissioner Roger Goodell stated that the league planned to make the 50th Super Bowl "spectacular" and that it would be "an important game for us as a league".
Question: Who was the NFL Commissioner in early 2012?
Start Position, End Position
6, 8
Note: The start/end position values of course positions of tokens, so these values will depend on how you tokenize your inputs
In this setup, you're also better off using a BERT-style model. In fact, there are already models on huggingface hub trained on SQuAD (and similar datasets). They should already be good at these tasks out of the box (but you can always fine-tune on top of this).
Language Modeling
If all you're examples have Answer: X, where X can basically be anything (it need not be contained in the text, and is not categorical), then you'd need to do language modeling.
In this setup, you have to use a GPT-style model, and your input would just be the whole text as is:
Context: Matt wrecked his car today.
Question: How was Matt's day?
Answer: Bad
There is no need for labels, since the text itself is the label (we're asking the model to predict the next word, for each word). Larger models like GPT-3 and https://cohere.com (full disclosure, I work at Cohere) should be good at these tasks without any finetuning (if you give it the right prompt + examples), but of course, these are accessed behind APIs. These platforms also allow you to fine-tune models (via language modeling), so you don't need to run any code yourself. Not sure how much mileage you'll get with finetuning a smaller model like GPT-2. If this project is for learning, then yeah, definitely go ahead and fine-tune a GPT-2 model! But if performance is key, I highly recommend using a solution like https://cohere.com, which will just work out of the box.
I'm working on a project that aims to find conflicting Semantic Sentences (NLP - Semantic Search )
For example
Our text is: "I ate today. The lunch was very tasty. I was an honest guest."
Query: "I had lunch with my friend"
Do we want to give the query model and find the meaning of the sentences with a certain point in terms of synonyms and antonyms?
The solution that came to my mind was to first find the synonymous sentences and extract the key words from the synonymous sentences and then get the semantic opposite words and then find the semantic synonymous sentences based on these opposite words.
Do you think this idea is possible? If you have a solution or experience in this area, please reply
Thanks
You have not mentioned the exact use case for your problem so I am not sure if the solution I know will help your cause. But there is an approach in NLP (using Deep learning) which helps to find whether two sentences are correlated, unrelated or contradictory.
Below is the information about the pretrained model which is trained specifically for this task ->
https://huggingface.co/facebook/bart-large-mnli
The dataset on which the above model is trained is given here ->
https://huggingface.co/datasets/glue/viewer/mnli/train
You can check the dataset to verify if your use case is related to the classification task performed on the dataset.
Since the model is already pretrained, you do not need to perform any training and can jump straight to evaluation. Once you can somewhat satisfied with the results, you can fine tune the model a bit for your specific problem.
We can talk in comments if you need more clarification.
I am new to the machine learning and wanted to work on this problem statement.
I have got some of the user comments about products and based on those comments, my model should summarize and give me the output for that text sentence.
Example :-
User commented "Device Battery is heating up", based on this comment my model should summarize this to "Battery issue".
User commented "Cracked Screen", based on this comment my model should summarize this to "Display issue".
Can anyone suggest me which model should be the best fit for my problem statement or any model code samples would be really helpful.
I have tried with TF-IDF, and MB Naive bayes classifier but those are not helpful. I think topic modelling can help me here.
this sounds like a problem for an encoder-decoder DNN. You can translate the words e.g. with word2vec to vectors and feed the sentences into an encoder-model. This will give you another vector for a second model (decoder) which gives you the final classification.
For reference take a look at the deep learning course by Andrew Ng on Coursera (Sequence models).
I have a data that represents comments from the operator on various activities performed on a industrial device. The comments, could reflect either a routine maintainence/replacement activity or could represent that some damage occured and it had to be repaired to rectify the damage.
I have a set of 200,000 sentences that needs to be classified into two buckets - Repair/Scheduled Maintainence(or undetermined). These have no labels, hence looking for an unsupervised learning based solution.
Some sample data is as shown below:
"Motor coil damaged .Replaced motor"
"Belt cracks seen. Installed new belt"
"Occasional start up issues. Replaced switches"
"Replaced belts"
"Oiling and cleaning done".
"Did a preventive maintainence schedule"
The first three sentences have to be labeled as Repair while the second three as Scheduled maintainence.
What would be a good approach to this problem. though I have some exposure to Machine learning I am new to NLP based machine learning.
I see many papers related to this https://pdfs.semanticscholar.org/a408/d3b5b37caefb93629273fa3d0c192668d63c.pdf
https://arxiv.org/abs/1611.07897
but wanted to understand if there is any standard approach to such problems
Seems like you could use some reliable keywords (verbs it seems in this case) to create training samples for an NLP Classifier. Or you could use KMeans or KMedioids clustering and use 2 as K, which would do a pretty good job of separating the set. If you want to get really involved, you could use something like Latent Derichlet Allocation, which is a form of unsupervised topic modeling. However, for a problem like this, on the small amount of data you have, the fancier you get the more frustrated with the results you will become IMO.
Both OpenNLP and StanfordNLP have text classifiers for this, so I recommend the following if you want to go the classification route:
- Use key word searches to produce a few thousand examples of your two categories
- Put those sentences in a file with a label based on the OpenNLP format (label |space| sentence | newline )
- Train a classifier with the OpenNLP DocumentClassifier, and I recommend stemming for one of your feature generators
- after you have the model, use it in Java and classify each sentence.
- Keep track of the scores, and quarantine low scores (you will have ambiguous classes I'm sure)
If you don't want to go that route, I recommend using a text indexing technology de-jeur like SOLR or ElasticSearch or your favorite RDBMS's text indexing to perform a "More like this" type function so you don't have to play the Machine learning continuous model updating game.
I want to write a Learning Algo which can automatically create summary of articles .
e.g, there are some fiction novels(one category considering it as a filter) in PDF format. I want to make an automated process of creating its summary.
We can provide some sample data to implement it in supervised learning approach.
Kindly suggest me how can i implement this properly.
I am a beginner & am pursuing Andrew Ng course and aware of some common algorithms(linear reg, logistic , neural net) + Udacity Statistics courses and ready to dive more into NLP , Deep learning etc. but motive is to solve this. :)
Thanks in advance
The keyword is Automatic Summarization.
Generally, there are two approaches to automatic summarization: extraction and abstraction.
Extractive methods work by selecting a subset of existing words, phrases, or sentences in the original text to form the summary.
Abstractive methods build an internal semantic representation and then use natural language generation techniques to create a summary that is closer to what a human might generate.
Abstractive summarization is a lot more difficult. An interesting, approach is described in A Neural Attention Model for Abstractive Sentence Summarization by Alexander M. Rush, Sumit Chopra, Jason Weston (source code based on the paper here).
A "simple" approach is used in Word (AutoSummary Tool):
AutoSummarize determines key points by analyzing the document and assigning a score to each sentence. Sentences that contain words used frequently in the document are given a higher score. You then choose a percentage of the highest-scoring sentences to display in the summary.
You can select whether to highlight key points in a document, insert an executive summary or abstract at the top of a document, create a new document and put the summary there, or hide everything but the summary.
If you choose to highlight key points or hide everything but the summary, you can switch between displaying only the key points in a document (the rest of the document is hidden) and highlighting them in the document. As you read, you can also change the level of detail at any time.
Anyway automatic data (text) summarization is an active area of machine learning / data mining with many ongoing researches. You should start reading some good overviews:
Summarization evaluation: an overview by Inderjeet Mani.
A Survey on Automatic Text Summarization by Dipanjan Das André F.T. Martins (emphasizes extractive approaches to summarization using statistical methods).