How to run huggingface Helsinki-NLP models - machine-learning

I am trying to use the Helsinki-NLP models from huggingface, but
I cannot find any instructions on how to do it.
The README files are computer generated and do not contain explanations.
Can some one point me to a getting started guide, or show an example of how to run a model like opus-mt-en-es?

On the model's page here there's a Use in Transformers link that you can use to see the code to load it in their transformers package as shown below:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-es-en")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-es-en")
then use it as you would any transformer model:
inp = "Me llamo Wolfgang y vivo en Berlin"
input_ids = tokenizer(inp, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids, num_beams=5, num_return_sequences=3)
print("Generated:", tokenizer.batch_decode(outputs, skip_special_tokens=True))
Output:
Generated: ['My name is Wolfgang and I live in Berlin', 'My name is Wolfgang and I live in Berlin.', "My name's Wolfgang and I live in Berlin."]

To use on the fly, you can check the huggingFace course here. They provide pipelines that help you run this on the fly, consider:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-es-en")
translator("your-text-to-translate-here")

Related

Generate files with one input to multiply outputs

I'm trying to create a code generator that takes input a JSON file and generates multiple classes in multiple files.
And my question is, is it possible to create multiple files for one input using build from dart lang?
Yes it is possible. There are currently many tools in available on pub.dev that have code generation. For creating a simple custom code generator, check out the package code_builder provided by the core Dart team.
You can use dart_style as well to format the output of the code_builder results.
Here is a simple example of the package in use (from the package's example):
import 'package:code_builder/code_builder.dart';
import 'package:dart_style/dart_style.dart';
final _dartfmt = DartFormatter();
// The string of the generated code for AnimalClass
String animalClass() {
final animal = Class((b) => b
..name = 'Animal'
..extend = refer('Organism')
..methods.add(Method.returnsVoid((b) => b
..name = 'eat'
..body = refer('print').call([literalString('Yum!')]).code)));
return _dartfmt.format('${animal.accept(DartEmitter())}');
}
In this example you can use the dart:io API to create a File and write the output from animalClass() (from the example) to the file:
final animalDart = File('animal.dart');
// write the new file to the disk
animalDart.createSync();
// write the contents of the class to the file
animalDart.writeAsStringSync(animalClass());
You can use the File API to read a .json from the path, then use jsonDecode on the contents of the file to access the contents of the JSON config.

ValueError: [E143] Labels for component 'tagger' not initialized

I've been following this tutorial to create a custom NER. However, I keep getting this error:
ValueError: [E143] Labels for component 'tagger' not initialized. This can be fixed by calling add_label, or by providing a representative batch of examples to the component's initialize method.
This is how I defined the spacy model:
import spacy
from spacy.tokens import DocBin
from tqdm import tqdm
nlp = spacy.blank("ro") # load a new spacy model
source_nlp = spacy.load("ro_core_news_lg")
nlp.tokenizer.from_bytes(source_nlp.tokenizer.to_bytes())
nlp.add_pipe("tagger", source=source_nlp)
doc_bin = DocBin() # create a DocBin object
I just meet the same problem. The picture of setting the config file is misleading you.
If you just want to run through the tutrital, you can set the config file like this.
only click the check box on ner

Jenkins Active Choices Plugin, Dynamically fill checkbox parameters from yaml file

I am new to Jenkins and keep exploring different plugins, articles everyday to learn something new about Jenkins.
I found that, the parameters can be filled before the build using the active choice parameter plugin from the json file uploaded to the Jenkins server. I had googled it but couldn't find the right explanation. The hint I got was the use of the Groovy script in the Active Choice parameter. If its possible, please let me know.
I appreciate #Noam Helmer for the reference, but how can i capture the group1, group2, group3, group4,... from below file. all, children, hosts no matter how many groups are added to this file, it will remain the same.
all:
children:
group1:
hosts:
xyz4.axs:
group2:
hosts:
xyz5.adf:
xyz8.asf:
group3:
hosts:
xyz3.asd:
xyz6.ads:
xyz7.asd:
I tried the following script but was unable to get the group name.
import org.yaml.snakeyaml.Yaml
List groups = []
Yaml parser = new Yaml()
def example = parser.load(("/var/lib/jenkins/workspace/groups/hosts" as File).text)
for (details in example){
groups.add(details)
}
println(groups)
Now, I don't know if I'm on the right track, please give me a hint.
Any help in advance would be appreciated.
Finally found the answer, once again thanks #NoamHelmer for replying.
import org.yaml.snakeyaml.Yaml
List groups = []
Yaml parser = new Yaml()
def example = parser.load(("path to file" as File).text)
for(anf in example.keySet()){
groups.add(anf)
}
for(inf in example.all.children.keySet()){
groups.add(inf)
}
return(groups)
the output will be :[all, group1, group2, group3]

Unable to use openvino model

I am using person-detection-action-recognition-0005 pre-trained model from openvino to detect the person and their action.
https://docs.openvinotoolkit.org/latest/_models_intel_person_detection_action_recognition_0005_description_person_detection_action_recognition_0005.html
From the above link, I wrote a python script to get detections.
This is the script.
import cv2
def main():
print(cv2.__file__)
frame = cv2.imread('/home/naveen/Downloads/person.jpg')
actionNet = cv2.dnn.readNet('person-detection-action-recognition-0005.bin',
'person-detection-action-recognition-0005.xml')
actionBlob = cv2.dnn.blobFromImage(frame, size=(680, 400))
actionNet.setInput(actionBlob)
# detection output
actionOut = actionNet.forward(['mbox_loc1/out/conv/flat',
'mbox_main_conf/out/conv/flat/softmax/flat',
'out/anchor1','out/anchor2',
'out/anchor3','out/anchor4'])
# this is the part where I dont know how to get person bbox
# and action label for those person fro actionOut
for detection in actionOut[2].reshape(-1, 3):
print('sitting ' +str( detection[0]))
print('standing ' +str(detection[1]))
print('raising hand ' +str(detection[2]))
Now, I don't know how to get bbox and action label from the output variable(actionOut). I am unable to find any documentation or blog explaining this.
Does someone have any idea or suggestion, how it can be done?
There is a demo called smart_classroom_demo: link
This demo uses the network you are trying to run.
The parsing of outputs is located here
The implementation is in C++ but it should help you to understand how outputs of the network are parsed.
Hope it will help.

Inception5h vs Inception V4, what is 5h

I have been following github repository for "Tensorflow on Android".
This link, shows all the inception models but not inception5h.
The demo application for tensorflow on github uses inception5h, as shown here
new_http_archive(
name = "inception5h",
build_file = "models.BUILD",
url = "https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip",
sha256 = "d13569f6a98159de37e92e9c8ec4dae8f674fbf475f69fe6199b514f756d4364"
)
Please explain
1.Why it is inception5h and not inceptionV5?
2.Why is inception5h not listed in the models link above?
Inception 5h is equivalent to Inception V1. This just comes down to a bit of confusion of what versioning scheme we were publishing things under :)

Resources