I am using person-detection-action-recognition-0005 pre-trained model from openvino to detect the person and their action.
https://docs.openvinotoolkit.org/latest/_models_intel_person_detection_action_recognition_0005_description_person_detection_action_recognition_0005.html
From the above link, I wrote a python script to get detections.
This is the script.
import cv2
def main():
print(cv2.__file__)
frame = cv2.imread('/home/naveen/Downloads/person.jpg')
actionNet = cv2.dnn.readNet('person-detection-action-recognition-0005.bin',
'person-detection-action-recognition-0005.xml')
actionBlob = cv2.dnn.blobFromImage(frame, size=(680, 400))
actionNet.setInput(actionBlob)
# detection output
actionOut = actionNet.forward(['mbox_loc1/out/conv/flat',
'mbox_main_conf/out/conv/flat/softmax/flat',
'out/anchor1','out/anchor2',
'out/anchor3','out/anchor4'])
# this is the part where I dont know how to get person bbox
# and action label for those person fro actionOut
for detection in actionOut[2].reshape(-1, 3):
print('sitting ' +str( detection[0]))
print('standing ' +str(detection[1]))
print('raising hand ' +str(detection[2]))
Now, I don't know how to get bbox and action label from the output variable(actionOut). I am unable to find any documentation or blog explaining this.
Does someone have any idea or suggestion, how it can be done?
There is a demo called smart_classroom_demo: link
This demo uses the network you are trying to run.
The parsing of outputs is located here
The implementation is in C++ but it should help you to understand how outputs of the network are parsed.
Hope it will help.
Related
I am trying to use the Helsinki-NLP models from huggingface, but
I cannot find any instructions on how to do it.
The README files are computer generated and do not contain explanations.
Can some one point me to a getting started guide, or show an example of how to run a model like opus-mt-en-es?
On the model's page here there's a Use in Transformers link that you can use to see the code to load it in their transformers package as shown below:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-es-en")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-es-en")
then use it as you would any transformer model:
inp = "Me llamo Wolfgang y vivo en Berlin"
input_ids = tokenizer(inp, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids, num_beams=5, num_return_sequences=3)
print("Generated:", tokenizer.batch_decode(outputs, skip_special_tokens=True))
Output:
Generated: ['My name is Wolfgang and I live in Berlin', 'My name is Wolfgang and I live in Berlin.', "My name's Wolfgang and I live in Berlin."]
To use on the fly, you can check the huggingFace course here. They provide pipelines that help you run this on the fly, consider:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-es-en")
translator("your-text-to-translate-here")
I am currently using jupyter-manim since it is the most efficient way for me to use manim. I'm running my code on Kaggle and every time I use TextMobject in manim, it outputs an error that says Latex error converting to dvi. See log output above or the log file: media/Tex/54dfbfee288272f0.log. I've tried TexMobject and Text function, but only the Text function works. The Text function is limited however, and I'm not sure how to change the font. Is there a way to fix this or is it something that comes with using jupyter-manim? It seems that all the other functions work such as drawing shapes, animating scenes, etc.
%%manim
class Text(Scene):
def construct(self):
first_line = TextMobject('Hi')
second_line = TexMobject('Hi')
#Only one that works
third_line = Text('Hi')
I tried your Manim program and it worked as expected for me. I would try making sure
include from manimlib.imports import * in your first line (importing Manim library)
include self.play(...) so you can see them
I think you already have these, but I'm putting them in case you don't.
You may also be getting the error because you do not have a LaTeX distribution installed on your system (i.e. MikTex or Texlive).
I think part of your problem may be the name of the class you chose. I had problems with your code until I changed the name from Text to TextTest. Here is a minimally working example that works fine in my Jupyter notebook (after running import jupyter_manim of course).
%%manim TextTest -p -ql
from manim import *
class TextTest(Scene):
def construct(self):
first_line = TextMobject('Hi 1')
second_line = TexMobject('Hi 2').shift(DOWN)
third_line = Text('Hi 3').shift(UP)
self.add(first_line)
self.add(second_line)
self.add(third_line)
self.wait(1)
Also, you should be aware that TextMobject and TexMobject have been deprecated.
I am trying to use Haar cascade classifier for object detection.I have copied a code for haar cascade algorithm but its not working.It's giving error as
unknown url type: '//drive.google.com/drive/folders/11XfAPOgFv7qJbdUdPpHKy8pt6aItGvyg'
even though this link is working.
import urllib.request, urllib.error, urllib.parse
import cv2
import os
def store_raw_images():
neg_images_link = '//drive.google.com/drive/folders/11XfAPOgFv7qJbdUdPpHKy8pt6aItGvyg'
neg_image_urls = urllib.request.urlopen(neg_images_link).read().decode()
pic_num = 1
if not os.path.exists('neg'):
os.makedirs('neg')
for i in neg_image_urls.split('\n'):
try:
print(i)
urllib.request.urlretrieve(i, "neg/"+str(pic_num)+".jpg")
img = cv2.imread("neg/"+str(pic_num)+".jpg",cv2.IMREAD_GRAYSCALE)
# should be larger than samples / pos pic (so we can place our image on it)
resized_image = cv2.resize(img, (100, 100))
cv2.imwrite("neg/"+str(pic_num)+".jpg",resized_image)
pic_num += 1
except Exception as e:
print(str(e))
store_raw_images()
I am expecting output as set of negative images for creating dataset module for object detection.
I think the missing "https:" at the start of the url is the causing the specific error.
Furthermore, you cannot just load a drive folder when it is not shared (you should use the drive link) and event then it is not optimal, you have to parse the html response and it may not even work.
I strongly suggest you to use a normal HTTP server or the Google Drive python API.
I am getting the following error when trying to run camera_calibration.cpp in the calib3d module of OpenCV's tutorial code samples, in streaming mode: Input does not exist: Invalid input detected. Application stopping. The input line in the xml input file looks like this: <Input>0</Input>, which should get the "0th" camera on my system.
I fixed this by enclosing the camera index in double quotes: <Input>0</Input> became <Input>"0"</Input>. (Simplistic parser... oh well, hope this helps some people.)
I am using the detectMSERFeatures function in the computer vision toolbox of MATLAB and have been running into a few errors. I have a black and white image that I am reading in to detect the features of, however I want to invert the image before running the feature detection or I am filtering for the red in an image. Therefore, either way I have a binary image that I am trying to use in detectMSERFeatures. I know that does not work, but I have tried several conversions to a usable format and none of them have seemed to work. detectMSERFeatures will pick up features if I use rgb2gray on the original image, but not if I try to convert it. Here is everything I have tried so far:
Target1=imread('Decal0.JPG');
Target1bw=~im2bw(Target1);
Target=uint8(Target1bw);
[m,n]=size(Target);
regionsTarget = detectMSERFeatures(Target, 'MaxAreaVariation',0.15,...
'ThresholdDelta',15, 'RegionAreaRange',[10000 (m*n)/2]);
Target1=imread('Decal0.JPG');
Target1bw=~im2bw(Target1);
Target=im2double(Target1bw);
regionsTarget = detectMSERFeatures(Target, 'MaxAreaVariation',0.15,...
'ThresholdDelta',15, 'RegionAreaRange',[10000 (m*n)/2]);
Target1=imread('Decal0.JPG');
Target1bw=~im2bw(Target1);
Target2=255*Target1bw;
[m,n]=size(Target2);
Target3=zeros(m,n,3);
Target3(:,:,1)=Target2;
Target3(:,:,2)=Target2;
Target3(:,:,3)=Target2;
Target3=uint8(Target3);
Target=rgb2gray(Target3);
regionsTarget = detectMSERFeatures(Target, 'MaxAreaVariation',0.15,...
'ThresholdDelta',15, 'RegionAreaRange',[10000 (m*n)/2]);
What have I done incorrectly?
I brought the question up to Mathworks and it was a bug in MATLAB. Here is their response:
"We have detected a bug in detectMSERFeatures when it handles binary images. A workaround would be is to use regionprops to detect the regions for binary images. Then, MSERRegions can be constructed as follows:
props = regionprops(im2bw(newGrayTarget),'PixelList');
pixlist = {}
for i = 1:numel(props)
pixlist = [pixlist; int32(props(i).PixelList)]; end
r = MSERRegions(pixlist);
Thanks for the help!