I've started a project titled "Multi-Modal (Image, Audio, Text) Analysis of Personality Traits", I need refernces to dataset that has image , audio and speech annotation of video which are also labelled w.r.t "Big Five Personality Traits", Thank you in advance!
I tried searching many research papers, most of them didn't have their code open or their data didn't meet my requirement and I tried to access chalearn first impression dataset but couldn't get access to it.
Related
it is my first time dealing with audio data and i'm having a bit of trouble extracting the freature of a real sample audio to pass on the model for evaluation especially that i didn't really understand how the data was pre-processed from the paper below and "The exact order of appearance of the features is not known"
as stated in the Attribute Information.
https://aclanthology.org/H90-1075.pdf
any help would be appreciated.
i tried searching for "spectral coefficients; contour features, sonorant features, pre-sonorant features, and post-sonorant features" but got with a lot of information to start with.
So I'm working on a project for my University where our current plan is to use the YouTube API and do some data analysis. We have some ideas since we're looking at the Terms of Service and the Developer Policies, but we're not entirely sure about a few things.
Our project does not focus on things such as monetary gain or predicting estimated income from a video or anything of that nature, or anything regarding trying to determine user data such as passwords/usernames, etc. It's much more about the content and statistics of the videos rather than anything else.
Our current ideas that we want to be sure would be ok to do and use:
Determine the category of a video given its title
Determine the category of a video given its tags
Determine the category of a video given its description
Determine the category of a video given its thumbnail
Some combination of above to create an ensemble model
Clustering on videos category/view counts
Sentiment analysis on comments
Trending topics over time
These are just a vague list for now but I would love to be able to reach out more to figure out what all we're allowed to use the data for.
I want to make an application for counting game statistics automatically. For that purpose, I need some sort of computer vision for handling screenshots of the game.
There are bunch of regions with different skills in always the same place that app needs to recognize. I assume that it should have a database of pictures or maybe some trained samples.
I've started to learn opencv lib, but not sure what will be better for this purpouse.
Would you please give me some hints or algorithms that I could use?
Here is the example of game screenshot.
You can covert it into gray scale and then use any haar cascade classifier to read the words in that image and then save it into any file format (csv) this way you can utilize your game pics for gathering data so that you can train your models
So I am working on a project for school and what we are trying to do is to teach a neural network to recognize buildings from non-buildings. The problem I am having right now is representing the data in a form, that would be "readable" by the classifier function.
The training data is a bunch of pictures + .wkt file with coordinates of buildings on a picture. So far we have been able to rescale the polygons, but kinda got stuck there.
Can you give any hints or ideas of how to bring this all to an appropriate form?
Edit: I do not need the code written for me, a link to an article on a similar subject or a book is more of stuff I am looking for.
You did not mention what framework you are using, but I will give an answer for caffe.
Your problem is very close to detecting objects within an image. You have full images with object (building in your case) bounding boxes.
The easiest way of doing this is through a python data layer which reads an image and a file with stored coordinates for that image and feeds that into your network. A tutorial on how to use it can be found here: https://github.com/NVIDIA/DIGITS/tree/master/examples/python-layer
To accelerate the process you may want to store image, coordinate pairs in your custom lmdb database.
Finally a good working example with complete caffe implementation can be found within Faster-RCNN library here: https://github.com/rbgirshick/caffe-fast-rcnn/
You should check roi_pooling_layer.cpp in their custom caffe branch and roi_data_layer on how the data is fed into the network.
I'm currently reading for BSc Creative Computing with the University of London and I'm in my last year of my studies. The only remaining module I have left in order to complete the degree is the Project.
I'm very interested in the area of content-based image retrieval and my project idea is based on that concept. In a nutshell, my idea is to help novice artists in drawing sketches in perspective with the use of 3D models as references. I intend to achieve this by rendering the side/top/front views of each 3D model in a collection, pre-process these images and index them. While drawing, the user gets a series of models (that have been pre-processed) that best match his/her sketch, which can be used as guidelines to further enhance the sketch. Since this approach relies on 3D models, it is also possible for the user to rotate the sketch in 3D space and continue drawing based on that perspective. Such approach could help comic artists or concept designers in quickly sketching their ideas.
While carrying out my research I came across LIRe and I must say I was really impressed. I've downloaded the LIRe demo v0.9 and I played around with the included sample. I've also developed a small application which automatically downloades, indexes and searches for similar images in order to better understand the inner workings of the engine. Both approaches returned very good results even with a limited set of images (~300).
Next experiment was to test the output response when a sketch rather than an actual image is provided as input. As mentioned earlier, the system should be able to provide a set of matching models based on the user's sketch. This can be achieved by matching the sketch with the rendered images (which are of course then linked to the 3D model). I've tried this approach by comparing several sketches to a small set of images and the results were quite good - see http://claytoncurmi.net/wordpress/?p=17. However when I tried with a different set of images, results weren't as good as the previous scenario. I used the Bag of Visual Words (using SURF) technique provided by LIRe to create and search through the index.
I'm also trying out some sample code that comes with OpenCV (I've never used this library and I'm still finding my way).
So, my questions are;
1..Has anyone tried implementing a sketch-based image retrieval system? If so, how did you go about it?
2..Can LIRe/OpenCV be used for sketch-based image retrieval? If so, how this can be done?
PS. I've read several papers about this subject, however I didn't find any documentation about the actual implementation of such system.
Any help and/or feedback is greatly appreciated.
Regards,
Clayton