I need someone to suggest a dataset for palm trees, I am doing my computer vision project for detecting palm trees, but regarding to the short period I have I needed to have a ready made dataset for palms with high resolution! any suggestions?
Probably your best bet is to use ImageNet. It's a huge dataset that contains a lot of images and label categories. You can find a list of the categories contained and their corresponding codes here. Palm trees have the code n12582231, so you can use this link to access the URLs of the images which contain palm trees. You haven't stated what you're trying to do with respect to the palm trees (are you trying to localize them? Detect whether the image contains a palm tree? Identify different types?), but hopefully this gives you a good place to start. If you need more information with regards to using ImageNet, you should read through the ImageNet API.
Edit: Based on the feedback that you are trying to count palm trees, then you might be able to use the bounding boxes included in ImageNet (more information here). Using the code for palm trees, you can download the bounding boxes for the palm tree images from this link. Unfortunately, these are just bounding boxes, not tight object masks, so that may not work for your purposes. However, the bounding boxes should be one per instance, so you should at least be able to use them to give you your count.
Related
This might be a very broad question so I'm sorry in advance. I'd like to also point out I'm new in the CV field, so my insight in this field is minimum.
I am trying to find correspondences between points from a FLIR image and a VIS image. I'm currently building 40x40 pixels regions around keypoints, over which I'm applying the LoG. I'm trying to compare them to find the most similar regions.
For example, I have these data sets:
Where the columns represent, in this order:
the image for which I'm trying to find a correspondent
the candidate images
the LoG of the first column
the LoG of the second column
It is very clear, for the human eye, that the third image is the best match for the first set, while the first image is the best image for the second set.
I have tried various ways of expressing a similarity/disimilarity between these images, such as SSD, Cross Correlation, or Mutual Information, but they all fail to be consistent (they only work in some cases).
Now, my actual question is:
What should I use to express the similarity between images in a more semantic way, such that shapes would be more important in deciding the best match, rather than actual intensities of the pixels? Do you know of any technique that would aid me in my quest of finding these matches?
Thank you!
Note: I'm using OpenCV with Python right now, but the programming language and library is not important.
Shown above is a sample image of runway that needs to be localized(a bounding box around runway)
i know how image classification is done in tensorflow, My question is how do I label this image for training?
I want model to output 4 numbers to draw bounding box.
In CS231n they say that we use a classifier and a localization head.
but how does my model knows where are the runnway in 400x400 images?
In short How do I LABEL this image for training? So that after training my model detects and localizes(draw bounding box around this runway) runways from input images.
Please feel free to give me links to lectures, videos, github tutorials from where I can learn about this.
**********Not CS231n********** I already took that lecture and couldnt understand how to solve using their approach.
Thanks
If you want to predict bounding boxes, then the labels are also bounding boxes. This is what most object detection systems use for training. You can just have bounding box labels, or if you want to detect multiple object classes, then also class labels for each bounding box would be required.
Collect data from google or any resources that contains only runway photos (From some closer view). I would suggest you to use a pre-trained image classification network (like VGG, Alexnet etc.) and fine tune this network with downloaded runway data.
After building a good image classifier on runway data set you can use any popular algorithm to generate region of proposal from the image.
Now take all regions of proposal and pass them to classification network one by one and check weather this network is classifying given region of proposal as positive or negative. If it classifying as positively then most probably your object(Runway) is present in that region. Otherwise it's not.
If there are a lot of region of proposal in which object is present according to classifier then you can use non maximal suppression algorithms to reduce number of positive proposals.
What image recognition technology is good at identifying a low resolution object?
Specifically I want to match a low resolution object to a particular item in my database.
Specific Example:
Given a picture with bottles of wine, I want to identify which wines are in it. Here's an example picture:
I already have a database of high resolution labels to match to.
Given a high-res picture of an individual bottle of wine - it was very easy to match it to its label using a Vuforia (service for some image recongition). However the service doesn't work well for lower resolution matching, like the bottles in the example image.
Research:
I'm new to this area of programming, so apologies for any ambiguities or obvious answers to this question. I've been researching, but theres a huge breadth of technologies out there for image recognition. Evaluating each one takes a significant amount of time, so I'll try to keep this question updated as I research them.
OpenCV: seems to be the most popular open source computer vision library. Many modules, not sure which are applicable yet.
haar-cascade feature detection: helps with pre-processing an image by orienting a component correctly (e.g. making a wine label vertical)
OCR: good for reading text at decent resolutions - not good for low-resolution labels where a lot of text is not visible
Vuforia: a hosted service that does some types of image recognition. Mostly meant for augmented reality developers. Doesn't offer control over algorithm. Doesn't work for this kind of resolution
I am trying to create spatial representation of features. Basically, an image is subdivided into grids, e.g. 4 grids, and features are detected for each grid. Features are clustered into visual words. Histograms are created for each grid and then I can match the corresponding grids with histogram intersection. Here is the paper http://www.vision.caltech.edu/Image_Datasets/Caltech101/cvpr06b_lana.pdf that I am working on it. First of all, how can I subdivide an image and detect features? I found out GridAdaptedFeatureDetector in Opencv but I do not know how to get features for particular grid. I can define a region of interest and detect features separately and add them into histogram but this sounds complicated and time wasting. Maybe there is an easy way to do. Any ideas are appreciated. Thanks in advance.
Your question is basically how one could implement her paper. The good news are that prof. Lazebnik has shared the source code or her Spatial Pyramid here:
http://web.engr.illinois.edu/~slazebni/research/SpatialPyramid.zip
Nevertheless, it is a matlab implementation that you would have to convert to OpenCV if you want.
You can also take a look at here slides and the dataset used is here.
Imagine a huge 360° panoramic shot from security camera. I need to find people on this image. Suppose I already have some great classifier to tell if some picture is a picture of a single human being.
What I don't understand - is how to apply this classifier to the panoramic shot, which can contain multiple people? Should I apply it to all possible regions of the image? Or there is some way to search for "interesting points" and feed only regions around this points to my classifier?
Which keywords to google / algorythms to read about to find out ways to search regions of image, that may (or may not) contain information for following classification?