I have image with wood trunks.
I have to detect each wooden trunk individually. It looks similar like following image:
wooden trunks
Do you have any ideas about approaches how to do that?
Should I use Al? Or just machine learning like SVM? Or some pattern recognition algorithm?
Or I can train it.
training dataset
I tried to detect circles/ellipses, but it doesnt have good results.
I also read that wood reflect red color.
But I dont have so much experience with OpenCV, so I dont know which approach is the best for this task.
Thank you for your help
I think retraining YOLO seems like a good option:
https://github.com/AlexeyAB/darknet
You'll need about 2,000 labeled images, plus image augmentation. I've used this library for image augmentation for YOLO:
https://github.com/aleju/imgaug/
Related
I have a problem statement to recognize 10 classes of different variations(variations in color and size) of same object (bottle cap) while falling taking into account the camera sees different viewpoint of the object. I have split this into sub-tasks
1) Trained a deep learning model to classify only the flat surface of the object and successful in this attempt.
Flat Faces of sample 2 class
2) Instead of taking fall into account, trained a model for possible perspective changes - not successful.
Perception changes of sample 2 class
What are the approaches to recognize the object even for perspective changes. I am not constrained to arrive with a single camera solution. Open to ideas in approaching towards this problem of variable perceptions.
Any help could be really appreciated, Thanks in advance!
The answer I want to give you is: CapsNets
You should definately check out the paper, where you will be introduced to some short comings of CNNs and how they tried to fix them.
That said, I find it hard to believe that your architecture cannot solve the problem successfully when the perspective changes. Is your dataset extremely small? I'd expect the neural network to learn filters for the riffled edges, which can be seen from all perspectives.
If you're not limited to one camera you could try to train a "normal" classifier, which you feed multiple images in production and average the prediction. Or you could build an architecture that takes in multiple perspectives at once. You have to try for yourself, what works best.
Also, never underestimate the power of old school image preprocessing. If you have 3 different perspectives, you could take the one that comes closest to the "flat" perspective. This is probably as easy as using the image with the largest colored area, where img.sum() is the highest.
Another idea is to figure out the color through explicit programming, which should be fairly easy and then feed the network a grayscale image. Maybe your network is confused by the strong correlation of the color and ignores the shape altogether.
I am interested in the possibility of training a TensorFlow model to modify images, but I'm not quite sure where to get started. Almost all of the examples/tutorials dealing with images are for image classification, but I think I am looking for something a little different.
Image classification training data typically includes the images plus a corresponding set of classification labels, but I am thinking of a case of an image plus a "to-be" version of the image as the "label". Is this possible? Is it really just a classification problem in disguise?
Any help on where to get started would be appreciated. Also, the solution does not have to use TensorFlow, so any suggestions on alternate machine learning libraries would also be appreciated.
For example, lets say we want to train TensorFlow to draw circles around objects in a picture.
Example Inbound Image:
(source: pbrd.co)
Label/Expected Output:
(source: pbrd.co)
How could I accomplish that?
I can second that, its really hard to find information about Image modification with tensorflow :( But have a look here: https://affinelayer.com/pix2pix/
From my understanding, you do use a GAN, but insead of feeding the Input of the generator with random data during training, you use a sample Input.
Two popular ways (the ones that I know about) to make models generate/edit images are:
Deep Convolutional Generative Adversarial Networks
Back-Propagation through a pre-trained image classification model (in a similar manner to deep dream) but you can start from the final layer to feed back the wanted label and the gradient descent should be applied to the image only. This was explained in more details in the following course: CS231n (this lecture)
But I don't think they fit the circle around "3" example that you gave. I think object detection and instance segmentation would be more helpful. Detect the object you are looking for, extract its boundaries via segmentation and post-process it to make the circle that you wish for (or any other shape).
Reference for the images: Intro to Deep Learning for Computer Vision
I'm working on a project to do a segmentation of tissu. So far i so good for now. But her i want to segment the destructed from the good tissu. Her is an image example. So as you can see the good tissus are smooth and the destructed ones are not. I have the idea to detected the edges to do the segmentation but it give bad results.
I'm opening to any i'm open to any suggestions.
Use a convolutional neural network for example any prebuilt in the Caffe package. Label the different kinds of areas in as many images as you have, then use many (1000s) small (32x32) patches from those to train the network. This will produce much better results than any kind of handcrafted algorithm.
A very simple approach which can be used as an intermediate test could be following:
Blur the image to reduce the noise. This is an important step. OpenCV provides an inbuilt method for it.
Find contours using the OpenCV method findContour().
Then if the perimeter of contour is greater than a set threshold (you will have to set a value) then, you can consider it to be a smooth tissue else you can discard the tissue.
This is a really simple approach and a simple program can be written for it really fast.
I have good resolution faces images and I would like to automatically detect the iris and know its color. Is there any state-of-the-art (standard) way to detect iris other than HoughCircles which is not reporting consistent results on different images. One condition I have is that I have to use still images (no video is available)?
I am using OpenCV-Python for image processing. Any help is highly appreciated.
I think the problem can be split into two parts:
Localisation of the iris regions
Estimating the colour
Step one is time consuming, but I have done this at my workplace. You can train a Haar-cascade classifier for iris images (grayscale), and localise the iris within the eye-region returned be the cascade classifier for the eyes. If you already have a collection of face images, you can use them. Otherwise, try to collect as many samples as possible, with the same image quality as the images you want to use.
Step two is relatively easy, but might not be "very easy" because of auto white balance etc.
If you want a simpler approach, try detecting the white regions of the eye and using them to loc
I'm attempting to implement an easter egg in a mobile app I'm working on. These easter egg will be triggered when a logo is detected in the camera view. The logo I'm trying to detect is this one: .
I'm not quite sure what the best way to approach this is as I'm pretty new to computer vision. I'm currently finding horizontal edges using the Canny algorithm. I then find line segments using the probabilistic Hough transform. The output of this looks as follows (blue lines represent the line segments detected by the probabilistic Hough transform):
The next step I was going to take would be to look for a group of around 24 lines (fitting within a nearly square rectangle), each line would have to be approximately the same length. I'd use these two signals to indicate the potential presence of the logo. I realise that this is probably a very naive approach and would welcome suggestions as to how to better detect this logo in a more reliable manner?
Thanks
You may want to go with SIFT using Rob Hess' SIFT Library. It's using OpenCV and also pretty fast. I guess that easier than your current way of approaching the logo detection :)
Try also looking for SURF, which claims to be faster & robuster than SIFT. This Feature Detection tutorial will help you.
You may just want to use LogoGrab's technology. It's the best out there and offers all sorts of APIs (both mobile and HTTP). http://www.logograb.com/technologyteam/
I'm not quite sure if you would find such features in the logo to go with a SIFT/SURF approach. As an alternative you can try training a Haar-like feature classifier and use it for detecting the logo, just like opencv does for face detection.
You could also try the Tensorflow's object detection API here:
https://github.com/tensorflow/models/tree/master/research/object_detection
The good thing about this API is that it contains State-of-the-art models in Object Detection & Classification. These models that tensorflow provide are free to train and some of them promise quite astonishing results. I have already trained a model for the company I am working on, that does quite amazing job in LOGO detection from Images & Video Streams. You can check more about my work here:
https://github.com/kochlisGit/LogoLens