I want to do binary image classification and I thought ImageJ would be a good platform to do it, but alas, I am at a loss. The pseudocode for what I want to do is below. How do I implement this in ImageJ? It does not have to follow the pseudocode exactly, I just hope that gives an idea of what I want to do. (If there is a plugin or a more suitable platform that does this already, can you point me to that?)
For training:
train_directory = get folder from user
train_set = get all image files in train_directory
class_file = get text file from user
// each row of class_file contains image name and a "1" or "-1"
// a "1" indicates that image belongs to the positive class
model = build_model_from_train_set_and_class_file
write_model_to_output_file
// end train
For testing:
test_directory = get folder from user
test_set = get all images in test_directory
if (user wants to)
class_file = get text file from user
// normally for testing one would always know the class of the image
// the if statement is just in case the user does not
model = read_model_file
output = classify_images_as_positive_or_negative
write_output_to_file
Note that there should not be any preprocessing done by the user: no additional set of masks, no drawings, no additional labels, etc. The algorithm must be able to figure out what is common among the training images from the training images alone and build up a model appropriately. Of course, the algorithm can do any preprocessing it wants to, it just cannot rely on that preprocesssing being done already.
I tried using CVIPtools for this but it wants a mask on all of the images to do feature extraction.
Related
As I am working on my project that is to detect FOD (Foreign Object Debirs) that is found on the runway. FOD include anything like nuts, bolts, screws, locking wires, plastic debris, stones etc. that has the potential to cause damage to the aircraft. Now I have searched on the Internet to find any image dataset but no dataset is available related to FOD. Now my question is kindly guide me that how can I make my own dataset of images that can then be used for training purpose.
Kindly guide me in making image dataset for both classification and detection purposes. And also the data pre-processing that will be required. Thanks and waiting for the reply!
Although the question is a bit vague regarding your requirements and the specs of your machine, I'll try to answer it. You'll need object detection to do your task. There are many models available which you can use like Yolo, SSD, etc..
To create your own dataset, you can follow these steps:
Take lots of images of your objects of interest in various conditions, viewpoints and backgrounds. (Around 2000 per class should be good enough).
Now annotate (or mark) where your object is in the image. If you're using Yolo, make use of Yolo-mark for annotating. There should be other similar tools for SSD and other models.
Now you can begin training.
These steps should get you started or at least point you in the right direction.
You can build your own dataset with this code. I wrote it, and it works correctly.
You need to import the libraries and add your DATADIR.
if __name__ == "__main__":
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
training_data.append([new_array, class_num])
except Exception as e:
pass
for features, label in training_data:
x_train.append(features)
y_train.append(label)
#create pikle
pickle_out = open("x_train.pickle", "wb")
pickle.dump(x_train, pickle_out)
pickle_out.close()
pickle_out = open("y_train.pickle", "wb")
pickle.dump(y_train, pickle_out)
pickle_out.close()
In case if you're starting completely from scratch, you can use "Dataset Directory", available on Play store. The App helps you in creating custom datasets using your mobile. You'll have to sign in to your Google drive such that your dataset is stored in Drive rather on your mobile. Additionally, It also contains Labelling the entity for classification and Regression predictive models.
Currently, the App supports Binary Image Classification and Image Regression.
Hope this Helped!
Download Link :
https://play.google.com/store/apps/details?id=com.applaud.datasetdirectory
I have been following the Turicreate tutorial link here
I am able to train a model successfully as per instructions. I am also able to consume the model in the iOS app with the help of the code provided over there. But I am not able to figure out how could I get the most similar actual images based on the distances this model returns. Also when I run it on iPhone, it returns me offset and element which I am not able to interpret. Please see the screenshot.
My aim in the iOS application is to input an image, pass that input image to the model and then show the actual output 5 or 10 most similar images and not just the distances.
Be sure that you have an id column or have added the id column to your SFrame using:
reference_data = reference_data.add_row_number()
Here is a code snippet to create the image_similarity model and get the ten most similar images for a sample image. I use fifty as a k value to demonstrate that how to pull out the n most similar from a larger group in your similarity graph.
similarity_graph = model.similarity_graph(k=50)
similar_images = similarity_graph.edges
# pick a sample image from reference_data
sample_index = 3
sample_similar_images = similar_images[similar_images["__src_id"]==sample_index].sort("rank")
# get 10 most similar images
most_similar_ids = sample_similar_images["__dst_id"][0:10]
most_similar_images = reference_data.filter_by(most_similar_ids, "id")
For details on interpreting the distance MultiArray from the CoreML model, see the sample code in the Turi Create documentation. The element value corresponds to the distance.
I understant that my question is not directly related to programming itself and looks more like research. But probably someone can advise here.
I have an idea for app, when user takes a photo and app will analyze it and cut everythig except required object (a piece of clothin for example) and will save it in a separate image. Yesterday it was very difficult task, because developer should create pretty good neural network and educate it. But after Apple released iPhone X with true depth camera, half of the problems can be solved. As per my understanding, developer can remove background much more easily, because iPhone will know where background is located.
So only several questions left:
I. What is the format of photos which are taken by iPhone X with true depth camera? Is it possible to create neural network that will be able to use information about depth from the picture?
II. I've read about CoreML, tried some examples, but it's still not clear for me - how the following behaviour can be achieved in terms of External Neural Network that was imported into CoreML:
Neural network gets an image as an input data.
NN analyzes it, finds required object on the image.
NN returns not only determinated type of object, but cropped object itself or array of coordinates/pixels of the area that should be cropped.
Application gets all required information from NN and performs necessary actions to crop an image and save it to another file or whatever.
Any advice will be appreciated.
Ok, your question is actually directly related to programming:)
Ad I. The format is HEIF, but you access data of the image (if you develop an iPhone app) by means of iOS APIs, so you easily get information about bitmap as CVPixelBuffer.
Ad II.
1. Neural network gets an image as an input data.
As mentioned above, you want to get your bitmap first, so create a CVPixelBuffer. Check out this post for example. Then you use CoreML API. You want to use MLFeatureProvider protocol. An object which conforms to is where you put your vector data with MLFeatureValue under a key name picked by you (like "pixelData").
import CoreML
class YourImageFeatureProvider: MLFeatureProvider {
let imageFeatureValue: MLFeatureValue
var featureNames: Set<String> = []
init(with imageFeatureValue: MLFeatureValue) {
featureNames.insert("pixelData")
self.imageFeatureValue = imageFeatureValue
}
func featureValue(for featureName: String) -> MLFeatureValue? {
guard featureName == "pixelData" else {
return nil
}
return imageFeatureValue
}
}
Then you use it like this, and feature value will be created with initWithPixelBuffer initializer on MLFeatureValue:
let imageFeatureValue = MLFeatureValue(pixelBuffer: yourPixelBuffer)
let featureProvider = YourImageFeatureProvider(imageFeatureValue: imageFeatureValue)
Remember to crop/scale image before this operation so as to your network is being fed with a vector of a proper size.
NN analyzes it, finds required object on the image.
Use prediction function on your CoreML model.
do {
let outputFeatureProvider = try yourModel.prediction(from: featureProvider)
//success! your output feature provider has your data
} catch {
//your model failed to predict, check the error
}
NN returns not only determinated type of object, but cropped object itself or array of coordinates/pixels of the area that should be cropped.
This depends on your model and whether you imported it correctly. Under the assumption you did, you access output data by checking returned MLFeatureProvider (remember that this is a protocol, so you would have to implement another one similar to what I made for you in step 1, smth like YourOutputFeatureProvider) and there you have a bitmap and rest of the data your NN spits out.
Application gets all required information from NN and performs necessary actions to crop an image and save it to another file or whatever.
Just reverse step 1, so from MLFeatureValue -> CVPixelBuffer -> UIImage. There are plenty of questions on SO about this so I won't repeat answers.
If you are a beginner, don't expect to have results overnight, but the path is here. For an experienced dev I would estimate this work for several hours to get work done (plus model learning time and porting it to CoreML).
Apart from CoreML (maybe you find your model too sophisticated and it won't be able to port it to CoreML) check out Matthjis Hollemans' github (very good resources on different ways of porting models to iOS). He is also around here and knows a lot in the subject.
I need a solution to compare two scanned images.
I have an image of an application form (unfilled), I need to compare that against other images of the same form, and want to detect whether there is any totally unfilled application form.
I just tried with Emgu CV AbsDiff, MatchTemplate etc, but none of them give me a 100 % match, even if I scanned the same form twice in the same scanner, could be because of the noise in the scanning, I can apply a tolerance but the problem is that I need to find out whether the user has filled anything in it. If I apply a tolerance then small changes in the form will not be detected.
I also had a look at the Python Image Libray, Accord.Net etc but couldn't find an approach for comparing this type of image.
Any suggestions on how to do this type of image comparison ?
Is there any free or paid library available for this ?
Only inspect the form fields. Without distractions it's easier to detect small changes. You also don't have to inspect each pixel. Use the histogram or mean color. I used SimpleCV:
from SimpleCV import *
form = Image("form.png")
form_field = form.crop(34, 44, 200, 30)
mean_color = form_field.meanColor()[0] # For simplicity only red channel.
if mean_color < 253:
print "Field is filled"
else:
print "Field is empty"
Alternatively look for features. E.g. corners, blobs or key points. Key points will ignore noise and will work better with poor scans:
key_points = form_field.findKeypoints()
if key_points:
print "Field is filled"
key_points.show(autocolor=True)
else:
print "Field is empty"
I'm trying to create a model with a training dataset and want to label the records in a test data set.
All tutorials or help I find online has information on only using cross validation with one data set, i.e., training dataset. I couldn't find how to use test data. I tried to apply the result model on to the test set. But the test set seems to give different no. of attributes than training set after pre-processing. This is a text classification problem.
At the end I get some output like this
18.03.2013 01:47:00 Results of ResultWriter 'Write as Text (2)' [1]:
18.03.2013 01:47:00 SimpleExampleSet:
5275 examples,
366 regular attributes,
special attributes = {
confidence_1 = #367: confidence(1) (real/single_value)
confidence_5 = #368: confidence(5) (real/single_value)
confidence_2 = #369: confidence(2) (real/single_value)
confidence_4 = #370: confidence(4) (real/single_value)
prediction = #366: prediction(label) (nominal/single_value)/values=[1, 5, 2, 4]
}
But what I wanted is all my examples to be labelled.
It seems that my test data and training data have different no. of attributes, I see many of following in the logs.
Mar 18, 2013 1:46:41 AM WARNING: Kernel Model: The given example set does not contain a regular attribute with name 'wireless'. This might cause problems for some models depending on this particular attribute.
But how do we solve such problem in text classification as we cannot know no. of and name of attributes before hand.
Can some one please throw some pointers.
You probably use a Process Documents operator to preprocess both training and test set. Here it is important that both these operators are setup identically. To "synchronize" the wordlist, i.e. consider the same set of words in both of them, you have to connect the wordlist (wor) output of the Process Documents operator used for training to the corresponding input port of the Process Documents operator used for preprocessing the test set.