I am trying to create a handwritten digit recogniser that uses a core ml model.
I am taking the code from another similar project:
https://github.com/r4ghu/iOS-CoreML-MNIST
But i need to incorporate my ml model into this project.
This is my model:(Input image is 299x299)
https://github.com/LOLIPOP-INTELLIGENCE/createml_handwritten
My question is what changes are to be made in the similar project so that it incorporates my coreml model
I tried changing the shapes to 299x299 but that gives me an error
In viewDidLoad, you should change the number 28 to 299 in the call to CVPixelBufferCreate(). In the original app, the mlmodel expects a 28x28 image but your model uses 299x299 images.
However, there is something else you need to change as well: replace kCVPixelFormatType_OneComponent8 with kCVPixelFormatType_32BGRA or kCVPixelFormatType_32RGBA. The original model uses grayscale images but yours expects color images.
P.S. Next time include the actual error message in your question. That's an important piece of information for people who are trying to answer. :-)
Related
As I am working on my project that is to detect FOD (Foreign Object Debirs) that is found on the runway. FOD include anything like nuts, bolts, screws, locking wires, plastic debris, stones etc. that has the potential to cause damage to the aircraft. Now I have searched on the Internet to find any image dataset but no dataset is available related to FOD. Now my question is kindly guide me that how can I make my own dataset of images that can then be used for training purpose.
Kindly guide me in making image dataset for both classification and detection purposes. And also the data pre-processing that will be required. Thanks and waiting for the reply!
Although the question is a bit vague regarding your requirements and the specs of your machine, I'll try to answer it. You'll need object detection to do your task. There are many models available which you can use like Yolo, SSD, etc..
To create your own dataset, you can follow these steps:
Take lots of images of your objects of interest in various conditions, viewpoints and backgrounds. (Around 2000 per class should be good enough).
Now annotate (or mark) where your object is in the image. If you're using Yolo, make use of Yolo-mark for annotating. There should be other similar tools for SSD and other models.
Now you can begin training.
These steps should get you started or at least point you in the right direction.
You can build your own dataset with this code. I wrote it, and it works correctly.
You need to import the libraries and add your DATADIR.
if __name__ == "__main__":
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img))
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
training_data.append([new_array, class_num])
except Exception as e:
pass
for features, label in training_data:
x_train.append(features)
y_train.append(label)
#create pikle
pickle_out = open("x_train.pickle", "wb")
pickle.dump(x_train, pickle_out)
pickle_out.close()
pickle_out = open("y_train.pickle", "wb")
pickle.dump(y_train, pickle_out)
pickle_out.close()
In case if you're starting completely from scratch, you can use "Dataset Directory", available on Play store. The App helps you in creating custom datasets using your mobile. You'll have to sign in to your Google drive such that your dataset is stored in Drive rather on your mobile. Additionally, It also contains Labelling the entity for classification and Regression predictive models.
Currently, the App supports Binary Image Classification and Image Regression.
Hope this Helped!
Download Link :
https://play.google.com/store/apps/details?id=com.applaud.datasetdirectory
Pretty much brand new to ML here. I'm trying to create a hand-detection CoreML model using turicreate.
The dataset I'm using is from https://github.com/aurooj/Hand-Segmentation-in-the-Wild , which provides images of hands from an egocentric perspective, along with masks for the images. I'm following the steps in turicreate's "Data Preparation" (https://github.com/apple/turicreate/blob/master/userguide/object_detection/data-preparation.md) step-by-step to create the SFrame. Checking the contents of the variables throughout this process, there doesn't appear to be anything wrong.
Following data preparation, I follow the steps in the "Introductory Example" section of https://github.com/apple/turicreate/tree/master/userguide/object_detection
I get the hint of an error when turicreate is performing iterations to create the model. There doesn't appear to be any Loss at all, which doesn't seem right.
After the model is created, I try to test it with a test_data portion of the SFrame. The results of these predictions are just empty arrays though, which is obviously not right.
After exporting the model as a CoreML .mlmodel and trying it out in an app, it is unable to recognize anything (not surprisingly).
Me being completely new to model creation, I can't figure out what might be wrong. The dataset seems quite accurate to me. The only changes I made to the dataset were that some of the masks didn't have explicit file extensions (they are PNGs), so I added the .png extension. I also renamed the images to follow turicreate's tutorial formats (i.e. vid4frame025.image.png and vid4frame025.mask.0.png. Again, the SFrame creation process using this data seems correct at each step. I was able to follow the process with turicreate's tutorial dataset (bikes and cars) successfully. Any ideas on what might be going wrong?
I found the problem, and it basically stemmed from my unfamiliarity with Python.
In one part of the Data Preparation section, after creating bounding boxes out of the mask images, each annotation is assigned a 'label' indicating the type of object the annotation is meant to be. My data had a different name format than the tutorial's data, so rather than each annotation having 'label': 'bike', my annotations had 'label': 'vid4frame25`, 'label': 'vid4frame26', etc.
Correcting this such that each annotation has 'label': 'hand' seems to have corrected this (or at least it's creating a legitimate-seeming model so far).
I understant that my question is not directly related to programming itself and looks more like research. But probably someone can advise here.
I have an idea for app, when user takes a photo and app will analyze it and cut everythig except required object (a piece of clothin for example) and will save it in a separate image. Yesterday it was very difficult task, because developer should create pretty good neural network and educate it. But after Apple released iPhone X with true depth camera, half of the problems can be solved. As per my understanding, developer can remove background much more easily, because iPhone will know where background is located.
So only several questions left:
I. What is the format of photos which are taken by iPhone X with true depth camera? Is it possible to create neural network that will be able to use information about depth from the picture?
II. I've read about CoreML, tried some examples, but it's still not clear for me - how the following behaviour can be achieved in terms of External Neural Network that was imported into CoreML:
Neural network gets an image as an input data.
NN analyzes it, finds required object on the image.
NN returns not only determinated type of object, but cropped object itself or array of coordinates/pixels of the area that should be cropped.
Application gets all required information from NN and performs necessary actions to crop an image and save it to another file or whatever.
Any advice will be appreciated.
Ok, your question is actually directly related to programming:)
Ad I. The format is HEIF, but you access data of the image (if you develop an iPhone app) by means of iOS APIs, so you easily get information about bitmap as CVPixelBuffer.
Ad II.
1. Neural network gets an image as an input data.
As mentioned above, you want to get your bitmap first, so create a CVPixelBuffer. Check out this post for example. Then you use CoreML API. You want to use MLFeatureProvider protocol. An object which conforms to is where you put your vector data with MLFeatureValue under a key name picked by you (like "pixelData").
import CoreML
class YourImageFeatureProvider: MLFeatureProvider {
let imageFeatureValue: MLFeatureValue
var featureNames: Set<String> = []
init(with imageFeatureValue: MLFeatureValue) {
featureNames.insert("pixelData")
self.imageFeatureValue = imageFeatureValue
}
func featureValue(for featureName: String) -> MLFeatureValue? {
guard featureName == "pixelData" else {
return nil
}
return imageFeatureValue
}
}
Then you use it like this, and feature value will be created with initWithPixelBuffer initializer on MLFeatureValue:
let imageFeatureValue = MLFeatureValue(pixelBuffer: yourPixelBuffer)
let featureProvider = YourImageFeatureProvider(imageFeatureValue: imageFeatureValue)
Remember to crop/scale image before this operation so as to your network is being fed with a vector of a proper size.
NN analyzes it, finds required object on the image.
Use prediction function on your CoreML model.
do {
let outputFeatureProvider = try yourModel.prediction(from: featureProvider)
//success! your output feature provider has your data
} catch {
//your model failed to predict, check the error
}
NN returns not only determinated type of object, but cropped object itself or array of coordinates/pixels of the area that should be cropped.
This depends on your model and whether you imported it correctly. Under the assumption you did, you access output data by checking returned MLFeatureProvider (remember that this is a protocol, so you would have to implement another one similar to what I made for you in step 1, smth like YourOutputFeatureProvider) and there you have a bitmap and rest of the data your NN spits out.
Application gets all required information from NN and performs necessary actions to crop an image and save it to another file or whatever.
Just reverse step 1, so from MLFeatureValue -> CVPixelBuffer -> UIImage. There are plenty of questions on SO about this so I won't repeat answers.
If you are a beginner, don't expect to have results overnight, but the path is here. For an experienced dev I would estimate this work for several hours to get work done (plus model learning time and porting it to CoreML).
Apart from CoreML (maybe you find your model too sophisticated and it won't be able to port it to CoreML) check out Matthjis Hollemans' github (very good resources on different ways of porting models to iOS). He is also around here and knows a lot in the subject.
I run caffe using an image_data_layer and don't want to create an LMDB or LevelDB for the data, But The compute_image_mean tool only works with LMDB/LevelDB databases.
Is there a simple solution for creating a mean file from a list of files (the same format that image_data_layer is using)?
You may notice that recent models (e.g., googlenet) do not use a mean file the same size as the input image, but rather a 3-vector representing a mean value per image channel. These values are quite "immune" to the specific dataset used (as long as it is large enough and contains "natural images").
So, as long as you are working with natural images you may use the same values as e.g., GoogLenet is using: B=104, G=117, R=123.
The simplest solution is to create a LMDB or LevelDB database of the image set.
The complicated solution is to write a tool similar to compute_image_mean, which takes image inputs and do the transformations and find the mean!
I am trying to determine when a food packaging have error or not error. Example
the logo " McDonald's " have error misprints or not, as the wrong label, wrong color..( i can not post picture )
What should I do, please help me!!
It's not a trivial task by any stretch of the imagination. Two images of the same identical object will always be different according to lightning conditions, perspective, shooting angle, etc.
Basically you need to:
1. Process the 2 images into "digested" data - dominant color, shapes, etcw
2. Design and run your own similarity algorithm between the 2 objects
You may want to look at Feature detectors in OpenCV: Surf, SIFT, etc.
Along a result I just found your question, so I think I come too late.
If not I think your problem car easily be resolved, it exists since years and is called Sikuli .
While it's for testing purposes, I have been using it in the same way as you need : compare a reference and a production image. Based on OpenCV it does it very well.