So as a fun project, I've been messing with the TensorFlow API (in Java unfortunately.. but I should be able to get some results out anyways). My first goal is to develop a model for 2D point cloud filtering. So I have written code that generates random clouds in 224x172 resolution, computes the result of a neighbor density filter, and stores both (see images below).
So basically I have generated data for both an input and expected output, which can be done as much as needed for a massive dataset.
I have both the input and output arrays stored as 224x172 binary arrays (0 for no point at index, 1 for a point at that index). So my input and output are both 224x172. At this point, I'm not sure how to translate my input to my expected result. I'm not sure how to weight each "pixel" of my cloud, or how to "teach" the program the expected result. Any suggestions/guidance on whether this is even possible for my given scenario would be appreciated!
Please don't be too hard on me... I'm a complete noob when it comes to machine learning.
Imagine, that Tensorflow is a set of building blocks (like LEGO) that allows constructing machine learning models. After the model is constructed, it could be trained and evaluated.
So basically your question could be divided into three steps:
1. I'm new to machine learning. Please guide me how to choose the model that fits the task.
2. I'm new to tensorflow. I have the idea of model (see 1) and I want to construct it via Tensorflow.
3. I'm new to Java's API of tensorflow. I know how to build model using tensorflow (2), but I'm stuck in Java.
This sounds scaring, but that's not too bad really. And I'd suggest you the following plan to do:
1. You need to look through the machine learning models to find the model that suits your case. So you need to ask yourself: what are the models that could be used for cloud filtering? And basically, do you really need some machine learning model? Why don't you use simple math formulas?
That are the questions you may ask to yourself.
Ok, assume you've found some model. For example, you've found a paper describing neural network able to solve your tasks. Then go to the next step
2. Tensorflow has a lot of examples and code snippets. So you could even
find the code with your model implemented yet.
The bad thing, that most code examples and API are Python based. But as you want to go into machine learning, I'd suggest you studying Python. It's easy to enter. It's very common to use python in the science world as it allows not to waste time on wrappers, configuration, etc. (as Java needs). We just start solving our task from the first line of the script.
As I've told initially about tensorflow and LEGO similarity, I'd like to add that there are more high-level additions to the tensorflow. So you work with the not building blocks but some kind of layers of blocks.
Something like tflearn. It's very good especially if you don't have deep math or machine learning background. It allows building machine learning models in a very simple and understandable way. So do you need to add some neural network layer? Here you are. And that's all without complex low-level tensor operations.
The disadvantage that you won't be able to load tflearn model from Java.
Anyway, we assume, at the end of this step you are able to build your model, to train it and to evaluate the model and prediction quality.
3. So you have your machine learning model, you understand Tensorflow mechanics, and if you still need to work with Java that should be much easier yet.
I note, that you won't be able to load tflearn model from Java. You can try to use jython to call python's functions directly from Java, though I haven't tried it.
And on this way (1-3) you will definitely have some more questions. So welcome to SO.
Related
To be more specific, The traditional chatbot framework consists of 3 components:
NLU (1.intent classification 2. entity recognition)
Dialogue Management (1. DST 2. Dialogue Policy)
NLG.
I am just confused that If I use a deep learning model(seq2seq, lstm, transformer, attention, bert…) to train a chatbot, Is it cover all those 3 components? If so, could you explain more specifically how it related to those 3 parts? If not, how can I combine them?
For example, I have built a closed-domain chatbot, but it is only task-oriented which cannot handle the other part like greeting… And it can’t handle the problem of Coreference Resolution (it seems doesn't have Dialogue Management).
It seems like your question can be split into two smaller questions:
What is the difference between machine learning and deep learning?
How does deep learning factor into each of the three components of chatbot frameworks?
For #1, deep learning is an example of machine learning. Think of your task as a graphing problem. You transform your data so it has an n-dimensional representation on a plot. The goal of the algorithm is to create a function that represents a line drawn on the plot that (ideally) cleanly separates the points from one another. Each sector of the graph represents whatever output you want (be it a class/label, related words, etc). Basic machine learning creates a line on a 'linearly separable' problem (i.e. it's easy to draw a line that cleanly separates the categories). Deep learning enables you to tackle problems where the line might not be so clean by creating a really, really, really complex function. To do this, you need to be able to introduce multiple dimensions to the mapping function (which is what deep learning does). This is a very surface-level look at what deep learning does, but that should be enough to handle the first part of your question.
For #2, a good quick answer for you is that deep learning can be a part of each component of the chatbot framework depending on how complex your task is. If it's easy, then classical machine learning might be good enough to solve your problem. If it's hard, then you can begin to look into deep learning solutions.
Since it sounds like you want the chatbot to go a bit beyond simple input-output matching and handle complicated semantics like coreference resolution, your task seems sufficiently difficult and a good candidate for a deep learning solution. I wouldn't worry so much about identifying a specific solution for each of the chatbot framework steps because the tasks involved in each of those steps blend into one another with deep learning (e.g. a deep learning solution wouldn't need to classify intent and then manage dialogue, it would simply learn from hundreds of thousands of similar situations and apply a variation of the most similar response).
I would recommend handling the problem as a translation problem - but instead of translating from one language to another, you're translating from the input query to the output response. Translation frequently needs to resolve coreference and solutions people have used to solve that might be an ideal course of action for you.
Here are some excellent resources to read up on in order to frame your problem and how to solve it:
Google's Neural Machine Translation
Fine Tuning Tasks with BERT
There is always a trade-off between using traditional machine learning models and using deep learning models.
Deep learning models require large data to train and there will be an increase in training time & testing time. But it will give better results.
Traditional ML models work well with fewer data with moderate performance comparatively. The inference time is also less.
For Chatbots, latency matters a lot. And the latency depends on the application/domain.
If the domain is banking or finance, people are okay with waiting for a few seconds but they are not okay with wrong results. On the other hand in the entertainment domain, you need to deliver the results at the earliest.
The decision depends on the application domain + the data size you are having + the expected precision.
RASA is something worth looking into.
Recently, we planned to build a system for image processing to extract info from images. At present we are using AWS Rekognition to do that. But, in some cases, we are not getting accurate information from AWS. So, we've planned to build our own custom one.
We've 4/5 months to do that. At least a POC version. Also, we've planned to use Tensorflow for that. We all have no prior experience about Machine Learning & Deep Learning but already have 5/6yrs of experience on Computer Programming by using different languages.
Currently, I'm studying ML from a course of Udemy & my approach to solve this problem is...
Learn Machine Learning(ML)
Learn Deep Learning(DL)
Above ML & DL maybe I'll be ready to understand the whole thing & can able to build a system for Image Processing.
In abstract what I've understood is, I've to write one Deep Learning program in Python by using Tensorflow. By using that Program I've to build a Model. Then I've to train that Model by using some training data. Then, when my Model achieves a certain level of accuracy I'll use some test data.
Now, there some places at where I've bit confused & here are my questions regarding that confusion...
I know tensorflow is a library but at some places, it's also mentioned as a system. So, is it really a library(piece of code) only & something more than that?
I got some Image Processing Python code in Tensorflow tutorial section (https://www.tensorflow.org/tutorials/image_recognition). We've tested that code & it's working exactly the way AWS Recognition service work. So, here my doubt is... can I use this Python code as it is in our production work?
After train a model with some training data does those training data get part of the whole system or Machine Learning Model extract some META info from those training data & keep with itself rather whole raw training data(in my case it'll be raw images).
Can I do all these ML+DL programmings over my Linux System? It has Pentium 4 with 8GB RAM.
Also, want to know... the approach which I've mentioned to build a solution for my problem is sufficient or I need to do something else also.
Need some guidance to clear out all these confusion.
Thanks
1 : tensor-flow is like anything else we have been worked with (like Numpy ) but only difference is we have to first defined what we want to use the use it , every thing in tensor-flow are running into a computational graph and evaluating every thing in that graph require a Session , we could call it library because it just piece of code and have interface in python , and system because of all those mechanism it uses
2 :
can I use this Python code as it is in our production work? Why not !
3:
yes you could do that with your system , but the main advantage of tensor-flow and theano , .. the tool like those is that you could run your code on GPU it a more faster way than on CPU because the GPU could handle a lot more matrix multiplication and stuff like that
4:
you know you don't have to learn all the machine learning stuff to built a image recognition system , it may be take years for you to understand whats going on there , Udemy course is very good source but you I highly recommend you to see the machine learning courses of coursera , there is to courses there about machine learning : the great Andrew NG course and Emily fox course , the first one is more theoretical than practical , but second on is more practical ,
and about the Deep learning , there is nothing fancy about Deep learning and it's just a method in machine learning , after you gain some experience in machine learning and understood some basic or you could do it right know , go to fast.ai , it has a really good course about deep learning for coder and it's also free
I hope this will help you
I'm new to machine learning and trying to figure out where to start and how to apply it to my app.
My app is pulling a bunch of health metrics and based on all of them is suggesting a dose of medication (some abstract medication, doesn't matter) to take. Taking a medication is affecting health metrics and I can see if my suggestion was right of if it needs adjustments to be more precise the next time. Medications are being taken constantly so I have a lot of results and data to work with.
Does that seem like a good case for machine learning and using some of neural networks to train and make better predictions? If so - could you recommend an example for Tensorflow or Keras?
So far I only found image recognition examples and not sure how to apply similar algorithms to my problem.
I'm also a beginner into machine learning, but based on my knowledge, one way would be to use supervised learning with Keras, which uses Tensorflow as a backend. Keras is a lot easier to program than Tensorflow, but eventually Tensorflow might as well do the trick (depending on your familiarity with machine learning libraries).
You mentioned that your algorithm suggests medication based on data (from the patient).
One way to predict medication is to store all your preexisting data in a CSV file, and use the CSV module to read it. This tutorial covers the basics of reading CSV files (https://pythonprogramming.net/reading-csv-files-python-3/).
Next, you can store the data in a multi-dimensional array, and run a neural network through it. Just make sure that you have sufficiently enough data (the more the better) in comparison with the size of your neural network.
Another way, as you mentioned, would be using Convolutional Neural Networks, which theoretically could and should work, but I have very little experience programming them, so I'm afraid I can't give you any advice for that (you can program CNNs in both Keras and Tensorflow).
I do wish you good luck in your project!
I'm doing a project to detect (classify) human activities using a ARM cortex-m0 microcontroller (Freedom - KL25Z) with an accelerometer. I intend to predict the activity of the user using machine learning.
The problem is, the cortex-m0 is not capable of processing training or predicting algorithms, so I would probably have to collect the data, train it in my computer and then embed it somehow, which I don't really know how to do it.
I saw some post in the internet saying that you can generate a matrix of weights and embed it in a microcontroller, so it would be a straightforward function to predict something ,based on the data you providing for this function. Would it be the right way of doing ?
Anyway my question is, how could I embedded a classification algorithm in a microcontroller?
I hope you guys can help me and give some guidance, I'm kind of lost here.
Thank you in advance.
I've been thinking about doing this myself to solve a problem that I've had a hard time developing a heuristic for by hand.
You're going to have to write your own machine-learning methods, because there aren't any machine learning libraries out there suitable for low-end MCUs, as far as I know.
Depending on how hard the problem is, it may still be possible to develop and train a simple machine learning algorithm that performs well on a low-end MCU. After-all, some of the older/simpler machine learning methods were used with satisfactory results on hardware with similar constraints.
Very generally, this is how I'd go about doing this:
Get the (labelled) data to a PC (through UART, SD-card, or whatever means you have available).
Experiment with the data and a machine learning toolkit (scikit-learn, weka, vowpal wabbit, etc). Make sure an off-the-shelf method is able to produce satisfactory results before moving forward.
Experiment with feature engineering and selection. Try to get the smallest feature set possible to save resources.
Write your own machine learning method that will eventually be used on the embedded system. I would probably choose perceptrons or decision trees, because these don't necessarily need a lot of memory. Since you have no FPU, I'd only use integers and fixed-point arithmetic.
Do the normal training procedure. I.e. use cross-validation to find the best tuning parameters, integer bit-widths, radix positions, etc.
Run the final trained predictor on the held-out testing set.
If the performance of your trained predictor was satisfactory on the testing set, move your relevant code (the code that calculates the predictions) and the model you trained (e.g. weights) to the MCU. The model/weights will not change, so they can be stored in flash (e.g. as a const array).
I think you may be limited by your hardware. You may want to get something a little more powerful. For your project you've chosen the M-series processor from ARM. This is the simplest platform that they offer, the architecture doesn't lend itself to the kind of processing you're trying to do. ARM has three basic classifications as follows:
M - microcontroller
R - real-time
A - applications
You want to get something that has strong hardware support for these complex calculations. You're starting point should be an A-series for this. If you need to do floating point arithmetic, you'll definitely need to start with the A-series and probably get one with NEON-FPU.
TI's Discovery series is a nice place to start, or maybe just use the Raspberry Pi (at least for the development part)?
However, if you insist on using the M0 I think you might be able to pull it off using something lightweight like ROS-C. I know there are packages with ROS that can do it, even though its mainly for robotics you may be able to adapt it to what you're doing.
Dependency Free ROS
Neural Networks and Machine Learning with ROS
I'm new to machine learning algorithm. I'm learning basic algorithms like regression, classification, clustering, sequence modelling, on-line algorithms. All the article that are available on internet shows how to use these algorithm with specific data. There is no article regarding deployment of those algorithm in production environment. So my questions are
1) How to deploy machine learning algorithm in production environment?
2) The typical approach follows in machine learning tutorial is to build the model using some training data, use it for testing data. But is it advisable to use that kind of model in production environment? Incoming data may keep changing so the model will be ineffective. What should be duration for the model refresh cycle to accommodate such changes?
I am not sure if this is a good question (since it is too general and not formulated good), but I suggest you to read about bias - variance tradeoff. Long story short, you could have low bias\high variance machine-learning model and get 100% accurate results on your test data (the data you used to implement a model), but you could cause your model to overfit the training data. As result, when you will try to use it on data which you haven't used during training it will lead to poor performance. On the other hand, you may have high bias\low variance model, which will be poorly fit to your training data and will also perform just as bad on new production data. Keeping this in mind general guideline will be:
1) Obtain some good amount of data which you could use to build a prototype of machine-learning system
2) Split your data into train set, cross-validation set and test set
3) Create a model which will have relatively low bias (good accuracy, actually - good F1 score) on your test data. Then try this model on cross-validation set to see the results. If the results are bad - you have a high variance problem, you used a model which overfit the data and can't generalize well. Re-write your model, play with model parameters or use different algorithm. Repeat until you get a good result on CV set
4) Since we played with the model in order to get a good result on CV set, you want to test your final model on test set. If it is good - that's it, you have a final version of model and could use it on prod environment.
Second question has no answer, it is based on your data and your application. But 2 general approaches might be used:
1) Do everything I mentioned earlier to build a model with a good performance on test set. Re-train your model on new data once in some period (try different periods, but you could try to re-train your model once you see that performance of model dropped down).
2) Use online-learning approach. This is not applicable for many algorithms, but for some cases it could be used. Generally, if you see that you could use stochastic gradient descent learning method - you could use online-learning and just keep your model up-to-date with the newest production data.
Keep in mind that even if you use #2 (online-learning approach) you can't be sure that your model will be good forever. Sooner or later the data you get may change significantly and you may want to use whole different model (for example switch to ANN instead of SWM or logistic regression).
DISCLAIMER: I work for this company, Datmo building a better workflow for ML. We’re always looking to help fellow developers working on ML so feel free to reach out to me at anand#datmo.com if you have any questions.
1) In order to deploy, you should first split up your code into preprocessing, training and test. This way you can easily encapsulate the required components for deployment. Usually, you will then want to take your preprocessing, test, as well as your weights file (the output of your training process) and put them in one folder. Next, you will want to host this on a server and wrap an API server around this. I would suggest a Flask Restful API so that you can use query parameters as your inputs and output your response in standard JSON blobs.
To host it on a server, you can use this article which talks about how you can deploy a Flask API on EC2.
You can load and model and serve it as API as given in this code.
2) Hard for me to answer without more details. It's highly dependent on the type of data and the type of model. For example, for deep learning, there is no such thing as online learning.
I am sorry that my comments does not include too much detail* since I am also a newbie in "deployment" of ML. But since the author is also new in ML, I hope these basic guidance could be helpful as well.
For "deployment", you should
Have ML algorithms: You may use free-tools, or develop your own tool using libraries in Python, R, Java, .Net, .. or use a system on cloud..)
Train those ML models using training datasets
Save those trained models (You should search this topic based on your development environment. There are some file formats that Tensorflow/Keras provide, or formats like pickle, ONNX,.. I would like to write a whole list here, with their supporting language & environment, advantage&disadvantage and loadability but I am also trying to investigate this topic, as a newbie)
And THEN, you can deploy these saved-models on production. On production you should either have your own-developed application to run the saved model (For example: an application that you developed with Python that takes trained&saved .pickle file and TestData as input; and simply gives "prediction for the test data" as output) or you should have an environment/framework that runs the saved models (search for ML environments/frameworks on cloud). At first, you should clarify your need: Do you need a stand-alone program on production, or will you serve a internal web-service, or via-cloud, etc.
For the second question; as above answers indicate the issue is "online training ability" of the models. Please additionally note that; for "online learning", your production environment has to feed your production tool/system with the real-correct label of the test data as well. Will you have that capability?
Note: All above are just small "comments" instead of a clear answer, but technically I am not able to write comments yet. Thanks for not de-voting :)
Regarding the first question, my service mlrequest makes deploying models to production simple. You can get started with a free API key that provides 50k model transactions a month.
This code will train and deploy, or update your model across 5 global data centers.
from mlrequest import Classifier
classifier = Classifier('my-api-key')
features = {'feature1': 'val1', 'feature2': 45}
training_data = {'features': features, 'label': 2}
r = classifier.learn(training_data=training_data, model_name='my-model', class_count=2)
This is how you make predictions, latency-routed to the nearest data center to get the quickest response.
features = {'feature1': 'val1', 'feature2': 77}
r = classifier.predict(features=features, model_name='my-model', class_count=2)
r.predict_result
Regarding your second question, it completely depends on the problem you are solving. Some models need to be frequently updated, while others almost never need to be updated.