Using a sequence in CreateML to record device motion - machine-learning

so I want to train a MLClassifier to identify a specific device motion.
So what I did was to record the motion data and very recorded data I labeled accordingly. When that didn't quite worked as I hoped, I started to realize that I have to record the "motion" itself and not only momentarily.
So I packed 5 dataSets (dictionaries) in a row and that was my new training feature. So I thought, but trying to train my new data I watched this error trying to create my Classifier:
Value encountered in column 's' is of type 'dictionary' cannot be
mapped to a categorical value. Categorical values must be integer,
strings, or None.
Now I'm slowly giving up... Has anyone of you a suggestion or know why I can't use sequences (arrays) as features?
...
Btw, here is some sample data of my JSON:
[{"s":[{"rZ":-1.0,"p":0.2,"aY":-0.0,"rX":1.5,"y":0.1,"r":-1.3,"aZ":0.2,"rY":-2.8,"aX":0.6},{"rZ":-1.9,"p":0.2,"aY":0.0,"rX":2.0,"y":0.2,"r":-1.4,"aZ":0.0,"rY":-3.2,"aX":0.5},{"rZ":-1.8,"p":0.3,"aY":0.0,"rX":2.4,"y":0.2,"r":-1.5,"aZ":0.9,"rY":-4.8,"aX":0.5},{"rZ":-1.6,"p":0.3,"aY":0.0,"rX":2.5,"y":0.3,"r":-1.6,"aZ":0.9,"rY":-3.8,"aX":0.6},{"rZ":-1.8,"p":0.3,"aY":0.1,"rX":2.2,"y":0.3,"r":-1.7,"aZ":0.1,"rY":-3.0,"aX":0.6}],"v":0}]
And the code I use to create my model:
do{
let a = try MLDataTable(contentsOf: dummyJSONurl)
let recognizer = try MLClassifier(trainingData: a, targetColumn: "v")
}catch let er{
er
}

You can't use sequences because MLClassifier isn't a classifier that can work on sequences. Perhaps Apple will add this in a future release but for now it appears that you'll have to use a more capable tool.

Related

Find the importance of each column to the model

I have a ML.net project and as of right now everything has gone great. I have a motor that collects a power reading 256 times around each rotation and I push that into a model. Right now it determines the state of the motor nearly perfectly. The motor itself only has room for 38 values on it at a time so I have been spending several rotations to collect the full 256 samples for my training data.
I would like to cut the sample size down to 38 so every rotation I can determine its state. If I just evenly space the samples down to 38 my model degrades by a lot. I know I am not feeding the model the features it thinks are most important but just making a guess and randomly selecting data for the model.
Is there a way I can see the importance of each value in the array during the training process? I was thinking I could use IDataView for this and I found the below statement about it (link).
Standard ML schema: The IDataView system does not define, nor prescribe, standard ML schema representation. For example, it does not dictate representation of nor distinction between different semantic interpretations of columns, such as label, feature, score, weight, etc. However, the column metadata support, together with conventions, may be used to represent such interpretations.
Does this mean I can print out such things as weight for each column and how would I do that?
I have actually only been working with ML.net for a couple weeks now so I apologize if the question is naive, I assure you I have googled this as many ways as I can think to. Any advice would be appreciated. Thanks in advance.
EDIT:
Thank you for the answer I was going down a completely useless path. I have been trying to get it to work following the example you linked to. I have 260 columns with numbers and one column with the conditions as one of five text strings. This is the condition I am trying to predict.
The first time I tried it threw an error "expecting single but got string". No problem I used .Append(mlContext.Transforms.Conversion.MapValueToKey("Label", "Label")) to convert to key values and it threw the error expected Single, got Key UInt32. any ideas on how to push that into this function?
At any rate thank you for the reply but I guess my upvotes don't count yet sorry. hopefully I can upvote it later or someone else here can upvote it. Below is the code example.
//Create MLContext
MLContext mlContext = new MLContext();
//Load Data
IDataView data = mlContext.Data.LoadFromTextFile<ModelInput>(TRAIN_DATA_FILEPATH, separatorChar: ',', hasHeader: true);
// 1. Get the column name of input features.
string[] featureColumnNames =
data.Schema
.Select(column => column.Name)
.Where(columnName => columnName != "Label").ToArray();
// 2. Define estimator with data pre-processing steps
IEstimator<ITransformer> dataPrepEstimator =
mlContext.Transforms.Concatenate("Features", featureColumnNames)
.Append(mlContext.Transforms.NormalizeMinMax("Features"))
.Append(mlContext.Transforms.Conversion.MapValueToKey("Label", "Label"));
// 3. Create transformer using the data pre-processing estimator
ITransformer dataPrepTransformer = dataPrepEstimator.Fit(data);//error here
// 4. Pre-process the training data
IDataView preprocessedTrainData = dataPrepTransformer.Transform(data);
// 5. Define Stochastic Dual Coordinate Ascent machine learning estimator
var sdcaEstimator = mlContext.Regression.Trainers.Sdca();
// 6. Train machine learning model
var sdcaModel = sdcaEstimator.Fit(preprocessedTrainData);
ImmutableArray<RegressionMetricsStatistics> permutationFeatureImportance =
mlContext
.Regression
.PermutationFeatureImportance(sdcaModel, preprocessedTrainData, permutationCount: 3);
// Order features by importance
var featureImportanceMetrics =
permutationFeatureImportance
.Select((metric, index) => new { index, metric.RSquared })
.OrderByDescending(myFeatures => Math.Abs(myFeatures.RSquared.Mean));
Console.WriteLine("Feature\tPFI");
foreach (var feature in featureImportanceMetrics)
{
Console.WriteLine($"{featureColumnNames[feature.index],-20}|\t{feature.RSquared.Mean:F6}");
}
I believe what you are looking for is called Permutation Feature Importance. This will tell you which features are most important by changing each feature in isolation, and then measuring how much that change affected the model's performance metrics. You can use this to see which features are the most important to the model.
Interpret model predictions using Permutation Feature Importance is the doc that describes how to use this API in ML.NET.
You can also use an open-source set of packages, they are much more sophisticated than what is found in ML.NET. I have an example on my GitHub how-to use R with advanced explainer packages to explain ML.NET models. You can get local instance as well as global model breakdown/details/diagnostics/feature interactions etc.
https://github.com/bartczernicki/BaseballHOFPredictionWithMlrAndDALEX

How to train one model for several devices

I have some tabular device data comprising a
time column, some tabular features, target classes
There are around 500 rows (not same) in all devices data and target classes are same.
I have same data for around 1000 devices,
I want to train a general model for all the devices for detecting the class.
Can someone help me with the approach to train for the target variable. What kind of models work in this condition
If your device type is part of the data, you can train a decision tree. If the device type feature is important for classification sake, it will be added to the tree. First, create the device type features yourself - a binary column for each device type, like done in one-hot encoding. There will be a binary column per device type - is_device_samsung, is_device_lg, is_device_iphone and so forth. The number of columns created is equal to the number of device types. All but one of these columns will be 0, and the one indicating the current type will be 1. This will not guarantee the device type will be a part of the model - but let the AI decide this for you.
BTW - don't use get_dummies unless you know how to reuse it exactly as needed in the test data.
Another option is to use the python-weka wrapper, which accepts nominal attributes:
Example:
import weka.core.jvm as jvm
from weka.core.converters import Loader
from weka.classifiers import Classifier
def get_weka_prob(inst):
dist = c.distribution_for_instance(inst)
p = dist[next((i for i, x in enumerate(inst.class_attribute.values) if x == 'DONE'), -1)]
return p
jvm.start()
loader = Loader(classname="weka.core.converters.CSVLoader")
data = loader.load_file(r'.\recs_csv\df.csv')
data.class_is_last()
datatst = loader.load_file(r'.\recs_csv\dftst.csv')
datatst.class_is_last()
c = Classifier("weka.classifiers.trees.J48", options=["-C", "0.1"])
c.build_classifier(data)
print(c)
probstst = [get_weka_prob(inst) for inst in datatst]
jvm.stop()
Weka models are different models that use a java bridge to python - the methods are java methods that can be called using this bridge. To use the dataframe in sklearn - you would have to manipulate it with one-hot encoding. Note that the nominal attributes in weka cannot have any special character in them. so use
df = df.replace([',', '"', "'", "%", ";"], '', regex=True)
for any nominal attribute before saving it to csv.
If you want to ensure that the model_type feature will be included in your model, you can trick it and add a dummy model type - and ensure that the class column for this dummy model is always "1" or "True" - depending on your class variable. If you have enough rows with this dummy model - j48 will open it as the first branch. Once the attribute is selected by j48 - it will be branched for all of the model types, not just the dummy one.

Adding static data( not changing over time) to sequence data in LSTM

I am trying to build a model like the following figure. Please see the following image:
I want to pass sequence data in LSTM layer and static data (blood group, gender) in another feed forward neural network layer. Later I want to merge them. However, I am confused about the dimenstion here.
If my understaning is right(which i depict in the image), how the 5-dimensional sequence data can be merged with 4 dimenstional static data?
Also, what is the difference of attention mechanism with this structure? (I found in the KERAS documentation that attention mechanism is an way to add static data with sequence data)
Basically, I want to add the static data with sequence data. Any other suggestion is apprciated.
I am not sure if I got what you are asking, but I will try.
Example in Keras:
static_out = (static_input)
x = LSTM(n_cell_lstm, return_sequences=True)(dynamic_input)
x = Flatten()(x)
dynamic_out = (x)
z = concatenate([dynamic_out, static_out])
z = Dense(64, activation='relu')(z)
main_output = Dense(classes, activation='softmax', name='main_output')(z)
Practically you are using an LSTM architecture as you would if you where using only the dynamic data, but at the end you add the info coming from the static data. Hope this helps.

Objective-C: Cross correlation of two audio files

I want to perform a cross-correlation of two audio files (which are actually NSData objects). I found a vDSP_convD function in accelerate framework. NSData has a property bytes which returns a pointer to an array of voids - that is the parameter of the filter and signal vector.
I struggled with other parameters. What is the length of these vectors or the length of the result vectors?
I guess:
it's the sum of the filter and signal vector.
Could anyone give me an example of using the vDSP_convD function?
Apple reference to the function is here
Thanks
After reading a book - Learning Core Audio, I have made a demo which demonstrates delay between two audio files. I used new iOS 8 API to get samples from the audio files and a good performance optimization.
Github Project.
a call would look like this:
vDSP_conv ( signal *, signalStride, filter *, filterStride, result*, resultStride, resultLenght, filterLength );
where we have:
signal*: a pointer to the first element of your signal array
signalStride: the lets call it "step size" throug your signal array. 1 is every element, 2 is every second ...
same for filter and result array
length for result and filter array
How long do the arrays have to be?:
As stated in the docs you linked our signal array has to be lenResult + lenFilter - 1 which it is where it gets a little messy. You can find a demonstration of this by Apple here or a shorter answer by SO user Rasman here.
You have to do the zero padding of the signal array by yourself so the vector functions can apply the sliding window without preparation.
Note: You might consider using the Fast-Fourier-Transformation for this, because when you work with audio files i assume, that you have quite some data and there is a significant performance increase from a certain point onwards when using:
FFT -> complex multiplication in frequency domain (which results in a correlation in time domain) -> reverse FFT
here you can find a useful piece of code for this!

Using test data set in RapidMiner

I'm trying to create a model with a training dataset and want to label the records in a test data set.
All tutorials or help I find online has information on only using cross validation with one data set, i.e., training dataset. I couldn't find how to use test data. I tried to apply the result model on to the test set. But the test set seems to give different no. of attributes than training set after pre-processing. This is a text classification problem.
At the end I get some output like this
18.03.2013 01:47:00 Results of ResultWriter 'Write as Text (2)' [1]:
18.03.2013 01:47:00 SimpleExampleSet:
5275 examples,
366 regular attributes,
special attributes = {
confidence_1 = #367: confidence(1) (real/single_value)
confidence_5 = #368: confidence(5) (real/single_value)
confidence_2 = #369: confidence(2) (real/single_value)
confidence_4 = #370: confidence(4) (real/single_value)
prediction = #366: prediction(label) (nominal/single_value)/values=[1, 5, 2, 4]
}
But what I wanted is all my examples to be labelled.
It seems that my test data and training data have different no. of attributes, I see many of following in the logs.
Mar 18, 2013 1:46:41 AM WARNING: Kernel Model: The given example set does not contain a regular attribute with name 'wireless'. This might cause problems for some models depending on this particular attribute.
But how do we solve such problem in text classification as we cannot know no. of and name of attributes before hand.
Can some one please throw some pointers.
You probably use a Process Documents operator to preprocess both training and test set. Here it is important that both these operators are setup identically. To "synchronize" the wordlist, i.e. consider the same set of words in both of them, you have to connect the wordlist (wor) output of the Process Documents operator used for training to the corresponding input port of the Process Documents operator used for preprocessing the test set.

Resources