Need help in reading a .csv file - machine-learning

I am new to Machine Learning and was trying out a ML model to identify facial expression. I got the dataset from reserach.google but I am not able to read the .csv file using pd.read_csv(). Is there a way to read .jpg file from .csv file?

Reading a .jpg from a .csv is a weird request. That being said, you can try using optical character recognition tools like the Google Tesseract framework. It has python support and it pretty straightforward. It works pretty well with printed text as well.
Learn more about tesseract, tutorial with python

Related

On replacing the LJ-Speech dataset with your own

In most github repositories for machine learning based text to speech, the LJ-Speech dataset is being used and optimized for.
Having unsucessfully tried to use my own wave files for it, I am interested in the right approach to prepare your dataset for an optimized framework to likely convert.
With Mozilla TTS, you can have a look at the LJ-Speech script used to prepare the data to have an idea of what is needed for your own dataset:
https://github.com/erogol/TTS_recipes/blob/master/LJSpeech/DoubleDecoderConsistency/train_model.sh

Convert .pb file (Tensorflow model file) to human readable format

I am new to tensorflow. I have downloaded and run the image classifier provided on tensorflow website. I can see link that downloads model from web.
I have to read the .pb file in human readable format.
Is this possible? if yes, How?
Thanks!
If you mean the model architecture then I recommend looking at the graph in Tensorboard, the graph visualisation tool provided with Tensorflow. I'm pretty sure that the demo code/tutorial already implements all the code required to import into tensorboard so it should just be a case or running tensorboard and pointing it to the log directory. (This should be defined in the code near the top).
Then run tensorboard --logdir=/path/to/logs/ and click the graphs tab. You will then see the various graphs for the different runs.
Alternatively, there are a couple of papers on inception that describe the model and the theory behind it. One is available here
Hope that helps you understand inception a bit more.

How to use ckpt data model into tensorflow iOS example?

I am quiet new to Machine learning, and I am working on iOS app for object detection using tensorflow, I have been using the sample data model that is provided by tensorflow example in the form of .pb (graph.pb) file which works just fine with object detection.
But My backend team has given me model2_BN.ckpt for data model file, I have tried to research on how to use this file and I have no clue.
Is it possible to use the ckpt file on client side as data model? If yes How can I use it in the iOS tensorflow example as data model?
Please help.
Thanks
This one from my backend developer:
The .ckpt is the model given by tensorflow which includes all the
weights/parameters in the model. The .pb file stores the computational
graph. To make tensorflow work we need both the graph and the
parameters. There are two ways to get the graph:
(1) use the python program that builds it in the first place (tensorflowNetworkFunctions.py).
(2) Use a .pb file (which would have to be generated by tensorflowNetworkFunctions.py).
.ckpt file is were all the intelligence is.

How to process XML files using Rapidminer for classification

I am new to Rapidminer. I have many XML files and I want to classify these files manually based on keywords. Then I would like to train a classifier like Naive Bayer and SVM on these data and calculate their performances using cross- validator.
Could you please let me know different steps for this?
Should I need to use text processing activities like tokenising, TFIDF etc.?
The steps would go something like this
Loop over files - i.e. iterate over all files in a folder and read each one in turn.
For each file
read it in as a document.
tokenize it using operators like Extract Information or Cut Document containing suitable XPath queries to output a row corresponding to the extracted information in the document.
Create a document vector with all the rows. This is where TF-IDF or other approaches would be used. The choice depends on the problem at hand with TF-IDF being a usual choice where it is important to give more weight to tokens that appear often in a relatively small number of the documents.
Build the model and use cross validation to get an estimate of the performance on unseen data.
I have included a link to a process that you could use as the basis for this. It reads the RapidMiner repository which contains XML files so is a good example of processing XML documents using text processing techniques. Obviously, you would have to make some large modifications for your case.
Hope it helps.
Probably, it is too late to reply. But it could help to other people. There is an extension called 'text mining extension', I am using version 6.1.0 . So you may go to RapidMiner > help>update and install this extension. It will get all the files from one directory. It has various text mining algorithms that you may use
Also, I found this tutorial video which could be of some help to you as well
https://www.youtube.com/watch?v=oXrUz5CWM4E

Libsvm file format for Shogun

I'm new to shogun and I've been told that it's efficient with large datasets. I keep reading that Shogun supports LibSVM data format so I thought it'd be easier to switch.
I noticed that shogun needs training data and labels set separately. In LibSVM's file format they are both contained in one data file. How can I load the exact same data file that I created for LibSVM in Shogun (i.e. without separating data and labels)?
checkout the latest develop branch of shogun toolbox from github. it has now native support for reading libsvm file format.
for more details check the examples

Resources