ERP study using EEG signals in EEGLab - erp

I have done an experiment where a user performs a click when a familiar image pops up on the screen. This experiment is done for 10 trials (10 times) and each trial is stored in a different file. I would need clarification on the following questions.
Because I already have each trial in a different file, I don't have to extract epochs. Is my understanding correct?
If I don't have to extract epochs, how do I plot ERPs? Because in the tutorials that I referred to, epoch creation is one of the steps involved in ERP creation.
When I try to create an ERP study, I am unable to find the option to include multiple files (files with epochs) as input. In this case, how do I import multiple epoch files?
When I try to create an ERP study, I am unable to find the option to include multiple files (files with epochs) as input. In this case, how do I import multiple epoch files?

Related

Download from Google Colab in a scheduled way

I'm training a very resourceful CycleGAN.
As the training at night, it happens that the system makes a wipe of the virtual machine and I lose all the checkpoints of my training phases.
I would like to insert a control, in which for example every 100 epochs, the checkpoints are downloaded to my hard disk, so that I can reload them and restart the training.
Is it possible to download files in a programmed way on Colab?
Variants of this question have been asked several times, and the consensus seems to be to transfer results into one's google drive. I do something like what is described here -- it works well for me.
How to download file created in Colaboratory workspace?

Is LIBSVM suitable for many categories and samples?

I'm building a text classifier, which should be able to give probabilities that a document belongs to certain categories (i.e. 80% fiction, 30% marketing etc)
I believe Libsvm does this via the "predict" method, but the problem is that I have approximately 20 categories to test for. Also, I have several hundred documents that can be used for the training.
The problem is that the training file gets 1 GB - 2 GB big, and this makes Libsvc super-slow.
How can this issue be solved? And should I go for Liblinear instead, or are there better options?
Regarding this specific question, I had to use Liblinear as LibSVC kept running forever.
But if anyone wants to know how it eventually turned out:
I switched from PHP / C++ to Python, which was tremendously
easier and did not encounter any memory issues
My case was "multi-labelling". This article put me in the right direction, and the magpie project helped me accomplish the task.

How can I handle a regression question about time series by tsfresh package (python)?

How can I handle a regression question about time series by tsfresh package (python)?
Recently, I need to forecast the volume of sales. I find a great Python package tsfresh
http://tsfresh.readthedocs.io/en/latest/text/quick_start.html
The example in quick start is a solution about classification. I want to know how to use it to solve regression problem. I want to extract the features from time series data. But when I use the method extract_features, I get same the feature for a series of data and I can't figure out what time these features belong to. It has been bothering me for a long time.
My training data

How do you add new categories and training to a pretrained Inception v3 model in TensorFlow?

I'm trying to utilize a pre-trained model like Inception v3 (trained on the 2012 ImageNet data set) and expand it in several missing categories.
I have TensorFlow built from source with CUDA on Ubuntu 14.04, and the examples like transfer learning on flowers are working great. However, the flowers example strips away the final layer and removes all 1,000 existing categories, which means it can now identify 5 species of flowers, but can no longer identify pandas, for example. https://www.tensorflow.org/versions/r0.8/how_tos/image_retraining/index.html
How can I add the 5 flower categories to the existing 1,000 categories from ImageNet (and add training for those 5 new flower categories) so that I have 1,005 categories that a test image can be classified as? In other words, be able to identify both those pandas and sunflowers?
I understand one option would be to download the entire ImageNet training set and the flowers example set and to train from scratch, but given my current computing power, it would take a very long time, and wouldn't allow me to add, say, 100 more categories down the line.
One idea I had was to set the parameter fine_tune to false when retraining with the 5 flower categories so that the final layer is not stripped: https://github.com/tensorflow/models/blob/master/inception/README.md#how-to-retrain-a-trained-model-on-the-flowers-data , but I'm not sure how to proceed, and not sure if that would even result in a valid model with 1,005 categories. Thanks for your thoughts.
After much learning and working in deep learning professionally for a few years now, here is a more complete answer:
The best way to add categories to an existing models (e.g. Inception trained on the Imagenet LSVRC 1000-class dataset) would be to perform transfer learning on a pre-trained model.
If you are just trying to adapt the model to your own data set (e.g. 100 different kinds of automobiles), simply perform retraining/fine tuning by following the myriad online tutorials for transfer learning, including the official one for Tensorflow.
While the resulting model can potentially have good performance, please keep in mind that the tutorial classifier code is highly un-optimized (perhaps intentionally) and you can increase performance by several times by deploying to production or just improving their code.
However, if you're trying to build a general purpose classifier that includes the default LSVRC data set (1000 categories of everyday images) and expand that to include your own additional categories, you'll need to have access to the existing 1000 LSVRC images and append your own data set to that set. You can download the Imagenet dataset online, but access is getting spotier as time rolls on. In many cases, the images are also highly outdated (check out the images for computers or phones for a trip down memory lane).
Once you have that LSVRC dataset, perform transfer learning as above but including the 1000 default categories along with your own images. For your own images, a minimum of 100 appropriate images per category is generally recommended (the more the better), and you can get better results if you enable distortions (but this will dramatically increase retraining time, especially if you don't have a GPU enabled as the bottleneck files cannot be reused for each distortion; personally I think this is pretty lame and there's no reason why distortions couldn't also be cached as a bottleneck file, but that's a different discussion and can be added to your code manually).
Using these methods and incorporating error analysis, we've trained general purpose classifiers on 4000+ categories to state-of-the-art accuracy and deployed them on tens of millions of images. We've since moved on to proprietary model design to overcome existing model limitations, but transfer learning is a highly legitimate way to get good results and has even made its way to natural language processing via BERT and other designs.
Hopefully, this helps.
Unfortunately, you cannot add categories to an existing graph; you'll basically have to save a checkpoint and train that graph from that checkpoint onward.

Recommended download using google prediction

I run a download portal and basically what I want to do is after a user downloads a file i would like to recommend other related categories. I'm thinking of using google predict to do this but I'm not sure how to structure the training data. I'm thinking something like this:
category of the file downloaded (label), geo, gender, age
however that seems incomplete because the data doesn't have any information on the file downloaded. Would appreciate some advice, new to ML.
Here is a suggestion that might work...
For your training data, assuming you have the logs of downloads per user, create the following dataset:
download2 (serves as label), download1, features of users, features of download1
Then train a classifier to predict class given a download and user - the output classes and corresponding scores represent downloads to recommend.

Resources