Neural Network Structure [closed] - machine-learning

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am currently building a Neural Network library. I have constructed it as an object graph for simplicity. I am wondering if anyone can quantify the performance benefits of going to an array based approach. What I have now works very good for building networks of close to arbitrary complexity. Regular (backpropped) networks as well as recurrent networks are supported. I am considering having trained networks "compile" into some "simpler" form such as arrays.
I just wanted to see if anyone out there had any practical advice or experience building neural networks that deployed well into production. Is there any benefit to having the final product be array based instead of object graph based?
P.S Memory footprint is less important than speed.

People have started using GPGPU techniques in AI, and having your neural net in matrix form could leverage the much faster matrix ops in your typical graphics card.

This all depends on what language you are using - I assume you are using a C derivative.
In my implementations I've found the object graph approach far superior. There is some tradeoff in speed, but the ease of maintenance outweighs the object lookup calls. This all depends on whether you're looking for training speed or solving speed as well... I'm assuming you are most worried about training speed?
You can always end up micro-optimizing some of the object call issues if need be.
Considering your secondary motive of sub-netting the networks, I think it's even more important to be object based - it makes it much easier to take out portions of the work.

However you implement it, you must never forget:
http://xkcd.com/534/

It's been a while, but I recall that speed is usually only an issue during training of the Neural Network.

I don't have any personal experience writing such a library, but I can link you to some popular open-source projects which you could perhaps learn from. (Personally I would just use one of these existing libraries.)
Fast Artificial Neural Network Library
NeuronDotNet

Related

Which supervised machine learning classification method suits for randomly spread classes? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
If classes are randomly spread or it is having more noise, which type of supervised ML classification model will give better results, and why?
It is difficult to say which classifier will perform best on general problems. It often requires testing of a variety of algorithms on a given problem in order to determine which classifier performs best.
Best performance is also dependent on the nature of the problem. There is a great answer in this stackoverflow question which looks at various scoring metrics. For each problem, one needs to understand and consider which scoring metric will be best.
All of that said, neural networks, Random Forest classifiers, Support Vector Machines, and a variety of others are all candidates for creating useful models given that classes are, as you indicated, equally distributed. When classes are imbalanced, the rules shift slightly, as most ML algorithms assume balance.
My suggestion would be to try a few different algorithms, and tune the hyper parameters, to compare them for your specific application. You will often find one algorithm is better, but not remarkably so. In my experience, often of far greater importance, is how your data are preprocessed and how your features are prepared. Once again this is a highly generic answer as it depends greatly on your given application.

How to implement feature extraction in Julia [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to make a binary classifier using machine learning and I am trying to develop other features for my data using correlated features (numerical attributes) I have. I searched much but could not get a block of code that will work with me.
What should i do?
I've searched in dimenshionality reduction and found library (Multivariate Statistics) but actually i did not understand and i felt lost :D
No one will make a choice for you what exact method to choose. They are many, many different ways of doing a binary classification and to do feature extraction. If you feel overwhelmed by all these names that libraries such as Multivariate Statistics offer, then take a look at a textbook on statistics and machine learning, understanding the methods is independent from the programming language.
Start with some simple methods such as principal compenent analysis (PCA), (MultivariateStats.jl provides that), then test others as you gain more knowledge on your data and the methods.
Some Julia libraries to take a look at: JuliaStats (https://github.com/JuliaStats) with its parts
StatsBase for the most basic stuff
MultivariateStats for methods like PCA
StatsModels (and DataFrames) for statistical models
many more ....
For Neural Networks there are Flux.jl and KNet.jl
For Clustering there is Clustering.jl
Then, there are also bindings to the python libraries Tensorflow (Neural Networks & more) and Scikit-Learn (all kinds of ML algorithms)
There are many more projects, but these are some that I think are important.

How to train a model to detect an event in a time line sequence of data [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have a time line of a users data and i want to train a model to detect events.
For example an event could be a gesture in a time line of accelerometer data.
or
time line of looking at the time (looking at a watch), (labeling nerves or calm).
What machine learning algorithm will be appropriate for this problem?
Thanks
This task is known as Event Detection and can be performed using Natural Language Processing (NLP) techniques.
There is no 'appropriate' or 'not appropriate' algorithm. You have to extract various features (e.g. Part-of-Speech tags) that enable the algorithm(s) to detect events. Then, you need to evaluate the implemented algorithms/models (assuming that you have also tuned the corresponding parameters for each algorithm) and decide which one is the best (in terms of performance). Also, you need to decide which features are helpful and which are not.
These papers might be a good starting point:
Machine Learning Algorithms for Event Detection
Event Detection Challenges, Methods, and Applications
in Natural and Artificial Systems
There is no closed answer as to what is the best approach. Based on experience, my favourite approach to modelling series generally is LSTM nets. These work great with time events as long as you have enough data. You can either try to look for anomalies. For this you could use an LSTM that triggers when something 'unexpected' happens. Another option would be defining different states (e.g is.event = {0,1}) and train your LSTM as a normal classifier (check this question in Quora). You can use for example keras to implement this easily in python.
If data in not so abundant, you can also try other nice sequential models like HMM and HSMM. These are also supervised model that learn from sequential data. In the case of HSMM you also take into account the time each state has occur which depending on your data can be of use. As far as I know scikit-learn only supports HMM, however there is a HSMM library available here.
Finally, some remarks about processing your data. If you intend to do batch learning, any of the models here suggested should work fine. However, if you want to do on-line learning (meaning that you make prediction on the fly as data arrives), you will need to stick to LSTM or perhaps check this alternative if you decide to use any of the Bayesian Approach: paper on-line hsmm
Hope this helps!

Why is Bayesian filtering better than Neural Networks when classifying spam? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
According to several people on StackOverflow Bayesian filtering is better than Neural Networks for detecting spam.
According to the literature I've read that shouldn't be the case. Please explain!
There is no mathematical proof or explanation that can explain why the applications of Neural Networks have not been as good at detecting spam as Bayesian filters. This does not mean that Neural Networks would not produce similar or better results, but the time it would take for one to tweak the Neural Network topology and train it to get even approximately the same results as a Bayesian filter is simply not justified. At the end of the day, people care about results and minimizing the time/effort achieving those results. When it comes to spam detection, Bayesian filters get you the best results with the least amount of effort and time. If the spam detection system using Bayesian filters detects 99% of the spam correctly, then there is very little incentive for people to spend a lot of time adjusting Neural Networks just so they can eek out an extra 0.5% or so.
"According to the literature I've read that shouldn't be the case."
It's technically correct. If properly configured, a Neural Network would get as good or even better results than the Bayesian filters, but its the cost/benefit ratio that makes the difference and ultimately the trend.
Neural Networks works mostly as black box approach. You determine your inputs and outputs. After that finding suitable architecture (2 hidden layer Multi layer perceptron , RBF network etc) is done mostly empirically. There are suggestions for determining architecture but they are, well suggestions.
This is good for some problems since we, domain analyst, do not have enough information about problem itself. Ability of NN to find an answer is a wanted thing.
Bayesian Network is on the other hand is designed mostly by domain analyst. Since spam classification is a well known problem, a domain analyst can tweak architecture more easily. Bayesian network would get better results more easily in this way.
Also most NNs are not very good with changing features therefore almost always need to be RE-trained,
an expensive operation.
Bayesian network on the other hand may only change probabilities.

Neural Network: Handling unavailable inputs (missing or incomplete data) [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Hopefully the last NN question you'll get from me this weekend, but here goes :)
Is there a way to handle an input that you "don't always know"... so it doesn't affect the weightings somehow?
Soo... if I ask someone if they are male or female and they would not like to answer, is there a way to disregard this input? Perhaps by placing it squarely in the centre? (assuming 1,0 inputs at 0.5?)
Thanks
You probably know this or suspect it, but there's no statistical basis for guessing or supplying the missing values by averaging over the range of possible values, etc.
For NN in particular, there are quite a few techniques avaialble. The technique i use--that i've coded--is one of the simpler techniques, but it has a solid statistical basis and it's still used today. The academic paper that describes it here.
The theory that underlies this technique is weighted integration over the incomlete data. In practice, no integrals are evaluated, instead they are approximated by closed-form solutions of Gaussian Basis Function networks. As you'll see in the paper (which is a step-by-step explanation, it's simple to implement in your backprop algorithm.
Neural networks are fairly resistant to noise - that's one of their big advantages. You may want to try putting inputs at (-1.0,1.0) instead, with 0 as the non-input input, though. That way the input to the weights from that neuron is 0.0, meaning that no learning will occur there.
Probably the best book I've ever had the misfortune of not finishing (yet!) is Neural Networks and Learning Machines by Simon S. Haykin. In it, he talks about all kinds of issues, including the way you should distribute your inputs/training set for the best training, etc. It's a really great book!

Resources