Dynamically changing the neural network structure - machine-learning

I am trying to implement a NEAT-like algorithm which involves dynamically changing the neural network structure like adding or deleting nodes and connections. I've been using Tensorflow for my previous work in supervised learning. But once a network is defined in Tensorflow , it cannot be changed. Is there any other framework available that provides this functionality ?
Thanks.

Unless it's a framework designed specifically for NEAT, no, not really. The nature of symbolic execution necessarily means that there's a "create the network" step followed by a "run/train the network" step. Depending on what kind of frequency you're changing the network topology, though, Tensorflow could definitely still be viable: it will mean, every so often, saving all the parameters, and making a new model -- but this might not be terrible, depending on your parameters.
If you don't like that, you can sort of hack something together more manually using masking. That is, have some neurons "masked" out and removed, or some connnections "masked" out. You would do this by having a 0-1 valued mask for all your parameters that you pre-multiply into your parameters before applying. Keep the "allowed" connections sparse, but densely-connect everything else together as much as possible. It will, to some degree, give you slowdown since there are some additional computations, but a tf.cond call might be able to save you most of the time by only conditionally executing. This can't get you totally free topology evolution, but could be very flexible.

Related

Turning some of the states in a Drake system into inputs

Consider an existing Drake System (for example MultibodyPlant). Is there a way to wrap that System inside a Diagram in such a way as to convert some of the states of the internal System to be inputs instead, i.e. set directly from input ports of the outer Diagram?
The motivation would be essentially to make a change in modeling decisions. For example, a quadrotor is sometimes considered to have its angular rates and collective thrust as inputs, instead of the collective thrust and body moments (or similarly, individual rotor commands).
In a more complex system, perhaps I might assume I have instantaneous control over certain velocities (e.g. internal, fully-actuated joints with fast dynamics), but still want to model the entire multibody system's dynamics accounting for the current choice of velocity in terms of Coriolis terms etc.
What I'm getting at is actually very similar to the modeling choice for the elevator in the Flat-Plate Glider Model - but I'd like to avoid manually implementing a LeafSystem because my system has nontrivial multibody dynamics.
My sense is that this may not be possible since I don't know of any way for a Diagram to interfere with the internal dynamics of a System, so "deleting" a state and promoting it to an input seems impossible. But I thought there might be some clever method to do this.
Thanks in advance!
I agree with your analysis. The simplest answer is "no" -- systems that declare state cannot be post-hoc exploded to have that state come from an input port. But for the specific examples that you mention, there are a few possibilities / related ideas.
The first is the notion of a prescribed motion constraint -- e.g. that one can set the positions/velocities of joints in a MultibodyPlant directly. We don't implement that yet, unfortunately, but it is a reasonable request that we've discussed occasionally
(here is one example: https://github.com/RobotLocomotion/drake/issues/14694).
As you say, you could implement a PD controller just outside of the system to achieve the desired effect. The only real difference between this and the way that we would implement the prescribed motion constraint internally, is that we can choose the gains well and inform the solver about that constraint directly.
Another possibility is to implement a force element that works "inside" the plant and accepts an input port. This would allow you to apply forces to implement your modeling ideas even in ways that are not possible through the actuation_input_port (e.g. not achievable by a declared actuator).
The glider example you link is a good one. In that case, I was very worried about having a model that avoided even declaring the velocity of the elevator as state (since our verification methods scale in complexity with the dimension of the state space). For now, that still requires writing a bespoke LeafSystem implementation.

When true positives are rare

Suppose you're trying to use machine learning for a classification task like, let's say, looking at photographs of animals and distinguishing horses from zebras. This task would seem to be within the state of the art.
But if you take a bunch of labelled photographs and throw them at something like a neural network or support vector machine, what happens in practice is that zebras are so much rarer than horses that the system just ends up learning to say 'always a horse' because this is actually the way to minimize its error.
Minimal error that may be but it's also not a very useful result. What is the recommended way to tell the system 'I want the best guess at which photographs are zebras, even if this does create some false positives'? There doesn't seem to be a lot of discussion of this problem.
One of the things I usually do with imbalanced classes (or skewed data sets) is simply generate more data. I think this is the best approach. You could go out in the real world and gather more data of the imbalanced class (e.g. find more pictures of zebras). You could also generate more data by simply making copies or duplicating it with transformations (e.g. flip horizontally).
You could also pick a classifier that uses an alternate evaluation (performance) metric over the one usually used - accuracy. Look at precision/recall/F1 score.
Week 6 of Andrew Ng's ML course talks about this topic: link
Here is another good web page I found on handling imbalanced classes: link
With this type of unbalanced data problem, it is a good approach to learn patterns associated with each class as opposed to simply comparing classes - this can be done via unsupervised learning learning first (such as with autoencoders). A good article with this available at https://www.r-bloggers.com/autoencoders-and-anomaly-detection-with-machine-learning-in-fraud-analytics/amp/. Another suggestion - after running the classifier, the confusion matrix can be used to determine where additional data should be pursued (I.e. many zebra errors)

Should I adapt loss weights for misclassified samples during epochs?

I am using FCN (Fully Convolutional Networks) and trying to do image segmentation. When training, there are some areas which are mislabeled, however further training doesn't help much to make them go away. I believe this is because network learns about some features which might not be completely correct ones, but because there are enough correctly classified examples, it is stuck in local minimum and can't get out.
One solution I can think of is to train for an epoch, then validate the network on training images, and then adjust weights for mismatched parts to penalize mismatch more there in next epoch.
Intuitively, this makes sense to me - but I haven't found any writing on this. Is this a known technique? If yes, how is it called? If no, what am I missing (what are the downsides)?
It highly depends on your network structure. If you are using the original FCN, due to the pooling operations, the segmentation performance on the boundary of your objects is degraded. There have been quite some variants over the original FCN for image segmentation, although they didn't go the route you're proposing.
Just name a couple of examples here. One approach is to use Conditional Random Field (CRF) on top of the FCN output to refine the segmentation. You may search for the relevant papers to get more idea on that. In some sense, it is close to your idea but the difference is that CRF is separated from the network as a post-processing approach.
Another very interesting work is U-net. It employs some idea from the residual network (RES-net), which enables high resolution features from lower levels can be integrated into high levels to achieve more accurate segmentation.
This is still a very active research area. So you may bring the next break-through with your own idea. Who knows! Have fun!
First, if I understand well you want your network to overfit your training set ? Because that's generally something you don't want to see happening, because this would mean that while training your network have found some "rules" that enables it to have great results on your training set, but it also means that it hasn't been able to generalize so when you'll give it new samples it will probably perform poorly. Moreover, you never talk about any testing set .. have you divided your dataset in training/testing set ?
Secondly, to give you something to look into, the idea of penalizing more where you don't perform well made me think of something that is called "AdaBoost" (It might be unrelated). This short video might help you understand what it is :
https://www.youtube.com/watch?v=sjtSo-YWCjc
Hope it helps

Machine learning applied to geometry

before i have to say that i don't know very well machine learning.
I want to ask if with Machine Learning can help to fill a specified area with some irregular geometries to use more area that it can, it's call "nesting"
Machile learning works by comparing the current configuration to other ones previously shown for training, with solutions, and tries to "interpolate" between them.
In your case, it means that you should have sufficiently many solved cases so that you can reasonably cover the configuration space. This can be a serious obstacle, because it requires that you have a packing algorithm anyway.
Also, due to the discontinous nature of the solution function, I doubt that an ML approach be efficient and able to construct collision-free solutions in all cases.

Improving K Means on some data sets

Anyone got an idea on how a simple K-means algorithm could be tuned to handle data sets of this form.
The most direct way to handle data of that form while still using k-means it to use a kernelized version of k-means. 2 implemtations of it exist in the JSAT library (see here https://github.com/EdwardRaff/JSAT/blob/67fe66db3955da9f4192bb8f7823d2aa6662fc6f/JSAT/src/jsat/clustering/kmeans/ElkanKernelKMeans.java)
As Nicholas said, another option is to create a new feature space on which you run k-means. However this takes some prior knowledge of what kind of data you will be clustering.
After that, you really just need to move to a different algorithm. k-means is a simple algorithm that makes simple assumptions about the world, and when those assumptions are too strongly violated (non linearly separable clusters being one of those assumptions) then you just have to accept that and pick a more appropriate algorithm.
One possible solution to this problem is to add another dimension to your data set, for which there is a split between the two classes.
Obviously this is not applicable in many cases, but if you have applied some sort of dimensionality reduction to your data, then it may be something worth investigating.

Resources