I have been using the GMM cluster package by Bouman, for which I did not find any adaptation module online. Before I start off reading up on the GMM adaptation theory and implementing it, I did like to know if there are any other opensource GMM projects online which does all of training, testing and adaptation to new data.?
It might be late to answer this now but for future reference, I suggest the Bob library (specifically bob.bio.gmm), which provides a wide range of functionalities to manipulate Guassian mixture models for speech related applications including MAP adaptation and UBM generation.
Related
I need advice on which libraries and game engines should I use for a ml project
my goal is to create machine learning model for pruning the trees. I believe I have to create a game with generic tree model with some randomness then create reinforcement learning model and train ml model inside the game.ML model must have ability to first find the branch which must be cut and then find a path to move robotic arm near to that branch to cut it. I have experience in c++ and java but I prefer c++ , could you give me advise which library should I use for ML and which language and game engine should I use for creating game? I have a little experience in opengl. If it doesn't make any difference my prefered language is c++ but I know that I should use right tool for right job and python is leader in ML so if it will save a time and energy I have nothing against learning python.
My recommendation is to learn and use Python for your ML project. Though there is some work in R, for your future in ML, your best bet is to learn and use Python. The community is great, and there are many frameworks that can work out-of-the-box.
After a quick search, I did find a framework called robotframework, that is pretty highly starred on GitHub here: https://github.com/robotframework/robotframework. I will say though, however, that I am not personally familiar with using this framework. But it may be helpful to you.
In terms of tree-based algorithms, you might want to start exploring with XGBoost. It can be found here: https://github.com/dmlc/xgboost.
Unity provides two RL algorithms to train agents: PPO and SAC.
I have been searching for weeks now on how to write my own algorithms and only found a mention of a gym-unity wrapper that wraps Unity Environments and I could just write my algorithms using Gym. This wrapper has 0 useful documentation so I don't have anywhere to start.
My questions are:
(1) How can I import custom-written RL models into unity?
(2) Is there a better documentation for the wrapper?
You could look at my repository genetic-unity that implements evolutionary algorithms using the ML-Agent package.
I did not use their implemented agents (PPO and SAC) and I just used the interface between Unity and python to code my own algorithms, which is what you're looking for if I understand correctly.
You could start by looking at the genetic_algorithm.py file to see how I handle the Unity environment.
However you should note that this work was made 9 months ago and the ML-Agent framework changes at a fast pace, maybe you will need to adapt a little bit.
I noticed that the Gradient Quantization compression method is already implemented in TFF framework. How about non-traditional compression methods where we select a sub-model by dropping some parts of the global model? I come across the "Federated Dropout" compression method in the paper "Expanding the Reach of Federated Learning by Reducing Client Resource Requirements" (https://arxiv.org/abs/1812.07210). Any idea if Federated Dropout method is already supported in Tensorflow Federated. If not, any insights how to implement it (the main idea of the method is dropping a fixed percentage of the activations and filters in the global model to exchange and train a smaller sub-model)?
Currently, there is no implementation of this idea available in the TFF code base.
But here is an outline of how you could do it, I recommend to start from examples/simple_fedavg
Modify top-level build_federated_averaging_process to accept two model_fns -- one server_model_fn for the global model, one client_model_fn for the smaller sub-model structure actually trained on clients.
Modify build_server_broadcast_message to extract only the relevant sub-model from the server_state.model_weights. This would be the mapping from server model to client model.
The client_update may actually not need to be changed (I am not 100% sure), as long as only the client_model_fn is provided from client_update_fn.
Modify server_update - the weights_delta will be the update to the client sub-model, so you will need to map it back to the larger global model.
In general, the steps 2. and 4. are tricky, as they depend not only what layers are in a model, but also the how they are connected. So it will be hard to create a easy to use general solution, but it should be ok to write these for a specific model structure you know in advance.
We have several compression schemas implemented in our simulator:
"FL_PyTorch: Optimization Research Simulator for Federated Learning."
https://burlachenkok.github.io/FL_PyTorch-Available-As-Open-Source/
https://github.com/burlachenkok/flpytorch
FL_PyTorch is a suite of open-source software written in python that builds on top of one of the most popular research Deep Learning (DL) frameworks PyTorch. We built FL_PyTorch as a research simulator for FL to enable fast development, prototyping, and experimenting with new and existing FL optimization algorithms. Our system supports abstractions that provide researchers with sufficient flexibility to experiment with existing and novel approaches to advance the state-of-the-art. The work is in proceedings of the 2nd International Workshop on Distributed Machine Learning DistributedML 2021. The paper, presentation, and appendix are available in DistributedML’21 Proceedings (https://dl.acm.org/doi/abs/10.1145/3488659.3493775).
I am looking for implemented online learning time series algorithms. Does R, Python, MOA or any other tools have these kind of algorithms implemented?
TIA!
It's a little bit late, but in case someone is looking for the answer, I will share what I know:
PYTHON: sklearn clustering algorithms.
MiniBatchKMeans and Birch: both algorithm implementations have a partial_fit method allowing you to stream data through them in incremental updates (allowing online learning).
JAVA: MOA framework.
There are a lot of well known stream clustering algorithms implemented (CluStream, DenStream, etc ...). You can use it via:
terminal
user interface (see clustering demo)
code (Java API)
See the 'downloads' section in the MOA web, or check directly the source code on Github.
R: streamMOA: a R package that acts as a wrapper for the MOA [Java] classes. See the manual.
For some time, I have been using OpenCV. It satisfied all my needs of feature extraction, matching and clustering(k-means till now) and classification(SVM). Recently, I came across Apache Mahout. But, most of the algorithms for machine learning are already available in OpenCV as well. Are there any advantages of using Mahout over OpenCV if the work relates to Videos and Images ?
This question might be put on hold since it is opinion based. I still want to add a basic comparison.
OpenCV is capable of anything about vision and ml that is possibly researched, or invented. The vision literature is based on it, and it develops according to the literature. Even the newborn ml algorithms -like TLD, originated on MATLAB- (http://www.tldvision.com/) can also be implemented using OpenCV (http://gnebehay.github.io/OpenTLD/) with some effort.
Mahout is capable, too and specific to ml. It includes not only the well known ml algorithms, but also the specific ones. Say you came across to a paper "Processing Apples with K-means Orientation Filtering". You can find OpenCV implementations of this paper all around the web. Even the actual algorithm might be open source and developed using OpenCV. With OpenCV, say it takes 500 lines of code, but with Mahout, the paper might be already implemented with a single method making everything easier
An example about this case is http://en.wikipedia.org/wiki/Canopy_clustering_algorithm, which is harder to implement using OpenCV right now.
Since you are going to work with image data sets you will need to learn about HIPI, too.
To sum up, here is a simple pro-con table:
know-how (learning curve): OpenCV is easier, since you already know about it. Mahout+HIPI will take more time.
examples: Literature + vision community commonly use OpenCV. Open source algorithms are mostly created with C++ api of OpenCV.
ml algorithms: Mahout is only about ml, whereas OpenCV is more generic. Still OpenCV has access to basic ml algorithms.
development: Mahout is easier to work with in terms of coding and time complexity (I am not sure about the latter, but I reckon it is).