Does anyone know how XEMP in Datarobot works? - machine-learning

DataRobot has an inhouse algorithm named XEMP for prediction explanation. Unfortunately they haven't released the algorithm which makes it difficult to trust and use it. There is a white paper which compares it with LIME. That document is more a product marketing document than a technical document.
Does anyone have any technical understanding of how this algorithm works and what is the experience of people using it?

Related

Tips for writing an algorithm for paraphrasing sentences(machine learning)

I am doing a project at the university and I need to train an algorithm to rephrase sentences, what can you advise for implementation? Is it possible to use a translator to translate into another language in the end to get a paraphrased sentence? Also i want to use Word2Vec, or it's a bad idea?
This kind of broad-advice question – and about a very-tough problem, paraphrasing text, that is still a very active research problem – would be better answered by surveyin the research literature.
A great site for searching relevant papers – and then finding other related papers once you've set some positive examples – is http://www.arxiv-sanity.com/.
Searching for [paraphrasing] or [summarization] would give you a running start in seeing major techniques & their limitations. And, once you start bookmarking papers by the little 'disk' icon, it can autosuggest important related papers... so even if your 1st few finds are tangential or far-from-usefulness, it can lead you to the seminal papers, & prevailing cutting-edge algorithms/libraries, pretty quickly.

Where do i learn credit card fraud detection with machine learning?

Can anyone suggest me a good source to learn?
I am a newbie in ML
As I am a newbie, I have not done anything in this.
This might be an excellent place to start. You can create a new kernel straight from the dataset page, and the data will be ready for you when you enter the kernel. You can also look at other people's kernels who have used that dataset, and I bet you'll find plenty of helpful examples.
You'll get lots of hate for asking this kind of question, since it doesn't fit in S.O. question parameters, but I prefer to be a useful human.

Increasing the efficiency of equipment using Amazon Machine Learning

The problem statement is kind of vague but i am looking for directions because of privacy policy i can't share exact details. so please help out.
We have a problem at hand where we need to increase the efficiency of equipment or in other words decide on which values across multiple parameters should the machines operate to produce optimal outputs.
My query is whether it is possible to come up with such numbers using Linear Regression or Multinomial Logistic Regression algorithms, if no then can you please specify which algorithms will be more suitable. Also can you please point me to some active research done on this kind of problem that is available in public domain.
Does the type of problem i am asking suggestions for comes in the area of Machine Learning ?
Lots of unknowns here but I’ll make some assumptions.
What you are attempting to do could probably be achieved with multiple linear regression. I have zero familiarity with the Amazon service (I didn’t even know it existed until you brought this up, it’s not available in Europe). However, a read of the documentation suggests that the Amazon service would be capable of doing this for you. The problem you will perhaps have is that it’s geared to people unfamiliar with this field and a lot of its functionality might be removed or clumped together to prevent confusion. I am under the impression that you have turned to this service because you too are somewhat unfamiliar with this field.
Something that may suit your needs better is Response Surface Methodology (RSM), which I have applied to industrial optimisation problems that I think are similar to what you suggest. RSM works best if you can obtain your data through an experimental design such as a Central Composite Design or Box-Behnken design. I suggest you spend some time Googling these terms to get your head around them, I don’t think it’s an unmanageable burden to learn how to apply these with no prior experience in this area. Because your question is vague, only you can determine if this really is suitable. If you already have the data in an unstructured format, you can still generate an RSM but it is less robust. There are plenty of open-access articles using these techniques but Science Direct is conveniently down at the moment!
Minitab is a software package that will do all the regression and RSM for you. Its strength is that it has a robust GUI and partially reflects Excel so it is far less daunting to get into than something like R. It also has plenty of guides online. They offer a 30 day free trial so it might be worth doing some background reading, collecting the tutorials you need and develop a plan of action before downloading the trial.
Hope that is some help.

Is there an algorithm to describe a portrait of a person in words?

I'm looking an algorithm that analyzes a portrait-photo of a person and outputs a descriptive text like "young man, rather long nose, green eyes".
It doesn't matter if the output is very precise or not; it is for an art installation. But it should be possible to do it automatic.
I found this one: https://code.google.com/p/deep-learning-faces/, but it is impossible for me to fulfill the hardware and software requirements (NVIDIA Fermi GPUs & matlab)
Do you know of anything more accessible?
There are a few free face analyser APIs that are fairly easy to use:
Rekognition, by Orbeus
MP Face Analyzer SDK (evaluation) by MotionPortrait
Faceplusplus (linked above)
You might have to take measurements of an "average face" to make interpretations like "long nose". ToonifyMe is an app that caricatures faces using this approach.
Some of these API's can actually work on a Pi. Recognition does the analysis in the cloud, so that should be doable.
This is one of the hardest problems in computer vision. I'd recommend you watch the ted talk by Fei-Fei Li to get an understanding of it:
https://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures
In short: If you want to use any of the state-of-the-art methods you will need a lot of processing power. A lot more than just a single high-end graphics card, I'm talking about super computing here.
And unless you're really lucky and find a research group that has released their implementation, this also requires a huge amount of engineering.
I found this online service that describes faces: http://www.faceplusplus.com/
It has a very well documented API and seems to be free of charge. Or at least I didn't find any information about pricing.

OPenCV boosting differences

I am working with OpenCV for a project used for recognition and I had a general question regarding the API and it's terms. I've looked online and couldn't find anything specific to this but I was wondering what the differences were regarding the Discrete Adaboost, Real AdaBoost, LogitBoost, and Gentle AdaBoost. If anyone could direct me to a pros v cons or a general description about these so that I may research which would be useful.
Update
I have added a link to a powerpoint file that goes over the different variations of the Boosting techniques. Hope this hopes someone else out there.
Adaboost powerpoint
Thanks in advance
There isn't really a simple "always use technique X" otherwise there wouldn't be a need for all the others . You really have to understand the details and experiment.
see The opencv discussion and A list of papers and technical summaries

Resources