Encoding or Mapping [closed] - machine-learning

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am a bit confuse about how to handle categorical data for machine learning algorithm. There are some ways that I found on the internet, which are: Encoding only, Encoding followed by OneHotEncoding, and mapping with number 1, 2, 3, etc.
Can someone help me to understand when to use each of those ways?

There are multiple ways by which you can encode your categories depending on the nature of your data. It also depends on the algorithm that you're going to use as you can't use the same method of encoding for every model. Based on your method of encoding, you might even need to change your model cross validation strategy to avoid leakage.
Check this out - https://towardsdatascience.com/all-about-categorical-variable-encoding-305f3361fd02

Related

Is this possible to predict the lottery numbers (not the most accurate)? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am looking for the machine learning correct approach for predicting the lottery numbers, not the most accurate answer but at least we have some predicted output. I am implementing the regression based and neural network models for this. Is their any specific approach which follows this?
It is impossible. The lottery numbers are random - actually to be more specific, the system is chaotic. You would require the initial configuration (positions etc) to insane (possibly infinite) precision to be able to make any predictions. Basically, don't even try it.

iOS Generating Random Test Data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
For testing, I needed to generate a list of data values randomly and put them into the models for further use. But I found out that there is no library, which could produce such functionality.
The elegant solution I expected to find had to combine such simple things as:
the variety of data;
the variety of methods to reach this data;
the possibility to change the default data set to the custom one.
Since I hadn't found the accurate solution, I decided to create my own library (ref. https://github.com/codeitua/ios-data-factory).
There were implemented all necessary methods for data generation (including random names, cities, addresses, dates etc) and data retrieve. And moreover, it has "swifty" interface, which provides comfortable use in every project.
I hope, it will be helpful for everyone, who faced the same problem as me!

How to write a program that outputs source code [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
This might not be the right place for this to ask, but I am interested in artificial neural networks and want to learn more.
How do you design a network and train it on source code so it can come up with programs for, for example, easy number theory problems?
What's the general name of this research field?
This is a hugely interesting, and very hard, problem area. It will probably take you months to read enough to even understand how to attack the problem. Here's a few things that might help you get started, and they are more to show the problems you will face than to provide solutions:
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Then read this, and related papers:
https://arxiv.org/pdf/1410.5401v2.pdf
Next, you probably want to read the classic papers in program synthesis and generation at the parse tree/AST level (mostly out of MIT, I think, in the early 90s.)
Best of luck. This is not trivial.

What is the difference between feature engineering and feature extraction? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am struggling to find the difference between the two concepts. From what I understand both refer to turning raw data into more comprehensive features to describe the problem at hand. Are they the same thing? If not could anyone please provide examples for both?
Feature extraction is usually used when the original data was very different. In particular when you could not have used the raw data.
E.g. original data were images. You extract the redness value, or a description of the shape of an object in the image. It's lossy, but at least you get some result now.
Feature engineering is the careful preprocessing into more meaningful features, even if you could have used the old data.
E.g. instead of using variables x, y, z you decide to use log(x)-sqrt(y)*z instead, because your engineering knowledge tells you that this derived quantity is more meaningful to solve your problem. You get better results than without.
Feature engineering - is transforming raw data into features/attributes that better represent the underlying structure of your data, usually done by domain experts.
Feature Extraction - is transforming raw data into the desired form.

Linked Data and Tagging [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Does linked data applications use tagging for easier information retrieval? Where to get information on this specific topic?
For semantic annotation (tagging) the following applications would be good starting points:
http://gate.ac.uk/
http://www.ontotext.com/kim
Especially the GATE system includes a lot of information and tutorials related to both POS-tagging and ontology-based semantic tagging.
And yes, once your text has been semantically tagged, it is much easier to connect it to other pieces of text using the extra semantic medatada.

Resources