Can we predict y when y value is not numeric? [closed] - machine-learning

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 months ago.
Improve this question
I am using support vector regressor. I want to predict personality as shown in screenshot! Is it possible to predict when y is in string format? I used onehot encoder but its not working.

This is not a regression task, but classification. 'not working' is not very informative, however normally you'd just map classes to integers. Either sklearn.preprocessing.LabelEncoder, sklearn.preprocessing.label_binarize().argmax(axis=1), pandas.factorize() or manual mapping should get the job done.
Worth noting support vector machines don't handle multiclass problems natively, so you may encounter troubles depending on the exact model you use. At least the latest sklearn versions should handle it automatically when using models like sklearn.svm.LinearSVC, building N binary classifiers under the hood.
I'd also recommend getting acquainted with a more elegant way of ensembling SVMs for multiclass problems, using sklearn.multiclass.OutputCodeClassifier().

Related

what's the meaning of ^ in this function [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 months ago.
Improve this question
Supervised learning is divided into two processes: learning and prediction, which are completed by learning system and prediction system. In the learning process, the learning system uses the given training data set to obtain a model through learning or training, which is expressed as conditional probability distribution or decision function
I just want to know why to add " ^ " on the top of "f(x)" or "P(Y|X)"
In statistics, that symbol is used to denote an "estimator" of a random variable.

Why are linear layers used in Binary Classification with Deep Learning? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 months ago.
Improve this question
In many examples of Binary Classification with Deep learning
Why are linear layers used? I've been trying to look around the internet for information on the reason for the use of linear layers
e.g.
https://github.com/StatsGary/PyTorch_Tutorials/blob/main/01_MLP_Thyroid_Classifier/PyTorch_Binary_From_Scratch.py
https://hutsons-hacks.info/building-a-pytorch-binary-classification-multi-layer-perceptron-from-the-ground-up
Linear layer is just another (a bit mathematically incorrect) name of a fully connected layer, the most standard, classic, and in some sense - powerful building block of neural networks. Networks built purely from fully connected layers are universal approximators, and thus a good starting point for any sort of investigation.

ML or rule based [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I already have 85 accuracy on my sklearn text classifier. What are the advantages and disadvantages of making a rule based system? Can save doing double the work? Maybe you can provide me with sources and evidence for each side, so that I can make the decision baed on my cirucumstances. Again, I want to know when ruls-based approach is favorable versus when a ML based approach is favorable? Thanks!
Here is an idea:
Instead of going one way or another, you can set up a hybrid model. Look at typical errors your machine learning classifier makes, and see if you can come up with a set of rules that capture those errors. Then run these rules on your input, and if they applied, finish there; if not, pass the input on to the classifier.
In the past I did this with a probabilistic part-of-speech tagger. It's difficult to tune a probabilistic model, but it's easy to add a few pre- or post-processing rules to capture some consistent errors.
https://www.linkedin.com/feed/update/urn:li:activity:6674229787218776064?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6674229787218776064%2C6674239716663156736%29
Yoel Krupnik (CTO & co-founder | smrt - AI For Accounting) writes:
I think it really depends on the specific problem. Some problems can be completely solved with rule based logic, some require machine learning (often in combination with rule based logic before or after).
Advantages of the rule based are that it doesn't require labeled training data, might quickly provide decent results used as a benchmark and helps you better understand the problem for future labeling / text manipulations required by the ML algorithm.

Prime numbers identifier with logistic regression [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
is it possible to use logistic regression to identify prime numbers?
i´m trying to project a system with supervised logistic regression with a predefined database numbers and it´s classification (1 = Prime, 0 = Not Prime), using this data i want the computer to use this type of alghorythm to identify other numbers that aren´t classified on DB,
is it possible, or i´m trying to do something impossible?
Given the right network configuration and enough time, I don't know why it would be impossible.
It seems others have had success with different models and you might get a better idea from them:
Early success on prime number testing via artificial networks is presented in A Compositional Neural-network Solution to Prime-number Testing, László Egri, Thomas R. Shultz, 2006. The knowledge-based cascade-correlation (KBCC) network approach showed the most promise, although the practicality of this approach is eclipsed by other prime detection algorithms that usually begin by checking the least significant bit, immediately reducing the search by half, and then searching based other theorems and heuristics up to 𝑓𝑙𝑜𝑜𝑟(𝑥‾‾√). However the work was continued with Knowledge Based Learning with KBCC, Shultz et. al. 2006.

How to do data exploration before choosing any Machine Learning algorithms [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Any tools could help recognize the data distribution pattern, and then make the decision to choose ML algorithms?
Firstly, you have to understand Machine Learning as a field, and have some understanding of its sub fields. If you don't intuitively understand your tools, you won't be able to identify when to use them.
The idea you're talking about is called exploratory data analysis, and it can be very approachable if you think about it the right way. Think about it in terms of the scientific method:
First, look over the data, and any documentation about it.
Then, come to some hypotheses about the patterns that might exist.
Based on your understanding of ML, brainstorm some approaches that might give some insight into your hypotheses. For example, if you see that your proposed dependent value can have several distinct values, you have a classification problem, and based on your input data, you should choose an appropriate approach.
The tools that you might find useful are plentiful, but a good start could be the programming language R, or Python. Both are very strong data science tools. R has a greater learning curve, but is built with data science in mind. Python, on the other hand, is very easy to pick up, but you have more choices to make with regards to ML and data science libraries. With Python, look into Pandas for CSV and data manipulation, and Tensorflow, Theano or Scikit-Learn for data analysis and ML.
Hope this helps!

Resources