I am trying normalize the input of my CoreML model like below, it kind of does something to the array but its quite different then what SKLearn does(I give same input and watch output in these environments). So appereantly I do something wrong.
My Model is trained with Keras and SKlearn and it must do the same normalization as I did using SKLearn Normalizer, which is the default L2 normalizer. What I am doing below apperantly is not equalivant of sklearn, any ideas?
vDSP_normalizeD(vec, 1, &normalizedVec, 1, &mean, &std, vDSP_Length(count))
let (normalizedXVec, _, _) = normalize(vec: doubleArray)
Then here I convert normalizedXVec to MLMultiArray and use as input to my predictor
Note: I also tried to convert the normalizer from sklearn using coreml tools but I got errors as seen here:
vDSP_normalizeD uses the mean and standard deviation. That is not the same as L2.
The L2 normalization first computes the L2-norm of the vector, which is the same as sqrt(v[0]*v[0] + v[1]*v[1] + ... + v[n]*v[n]) and then it divides each element of the vector by that number.
Related
My LightGBM regressor model returns negative values.
For XGBoost there is objective='count:poisson' hyperparameter in order to prevent returning negative predicitons.
Is there any chance to do this ?
Github issue => https://github.com/microsoft/LightGBM/issues/5629
LightGBM also supports poisson regression. For example, consider the following Python code.
import lightgbm as lgb
import numpy as np
from matplotlib import pyplot
# random Poisson-distributed target and one informative feature
y = np.random.poisson(lam=15.0, size=1_000)
X = y + np.random.normal(loc=10.0, scale=2.0, size=(y.shape[0], ))
X = X.reshape(-1, 1)
# fit a Poisson regression model
reg = lgb.LGBMRegressor(
objective="poisson",
n_estimators=150,
min_data=1
)
reg.fit(X, y)
# get predictions
preds = reg.predict(X)
print("summary of predicted values")
print(f" * min: {round(np.min(preds), 3)}")
print(f" * max: {round(np.max(preds), 3)}")
# compare predicted distribution to the empirical one
bins = np.linspace(0, 30, 50)
pyplot.hist(y, bins, alpha=0.5, label='actual')
pyplot.hist(preds, bins, alpha=0.5, label='predicted')
pyplot.legend(loc='upper right')
pyplot.show()
This example uses Python 3.10 and lightgbm==3.3.3.
However... I don't recommend using Poisson regression just to achieve "no negative predictions". The Poisson loss function is intended to be used for cases where you believe your target is Poisson-distributed, e.g. it looks like counts of events observed over some regular interval like time or space.
Other options you might consider to try to achieve the behavior "never predict a negative number from LightGBM regression":
write a custom objective function in one of the interfaces that support it, like the R or Python package
post-process LightGBM's predictions, recoding negative values to 0
pre-process the target variable such that there are no negative values (e.g. dropping such observations, re-scaling, taking the absolute value)
LightGBM also facilitates an objective parameter which can be set to 'poisson'. Follow this link for more information.
An example for LGBMRegressor (scikit-learn API):
from lightgbm import LGBMRegressor
regressor = LGBMRegressor(objective='poisson')
I have studied some related questions regarding Naive Bayes, Here are the links. link1, link2,link3 I am using TF-IDF for feature selection and Naive Bayes for classification. After fitting the model it gave the prediction successfully. and here is the output
accuracy = train_model(model, xtrain, train_y, xtest)
print("NB, CharLevel Vectors: ", accuracy)
NB, accuracy: 0.5152523571824736
I don't understand the reason why Naive Bayes did not give any error in the training and testing process
from sklearn.preprocessing import PowerTransformer
params_NB = {'alpha':[1.0], 'class_prior':[None], 'fit_prior':[True]}
gs_NB = GridSearchCV(estimator=model,
param_grid=params_NB,
cv=cv_method,
verbose=1,
scoring='accuracy')
Data_transformed = PowerTransformer().fit_transform(xtest.toarray())
gs_NB.fit(Data_transformed, test_y);
It gave this error
Negative values in data passed to MultinomialNB (input X)
TL;DR: PowerTransformer, which you seem to apply only in the GridSearchCV case, produces negative data, which makes MultinomialNB to expectedly fail, es explained in detail below; if your initial xtrain and ytrain are indeed TF-IDF features, and you do not transform them similarly with PowerTransformer (you don't show something like that), the fact that they work OK is also unsurprising and expected.
Although not terribly clear from the documentation:
The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.
reading closely you realize that it implies that all the features should be positive.
This has a statistical basis indeed; from the Cross Validated thread Naive Bayes questions: continus data, negative data, and MultinomialNB in scikit-learn:
MultinomialNB assumes that features have multinomial distribution which is a generalization of the binomial distribution. Neither binomial nor multinomial distributions can contain negative values.
See also the (open) Github issue MultinomialNB fails when features have negative values (it is for a different library, not scikit-learn, but the underlying mathematical rationale is the same).
It is not actually difficult to demonstrate this; using the example available in the documentation:
import numpy as np
rng = np.random.RandomState(1)
X = rng.randint(5, size=(6, 100)) # random integer data
y = np.array([1, 2, 3, 4, 5, 6])
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB()
clf.fit(X, y) # works OK
# inspect X
X # only 0's and positive integers
Now, changing a single element of X to a negative number and trying to fit again:
X[1][0] = -1
clf.fit(X, y)
gives indeed:
ValueError: Negative values in data passed to MultinomialNB (input X)
What can you do? As the Github thread linked above suggests:
Either use MinMaxScaler(), which will bring all the features to [0, 1]
Or use GaussianNB instead, which does not suffer from this limitation
I'm a beginner and making a linear regression model, when I make predictions on the basis of test sets, it works fine. But when I try to predict something for a specific value. It gives an error. The tutorial I'm watching, they don't have any errors.
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
# Fitting Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)
# Visualising the Linear Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg.predict(X), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
# Predicting a new result with Linear Regression
lin_reg.predict(6.5)
ValueError: Expected 2D array, got scalar array instead:
array=6.5.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
According to the Scikit-learn documentation, the input array should have shape (n_samples, n_features). As such, if you want a single example with a single value, you should expect the shape of your input to be (1,1).
This can be done by doing:
import numpy as np
test_X = np.array(6.5).reshape(-1, 1)
lin_reg.predict(test_X)
You can check the shape by doing:
test_X.shape
The reason for this is because the input can have many samples (i.e. you want to predict for multiple data points at once), or/and each sample can have many features.
Note: Numpy is a Python library to support large arrays and matrices. When scikit-learn is installed, Numpy should be installed as well.
I'm pretty sure this is a silly question but I can't find it anywhere else so I'm going to ask it here.
I'm doing semantic image segmentation using a cnn (unet) in keras with 7 labels. So my label for each image is (7,n_rows,n_cols) using the theano backend. So across the 7 layers for each pixel, it's one-hot encoded. In this case, is the correct error function to use categorical cross-entropy? It seems that way to me but the network seems to learn better with binary cross-entropy loss. Can someone shed some light on why that would be and what the principled objective is?
Binary cross-entropy loss should be used with sigmod activation in the last layer and it severely penalizes opposite predictions. It does not take into account that the output is a one-hot coded and the sum of the predictions should be 1. But as mis-predictions are severely penalizing the model somewhat learns to classify properly.
Now to enforce the prior of one-hot code is to use softmax activation with categorical cross-entropy. This is what you should use.
Now the problem is using the softmax in your case as Keras don't support softmax on each pixel.
The easiest way to go about it is permute the dimensions to (n_rows,n_cols,7) using Permute layer and then reshape it to (n_rows*n_cols,7) using Reshape layer. Then you can added the softmax activation layer and use crossentopy loss. The data should also be reshaped accordingly.
The other way of doing so will be to implement depth-softmax :
def depth_softmax(matrix):
sigmoid = lambda x: 1 / (1 + K.exp(-x))
sigmoided_matrix = sigmoid(matrix)
softmax_matrix = sigmoided_matrix / K.sum(sigmoided_matrix, axis=0)
return softmax_matrix
and use it as a lambda layer:
model.add(Deconvolution2D(7, 1, 1, border_mode='same', output_shape=(7,n_rows,n_cols)))
model.add(Permute(2,3,1))
model.add(BatchNormalization())
model.add(Lambda(depth_softmax))
If tf image_dim_ordering is used then you can do way with the Permute layers.
For more reference check here.
I tested the solution of #indraforyou and think that the proposed method has some mistakes. As the commentsection does not allow for proper code segments, here is what I think would be the fixed version:
def depth_softmax(matrix):
from keras import backend as K
exp_matrix = K.exp(matrix)
softmax_matrix = exp_matrix / K.expand_dims(K.sum(exp_matrix, axis=-1), axis=-1)
return softmax_matrix
This method will expect the ordering of the matrix to be (height, width, channels).
i wanted to code the linear kernel regression in sklearn so i made this code :
model = LinearRegression()
weights = rbf_kernel(X_train,X_test)
for i in range(weights.shape[1]):
model.fit(X_train,y_train,weights[:,i])
model.predict(X_test[i])
then i found that there is KernelRidge in sklearn :
model = KernelRidge(kernel='rbf')
model.fit(X_train,y_train)
pred = model.predict(X_train)
my question is:
1-what is the difference between these 2 codes?
2-in model.fit() that come after KernelRidge(), i found in the documentation that i can add a third argument "weight" to fit() function, i would i do that if i already applied a kernel function to the model?
What is the difference between these two code snippets?
Basically, they have nothing in common. Your first code snippet implements linear regression, with arbitrary set weights of samples. (How did you even come up with calling rbf_kernel this way?) This is still just a linear model, nothing more. You simply assigned (a bit randomly) which samples are important and then looped over features (?). This makes no sense at all. In general: what you have done with rbf_kernel is simply wrong; this is completely not how it is supposed to be used (and why it gave you errors when you tried to pass it to the fit method and you ended up doing a loop and passing each column separately).
Example of fitting such a model to data which is a cosine (thus 0 in mean):
I found in the documentation for the model.fit() function that comes after KernelRidge() that I can add a third argument, weight. Would I do that if I had already applied a kernel function to the model?
This is actual kernel method, kernel is not samples weighting. (One might use kernel function to assign weights, but this is not the meaning of kernel in "linear kernel regression" or in general "kernel methods".) Kernel is a method of introducing nonlinearity to the classifier, which comes from the fact that many methods (including linear regression) can be expressed as dot products between vectors, which can be substituted by kernel function leading to solving the problem in different space (Reproducing Hilbert Kernel Space), which might have very high complexity (like the infinite dimensional space of continuous functions induced by the RBF kernel).
Example of fitting to the same data as above:
from sklearn.linear_model import LinearRegression
from sklearn.kernel_ridge import KernelRidge
import numpy as np
from matplotlib import pyplot as plt
X = np.linspace(-10, 10, 100).reshape(100, 1)
y = np.cos(X)
for model in [LinearRegression(), KernelRidge(kernel='rbf')]:
model.fit(X, y)
p = model.predict(X)
plt.figure()
plt.title(model.__class__.__name__)
plt.scatter(X[:, 0], y)
plt.plot(X, p)
plt.show()