Problem with Tuning & Benchmark "surv.svm" - mlr3

I get different error messages when I try to tune/benchmark "surv.svm".
For tuning I get the following error
Error in kernelMatrix(Xtrain = sv, kernel_type = kernel_type, kernel_pars = kernel_pars, : additiv kernel can not be applied on constant column
For benchmark I get the following error when poly_kernel is listed
Error in tcrossprod(K, Dc) : non-conformable arguments
When poly_kernel is removed, I get different error message
What is the problem and how to solve it?
task = tsk("actg")
learner = as_learner(ppl("distrcompositor",
learner = lrn("surv.svm", type = "regression",
kernel = to_tune(c("lin_kernel", "add_kernel", "rbf_kernel")),
gamma.mu = to_tune(p_dbl(-3, 1, trafo = function(x) 10^x))),
estimator = "kaplan", form = "ph"))
set.seed(82721)
inner_cv = rsmp("cv", folds = 2)
at_learner = AutoTuner$new(learner = learner,
resampling = inner_cv,
measure = msr("surv.cindex"),
terminator = trm("evals", n_evals = 96),
tuner = tnr("irace"))
at_learner$train(task)

Related

How can I get the feature names after several fit_transform's from sklearn?

I'm running a machine learning model that requires multiple transformations. I applied polynomial transformations, interactions, and also a feature selection using SelectKBest:
transformer = ColumnTransformer(
transformers=[("cat", ce.cat_boost.CatBoostEncoder(y_train), cat_features),]
)
X_train_transformed = transformer.fit_transform(X_train, y_train)
X_test_transformed = transformer.transform(X_test)
poly = PolynomialFeatures(2)
X_train_polynomial = poly.fit_transform(X_train_transformed)
X_test_polynomial = poly.transform(X_test_transformed)
interaction = PolynomialFeatures(2, interaction_only=True)
X_train_interaction = interaction.fit_transform(X_train_polynomial)
X_test_interaction = interaction.transform(X_test_polynomial)
feature_selection = SelectKBest(chi2, k=55)
train_features = feature_selection.fit_transform(X_train_interaction, y_train)
test_features = feature_selection.transform(X_test_interaction)
model = lgb.LGBMClassifier()
model.fit(train_features, y_train)
However, I want to get the feature names and I have no idea on how to get them.

Why does all my emission mu of HMM in pyro converge to the same number?

I'm trying to create a Gaussian HMM model in pyro to infer the parameters of a very simple Markov sequence. However, my model fails to infer the parameters and something wired happened during the training process. Using the same sequence, hmmlearn has successfully infer the true parameters.
Full code can be accessed in here:
https://colab.research.google.com/drive/1u_4J-dg9Y1CDLwByJ6FL4oMWMFUVnVNd#scrollTo=ZJ4PzdTUBgJi
My model is modified from the example in here:
https://github.com/pyro-ppl/pyro/blob/dev/examples/hmm.py
I manually created a first order Markov sequence where there are 3 states, the true means are [-10, 0, 10], sigmas are [1,2,1].
Here is my model
def model(observations, num_state):
assert not torch._C._get_tracing_state()
with poutine.mask(mask = True):
p_transition = pyro.sample("p_transition",
dist.Dirichlet((1 / num_state) * torch.ones(num_state, num_state)).to_event(1))
p_init = pyro.sample("p_init",
dist.Dirichlet((1 / num_state) * torch.ones(num_state)))
p_mu = pyro.param(name = "p_mu",
init_tensor = torch.randn(num_state),
constraint = constraints.real)
p_tau = pyro.param(name = "p_tau",
init_tensor = torch.ones(num_state),
constraint = constraints.positive)
current_state = pyro.sample("x_0",
dist.Categorical(p_init),
infer = {"enumerate" : "parallel"})
for t in pyro.markov(range(1, len(observations))):
current_state = pyro.sample("x_{}".format(t),
dist.Categorical(Vindex(p_transition)[current_state, :]),
infer = {"enumerate" : "parallel"})
pyro.sample("y_{}".format(t),
dist.Normal(Vindex(p_mu)[current_state], Vindex(p_tau)[current_state]),
obs = observations[t])
My model is compiled as
device = torch.device("cuda:0")
obs = torch.tensor(obs)
obs = obs.to(device)
torch.set_default_tensor_type("torch.cuda.FloatTensor")
guide = AutoDelta(poutine.block(model, expose_fn = lambda msg : msg["name"].startswith("p_")))
Elbo = Trace_ELBO
elbo = Elbo(max_plate_nesting = 1)
optim = Adam({"lr": 0.001})
svi = SVI(model, guide, optim, elbo)
As the training goes, the ELBO has decreased steadily as shown. However, the three means of the states converges.
I have tried to put the for loop of my model into a pyro.plate and switch pyro.param to pyro.sample and vice versa, but nothing worked for my model.
I have not tried this model, but I think it should be possible to solve the problem by modifying the model in the following way:
def model(observations, num_state):
assert not torch._C._get_tracing_state()
with poutine.mask(mask = True):
p_transition = pyro.sample("p_transition",
dist.Dirichlet((1 / num_state) * torch.ones(num_state, num_state)).to_event(1))
p_init = pyro.sample("p_init",
dist.Dirichlet((1 / num_state) * torch.ones(num_state)))
p_mu = pyro.sample("p_mu",
dist.Normal(torch.zeros(num_state), torch.ones(num_state)).to_event(1))
p_tau = pyro.sample("p_tau",
dist.HalfCauchy(torch.zeros(num_state)).to_event(1))
current_state = pyro.sample("x_0",
dist.Categorical(p_init),
infer = {"enumerate" : "parallel"})
for t in pyro.markov(range(1, len(observations))):
current_state = pyro.sample("x_{}".format(t),
dist.Categorical(Vindex(p_transition)[current_state, :]),
infer = {"enumerate" : "parallel"})
pyro.sample("y_{}".format(t),
dist.Normal(Vindex(p_mu)[current_state], Vindex(p_tau)[current_state]),
obs = observations[t])
The model would then be trained using MCMC:
# MCMC
hmc_kernel = NUTS(model, target_accept_prob = 0.9, max_tree_depth = 7)
mcmc = MCMC(hmc_kernel, num_samples = 1000, warmup_steps = 100, num_chains = 1)
mcmc.run(obs)
The results could then be analysed using:
mcmc.get_samples()

Training random forest (ranger) using caret with custom F4 metric in R yields but after running full ,error showing undefined columns selected

library(MLmetrics)
library(caret)
library(doSNOW)
library(ranger)
data is called as the "bank additional" full from this enter link description here and then following code to generate data1
library(VIM)
data1<-hotdeck(data,variable=c('job','marital','education','default','housing','loan'),domain_var = "y",imp_var=FALSE)
#converting the categorical variables to factors as they should be
library(magrittr)
data1%<>%
mutate_at(colnames(data1)[grepl('factor|logical|character',sapply(data1,class))],factor)
Now, splitting
library(caret)
#spliting data into train test 70/30
set.seed(1234)
trainIndex<-createDataPartition(data1$y,p=0.7,times = 1,list = F)
train<-data1[trainIndex,-11]
test<-data1[-trainIndex,-11]
levels(train$y)
train$y = as.factor(train$y)
# train$y = factor(train$y,levels = c("yes","no"))
# train$y = relevel(train$y,ref="yes")
Here, i got an idea of how to create F1 metric in Training Model in Caret Using F1 Metric
and using fbeta score formula i created f1_val; now i can't understand what lev,obs and pred are indicating . in my train dataset only column y showing data$obs , but no data$pred . So, is following error is due to this? and how to rectify this?
f1 <- function (data, lev = NULL, model = NULL) {
precision <- precision(data$obs,data$pred)
recall <- sensitivity(data$obs,data$pred)
f1_val <- (17*precision*recall)/(16*precision+recall)
names(f1_val) <- c("F1")
f1_val
}
tgrid <- expand.grid(
.mtry = 1:5,
.splitrule = "gini",
.min.node.size = seq(1,500,75)
)
model_caret <- train(train$y~., data = train,
method = "ranger",
trControl = trainControl(method="cv",
number = 2,
verboseIter = T,
classProbs = T,
summaryFunction = f1),
tuneGrid = tgrid,
num.trees = 500,
importance = "impurity",
metric = "F1")
After running for 3/4 minutes we get following :
Aggregating results
Selecting tuning parameters
Fitting mtry = 5, splitrule = gini, min.node.size = 1 on full training set
but error:
Error in `[.data.frame`(data, , all.vars(Terms), drop = FALSE) :
undefined columns selected
Also when running model_caret we get,
Error: object 'model_caret' not found
Kindly help. Thanks in advance

How to construct the FSelectInstanceSingleCrit in mlr3FSelect?

I copied the code from mlr3book:
library(mlr3verse)
task = tsk("pima")
print(task)
learner = lrn("classif.rpart")
hout = rsmp("holdout")
measure = msr("classif.ce")
evals20 = trm("evals", n_evals = 20)
instance = FSelectInstanceSingleCrit$new(
task = task,
learner = learner,
resampling = hout,
measure = measure,
terminator = evals20
)
But I always got this error:
Error in initialize(...) : unused argument (store_x_domain = FALSE)
Is there anything with this code? Could someone give some suggestions? Thank you.
Update your packages with update.packages(). You use an old version of mlr3fselect.

Error on tuning parameters using classif.svm in mlr3

I'm using the mlr3 to build a machine learning workflow using SVM classfier. When I try to tune the parameter
library(mlr3)
library(mlr3learners)
library(paradox)
library(mlr3tuning)
task = tsk("pima")
learner = lrn("classif.svm")
learner$param_set
tune_ps = ParamSet$new(list(
ParamDbl$new("cost", lower = 0.001, upper = 0.1)
))
tune_ps
hout = rsmp("holdout")
measure = msr("classif.ce")
evals20 = term("evals", n_evals = 20)
instance = TuningInstance$new(
task = task,
learner = learner,
resampling = hout,
measures = measure,
param_set = tune_ps,
terminator = evals20
)
tuner = tnr("grid_search", resolution = 10)
result<-tuner$tune(instance)
It outputs the error
Error in (function (xs) :
Assertion on 'xs' failed: Condition for 'cost' not ok: type equal C-classification; instead: type=
I can't figure out what is happening there.
We decided to solve this with a more descriptive error message but still requiring to set parameters with dependencies explicitly in the ParamSet rather than falling back to ParamSet defaults.
See https://github.com/mlr-org/paradox/pull/262 and related issues/PRs for more information.

Resources