creating learner in mlr3: Error in sprintf(msg, ...) : too few arguments - machine-learning

I want to create a learner in mlr3, using the distRforest package.
my code:
library(mlr3extralearners)
create_learner( pkg = "." ,
classname = 'distRforest',
algorithm = 'regression tree',
type = 'regr',
key = 'distRforest',
package = 'distRforest',
caller = 'rpart',
feature_types = c("logical", "integer", "numeric","factor", "ordered"),
predict_types = c('response'),
properties = c("importance", "missings", "multiclass",
"selected_features", "twoclass", "weights"),
references = FALSE,
gh_name = 'CL'
)
gives the following error : Error in sprintf(msg, ...) : too few arguments
in fact, replicating the code in the tutorial https://mlr3book.mlr-org.com/extending-learners.html throws the same error.
Any ideas? Thanks a lot - c

thanks for your interest in extending the mlr3 universe!
Couple of things, firstly the example in the book works fine for me, and secondly your example cannot work because you are including classif properties for a regr learner. As I am unable to reproduce your error it's hard for me to debug what's going wrong, it would be helpful if you could run the following:
reprex::reprex({
create_learner(
pkg = ".",
classname = "Rpart",
algorithm = "decision tree",
type = "classif",
key = "rpartddf",
package = "rpart",
caller = "rpart",
feature_types = c("logical", "integer", "numeric", "factor", "ordered"),
predict_types = c("response", "prob"),
properties = c("importance", "missings", "multiclass", "selected_features", "twoclass", "weights"),
references = TRUE,
gh_name = "CL"
)
}, si = TRUE)
If you're still getting an error and the output is too long to print here then head over to the GitHub and open an issue there.

Related

Rewriting ParamSet ids from mlr3::paradox()

Let's say I have the following ParamSet object:
my_ps = paradox::ps(
minsplit = p_int(1, 64, logscale = TRUE),
cp = p_dbl(1e-04, 1, logscale = TRUE))
Is it possible to rename minsplit to survTree.minsplit without changing anything else?
The reason for this is that I use some learners as part of a GraphLearner and so their parameters names changed and I would like to have some code that adds the learner$id in front the parameters to use later for tuning (rather than rewriting them from scratch with the new names)
I think I have a partial solution here. It is only partial, because it does not support the transformation.
Where it works:
library(paradox)
my_ps = paradox::ps(
minsplit = p_int(1, 64),
cp = p_dbl(1e-04, 1)
)
my_ps$set_id = "john"
my_psc = ParamSetCollection$new(list(my_ps))
print(my_psc)
#> <ParamSetCollection>
#> id class lower upper nlevels default value
#> 1: john.minsplit ParamInt 1e+00 64 64 <NoDefault[3]>
#> 2: john.cp ParamDbl 1e-04 1 Inf <NoDefault[3]>
Created on 2022-12-07 by the reprex package (v2.0.1)
Where it does not:
library(paradox)
my_ps = paradox::ps(
minsplit = p_int(1, 64, logscale = TRUE),
cp = p_dbl(1e-04, 1)
)
my_ps$set_id = "john"
my_psc = ParamSetCollection$new(list(my_ps))
#> Error in .__ParamSetCollection__initialize(self = self, private = private, : Building a collection out sets, where a ParamSet has a trafo is currently unsupported!
Created on 2022-12-07 by the reprex package (v2.0.1)
The underlying problem is that we did not solve the problem of how to reconcile the parameter transformations of individual ParamSets and a possible parameter transformation of the ParamSetCollection
I fear that there is currently no neat solution for your problem.
Sorry I can not comment yet, this is not exactly the solution you are looking for but I hope this will fix the problem you are having.
You can set the param_space in the learner, before putting it in the graph, i.e. sticking with your search space. After you create the GraphLearner regularly it will have the desired search space.
A concrete example:
library(mlr3verse)
learner = lrn("regr.rpart", cp = to_tune(0.1, 0.2))
glrn = as_learner(po("pca") %>>% po("learner", learner))
at = auto_tuner(
"random_search",
glrn,
rsmp("holdout"),
term_evals = 10
)
task = tsk("mtcars")
at$train(task)

bad argument #1 to 'for iterator' (table expected, got string)

have a data like this
result = {
[1] = { ["identifier"] = MMK18495,["vehicles"] = {"vehN":"Caracara 4x4","vehM":"caracara2","totals":3},["id"] = 1,} ,
[2] = { ["identifier"] = MMK18495,["vehicles"] = {"vehN":"Sandking SWB","vehM":"sandking2","totals":3},["id"] = 2,} ,
[3] = { ["identifier"] = MMK18495,["vehicles"] = {"totals":5,"vehN":"Caracara 4x4","vehM":"caracara2"},["id"] = 3,} ,
}
trying to sort this data to a menu like this
for i=1, #result, 1 do
local ownedcars = result[i].vehicles
print(dump(ownedcars))
for _,v in pairs(ownedcars) do -- <- the error is here
menu[#menu+1] = {
header = " Model "..v.vehM.." Name "..v.vehN.." quantity"..v.totals,
txt = "",
}
end
end
the output of ownedcars
{"vehN":"Caracara 4x4","vehM":"caracara2","totals":3}
but here is the error
As others have commented, this is not a Lua table. When inputting JSON tables, Lua reads them as strings. You will first need to convert your given JSON string into Lua. Thankfully, others have already done this. I will refer you to this question which has answers that solved this problem.
Once you've converted your JSON string to a Lua table, you should be good to go.

XUnit/FsUnit framework throwing "Object reference not set to an instance of an objct"

I have written this simple test case for my code
module CustomerTests
open Xunit
open FsUnit
open MyProject.Customer
open MyProject.Customer.Domain
module ``When upgrading customer`` =
let customerVIP = {Id = 1; IsVip = true; Credit = 0.0M}
let customerSTD = {Id = 2; IsVip = false; Credit = 100.0M}
[<Fact>]
let ``should give VIP customer more credit`` () =
let expected = {customerVIP with Credit = customerVIP.Credit + 100.0M }
let actual = upgradeCustomer customerVIP
actual |> should equal expected
Very surprisingly this code fails with error
[xUnit.net 00:00:00.64] CustomerTests+When upgrading customer.should give VIP cstomer more credit [FAIL]
Failed CustomerTests+When upgrading customer.should give VIP cstomer more credit [3 ms]
Error Message:
System.NullReferenceException : Object reference not set to an instance of an object.
Stack Trace:
at CustomerTests.When upgrading customer.should give VIP cstomer more credit() in /Users/user/code/fsharp/CustomerProject/CustomerTests.fs:line 12
But line 12 is just a record being created so its not possible for that line to throw an Object reference not set to an instance of object. This is totally puzzling me.
In the dotnet fsi repl I can execute all my method and there is no object reference problem in my function which are being called from my tests here.
As this SO answer explains, XUnit loads the test in a way that skips initialization of those values. One easy fix is to use a class instead of a module:
type ``When upgrading customer``() =
let customerVIP = {Id = 1; isVip = true; Credit = 0.0M}
let customerSTD = {Id = 2; isVip = false; Credit = 100.0M}
[<Fact>]
let ``should give VIP cstomer more credit`` () =
let expected = {customerVIP with Credit = customerVIP.Credit + 100.0M }
let actual = upgradeCustomer customerVIP
actual |> should equal expected
That way, initialization of those values happens as it should.

Nvim-cmp is adding multiple times the same sources

I'm using nvim-cmpto have a contextual window to display my LSP suggestions and my snippets but when I open multiple buffers, I have an issue : the same source is added multiple times to nvim-cmp causing the same result to be repeated in the popup.
For example, here is the result of :CmpStatus: after a few minutes of work.
# ready source names
- vsnip
- buffer
- nvim_lsp:pylsp
- vsnip
- nvim_lsp:pylsp
- nvim_lsp:pylsp
Here is my nvim-cmpconfig :
cmp.setup({
snippet = {
expand = function(args)
vim.fn["vsnip#anonymous"](args.body)
end,
},
...
sources = {
{ name = 'vsnip' },
{ name = 'nvim_lua' },
{ name = 'nvim_lsp' },
{ name = 'buffer', keyword_length = 3 }
},
}
Does anyone know how to adress this issue ? Is it a problem with my configuration ?
In your cmp configuration, you can use the dup keyword for vim_item with the formatting option / format function. See help for complete-item for explanations (:help complete-item).
cmp.setup({
formatting = {
format = function(entry, vim_item)
vim_item.menu = ({
nvim_lsp = '[LSP]',
vsnip = '[Snippet]',
nvim_lua = '[Nvim Lua]',
buffer = '[Buffer]',
})[entry.source.name]
vim_item.dup = ({
vsnip = 0,
nvim_lsp = 0,
nvim_lua = 0,
buffer = 0,
})[entry.source.name] or 0
return vim_item
end
}
})
You can see details in this feature request for nvim-cmp plugin.
I had the same problem and managed to solve the issue by realizing that cmp is somehow installed twice.
Try removing or better first renaming the cmp-packages in the plugged directory, for example
cmp-nvim-lsp to cmp-nvim-lsp_not
cml-nvim-buffer to cmp-nvim-buffer_not
etc.
This did the job for me.

How to load Seurat Object into WGCNA Tutorial Format

As far as I can find, there is only one tutorial about loading Seurat objects into WGCNA (https://ucdavis-bioinformatics-training.github.io/2019-single-cell-RNA-sequencing-Workshop-UCD_UCSF/scrnaseq_analysis/scRNA_Workshop-PART6.html). I am really new to programming so it's probably just my inexperience, but I am not sure how to load my Seurat object into a format that works with WGCNA's tutorials (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/).
Here is what I have tried thus far:
This tries to replicate datExpr and datTraits from part I.1:
library(WGCNA)
library(Seurat)
#example Seurat object -----------------------------------------------
ERlist <- list(c("CPB1", "RP11-53O19.1", "TFF1", "MB", "ANKRD30B",
"LINC00173", "DSCAM-AS1", "IGHG1", "SERPINA5", "ESR1",
"ILRP2", "IGLC3", "CA12", "RP11-64B16.2", "SLC7A2",
"AFF3", "IGFBP4", "GSTM3", "ANKRD30A", "GSTT1", "GSTM1",
"AC026806.2", "C19ORF33", "STC2", "HSPB8", "RPL29P11",
"FBP1", "AGR3", "TCEAL1", "CYP4B1", "SYT1", "COX6C",
"MT1E", "SYTL2", "THSD4", "IFI6", "K1AA1467", "SLC39A6",
"ABCD3", "SERPINA3", "DEGS2", "ERLIN2", "HEBP1", "BCL2",
"TCEAL3", "PPT1", "SLC7A8", "RP11-96D1.10", "H4C8",
"PI15", "PLPP5", "PLAAT4", "GALNT6", "IL6ST", "MYC",
"BST2", "RP11-658F2.8", "MRPS30", "MAPT", "AMFR", "TCEAL4",
"MED13L", "ISG15", "NDUFC2", "TIMP3", "RP13-39P12.3", "PARD68"))
tnbclist <- list(c("FABP7", "TSPAN8", "CYP4Z1", "HOXA10", "CLDN1",
"TMSB15A", "C10ORF10", "TRPV6", "HOXA9", "ATP13A4",
"GLYATL2", "RP11-48O20.4", "DYRK3", "MUCL1", "ID4", "FGFR2",
"SHOX2", "Z83851.1", "CD82", "COL6A1", "KRT23", "GCHFR",
"PRICKLE1", "GCNT2", "KHDRBS3", "SIPA1L2", "LMO4", "TFAP2B",
"SLC43A3", "FURIN", "ELF5", "C1ORF116", "ADD3", "EFNA3",
"EFCAB4A", "LTF", "LRRC31", "ARL4C", "GPNMB", "VIM",
"SDR16C5", "RHOV", "PXDC1", "MALL", "YAP1", "A2ML1",
"RP1-257A7.5", "RP11-353N4.6", "ZBTB18", "CTD-2314B22.3", "GALNT3",
"BCL11A", "CXADR", "SSFA2", "ADM", "GUCY1A3", "GSTP1",
"ADCK3", "SLC25A37", "SFRP1", "PRNP", "DEGS1", "RP11-110G21.2",
"AL589743.1", "ATF3", "SIVA1", "TACSTD2", "HEBP2"))
genes = c(unlist(c(ERlist,tnbclist)))
mat = matrix(rnbinom(500*length(genes),mu=500,size=1),ncol=500)
rownames(mat) = genes
colnames(mat) = paste0("cell",1:500)
sobj = CreateSeuratObject(mat)
sobj = NormalizeData(sobj)
sobj$ClusterName = factor(sample(0:1,ncol(sobj),replace=TRUE))
sobj$Patient = paste0("Patient", 1:500)
sobj = AddModuleScore(object = sobj, features = tnbclist,
name = "TNBC_List",ctrl=5)
sobj = AddModuleScore(object = sobj, features = ERlist,
name = "ER_List",ctrl=5)
#WGCNA -----------------------------------------------------------------
sobjwgcna <- sobj
sobjwgcna <- FindVariableFeatures(sobjwgcna, selection.method = "vst", nfeatures = 2000,
verbose = FALSE, assay = "RNA")
options(stringsAsFactors = F)
sobjwgcnamat <- GetAssayData(sobjwgcna)
datExpr <- t(sobjwgcnamat)[,VariableFeatures(sobjwgcna)]
datTraits <- sobjwgcna#meta.data
datTraits = subset(datTraits, select = -c(nCount_RNA, nFeature_RNA))
I then copy-paste the code as written in the WGCNA I.2a tutorial (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-02-networkConstr-auto.pdf), and that all works until I get to this line in the I.3 tutorial (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-03-relateModsToExt.pdf):
MEList = moduleEigengenes(datExpr, colors = moduleColors)
Error in t.default(expr[, restrict1]) : argument is not a matrix
I tried converting both moduleColors and datExpr into a matrix with as.matrix(), but the error still persists.
Hopefully this makes sense, and thanks for reading!
So doing as.matrix(datExpr) right after datExpr <- t(sobjwgcnamat)[,VariableFeatures(sobjwgcna)] worked. I had been trying it right before MEList = moduleEigengenes(datExpr, colors = moduleColors)
and that didn't work. Seems simple but order matters I guess.

Resources