SPOTlight package and spotlight_deconvolution function error - spatial

I am currently running the SPOTlight package and attempting the spotlight decomposition portion of the. code on my reference scRNAseq data. The code I am running is below:
#Spotlight decomposition
set.seed(123)
spotlight_ls <-spotlight_deconvolution(
se_sc = E14_sc,
counts_spatial = anterior#assays$RNA#counts,
clust_vr = "subclass", # Variable in sc_seu containing the cell-type annotation
cluster_markers = cluster_markers_all, # Dataframe with the marker genes
cl_n = 100, # number of cells per cell type to use
hvg = 3000, # Number of HVG to use
ntop = NULL, # How many of the marker genes to use (by default all)
transf = "uv", # Perform unit-variance scaling per cell and spot prior to
factorzation and NLS
method = "nsNMF", # Factorization method
min_cont = 0 # Remove those cells contributing to a spot below a certain threshold
)
When I run this code, I get the following error:
Error in spotlight_deconvolution(se_sc = E14_sc, counts_spatial = anterior#assays$RNA#counts, : could not find function "spotlight_deconvolution"

I got the same error. The name of the function is just SPOTlight() now and the parameters names have been updated, but I think the function is the same.

Related

options for saving xarray dataset with to_netcdf

I would like to add units, long_name, and maybe a description to a variable while using the to_netcdf command. Let me know if you know how.
Here is my code that work:
filename = path+'file.nc'
ds = xr.Dataset({'sla': (('time_counter','x', 'y'), SLA)}, coords={'time_counter':time_counter,'nav_lon':(('x','y'),lon),'nav_lat':(('x','y'),lat)})
ds.to_netcdf(filename, 'w')
Supplementary informations if you want to use this:
'sla' is the name I give while saving the variable SLA
SLA has 3 dimensions; I give them the names 'time_counter', 'x', and 'y'
I defined coordinates, one of which ('time_counter') is directly a dimension of SLA, but also it is possible to have a coordinate with multiple dimensions (e.g., 'nav_lon' and 'nav_lat' have 2 dimensions.
Here is the link that explain the function: http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html
You can set the attributes of each variable before saving the Dataset to NetCDF, for example (after creating your ds):
ds['sla'].attrs = {'units': 'something'}
After the to_netcdf() step I get (part of the ncdump -h):
double sla(time_counter, x, y) ;
...
sla:units = "something" ;

Naive Bayes - no samples for class label 1

I am using accord.net. I have successfully implemented the two Decision tree algorithms ID3 and C4.5, now I am trying to implement the Naive Bays algorithm. While there is a lot of sample code on the site, most of it seems to be out of date, or have various issues.
The best sample code I have found on the site so far has been here:
http://accord-framework.net/docs/html/T_Accord_MachineLearning_Bayes_NaiveBayes_1.htm
However, when I try and run that code against my data I get:
There are no samples for class label 1. Please make sure that class
labels are contiguous and there is at least one training sample for
each label.
from line 228 of this file:
https://github.com/accord-net/framework/blob/master/Sources/Accord.MachineLearning/Tools.cs
when I call
learner.learn(inputs, outputs) in my code.
I have already run into the Null bugs that accord has when implementing the other two regression trees, and my data has been sanitized against that issue.
Does any accord.net expert have an idea what would trigger this error?
An excerpt from my code:
var codebook = new Codification(fulldata, AllAttributeNames);
/*
* Get list of all possible combinations
* Status software blows up if it encounters a value it has not seen before.
*/
var attributList = new List<IUnivariateFittableDistribution>();
foreach (var attr in DeciAttributeNames)
{
{
/*
* By default we'll use a standard static list of values for this column
*/
var cntLst = codebook[attr].NumberOfSymbols;
// no decisions can be made off of the variable if it is a constant value
if (cntLst > 1)
{
KeptAttributeNames.Add(attr);
attributList.Add(new GeneralDiscreteDistribution(cntLst));
}
}
}
var data = fulldata.Copy(); // this is a datatable
/*
* Translate our training data into integer symbols using our codebook
*/
DataTable symbols = codebook.Apply(data, AllAttributeNames);
double[][] inputs = symbols.ToJagged<double>(KeptAttributeNames.ToArray());
int[] outputs = symbols.ToArray<int>(OutAttributeName);
progBar.PerformStep();
/*
* Create a new instance of the learning algorithm
* and build the algorithm
*/
var learner = new NaiveBayesLearning<IUnivariateFittableDistribution>()
{
// Tell the learner how to initialize the distributions
Distribution = (classIndex, variableIndex) => attributList[variableIndex]
};
var alg = learner.Learn(inputs, outputs);
EDIT: After further experimentation, it seems as though this error only occurs when I am processing a certain number of rows. If I process 60 rows or less than I am fine, if I process 500 rows or more then I am fine. But in between that range I throw this error. Depending on the amount of data I choose, the index number in the error message can change, I have seen it range from 0 to 2.
All the data is coming from the same sql server datasource, the only thing I am adjusting is the Select Top ### portion of the query.
You will receive this error in multi-class scenarios when you have defined a label that does not have any sample data. With a small data set your random sampling may by chance exclude all observations with a given label.

Proper indexing in omnet++

In omnet++, would an indexing in the omnetpp.ini file like this be ok:
*.Member[0].numTcpApps = 2
*.Member[1..numberOfMembers].numTcpApps = 1
the parameter numberOfMembers has been specified in the .ned file as a usual integer variable. It is initialized to some value, e.g. 10.
What happens, if my numberOfMembers variable is set to only 1. In this case I should only have one Member (Member[0]). What happens to the second entry of the .ini file then?
One cannot use the value of a NED parameter in omnetpp.ini. However, you may achieve your goal using wildcard patterns in omnetpp.ini.
Let's assume that a network is defined in .ned as:
network ExampleNetwork
{
parameters:
int numberOfMembers;
submodules:
Member[numberOfMembers] : SomeMemberType;
// ...
}
Then in omnetpp.ini one can control the network. For example, to set numTcpApps = 2 for Member[0] only, and numTcpApps = 1 for all other submodules one should write:
*.numberOfMembers = 10
*.Member[0].numTcpApps = 2
*.Member[*].numTcpApps = 1 # i.e. Member[1], Member[2], ..., Member[9]
Take care of the order of entries in omnetpp.ini, because:
The order of entries is very important with wildcards. When a key matches several wildcard patterns, the first matching occurrence is used.
As a consequence, the following order of entries:
*.numberOfMembers = 10
*.Member[*].numTcpApps = 1 # i.e. Member[0], Member[1], Member[2], ..., Member[9]
*.Member[0].numTcpApps = 2
will set numTcpApps = 1 for all submodules. The last line is not taken into account because Member[0] has been already set by the entry Member[*].

xtable with latex math symbols in the table

I'm trying to write this table with checkboxes using the xtable package. It would seem that the checkboxes that I've chosen are what is throwing the error. I'm just at a loss how to fix it.
library(xtable)
## I want to create a table with the names of some people in two columns
nStu = 10
## Create fake names
names = character(nStu)
for(i in 1:nStu){
names[i] = paste(LETTERS[i],rep(letters[i],5),sep='',collapse='')
}
## put check boxes behind each of the names
squares = rep('$ \\square $',nStu)
## Build the table
rosterTab = data.frame('Name'=names,'Mostly'=squares,'Sometimes'=squares, stringsAsFactors = FALSE)
## Now chop it in half and paste the halves together. (Yes, if nStu is odd, this will have to be fixed)
lTab = nStu%/%2
aTab = rosterTab[1:lTab, ]
bTab = rosterTab[(lTab+1):nStu, ]
outTab = cbind(aTab,bTab)
## Everything before this point runs fine.
outTab.tab = xtable(outTab,label=FALSE)
align(outTab.tab) = 'llcc||lcc'
print(outTab.tab, include.rownames=FALSE, sanitize.text.function = function(x){x})
The error message that I'm getting is:
Error in as.string(y) : Cannot coerce argument to a string
This error goes away if I use:
squares = rep('aaa',nStu)
Ideally, I want to get the names from a csv file (which I can do quite easily), and will use knitr to write this into a LaTeX document. (I want to do this for a bunch of input files, so automating this task seems useful to me.)
Here are some other ideas that I've considered:
A $\LaTeX$ only solution. Note, that there is one other difficulty (other than the squares not being in the input file), and that is that I would need to do some text manipulation on the strings in the input file.
Replacing the \square symbol with some other object that looks like a checkbox, that (through knitr) I can send to $\LaTeX$.

randomly selection of images from File

I have a file that contains a 400 images. What I want is to separate this file into two files: train_images and test_images.
The train_images should contains 150 images selected randomly, and all these images must be different from each other. Then, the test_images should also contains 150 images selected randomly, and should be different from each other, even from the images selected in the file train_images.
I begin by writing a code that aims to select a random number of images from a Faces file and put them on train_images file. I need your help in order to respond to my behavior described above.
clear all;
close all;
clc;
Train_images='train_faces';
mkdir(Train_images);
ImageFiles = dir('Faces');
totalNumberOfImages = length(ImageFiles)-1;
scrambledList = randperm(totalNumberOfImages);
numberIWantToUse = 150;
loop_counter = 1;
for index = scrambledList(1:numberIWantToUse)
baseFileName = ImageFiles(index).name;
str = fullfile('faces', baseFileName); % Better than STRCAT
face = imread(str);
imwrite( face, fullfile(Train_images, ['hello' num2str(index) '.jpg']));
loop_counter = loop_counter + 1;
end
Any help will be very appreciated.
Your code looks good to me. When you implement the test, you can re-run the scrambledList = randperm(totalNumberOfImages); then select the first 150 elements in scrambledList as you did in training process.
You can also directly re-initialize the loop:
for index = scrambledList(numberIWantToUse+1 : 2*numberIWantToUse)
... % same thing you wrote in your training loop
end
with this approach, your test sample will be completely different from the training sample.
Supposing that you have the Bioinformatics Toolbox, you can use crossvalind using the parameter HoldOut:
This is an example. trainand test are logical arrays, so you can use findto get the actual indexes:
ImageFiles = dir('Faces');
ImageFilesIndexes = ones(1,length(ImageFiles )) %Use a numeric array instead the char array
proportion = 150/400; %Testing set
[train,test] = crossvalind('holdout',ImageFilesIndexes,proportion );
training_files = ImageFiles(train); %250 files: It is better to use more data to train
testing_files = ImageFiles(test); %150 files
%Then do whatever you like with the files
Other possibilities are dividerand ( Neural Network Toolbox) and cvpartition (Statistics Toolbox)

Resources