Getting error " no method or default for coercing “patchwork” to “dgCMatrix” in scRNA analysis, using seurat, normalization step - seurat

I have a scRNA dataset with 10 healthy controls and 17 patients. I am doing the comparative analysis. I did the following:
Created 10 seurat objects for 10 healthy controls and merged them to create one (healthy)
Created 17 seurat objects for 17 patients and merged them to create one (patients)
Created a list of the two objects: data <- list (healthy, patients)
Normalize the data:
data <- lapply(data, function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
I am getting the following error:
Error in as(object = data, Class = "dgCMatrix") : no method or default for coercing “patchwork” to “dgCMatrix”
Please help

After some trial and error I was able to reproduce your same error running this line of code before your lapply function:
data <- list(p1 + p2 , p2)
Where p1 and p2 are ggplot objects.
It looks to me that in your data list you don't have Seurat objects.
You should check for any mistakes in the code that you have used to generate your list of seurat objects.
I hope this helps :)

Related

Time series in R: How to convert raw data from int type to time type

For subject 1 in the training data, I am trying to plot nine time series corresponding to nine different features. The data is supposed to be a time series but R is not reading it as such. The first two columns, as you can see, are not time but the rest should be. How do I do this in R or Rmarkdown (I think they should be the same)?
I tried plotting it:
ggplot(train_ds, aes(x = Activity, y = TimeBodyAccelerometer-mean-X) +
theme_minimal() +
geom_point()
)
but I get this error:
Error in ggplot():
! mapping should be created with aes().
✖ You've supplied a object
Backtrace:
ggplot2::ggplot(...)
ggplot2:::ggplot.default(...)
Error in ggplot(train_ds, aes(x = Activity, y = TimeBodyAccelerometer - :
✖ You've supplied a object

Roogle Vision Package

I have a dataset (photos) with two column (one for the images ID and one with the images url).
After I plugged-in my Google Cloud Platform credentials, I ran the following code to generate keywords:
require(RoogleVision)
#add extra columns for 10 x 3 rows of data (keyword, probability score, and topicality score)
photos[,3:32] <- NA
##Loop##
for(i in 1:length(photos$url)){
te <- getGoogleVisionResponse(photos$url[i], feature="LABEL_DETECTION", numResults = 10)
#If not successful, return NA matrix
if(length(te)==1){ te <- matrix(NA, 10,4)}
if (is.null(te)){ te <- matrix(NA, 10,4)}
te <- te[,2:4]
#if successful but no. of keywords <10, put NAs in remaining rows
if(length(te[,1])<10){
te[(length(te[,1])+1):10,] <- NA}
#Append all data!
photos[i, 3:12] <- te[,1] #keywords
photos[i, 13:22] <- te[,2] #probability scores
photos[i, 23:32] <- te[,3] #topicality scores
cat("<row", i, "/", length(photos[,1]), "> ")
}
I got the following error
Error: Assigned data `te[, 1]` must be compatible with row subscript `i`.
x 1 row must be assigned.
x Assigned data has 10 rows.
ℹ Row updates require a list value. Do you need `list()` or `as.list()`?
Run `rlang::last_error()` to see where the error occurred.
Any help will be very much appreciated!

Error in predict.NaiveBayes : "Not all variable names used in object found in newdata"-- (Although no variables are missing)

I'm still learning to use caret package through The caret Package by Max Kuhn and got stuck in 16.2 Partial Least Squares Discriminant Analysis section while trying to predict using the plsBayesFit model through predict(plsBayesFit, head(testing), type = "prob") as shown in the book as well.
The data used is data(Sonar) from mlbench package, with the data being split as:
inTrain <- createDataPartition(Sonar$Class, p = 2/3, list = FALSE)
sonarTrain <- Sonar[ inTrain, -ncol(Sonar)]
sonarTest <- Sonar[-inTrain, -ncol(Sonar)]
trainClass <- Sonar[ inTrain, "Class"]
testClass <- Sonar[-inTrain, "Class"]
and then preprocessed as follows:
centerScale <- preProcess(sonarTrain)
centerScale
training <- predict(centerScale, sonarTrain)
testing <- predict(centerScale, sonarTest)
after this the model is trained using plsBayesFit <- plsda(training, trainClass, ncomp = 20, probMethod = "Bayes"), followed by predicted using predict(plsBayesFit, head(testing), type = "prob").
When I'm trying to do this I get the following error:
Error in predict.NaiveBayes(object$probModel[[ncomp[i]]], as.data.frame(tmpPred[, : Not all variable names used in object found in newdata
I've checked both the training and testing sets to check for any missing variable but there isn't any. I've also tried to predict using the 2.7.1 version of pls package which was used to render the book at that time but that too is giving me same error. What's happening?
I've tried to replicate your problem using different models, as I have encountered this error as well, but I failed; and caret seems to behave differently now from when I used it.
In any case stumbled upon this Github-issues here, and it seems like that there is a specific problem with the klaR-package. So my guess is that this is simply a bug - and nothing that can be readily fixed here!

How do you convert a GPX file directly into a SpatVector of lines while preserving attributes?

I'm trying to teach myself coding skills for spatial data analysis. I've been using Robert Hijmans' document, "Spatial Data in R," and so far, it's been great. To test my skills, I'm messing around with a GPX file I got from my smartwatch during a run, but I'm having issues getting my data into a SpatVector of lines (or a line, more specifically). I haven't been able to find anything online on this topic.
As you can see below with a data sample, the SpatVector "run" has point geometries even though "lines" was specified. From Hijman's example of SpatVectors with lines, I gathered that adding columns with "id" and "part" both equal to 1 does something that enables the data to be converted to a SpatVector with line geometries. Accordingly, in the SpatVector "run2," the geometry is lines.
My questions are 1) is adding the "id" and "part" columns necessary? 2) and what do they actually do? I.e. why are these columns necessary? 3) Is there a way to go directly from the original data to a SpatVector of lines? In the process I used to get "run2," I lost all the attributes from the original data, and I don't want to lose them.
Thanks!
library(plotKML)
library(terra)
library(sf)
library(lubridate)
library(XML)
library(raster)
#reproducible example
GPX <- structure(list(lon = c(-83.9626053348184, -83.9625438954681,
-83.962496034801, -83.9624336734414, -83.9623791072518, -83.9622404705733,
-83.9621777739376, -83.9620685577393, -83.9620059449226, -83.9619112294167,
-83.9618398994207, -83.9617654681206, -83.9617583435029, -83.9617464412004,
-83.9617786277086, -83.9617909491062, -83.9618581719697), lat = c(42.4169608857483,
42.416949570179, 42.4169420264661, 42.4169377516955, 42.4169291183352,
42.4169017933309, 42.4168863706291, 42.4168564472347, 42.4168310500681,
42.4167814292014, 42.4167292937636, 42.4166279565543, 42.4166054092348,
42.4164886493236, 42.4163396190852, 42.4162954464555, 42.4161833804101
), ele = c("267.600006103515625", "268.20001220703125", "268.79998779296875",
"268.600006103515625", "268.600006103515625", "268.399993896484375",
"268.600006103515625", "268.79998779296875", "268.79998779296875",
"269", "269", "269.20001220703125", "269.20001220703125", "269.20001220703125",
"268.79998779296875", "268.79998779296875", "269"), time = c("2020-10-25T11:30:32.000Z",
"2020-10-25T11:30:34.000Z", "2020-10-25T11:30:36.000Z", "2020-10-25T11:30:38.000Z",
"2020-10-25T11:30:40.000Z", "2020-10-25T11:30:45.000Z", "2020-10-25T11:30:47.000Z",
"2020-10-25T11:30:51.000Z", "2020-10-25T11:30:53.000Z", "2020-10-25T11:30:57.000Z",
"2020-10-25T11:31:00.000Z", "2020-10-25T11:31:05.000Z", "2020-10-25T11:31:06.000Z",
"2020-10-25T11:31:12.000Z", "2020-10-25T11:31:19.000Z", "2020-10-25T11:31:21.000Z",
"2020-10-25T11:31:27.000Z"), extensions = c("18.011677", "18.011977",
"18.012176", "18.012678", "18.013078", "18.013277", "18.013578",
"18.013877", "17.013977", "17.014278", "17.014478", "17.014677",
"17.014676", "17.014677", "16.014477", "16.014477", "16.014576"
)), row.names = c(NA, 17L), class = "data.frame")
crdref <- "+proj=longlat +datum=WGS84"
run <- vect(GPX, type="lines", crs=crdref)
run
data <- cbind(id=1, part=1, GPX$lon, GPX$lat)
run2 <- vect(data, type="lines", crs=crdref)
run2
There is a vect method for a matrix and one for a data.frame. The data.frame method can only make points (and has no type argument, so that is ignored). I will change that into an informative error and clarify this in the manual.
So to make a line, you could do
library(terra)
g <- as.matrix(GPX[,1:2])
v <- vect(g, "lines")
To add attributes you would first need to determine what they are. You have one line but 17 rows in GPX that need to be reduced to one row. You could just take the first row
att <- GPX[1, -c(1:2)]
But you may prefer to take the average instead
GPX$ele <- as.numeric(GPX$ele)
GPX$extensions <- as.numeric(GPX$extensions)
GPX$time <- as.POSIXct(GPX$time)
att <- as.data.frame(lapply(GPX[, -c(1:2)], mean))
# ele time extensions
#1 268.7412 2020-10-25 17.3078
values(v) <- att
Or in one step
v <- vect(g, "lines", atts=att)
v
#class : SpatVector
#geometry : lines
#dimensions : 1, 3 (geometries, attributes)
#extent : -83.96261, -83.96175, 42.41618, 42.41696 (xmin, xmax, ymin, ymax)
#coord. ref. :
#names : ele time extensions
#type : <num> <chr> <num>
#values : 268.7 2020-10-25 17.31
The id and part columns are not necessary if you make a single line. But you need them when you wish to create multiple lines and or line parts (in a "multi-line").
gg <- cbind(id=rep(1:3, each=6)[-1], part=1, g)
vv <- vect(gg, "lines")
plot(vv, col=rainbow(5), lwd=8)
lines(v)
points(v, cex=2, pch=1)
And with multiple lines you would use id in aggregate to compute attributes for each line.

Forecasting in R using forecast package

I'm trying to forecast hourly data for 30 days for a process.
I have used the following code:
# The packages required for projection are loaded
library("forecast")
library("zoo")
# Data Preparation steps
# There is an assumption that we have all the data for all
# the 24 hours of the month of May
time_index <- seq(from = as.POSIXct("2014-05-01 07:00"),
to = as.POSIXct("2014-05-31 18:00"), by = "hour")
value <- round(runif(n = length(time_index),100,500))
# Using zoo function , we merge data with the date and hour
# to create an extensible time series object
eventdata <- zoo(value, order.by = time_index)
# As forecast package requires all the objects to be time series objects,
# the below command is used
eventdata <- ts(value, order.by = time_index)
# For forecasting the values for the next 30 days, the below command is used
z<-hw(t,h=30)
plot(z)
I feel the output of this code, is not working fine.
The forecasted line looks wrong and the dates are not getting correctly projected on the chart.
I'm not sure the fault lies in the data preparation, or if the output is as expected. Any ideas?

Resources