Date with Gaps - Wavelet Analysis in R Using Biwavelet Package - time-series

I am performing Wavelet Analysis using biwavelet package in R. The date variable does not have continuous dates but with gaps. When I try to create the graph, I get the following error.
Error in check.datum(d) : The step size must be constant (see approx function to interpolate)
An MWE is given below:
library(foreign)
library(biwavelet)
library(xts)
library(labelled)
library(zoo)
date =c("2020-02-13", "2020-02-14", "2020-02-17", "2020-02-18", "2020-02-19", "2020-02-20", "2020-02-21", "2020-02-24", "2020-02-25", "2020-02-26", "2020-02-27", "2020-02-28", "2020-03-02", "2020-03-03", "2020-03-04", "2020-03-05", "2020-03-06", "2020-03-09", "2020-03-10", "2020-03-11", "2020-03-12", "2020-03-13")
rdate = as.Date(date)
date <- as.Date(date, format = "%Y-%m-%d")
date
class(date)
var = c(-0.077423148, -0.083293147, -0.089214072, -0.095185943, -0.101208754, -0.107282504, -0.113407195, -0.119582824, -0.125809386, -0.125806898, -0.132149309, -0.138584509, -0.145112529, -0.151733354, -0.158446968, -0.165253401, -0.172152638, -0.179144681, -0.186229542, -0.193407193, -0.200677648, -0.208040923)
data = data.frame(date, var)
View(data)
X <- as.xts(data[,-1], order.by = date)
ABC <- data.frame(date, var)
wt.t1=plot(wt(ABC), form = "%b-%d")
How can I resolve this issue?

You can interpolate missing days by following the instructions in the error message:
alldates <- seq(min(date), max(date), by = 1)
interpdata <- approx(date, var, xout = alldates)
ABC <- data.frame(date = alldates, var = interpdata$y)
wt.t1 <- plot(wt(ABC, form = "%b-%d")
However, I think the reason you are missing some days is that they are Saturday or Sunday; I only see weekdays in the dataset.
For many datasets (e.g. stock market trading, etc.) it doesn't make sense to interpolate "what would the price have been on Saturday?", because trades never occur on Saturday or Sunday. In that case, I'd suggest replacing the "date" variable with a simple increment, e.g.
date <- 1:length(date)
ABC <- data.frame(date, var)
wt.t1=plot(wt(ABC), form = "%b-%d")

Related

Training random forest (ranger) using caret with custom F4 metric in R yields but after running full ,error showing undefined columns selected

library(MLmetrics)
library(caret)
library(doSNOW)
library(ranger)
data is called as the "bank additional" full from this enter link description here and then following code to generate data1
library(VIM)
data1<-hotdeck(data,variable=c('job','marital','education','default','housing','loan'),domain_var = "y",imp_var=FALSE)
#converting the categorical variables to factors as they should be
library(magrittr)
data1%<>%
mutate_at(colnames(data1)[grepl('factor|logical|character',sapply(data1,class))],factor)
Now, splitting
library(caret)
#spliting data into train test 70/30
set.seed(1234)
trainIndex<-createDataPartition(data1$y,p=0.7,times = 1,list = F)
train<-data1[trainIndex,-11]
test<-data1[-trainIndex,-11]
levels(train$y)
train$y = as.factor(train$y)
# train$y = factor(train$y,levels = c("yes","no"))
# train$y = relevel(train$y,ref="yes")
Here, i got an idea of how to create F1 metric in Training Model in Caret Using F1 Metric
and using fbeta score formula i created f1_val; now i can't understand what lev,obs and pred are indicating . in my train dataset only column y showing data$obs , but no data$pred . So, is following error is due to this? and how to rectify this?
f1 <- function (data, lev = NULL, model = NULL) {
precision <- precision(data$obs,data$pred)
recall <- sensitivity(data$obs,data$pred)
f1_val <- (17*precision*recall)/(16*precision+recall)
names(f1_val) <- c("F1")
f1_val
}
tgrid <- expand.grid(
.mtry = 1:5,
.splitrule = "gini",
.min.node.size = seq(1,500,75)
)
model_caret <- train(train$y~., data = train,
method = "ranger",
trControl = trainControl(method="cv",
number = 2,
verboseIter = T,
classProbs = T,
summaryFunction = f1),
tuneGrid = tgrid,
num.trees = 500,
importance = "impurity",
metric = "F1")
After running for 3/4 minutes we get following :
Aggregating results
Selecting tuning parameters
Fitting mtry = 5, splitrule = gini, min.node.size = 1 on full training set
but error:
Error in `[.data.frame`(data, , all.vars(Terms), drop = FALSE) :
undefined columns selected
Also when running model_caret we get,
Error: object 'model_caret' not found
Kindly help. Thanks in advance

Error with bagImpute and predict from caret package

I have the following when error when trying to use the preProcess function from the caret package. The predict function works for knn and median imputation, but gives an error for bagging. How should I edit my call to the predict function.
Reproducible example:
data = iris
set.seed(1)
data = as.data.frame(lapply(data, function(cc) cc[ sample(c(TRUE, NA), prob = c(0.8, 0.2), size = length(cc), replace = TRUE) ]))
preprocess_values = preProcess(data, method = c("bagImpute"), verbose = TRUE)
data_new = predict(preprocess_values, data)
This gives the following error:
> data_new = predict(preprocess_values, data)
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "NULL"
The preprocessing/imputation functions in caret work only for numerical variables.
As stated in the help of preProcess
x a matrix or data frame. Non-numeric predictors are allowed but will be ignored.
You most likely found a bug in the part that should ignore the non numerical variables which throws an uninformative error instead of ignoring them.
If you remove the factor variable the imputation works
library(caret)
df <- iris
set.seed(1)
df <- as.data.frame(lapply(data, function(cc) cc[ sample(c(TRUE, NA), prob = c(0.8, 0.2), size = length(cc), replace = TRUE) ]))
df <- df[,-5] #remove factor variable
preprocess_values <- preProcess(df, method = c("bagImpute"), verbose = TRUE)
data_new <- predict(preprocess_values, df)
The last line of code works but results in a bunch of warnings:
In cprob[tindx] + pred :
longer object length is not a multiple of shorter object length
These warnings are not from caret but from the internal call to ipred::bagging which is called internally by caret::preProcess. The cause for these errors are instances in the data where there are 3 NA values in a row, when they are removed
df <- df[rowSums(sapply(df, is.na)) != 3,]
preprocess_values <- preProcess(df, method = c("bagImpute"), verbose = TRUE)
data_new <- predict(preprocess_values, df)
the warnings disappear.
You should check out recipes, and specifically step_bagimpute, to overcome the above mentioned limitations.

Incomplete structured construct

i am new with f# , will be great if some 1 can help , nearly half a day gone solving this problem Thank you
module Certificate =
type T = {
Id: int
IsECert: bool
IsPrintCert: bool
CertifiedBy: string
Categories: Category.T list
}
let createPending now toZonedDateTime toBeCertifiedByName (job: Models.Job.T) (certificateType: Models.CertificateType.T) (pendingCertificate: Models.PendingCertificate.T) visualization (categories: Category.T list) =
let forCompletion = Models.PendingCertificate.getCertificateForCompletion pendingCertificate
{
Id = forCompletion.Id |> CertificateId.toInt
IsECert = Models.PendingCertificate.isECertificate pendingCertificate
IsPrintCert = Models.PendingCertificate.isPrintCertificate pendingCertificate
CertifiedBy = toBeCertifiedByName
Categories = categories}
i am getting an error in "Incomplete structured construct at or before this point"
Your formatting is all off. I will assume here that this is just a result of posting to StackOverflow, and your actual code is well indented.
The error comes from the definition of createPending: this function does not have a result. All its body consists of defining a forCompletion value, but there is nothing after it. Here's a simpler example that has the same problem:
let f x =
let y = 5
This function will produce the same error, because it also doesn't have a result. In F#, every function has to return something. The body cannot contain only definitions of helper functions or values. For example, I could fix my broken function above like this:
let f x =
let y = 5
x + y
This function first defines a helper value y, then adds it to its argument x, and returns the result.
> f 2
> 7
>
> f 0
> 5
How exactly you need to fix your function depends on what exactly you want it to mean. I can't help you here, because you haven't provided that information.

Total sum from a set (logic)

I have a logic problem for an iOS app but I don't want to solve it using brute-force.
I have a set of integers, the values are not unique:
[3,4,1,7,1,2,5,6,3,4........]
How can I get a subset from it with these 3 conditions:
I can only pick a defined amount of values.
The sum of the picked elements are equal to a value.
The selection must be random, so if there's more than one solution to the value, it will not always return the same.
Thanks in advance!
This is the subset sum problem, it is a known NP-Complete problem, and thus there is no known efficient (polynomial) solution to it.
However, if you are dealing with only relatively low integers - there is a pseudo polynomial time solution using Dynamic Programming.
The idea is to build a matrix bottom-up that follows the next recursive formulas:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
The idea is to mimic an exhaustive search - at each point you "guess" if the element is chosen or not.
To get the actual subset, you need to trace back your matrix. You iterate from D(SUM,n), (assuming the value is true) - you do the following (after the matrix is already filled up):
if D(x-arr[i-1],i-1) == true:
add arr[i] to the set
modify x <- x - arr[i-1]
modify i <- i-1
else // that means D(x,i-1) must be true
just modify i <- i-1
To get a random subset at each time, if both D(x-arr[i-1],i-1) == true AND D(x,i-1) == true choose randomly which course of action to take.
Python Code (If you don't know python read it as pseudo-code, it is very easy to follow).
arr = [1,2,4,5]
n = len(arr)
SUM = 6
#pre processing:
D = [[True] * (n+1)]
for x in range(1,SUM+1):
D.append([False]*(n+1))
#DP solution to populate D:
for x in range(1,SUM+1):
for i in range(1,n+1):
D[x][i] = D[x][i-1]
if x >= arr[i-1]:
D[x][i] = D[x][i] or D[x-arr[i-1]][i-1]
print D
#get a random solution:
if D[SUM][n] == False:
print 'no solution'
else:
sol = []
x = SUM
i = n
while x != 0:
possibleVals = []
if D[x][i-1] == True:
possibleVals.append(x)
if x >= arr[i-1] and D[x-arr[i-1]][i-1] == True:
possibleVals.append(x-arr[i-1])
#by here possibleVals contains 1/2 solutions, depending on how many choices we have.
#chose randomly one of them
from random import randint
r = possibleVals[randint(0,len(possibleVals)-1)]
#if decided to add element:
if r != x:
sol.append(x-r)
#modify i and x accordingly
x = r
i = i-1
print sol
P.S.
The above give you random choice, but NOT with uniform distribution of the permutations.
To achieve uniform distribution, you need to count the number of possible choices to build each number.
The formulas will be:
D(x,i) = 0 x<0
D(0,i) = 1
D(x,0) = 0 x != 0
D(x,i) = D(x,i-1) + D(x-arr[i],i-1)
And when generating the permutation, you do the same logic, but you decide to add the element i in probability D(x-arr[i],i-1) / D(x,i)

Stata: multiplying each variable of a set of time-series variables with the corresponding variable of another set

Being fairly new to Stata, I'm having a difficulty figuring out how to do the following:
I have time-series data on selling price (p) and quantity sold (q) for 10 products in a single datafile (i,e., 20 variables, p01-p10 and q01-q10). I am strugling with appropriate stata command that computes sales revenue (pq) time-series for each of these 10 products (i.e., pq01-pq10).
Many thanks for your help.
forval i = 1/10 {
local j : display %02.0f `i'
gen pq`j' = p`j' * q`j'
}
A standard loop over 1/10 won't get you the leading zero in 01/09. For that we need to use an appropriate format. See also
#article {pr0051,
author = "Cox, N. J.",
title = "Stata tip 85: Looping over nonintegers",
journal = "Stata Journal",
publisher = "Stata Press",
address = "College Station, TX",
volume = "10",
number = "1",
year = "2010",
pages = "160-163(4)",
url = "http://www.stata-journal.com/article.html?article=pr0051"
}
(added later) Another way to do it is
local j = string(`i', "%02.0f")
That makes it a bit more explicit that you are mapping from numbers 1,...,10 to strings "01",...,"10".

Resources