ggplotly tooltip is showing data twice - tooltip

I have 2 datasets included in one chart using ggplot. I am using ggplotly to create a tooltip but the information in the tooltips for the 2 points is showing twice. The following code is a little lengthy but will recreate the chart:
AreaName <- c("A", "B", "C", "A", "B", "C")
Timeperiod <- c("2018", "2018", "2018", "2019", "2019", "2019")
Value <- c(11.5, 39.3, 9.4, 14.2, 40.7, 19.1)
df <- data.frame(cbind(AreaName, Timeperiod, Value), stringsAsFactors = F)
df$Value <- as.numeric(df$Value)
AreaName <- c("A", "A")
Timeperiod <- c("2019", "2020")
qtr <- c("Q1-Q2", "Q1-Q2")
Value <- c(15.6, 10.2)
df2 <- data.frame(cbind(Timeperiod, qtr, AreaName, Value), stringsAsFactors = F)
df2$Value <- as.numeric(df2$Value)
ggp <- ggplotly(ggplot(data = df, aes(x=Timeperiod, y=Value, group = AreaName, colour = AreaName, text = paste("Area name: ", AreaName, "<br>Time period: ", Timeperiod, "<br>Rate: ", round(Value,1), "per 100,000"))) +
geom_line() +
geom_point() +
geom_point(data = df2, aes(shape = c(paste(AreaName, qtr, Timeperiod)),text = paste("Area name: ", AreaName, "<br>Quarter: ", qtr, "<br>Time period: ", Timeperiod, "<br>Rate: ", round(Value,1), "per 100,000"))) +
scale_shape_manual(values = c(18, 17)) +
theme(axis.text.x = element_text(vjust = 0.5), axis.title.x = element_blank()) +
labs(y = "Crude rate per 100,000 persons all ages", colour = "Area", shape = "") +
guides(shape = guide_legend(order = 2),colour = guide_legend(order = 1)) +
expand_limits(y=0), tooltip = "text")
ggpNames <- unique(df$AreaName)
legs <- paste(df2$AreaName, df2$qtr, df2$Timeperiod)
ggpNames <- c(ggpNames,legs)
for (i in 1:length(ggp$x$data)) { # this goes over all places where legend values are stored
n1 <- ggp$x$data[[i]]$name # and this is how the value is stored in plotly
n2 <- " "
for (j in 1:length(ggpNames)) {
if (grepl(x = n1, pattern = ggpNames[j])) {n2 = ggpNames[j]} # if the plotly legend name contains the original value, replace it with the original value
}
ggp$x$data[[i]]$name <- n2 # now is the time for actual replacement
if (n2 == " ") {ggp$x$data[[i]]$showlegend = FALSE} # sometimes plotly adds to the legend values that we don't want, this is how to get rid of them, too
}
ggp %>% config(displaylogo = FALSE, modeBarButtonsToRemove = list("autoScale2d", "resetScale2d","select2d", "lasso2d", "zoomIn2d", "zoomOut2d", "toggleSpikelines", "zoom2d", "pan2d"))
ggp
Does anyone have an elegant solution to this?
Thanks

Do not define text in geom_point for the second dataframe df2. Then you will get only one tooltip for those two points.
ggp <- ggplotly(ggplot(data = df, aes(x=Timeperiod, y=Value, group = AreaName, colour = AreaName, text = paste("Area name: ", AreaName, "<br>Time period: ", Timeperiod, "<br>Rate: ", round(Value,1), "per 100,000"))) +
geom_line() +
geom_point() +
geom_point(data = df2, aes(shape = c(paste(AreaName, qtr, Timeperiod)) #,
#text = paste("Area name: ", AreaName, "<br>Quarter: ", qtr, "<br>Time period: ", Timeperiod, "<br>Rate: ", round(Value,1), "per 100,000")
)) +
scale_shape_manual(values = c(18, 17)) +
theme(axis.text.x = element_text(vjust = 0.5), axis.title.x = element_blank()) +
labs(y = "Crude rate per 100,000 persons all ages", colour = "Area", shape = "") +
guides(shape = guide_legend(order = 2),colour = guide_legend(order = 1)) +
expand_limits(y=0), tooltip = "text")
ggpNames <- unique(df$AreaName)
legs <- paste(df2$AreaName, df2$qtr, df2$Timeperiod)
ggpNames <- c(ggpNames,legs)
for (i in 1:length(ggp$x$data)) { # this goes over all places where legend values are stored
n1 <- ggp$x$data[[i]]$name # and this is how the value is stored in plotly
n2 <- " "
for (j in 1:length(ggpNames)) {
if (grepl(x = n1, pattern = ggpNames[j])) {n2 = ggpNames[j]} # if the plotly legend name contains the original value, replace it with the original value
}
ggp$x$data[[i]]$name <- n2 # now is the time for actual replacement
if (n2 == " ") {ggp$x$data[[i]]$showlegend = FALSE} # sometimes plotly adds to the legend values that we don't want, this is how to get rid of them, too
}
ggp %>% config(displaylogo = FALSE, modeBarButtonsToRemove = list("autoScale2d", "resetScale2d","select2d", "lasso2d", "zoomIn2d", "zoomOut2d", "toggleSpikelines", "zoom2d", "pan2d"))
ggp

Related

plotly joins points in wrong order

I have the following df (part)
"Date"
"2022-09-01" "2022-09-02" "2022-09-05" "2022-09-06" "2022-09-07" "2022-09-08" ....
"LogClose"
8.285728 8.274933 8.274933 8.270830 8.289004 8.295593 ....
"wielkoscDD"
0.00000000 -0.01073648 -0.01073648 -0.01478755 0.00000000 0.00000000 ....
I use:
p <- plot_ly(
df,
x = ~Date,
y = ~LogClose,
name = 'zamknięcie',
type = 'scatter',
mode = 'lines',
text = ~paste("zamknięcie :", Zamkniecie),
width = obrazek_szer,
height = obrazek_wyso)
but the second draw is correct
p <- plot_ly(
df,
x = ~Date,
y = ~wielkoscDD,
name = 'drawdown',
type = 'scatter',
mode = 'lines',
text = ~paste("drawdown : ", wielkoscDD),
width = obrazek_szer,
height = obrazek_wyso)
Additionaly on my Mac it works ok - on Windows there is chaos with the 1st chart.
rgds
Grzegorz

How to fix the dataTest error in Timeseries in R?

#Training and Test Split
split <- ceiling(0.7 * length(data))
dataTrain <- ts(data[1:split], frequency = 15, start = c(2021,5))
dataTest <- ts(data[c((split+1) : nrow(data))], frequency = 15, start = c(2022,5))
actual <- unclass(dataTrain)
actualFull <- unclass(data)
dataTest <- ts(data[c((split+1) : nrow(data))], frequency = 15, start = c(2022,5))

Need help getting method = cforest to work within train() from caret using leave one out cross-validation

examples_dataset.csv
I have tried looking up so many ways to fix this issue, but no solution so far. I am trying to train conditional inference forests with caret, using the leave one out cross-validation method. I have about 20 (larger) datasets to run this method on, hence the functions to automate some.
A lot of what I have found suggests that my QuantBins are not factors, but I have checked after running prep_df() on the df and those are indeed factors. I get an error when running the conditional inference forests (cif_model()), but not with random forests (rf_model()). The output from trying to make that model is "Something is wrong; all the Accuracy metric values are missing" (pictured below).
Any help and guidance is appreciated!
## Example code
## GOAL: create train() code from caret that uses conditional inference forests to assess variable importance with categorical dependent variable using leave one out cross validation
rm(list=ls())
setwd()
ex.all <- read.csv("examples_dataset.csv", header = TRUE)
loo_ctrl <- trainControl(method = "LOOCV")
#This function works!
rf_model <- function(file.name) {
model <- train(QuantBins ~ F_Cou + B_Cou + Height + GBH + N_b + N_f + L_u + D_w + N_p
+ P_Cou, data = file.name, method = "rf", trControl = loo_ctrl, tuneLength = 10, control =
rpart.control(minbucket = 10), ntree = 50)
return(model)
}
#This does not.
cif_model <- function(file.name) {
model <- train(QuantBins ~ F_Cou + B_Cou + Height + GBH + N_b + N_f + L_u + D_w + N_p
+ P_Cou, data = file.name, method = "cforest", trControl = loo_ctrl, tuneLength = 10, control
= ctree_control(minbucket = 10), ntree = 50)
return(model)
}
##### functions used #####
prep_df <- function(file.name) {
file.name$BINARY <- ifelse(file.name$TOTAL >= 1, "yes", "no")
file.name$BINARY <- as.factor(file.name$BINARY)
file.name$L_u <- as.factor(file.name$L_u)
file.name$TOTAL <- as.numeric(file.name$TOTAL)
## Quantile distribution of breaks in Total Fruit
numbers_of_bins = 5 #this will return four groups
file.name <- file.name %>% mutate(QuantBins = cut(TOTAL, breaks = unique(quantile(TOTAL,
probs=seq.int(0,1, by=1/numbers_of_bins))), include.lowest=TRUE))
print(length(levels(file.name$QuantBins)))
temp <- levels(file.name$QuantBins)
file.name$QuantBins <- as.character(file.name$QuantBins)
for(i in 1:length(file.name$QuantBins)) {
temp1 <- strsplit(file.name$QuantBins[i], ",")
temp2 <- strsplit(temp1[[1]][1], "\\(")
temp3 <- strsplit(temp1[[1]][[2]], "\\]")
file.name$QuantBins[i] <- paste("Fruit", temp2[[1]][2], "to", temp3[[1]][1])
}
file.name$QuantBins <- as.factor(file.name$QuantBins)
file.name$QuantBins <- droplevels(file.name$QuantBins)
print(length(levels(file.name$QuantBins)))
return(file.name)
}
##### running trees #####
ex.all <- prep_df(ex.all)
ex.rf <- rf_model(ex.all)
print(ex.rf)
ex.rf
ex.rf$finalModel$importance
ex.cf <- cif_model(ex.all)
print(ex.cf)
ex.cf
ex.cf$finalModel$importance
Error using cif_model(ex.all) showing "Something is wrong; all the Accuracy metric values are missing"

Crosstalk links broken by second Leaflet addCircleMarkers call

I am working with multivariant data linking Leaflet and d3scatter plots. It works well for one variable. If I try to include a second variable in Leaflet by a second addCircleMarkers and addLayersControl then the sharedData links break, the filtering doesn't work and the brushing doesn't work. Thanks in advance.
A MWE is attached:
library("crosstalk")
library("d3scatter")
library("leaflet")
Long <- c(117.4,117.5,117.6)
Lat<- c(-33.7,-33.8,-33.9)
var1 <- c(21,22,23)
var2 <- c(31,32,33)
species <- c(8,9,10)
df1<- data.frame(Long, Lat, var1, var2, species)
sdf1 <- SharedData$new(df1)
col_1 <- c( "yellow" ,"black" ,"orange")
col_2 <- c("red" ,"green" ,"blue")
l <- leaflet(sdf1)%>%
setView(117.5, -33.8, 10) %>%
addCircleMarkers(radius = 1, color = col_1, group = "1") %>%
# addCircleMarkers(radius = 1, color = col_2, group = "2") %>%
# PROBLEM - adding the second "addCircleMarkers" enables the overlayGroups but
# it breaks the link between the plots and breaks the filter
addLayersControl(overlayGroups=c("1","2"))
m <- list(l, filter_checkbox("unique_id_for_species", "Animal Species", sdf1, ~species))
n <- list(d3scatter(sdf1, ~var2, ~var1, color = ~species, x_lim = c(30,40), y_lim = c(20,25), width="70%", height=200),
d3scatter(sdf1, ~var1, ~var2, color = ~species, y_lim = c(30,40), x_lim = c(20,25), width="70%", height=200))
bscols(m, n)

Parsing a TeX-like language with lpeg

I am struggling to get my head around LPEG. I have managed to produce one grammar which does what I want, but I have been beating my head against this one and not getting far. The idea is to parse a document which is a simplified form of TeX. I want to split a document into:
Environments, which are \begin{cmd} and \end{cmd} pairs.
Commands which can either take an argument like so: \foo{bar} or can be bare: \foo.
Both environments and commands can have parameters like so: \command[color=green,background=blue]{content}.
Other stuff.
I also would like to keep track of line number information for error handling purposes. Here's what I have so far:
lpeg = require("lpeg")
lpeg.locale(lpeg)
-- Assume a lot of "X = lpeg.X" here.
-- Line number handling from http://lua-users.org/lists/lua-l/2011-05/msg00607.html
-- with additional print statements to check they are working.
local newline = P"\r"^-1 * "\n" / function (a) print("New"); end
local incrementline = Cg( Cb"linenum" )/ function ( a ) print("NL"); return a + 1 end , "linenum"
local setup = Cg ( Cc ( 1) , "linenum" )
nl = newline * incrementline
space = nl + lpeg.space
-- Taken from "Name-value lists" in http://www.inf.puc-rio.br/~roberto/lpeg/
local identifier = (R("AZ") + R("az") + P("_") + R("09"))^1
local sep = lpeg.S(",;") * space^0
local value = (1-lpeg.S(",;]"))^1
local pair = lpeg.Cg(C(identifier) * space ^0 * "=" * space ^0 * C(value)) * sep^-1
local list = lpeg.Cf(lpeg.Ct("") * pair^0, rawset)
local parameters = (P("[") * list * P("]")) ^-1
-- And the rest is mine
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") / function (a,b) return { text = a, line = b } end
begin_environment = P("\\begin") * Ct(parameters) * P("{") * Cg(identifier, "environment") * Cb("environment") * P("}") / function (a,b) return { params = a[1], environment = b } end
end_environment = P("\\end{") * Cg(identifier) * P("}")
texlike = lpeg.P{
"document";
document = setup * V("stuff") * -1,
stuff = Cg(V"environment" + anything + V"bracketed_stuff" + V"command_with" + V"command_without")^0,
bracketed_stuff = P"{" * V"stuff" * P"}" / function (a) return a end,
command_with =((P("\\") * Cg(identifier) * Ct(parameters) * Ct(V"bracketed_stuff"))-P("\\end{")) / function (i,p,n) return { command = i, parameters = p, nodes = n } end,
command_without = (( P("\\") * Cg(identifier) * Ct(parameters) )-P("\\end{")) / function (i,p) return { command = i, parameters = p } end,
environment = Cg(begin_environment * Ct(V("stuff")) * end_environment) / function (b,stuff, e) return { b = b, stuff = stuff, e = e} end
}
It almost works!
> texlike:match("\\foo[one=two]thing\\bar")
{
command = "foo",
parameters = {
{
one = "two",
},
},
}
{
line = 1,
text = "thing",
}
{
command = "bar",
parameters = {
},
}
But! First, I can't get the line number handling part to work at all. The function within incrementline is never fired.
I also can't quite work out how nested capture information is passed to handling functions (which is why I have scattered Cg, C and Ct semirandomly over the grammar). This means that only one item is returned from within a command_with:
> texlike:match("\\foo{text \\command moretext}")
{
command = "foo",
nodes = {
{
line = 1,
text = "text ",
},
},
parameters = {
},
}
I would also love to be able to check that the environment start and ends match up but when I tried to do so, my back references from "begin" were not in scope by the time I got to "end". I don't know where to go from here.
Late answer but hopefully it'll offer some insight if you're still looking for a solution or wondering what the problem was.
There are a couple of issues with your grammar, some of which can be tricky to spot.
Your line increment here looks incorrect:
local incrementline = Cg( Cb"linenum" ) /
function ( a ) print("NL"); return a + 1 end,
"linenum"
It looks like you meant to create a named capture group and not an anonymous group. The backcapture linenum is essentially being used like a variable. The problem is because this is inside an anonymous capture, linenum will not update properly -- function(a) will always receive 1 when called. You need to move the closing ) to the end so "linenum" is included:
local incrementline = Cg( Cb"linenum" /
function ( a ) print("NL"); return a + 1 end,
"linenum")
Relevant LPeg documentation for Cg capture.
The second problem is with your anything non-terminal rule:
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") ...
There are several things to be careful here. First, a named Cg capture (from incrementline rule once it's fixed) doesn't produce anything unless it's in a table or you backref it. The second major thing is that it has an adhoc scope like a variable. More precisely, its scope ends once you close it in an outer capture -- like what you're doing here:
C( (space^1 + (...) )^1)
Which means by the time you reference its backcapture with * Cb("linenum"), that's already too late -- the linenum you really want already closed its scope.
I always found LPeg's re syntax a bit easier to grok so I've rewritten the grammar with that instead:
local grammar_cb =
{
fold = pairfold,
resetlinenum = resetlinenum,
incrementlinenum = incrementlinenum, getlinenum = getlinenum,
error = error
}
local texlike_grammar = re.compile(
[[
document <- '' -> resetlinenum {| docpiece* |} !.
docpiece <- {| envcmd |} / {| cmd |} / multiline
beginslash <- cmdslash 'begin'
endslash <- cmdslash 'end'
envcmd <- beginslash paramblock? {:beginenv: envblock :} (!endslash docpiece)*
endslash openbrace {:endenv: =beginenv :} closebrace / &beginslash {} -> error .
envblock <- openbrace key closebrace
cmd <- cmdslash {:command: identifier :} (paramblock? cmdblock)?
cmdblock <- openbrace {:nodes: {| docpiece* |} :} closebrace
paramblock <- opensq ( {:parameters: {| parampairs |} -> fold :} / whitesp) closesq
parampairs <- parampair (sep parampair)*
parampair <- key assign value
key <- whitesp { identifier }
value <- whitesp { [^],;%s]+ }
multiline <- (nl? text)+
text <- {| {:text: (!cmd !closebrace !%nl [_%w%p%s])+ :} {:line: '' -> getlinenum :} |}
identifier <- [_%w]+
cmdslash <- whitesp '\'
assign <- whitesp '='
sep <- whitesp ','
openbrace <- whitesp '{'
closebrace <- whitesp '}'
opensq <- whitesp '['
closesq <- whitesp ']'
nl <- {%nl+} -> incrementlinenum
whitesp <- (nl / %s)*
]], grammar_cb)
The callback functions are straight-forwardly defined as:
local function pairfold(...)
local t, kv = {}, ...
if #kv % 2 == 1 then return ... end
for i = #kv, 2, -2 do
t[ kv[i - 1] ] = kv[i]
end
return t
end
local incrementlinenum, getlinenum, resetlinenum do
local line = 1
function incrementlinenum(nl)
assert(not nl:match "%S")
line = line + #nl
end
function getlinenum() return line end
function resetlinenum() line = 1 end
end
Testing the grammar with a non-trivial tex-like str with multiple lines:
local test1 = [[\foo{text \bar[color = red, background = black]{
moretext \baz{
even
more text} }
this time skipping multiple
lines even, such wow!}]]
Produces the follow AST in lua-table format:
{
command = "foo",
nodes = {
{
text = "text",
line = 1
},
{
parameters = {
color = "red",
background = "black"
},
command = "bar",
nodes = {
{
text = " moretext",
line = 2
},
{
command = "baz",
nodes = {
{
text = "even ",
line = 3
},
{
text = "more text",
line = 4
}
}
}
}
},
{
text = "this time skipping multiple",
line = 7
},
{
text = "lines even, such wow!",
line = 9
}
}
}
And a second test for begin/end environments:
local test2 = [[\begin[p1
=apple,
p2=blue]{scope} scope foobar
\end{scope} global foobar]]
Which seems to give approximately what you're looking for:
{
{
{
text = " scope foobar",
line = 3
},
parameters = {
p1 = "apple",
p2 = "blue"
},
beginenv = "scope",
endenv = "scope"
},
{
text = " global foobar",
line = 4
}
}

Resources