With the following code I produce the laTex table in the image below. As you might notice there are a few things wrong with the output.
The title is missing
P-values in the wrong place
The footnote is misaligned
Any help is greatly appreciated!
library(tidyverse)
library(modelsummary)
library(gt)
data <- as.data.frame(ChickWeight)
mod_control <- lm(weight ~ Time , data = data)
mod_treat <- lm(weight ~ Time + Diet, data = data)
mod_one_list <- list(mod_control, mod_treat)
# coefmap
cm <- c("(Intercept)"="Konstant",
"Time" = "Tid",
"Num.Obs." = "n")
# gof_map
gm <- list(list(raw = "nobs", clean = "N", fmt = 0))
# title
tit <- "En beskrivning här"
# produce table
modelsummary(mod_one_list,
output = "gt",
stars = T,
title = tit,
coef_map = cm,
gof_map = gm,
vcov = "HC1") %>%
tab_spanner(label = '(1)', columns = 2) %>%
tab_spanner(label = "(2)", columns = 3) %>%
tab_footnote("För standardfel använder vi HC1",
locations = cells_body(rows = 1, columns = 2)) %>%
as_latex() %>%
cat()
This is an issue with the gt package. When adding both footnotes and
source notes (which is what modelsummary uses to report significance
stars), gt puts both types of notes in different mini-pages. This
breaks alignment in LaTeX.
You can see this by inspecting the code of this minimal example:
library(gt)
dat <- mtcars[1:4, 1:4]
gt(dat) |>
tab_source_note(source_note = "source note") |>
tab_footnote("footnote", locations = cells_body(rows = 1, columns = 2)) |>
as_latex() |>
cat()
## \captionsetup[table]{labelformat=empty,skip=1pt}
## \begin{longtable}{rrrr}
## \toprule
## mpg & cyl & disp & hp \\
## \midrule
## 21.0 & 6\textsuperscript{1} & 160 & 110 \\
## 21.0 & 6 & 160 & 110 \\
## 22.8 & 4 & 108 & 93 \\
## 21.4 & 6 & 258 & 110 \\
## \bottomrule
## \end{longtable}
## \vspace{-5mm}
## \begin{minipage}{\linewidth}
## \textsuperscript{1}footnote \\
## \end{minipage}
## \begin{minipage}{\linewidth}
## source note\\
## \end{minipage}
I am not sure if the gt maintainers would consider this a “bug”, but
it might be worth it to report it on their repository anyway:
https://github.com/rstudio/gt/issues
For what it’s worth, I think that the default LaTeX output with
modelsummary(model, output="latex") generally works better, because it
uses kableExtra, which seems to prioritize LaTeX a bit more.
I'm trying to create a leaflet map which shows different movement paths for different months of the year. I.e. I've got a dataset showing multiple journeys per month and I want to display the movement paths separately for each month using the addTimeslider feature of the leaflet.extras2 package.
To do so I have been trying to adapt the code posted by SymbolixAU I found here: leaflet add multiple polylines
This code uses sf functions including st_linestring to create an object that can be supplied to a addPolylines leaflet function to show all movement paths at once.
I'm pretty sure for my purposes (showing data separately for each month) I have to use st_multilinestring, which takes a list of matrices containing the coordinates for multiple polylines per row (with one row per month) rather than a single polyline per row.
Once I have that I think I could supply that object to the addTimeslider function of leaflet.extras2 to achieve what I need. I'm quite sure of this because when I used the sf object created using sf_linestring inthe AddTimeslider feature I was able to use the time slider on the map to individual movement paths at a time.
However, I have been trying for hours and haven't been successful. Would be hugely grateful for any pointers, please and thank you.
Some example data:
#load packages
library(dplyr)
library(leaflet)
library(leaflet.extras2)
library(sf)
library(data.table)
# create the example dataset
data <- structure(list(arrival_month = structure(c(3L, 3L, 4L, 4L, 4L,
5L, 5L, 6L, 6L, 6L), .Label = c("January", "February", "March",
"April", "May", "June", "July", "August", "September", "October",
"November", "December"), class = c("ordered", "factor")), start_lat = c(33.40693,
33.64672, 33.57127, 33.42848, 33.54936, 33.53418, 33.60399, 33.49554,
33.5056, 33.61696), start_long = c(-112.0298, -111.9255, -112.049,
-112.0998, -112.0912, -112.0911, -111.9273, -111.9687, -112.0563,
-111.9866), finish_lat...4 = c(33.40687, 33.64776, 33.57125,
33.42853, 33.54893, 33.53488, 33.60401, 33.49647, 33.5056, 33.61654
), finish_lat...5 = c(-112.0343, -111.9303, -112.0481, -112.0993,
-112.0912, -112.0911, -111.931, -111.9711, -112.0541, -111.986
)), row.names = c(NA, -10L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x0000026e5df41ef0>)
My attempt at the code:
# Convert the data into a list of matrices for each month
mnths <- c("May","March","April","June")
mat_list <- list()
for (i in mnths) {
month <- as.matrix(data %>% filter(arrival_month == i) %>% select(-1))
mat_list[[i]] <- month
}
# convert to an sf object
data_DT <- setDT(data)
sf <- data_DT[
, {
geometry <- sf::st_multilinestring(x = mat_list)
geometry <- sf::st_sfc(geometry)
geometry <- sf::st_sf(geometry = geometry)
}
, by = arrival_month
]
sf <- sf::st_as_sf(sf)
This yields the following result:
It's not correct because each row contains the coordinates for all the months, rather than just for the month in the respective row. I'm at a loss as to where to go from here - any help would be hugely appreciated.
Thanks
I would do it slightly differently today, making use of {sfheaders} to build the linestrings.
library(sfheaders)
library(sf)
library(data.table)
setDT( data )
data[, line_id := .I ] ## Assuming each row is a line
## create a long-form of the data
dt_line <- rbindlist(
list(
data[, .(arrival_month, line_id, lon = start_long, lat = start_lat, sequence = 1)]
, data[, .(arrival_month, line_id, lon = finish_lat...5, lat = finish_lat...4, sequence = 2)] ## I think 'finish_lat...5' is actually the 'long'
)
)
setorder(dt_line, line_id, sequence)
sf <- sfheaders::sf_multilinestring(
obj = dt_line
, x = "lon"
, y = "lat"
, multilinestring_id = "arrival_month"
, linestring_id = "line_id"
, keep = T
)
sf::st_crs( sf ) <- 4326 ## Assuming it's in Web Mercator
# Simple feature collection with 4 features and 2 fields
# Geometry type: MULTILINESTRING
# Dimension: XY
# Bounding box: xmin: -112.0998 ymin: 33.40687 xmax: -111.9255 ymax: 33.64776
# Geodetic CRS: WGS 84
#. arrival_month sequence geometry
# 1 3 1 MULTILINESTRING ((-112.0298...
# 2 4 1 MULTILINESTRING ((-112.049 ...
# 3 5 1 MULTILINESTRING ((-112.0911...
# 4 6 1 MULTILINESTRING ((-111.9687...
Note the arrival_month has been re-coded to it's factor levels.
I managed to adapt the code provided by SymbolixAU to suit my purposes, by using sf_multilinestring instead of sf_line string:
## Generate data
data <- structure(list(arrival_month = structure(c(3L, 3L, 4L, 4L, 4L,
5L, 5L, 6L, 6L, 6L), .Label = c("January", "February", "March",
"April", "May", "June", "July", "August", "September", "October",
"November", "December"), class = c("ordered", "factor")), start_lat = c(33.40693,
33.64672, 33.57127, 33.42848, 33.54936, 33.53418, 33.60399, 33.49554,
33.5056, 33.61696), start_long = c(-112.0298, -111.9255, -112.049,
-112.0998, -112.0912, -112.0911, -111.9273, -111.9687, -112.0563,
-111.9866), finish_lat...4 = c(33.40687, 33.64776, 33.57125,
33.42853, 33.54893, 33.53488, 33.60401, 33.49647, 33.5056, 33.61654
), finish_lat...5 = c(-112.0343, -111.9303, -112.0481, -112.0993,
-112.0912, -112.0911, -111.931, -111.9711, -112.0541, -111.986
)), row.names = c(NA, -10L), class = c("data.table", "data.frame"
))
## Add id column and convert to data table
data[, line_id := .I ]
setDT(data)
## create a long-form of the data
dt_line <- rbindlist(
list(
data[, .(arrival_month, line_id, lon = start_long, lat = start_lat, sequence = 1)]
, data[, .(arrival_month, line_id, lon = finish_lat...5, lat = finish_lat...4, sequence = 2)] ## I think 'finish_lat...5' is actually the 'long'
)
)
setorder(dt_line, line_id, sequence)
## Create multistring sf object
sf <- sfheaders::sf_multilinestring(
obj = dt_line
, x = "lon"
, y = "lat"
, linestring_id = "line_id"
, multilinestring_id = "arrival_month"
, keep = TRUE
)
sf::st_as_sf(sf)
## Convert arrival_month back to factor
sf$arrival_month <- as.factor(sf$arrival_month)
Might not be the most elegant but this does the job.
The key was to use sf_multilinestring and include/specify both linestring_id and multilinestring_id to distinguish between separate sets of polylines.
This is the result:
Now when this sf object is used in AddTimeslider it behaves as desired.
Credit to SymbolixAU for most of the code and bringing my attention to the sfheaders package
I'm trying to learn some F# and Deedle by analyzing my electricity costs.
Suppose I have two frames, one containing my electricity usage:
let consumptionsByYear =
[ (2019, "Total", 500); (2019, "Day", 200); (2019, "Night", 300);
(2020, "Total", 600); (2020, "Day", 250); (2020, "Night", 350) ]
|> Frame.ofValues
Total Day Night
2019 -> 500 200 300
2020 -> 600 250 350
The other contains two plans with different pricing structure (either a flat fee or fee varying based on the time of the day):
let prices =
[ ("Plan A", "Base fee", 50); ("Plan A", "Fixed price", 3); ("Plan A", "Day price", 0); ("Plan A", "Night price", 0);
("Plan B", "Base fee", 40); ("Plan B", "Fixed price", 0); ("Plan B", "Day price", 5); ("Plan B", "Night price", 2) ]
|> Frame.ofValues
Base fee Fixed price Day price Night price
Plan A -> 50 3 0 0
Plan B -> 40 0 5 2
Previously I have solved this in SQL using a cross join and in Excel using nested joins. To copy those, I found Frame.mapRows, but constructing the expected output seems very tedious using it:
let costs = consumptionsByYear
|> Frame.mapRows (fun _year cols ->
["Total price" => (prices?``Base fee``
+ (prices?``Fixed price`` |> Series.mapValues ((*) (cols.GetAs<float>("Total"))))
+ (prices?``Day price`` |> Series.mapValues ((*) (cols.GetAs<float>("Day"))))
+ (prices?``Night price`` |> Series.mapValues ((*) (cols.GetAs<float>("Night"))))
)]
|> Frame.ofColumns)
|> Frame.unnest
Total price
2019 Plan A -> 1550
Plan B -> 1640
2020 Plan A -> 1850
Plan B -> 1990
Is there a better way or even small improvements?
I'm not a Deedle expert, but I think this is basically:
A dot product of two matrices: consumptionsByYear and the periodic day/night prices,
Followed by the addition of the constant base prices.
In other words:
consumptionsByYear periodicPrices basePrices
------------------- ------------------------ ---------------------------
| Day Night | | Plan A Plan B | | Plan A Plan B |
| 2019 -> 200 300 | * | Day -> 3 5 | + | Base fee -> 50 40 |
| 2020 -> 250 350 | | Night -> 3 2 | ---------------------------
------------------- ------------------------
With that approach in mind, here's how I would do it:
open Deedle
open Deedle.Math
let consumptionsByYear =
[ (2019, "Day", 200); (2019, "Night", 300)
(2020, "Day", 250); (2020, "Night", 350) ]
|> Frame.ofValues
let basePrices =
[ ("Plan A", "Base fee", 50)
("Plan B", "Base fee", 40) ]
|> Frame.ofValues
|> Frame.transpose
let periodicPrices =
[ ("Plan A", "Day", 3); ("Plan A", "Night", 3)
("Plan B", "Day", 5); ("Plan B", "Night", 2) ]
|> Frame.ofValues
|> Frame.transpose
// repeat the base prices for each year
let basePricesExpanded =
let row = basePrices.Rows.["Base fee"]
consumptionsByYear
|> Frame.mapRowValues (fun _ -> row)
|> Frame.ofRows
let result =
Matrix.dot(consumptionsByYear, periodicPrices) + basePricesExpanded
result.Print()
Output is:
Plan A Plan B
2019 -> 1550 1640
2020 -> 1850 1990
A few changes I made for simplicity:
consumptionsByYear
I mapped the years from integers to strings in order to make the matrices compatible.
I removed the Total column, since it can be derived from the other two.
prices
I broke this into two separate frames: one for the periodic prices and another for the base prices, and then transposed them to enable matrix multiplication.
I changed Day price to Day and Night price to Night to make the matrices compatible.
I got rid of the Fixed price column, since it can be represented in the Day and Night columns.
Update: As of Deedle 2.4.2, it is no longer necessary to map the years to strings. I've modified my solution accordingly.
I am working with multivariant data linking Leaflet and d3scatter plots. It works well for one variable. If I try to include a second variable in Leaflet by a second addCircleMarkers and addLayersControl then the sharedData links break, the filtering doesn't work and the brushing doesn't work. Thanks in advance.
A MWE is attached:
library("crosstalk")
library("d3scatter")
library("leaflet")
Long <- c(117.4,117.5,117.6)
Lat<- c(-33.7,-33.8,-33.9)
var1 <- c(21,22,23)
var2 <- c(31,32,33)
species <- c(8,9,10)
df1<- data.frame(Long, Lat, var1, var2, species)
sdf1 <- SharedData$new(df1)
col_1 <- c( "yellow" ,"black" ,"orange")
col_2 <- c("red" ,"green" ,"blue")
l <- leaflet(sdf1)%>%
setView(117.5, -33.8, 10) %>%
addCircleMarkers(radius = 1, color = col_1, group = "1") %>%
# addCircleMarkers(radius = 1, color = col_2, group = "2") %>%
# PROBLEM - adding the second "addCircleMarkers" enables the overlayGroups but
# it breaks the link between the plots and breaks the filter
addLayersControl(overlayGroups=c("1","2"))
m <- list(l, filter_checkbox("unique_id_for_species", "Animal Species", sdf1, ~species))
n <- list(d3scatter(sdf1, ~var2, ~var1, color = ~species, x_lim = c(30,40), y_lim = c(20,25), width="70%", height=200),
d3scatter(sdf1, ~var1, ~var2, color = ~species, y_lim = c(30,40), x_lim = c(20,25), width="70%", height=200))
bscols(m, n)
Consider the following toy data:
input strL Country Population Median_Age Sex_Ratio GDP Trade year
"United States of America" 3999 55 1.01 5000 13.1 2012
"United States of America" 6789 43 1.03 7689 7.6 2013
"United States of America" 9654 39 1.00 7689 4.04 2014
"Afghanistan" 544 24 0.76 457 -0.73 2012
"Afghanistan" 720 19 0.90 465 -0.76 2013
"Afghanistan" 941 17 0.92 498 -0.81 2014
"China" 7546 44 1.01 2000 10.2 2012
"China" 10000 40 0.96 3400 14.3 2013
"China" 12000 38 0.90 5900 16.1 2014
"Canada" 7546 44 1.01 2000 1.2 2012
"Canada" 10000 40 0.96 3400 3.1 2013
"Canada" 12000 38 0.90 5900 8.5 2014
end
I run different regressions (using three different independent variables):
*reg1
local var "GDP Trade"
foreach ii of local var{
qui reg `ii' Population i.year
est table, b p
outreg2 Population using table, drop(i.year*) bdec(3) sdec(3) nocons tex(nopretty) append
}
*reg2
local var "GDP Trade"
foreach ii of local var{
qui reg `ii' Median_Age i.year
est table, b p
outreg2 Population using table2, drop(i.year*) bdec(3) sdec(3) nocons tex(nopretty) append
}
*reg3
local var "GDP Trade"
foreach ii of local var{
qui reg `ii' Sex_Ratio i.year
est table, b p
outreg2 Population using table3, drop(i.year*) bdec(3) sdec(3) nocons tex(nopretty) append
}
I use the append option to append different dependent variables that are to be regressed on the same set of independent variables. Hence, I obtain three different tables.
I wish to "merge" these tables when I compile in LaTeX, so that they appear as a single table, with three different panels, one below the other.
Table1
Table2
Table3
I can use the tex(frag) option of the community-contributed command outreg2, but that will not give me the desired outcome.
Here is a simple way of doing this, using the community-contributed command esttab:
clear
input strL Country Population Median_Age Sex_Ratio GDP Trade year
"United States of America" 3999 55 1.01 5000 13.1 2012
"United States of America" 6789 43 1.03 7689 7.6 2013
"United States of America" 9654 39 1.00 7689 4.04 2014
"Afghanistan" 544 24 0.76 457 -0.73 2012
"Afghanistan" 720 19 0.90 465 -0.76 2013
"Afghanistan" 941 17 0.92 498 -0.81 2014
"China" 7546 44 1.01 2000 10.2 2012
"China" 10000 40 0.96 3400 14.3 2013
"China" 12000 38 0.90 5900 16.1 2014
"Canada" 7546 44 1.01 2000 1.2 2012
"Canada" 10000 40 0.96 3400 3.1 2013
"Canada" 12000 38 0.90 5900 8.5 2014
end
local var "GDP Trade"
foreach ii of local var{
regress `ii' Population i.year
matrix I = e(b)
matrix A = nullmat(A) \ I[1,1]
local namesA `namesA' Population_`ii'
}
matrix rownames A = `namesA'
local var "GDP Trade"
foreach ii of local var{
regress `ii' Median_Age i.year
matrix I = e(b)
matrix B = nullmat(B) \ I[1,1]
local namesB `namesB' Median_Age_`ii'
}
matrix rownames B = `namesB'
local var "GDP Trade"
foreach ii of local var{
regress `ii' Sex_Ratio i.year
matrix I = e(b)
matrix C = nullmat(C) \ I[1,1]
local namesC `namesC' Sex_Ratio_`ii'
}
matrix rownames C = `namesC'
matrix D = A \ B \ C
Results:
esttab matrix(D), refcat(Population_GDP "Panel 1" ///
Median_Age_GDP "Panel 2" ///
Sex_Ratio_GDP "Panel 3", nolabel) ///
gaps noobs nomtitles ///
varwidth(20) ///
title(Table 1. Results)
Table 1. Results
---------------------------------
c1
---------------------------------
Panel 1
Population_GDP .3741343
Population_Trade .0009904
Panel 2
Median_Age_GDP 202.1038
Median_Age_Trade .429315
Panel 3
Sex_Ratio_GDP 18165.85
Sex_Ratio_Trade 27.965
---------------------------------
Using the tex option:
\begin{table}[htbp]\centering
\caption{Table 1. Results}
\begin{tabular}{l*{1}{c}}
\hline\hline
& c1\\
\hline
Panel 1 & \\
[1em]
Population\_GDP & .3741343\\
[1em]
Population\_Trade & .0009904\\
[1em]
Panel 2 & \\
[1em]
Median\_Age\_GDP & 202.1038\\
[1em]
Median\_Age\_Trade & .429315\\
[1em]
Panel 3 & \\
[1em]
Sex\_Ratio\_GDP & 18165.85\\
[1em]
Sex\_Ratio\_Trade & 27.965\\
\hline\hline
\end{tabular}
\end{table}
EDIT:
This preserves the original format:
local var "GDP Trade"
foreach ii of local var{
regress `ii' Population i.year
matrix I = e(b)
matrix A = (nullmat(A) , I[1,1])
local namesA `namesA' `ii'
}
matrix rownames A = Population
matrix colnames A = `namesA'
local var "GDP Trade"
foreach ii of local var{
regress `ii' Median_Age i.year
matrix I = e(b)
matrix B = nullmat(B) , I[1,1]
local namesB `namesB' `ii'
}
matrix rownames B = "Median Age"
matrix colnames B = `namesB'
local var "GDP Trade"
foreach ii of local var{
regress `ii' Sex_Ratio i.year
matrix I = e(b)
matrix C = nullmat(C) , I[1,1]
local namesC `namesC' `ii'
}
matrix rownames C = "Sex Ratio"
matrix colnames C = `namesC'
matrix D = A \ B \ C
Table 1. Results
--------------------------------------
GDP Trade
--------------------------------------
Population .3741343 .0009904
Median Age 202.1038 .429315
Sex Ratio 18165.85 27.965
--------------------------------------