Table printing a list of lists Common lisp - printing

I wish to print this data in a table with the columns aligned. I tried with Format but the columns were not aligned. Does anyone know how to do it ? Thank you.
(("tiscali" 10000 2.31 0.84 -14700.0 "none")
("atlantia" 50 22.65 22.68 1.5 "none")
("bper-banca" 1000 1.59 2.01 423.0 "none")
("alerion-cleanpower" 30 44.14 36.45 -230.7 "none")
("tesmec" 10000 0.12 0.14 150.0 "none")
("cover-50" 120 8.95 9.6 78.0 "none")
("ovs" 1000 1.71 1.93 217.0 "none")
("credito-emiliano" 200 5.7 6.26 112.0 "none"))
I tried to align the columns wit the ~T directive, no way. Is there a piece of code that prints nicely table data?

Let's break this down.
First, let's give your data a nice name:
(defparameter *data*
'(("tiscali" 10000 2.31 0.84 -14700.0 "none")
("atlantia" 50 22.65 22.68 1.5 "none")
("bper-banca" 1000 1.59 2.01 423.0 "none")
("alerion-cleanpower" 30 44.14 36.45 -230.7 "none")
("tesmec" 10000 0.12 0.14 150.0 "none")
("cover-50" 120 8.95 9.6 78.0 "none")
("ovs" 1000 1.71 1.93 217.0 "none")
("credito-emiliano" 200 5.7 6.26 112.0 "none")))
Now, come up with a way to print each line using format and destructuring-bind. Widths of various fields are hard-coded in.
(defun print-line (line)
(destructuring-bind (a b c d e f) line
(format T "~20a ~5d ~6,2f ~6,2f ~10,2f ~4a~%" a b c d e f)))
Once you know you can print a line, you just need to do that for each line.
(mapcar 'print-line *data*)
Result:
tiscali 10000 2.31 0.84 -14700.00 none
atlantia 50 22.65 22.68 1.50 none
bper-banca 1000 1.59 2.01 423.00 none
alerion-cleanpower 30 44.14 36.45 -230.70 none
tesmec 10000 0.12 0.14 150.00 none
cover-50 120 8.95 9.60 78.00 none
ovs 1000 1.71 1.93 217.00 none
credito-emiliano 200 5.70 6.26 112.00 none

I have something like this in my personal code, that I reproduced here in a simplified way:
(defpackage :tabular (:use :cl))
(in-package :tabular)
I have a function that turns any object into a list of values (a row), here the usage is for a list of values, so it is already in the correct shape.
(defgeneric columnize (object)
(:documentation "Representation of object as a list of fields")
(:method ((o list)) o))
I also define a transpose method that works with lists of various sizes:
(defun transpose (lists)
(when (notany #'null lists)
(cons
(mapcar #'first lists)
(transpose (mapcar #'cdr lists)))))
Here is your data, as defined by Chris:
(defparameter *data*
'(("tiscali" 10000 2.31 0.84 -14700.0 "none")
("atlantia" 50 22.65 22.68 1.5 "none")
("bper-banca" 1000 1.59 2.01 423.0 "none")
("alerion-cleanpower" 30 44.14 36.45 -230.7 "none")
("tesmec" 10000 0.12 0.14 150.0 "none")
("cover-50" 120 8.95 9.6 78.0 "none")
("ovs" 1000 1.71 1.93 217.0 "none")
("credito-emiliano" 200 5.7 6.26 112.0 "none")))
And finally, a function that prints a list of objects in a tabular way.
Basically, I convert all objects to list of values, convert them to string, and compute their size. This gives a matrix of size that I transpose to have a list of sizes for the same column: this is used to compute the width of each column, based on the maximum size of the actual data.
In practice, I allow also the generic function to add indicators like how to justify (left/right), etc.
(defun tabulate (stream objects)
(loop
for n from 0
for o in objects
for row = (mapcar #'princ-to-string (columnize o))
collect row into rows
collect (mapcar #'length row) into row-widths
finally
(flet ((build-format-arguments (max-width row)
(when (> max-width 0)
(list max-width #\space row))))
(loop
with number-width = (ceiling (log n 10))
with col-widths = (transpose row-widths)
with max-col-widths = (mapcar (lambda (s) (reduce #'max s)) col-widths)
for index from 0
for row in rows
for entries = (mapcan #'build-format-arguments max-col-widths row)
do (format stream
"~v,'0d. ~{~v,,,va~^ ~}~%"
number-width index entries)))))
For example:
(fresh-line)
(tabulate *standard-output* *data*)
Gives:
0. tiscali 10000 2.31 0.84 -14700.0 none
1. atlantia 50 22.65 22.68 1.5 none
2. bper-banca 1000 1.59 2.01 423.0 none
3. alerion-cleanpower 30 44.14 36.45 -230.7 none
4. tesmec 10000 0.12 0.14 150.0 none
5. cover-50 120 8.95 9.6 78.0 none
6. ovs 1000 1.71 1.93 217.0 none
7. credito-emiliano 200 5.7 6.26 112.0 none
As you can see there is some adjustments that could be made to format floating points values so that they align on the dot, but this is already quite useful.

Related

Arrange downloaded data into more useful way in google sheets

We currently have a fixed report data that we can only manipulate after download and to simplify, it looks like this:
raw report data extracted to google sheets
a b c
1 Start Date Time Adhering to Schedule (Hours) Time Not Adhering to Schedule (Hours)
2 Employee: A Supervisor: X
3 5/4/2022 7.65 1.35
4 5/5/2022 8.12 0.88
5 5/6/2022 6.95 2.05
6 5/9/2022 8.7 0.3
7 5/10/2022 7.45 1.55
8 5/11/2022 8.63 0.37
9 5/12/2022 8.08 0.92
10 5/13/2022 6.13 0.13
11 Totals: 61.71 7.55
12 Employee: B Supervisor: X
13 5/1/2022 3.8 0.27
14 5/2/2022 6.72 2.28
15 5/3/2022 6.1 2.9
16 5/4/2022 8.43 0.57
17 5/5/2022 5.85 0.53
18 5/10/2022 6.13 2.87
19 5/11/2022 0 1.5
20 5/12/2022 2 1.5
21 5/13/2022 1.75 1.75
22 Totals: 40.78 14.17
I would like some help in constructing a new sheet via formulas so that it rearranges the raw data as follows:
desired output
a b c d e
1 EMPLOYEE SUPERVISOR Start Date Time Adhering to Schedule (Hours) Time Not Adhering to Schedule (Hours)
2 A X 04/05/22 7.65 1.35
3 A X 05/05/22 8.12 0.88
4 A X 06/05/22 6.95 2.05
5 A X 09/05/22 8.70 0.30
6 A X 10/05/22 7.45 1.55
7 A X 11/05/22 8.63 0.37
8 A X 12/05/22 8.08 0.92
9 A X 13/05/22 6.13 0.13
10 B X 01/05/22 3.80 0.27
11 B X 02/05/22 6.72 2.28
12 B X 03/05/22 6.10 2.90
13 B X 04/05/22 8.43 0.57
14 B X 05/05/22 5.85 0.53
15 B X 10/05/22 6.13 2.87
16 B X 11/05/22 0.00 1.50
17 B X 12/05/22 2.00 1.50
18 B X 13/05/22 1.75 1.75
It probably needs some combination of QUERY() ARRAYFORMULA(), TRANSPOSE() and/or INDEX() or something.. but i can't quite figure it out. I need some help with to get started in the right track. the dates and data between employees are dynamic so the formula in the desired result needs to adjust to that as well.
thanks!
edit: adding a sample trix for reference :) https://docs.google.com/spreadsheets/d/1m_FCGcnXvnEiMZ8X4K1eEsMljORWV4V1Yq_81vFnx4Y/edit?usp=sharing
Gobal solution
in E1
={ArrayFormula(if(A1:A="Totals:",,{
substitute(lookup(row(A1:A),row(A1:A)/if(ISNUMBER(A1:A),0,1),A1:A),"Employee: ",""),
substitute(lookup(row(A1:A),row(A1:A)/if(ISNUMBER(A1:A),0,1),C1:C),"Supervisor: ","")
})),Arrayformula(if(ISNUMBER(A1:A),{A1:A,B1:B,C1:C},))}
In 3 steps (3 arrayformulas),
try in H1
=arrayformula(if(left(A1:A,6)="Totals",,if(left(A1:A,8)="Employee",{B1:B,D1:D,E1:E,E1:E,E1:E},{E1:E,E1:E,A1:A,B1:B,C1:C})))
then, back in F1 to complete all rows with employee and supervisor
=ArrayFormula({lookup(row(H:H),row(H:H)/if(H:H<>"",1,0),H:H),lookup(row(I:I),row(I:I)/if(I:I<>"",1,0),I:I)})
finally, if you want to reduce the presentation, in M1
=query(F:L,"select F,G,J,K,L where J is not null",0)

How to manipulate multiple nested arrays in Dyalog APL?

I have been given matrices filled with alphanumerical values excluding lower case letters like so:
XX11X1X
XX88X8X
Y000YYY
ZZZZ789
ABABABC
and have been tasked with counting the repetitions in each row and then tallying up a score depending on the ranking of the character being repeated. I used {⍺ (≢⍵)}⌸¨ ↓ m to help me. For the example above I would get something like this:
X 4 X 4 Y 4 Z 4 A 3
1 3 8 3 0 3 7 1 B 3
8 1 C 1
9 1
This is great but now I need to do a function that would be able to multiply the numbers with each letter. I can access the first matrix with ⊃ but then I am completely lost on how to access the other ones. I can simply write ⊃w[2] and ⊃w[3] and so forth but I need a way to change every matrix at the same time in one function. For this example, the array of the ranking is as follow: ZYXWVUTSRQPONMLKJIHGFEDCBA9876543210 so for the first array XX11X1X
which corresponds to:
X 4
1 3
So the X is 3rd in the array so it corresponds to a 3 and 1 is 35th so it's a 35. The final scoring would be something like (3×104)+(35×103). My biggest problem is not necessarily the scoring part but being able to access each matrix individually in one function. So for this nested array:
X 4 X 4 Y 4 Z 4 A 3
1 3 8 3 0 3 7 1 B 3
8 1 C 1
9 1
if I do arr[1] it gives me the scalar
X 4
1 3
and ⍴ arr[1] gives me nothing confirming it so I can do ⊃arr[1] to get the matrix itself and have access to each column individually. This is where I'm stuck. I'm trying to write a function to be able to do the math for each matrix and then saving those results to an array. I can easily do the math for the first matrix but I can't do it for all of them. I might have made a mistake by making using {⍺ (≢⍵)}⌸¨ ↓ m to get those matrices. Thanks.
Using your example arrangement:
⎕ ← arranged ← ⌽ ⎕D , ⎕A
ZYXWVUTSRQPONMLKJIHGFEDCBA9876543210
So now, we can get the index values:
1 ⌷ m
XX11X1X
∪ 1 ⌷ m
X1
arranged ⍳ ∪ 1 ⌷ m
3 35
While you could compute the intermediary step first, it is much simpler to include most of the final formula in in Key's operand:
{ ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸¨ ↓m
┌───────────┬───────────┬───────────┬─────────────────┬───────────────┐
│30000 35000│30000 28000│20000 36000│10000 290 280 270│26000 25000 240│
└───────────┴───────────┴───────────┴─────────────────┴───────────────┘
Now we just need to sum each:
+/¨ { ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸¨ ↓m
65000 58000 56000 10840 51240
In fact, we can combine the summation with the application of Key to avoid a double loop:
{ +/ { ( arranged ⍳ ⍺ ) × 10 * ≢⍵ }⌸ ⍵}¨ ↓m
65000 58000 56000 10840 51240
For completeness, here is a way to use the intermediary result. Let's start by working on just the first matrix (you can get the second one with 2⊃ instead of ⊃ ― for details, see Problems when trying to use arrays in APL. What have I missed?):
⊃{⍺ (≢⍵)}⌸¨ ↓m
X 4
1 3
We can insert a function between the left column elements and the right column elements with reduction:
{⍺ 'foo' ⍵}/ ⊃{⍺ (≢⍵)}⌸¨ ↓m
┌─────────┬─────────┐
│┌─┬───┬─┐│┌─┬───┬─┐│
││X│foo│4│││1│foo│3││
│└─┴───┴─┘│└─┴───┴─┘│
└─────────┴─────────┘
So now we simply have to modify the placeholder function with one that looks up the left argument in the arranged items, and multiplies by ten to the power of the right argument:
{ ( arranged ⍳ ⍺ ) × 10 * ⍵ }/ ⊃{⍺ (≢⍵)}⌸¨ ↓m
30000 35000
Instead of applying this to only the first matrix, we apply it to each matrix:
{ ( arranged ⍳ ⍺ ) × 10 * ⍵ }/¨ {⍺ (≢⍵)}⌸¨ ↓m
┌───────────┬───────────┬───────────┬─────────────────┬───────────────┐
│30000 35000│30000 28000│20000 36000│10000 290 280 270│26000 25000 240│
└───────────┴───────────┴───────────┴─────────────────┴───────────────┘
Now we just need to sum each:
+/¨ { ( arranged ⍳ ⍺ ) × 10 * ⍵ }/¨ {⍺ (≢⍵)}⌸¨ ↓m
65000 58000 56000 10840 51240
However, this is a much more circuitous approach, and is only provided here for reference.

Sampling the data in InfluxDB on the query level

I have a really big dataset for which I'd like to fetch a diagnostic sample. Till now I've been fetching all the data and sampling on my own machine, but currently it causes both influxdb and my app to run out of memory.
Is there a way to maintain the entire dataset on the DB level and downsample in a query?
Let's say that I'm interested in 1% of the entire measurement data. How would that query look like?
e.g. I want to get 1% of all values of a measurement.
Example case:
Measurement X
time val1 val2
---- ---- ----
0 A1 0.5
1 A2 0.7
2 A1 1.0
3 A3 1.5
4 A4 0.7
5 A3 0.5
6 A7 1.0
7 A1 0.5
8 A10 0.7
9 A2 0.1
Magic Query - 10%
time val1 val2
---- ---- ----
5 A3 0.5
Magic Query - 20%
time val1 val2
---- ---- ----
9 A2 0.1
4 A4 0.7

Google Sheets Countif with Arrrayformula

I'm doing some dynamic Monte Carlo simulation in Google Sheets, by utilizing the COUNTIF formula for the simulation. Something is not working the way I thought it would, but I cannot put my finger on. I have two columns that I'm comparing, and I need to count the instances where the value in one column is bigger than the value in the other column. If I do this explicitly by propagating the if comparison formula I obtain the correct result. However, if I do it with
=countif( A4:A, ">" & B4:B )
I do not obtain the correct result. My example is at this sheet, the number in cell C4 is the malfunctioning COUNTIF, which equals 2 in the example, and the number in cell E4 is 5, which is the correct count by propagating the comparison in column F and adding the correct comparisons in E4.
p1 p2 n
0.5 0.51 10
Monte Carlo
0.50 0.60 2 5 0
0.90 0.50 1
0.60 0.30 1
0.50 0.60 0
0.40 0.30 1
0.40 0.50 0
0.60 0.70 0
0.60 0.30 1
0.70 0.50 1
0.10 0.30 0
There are two scenarios with countif:
(1) As a non-array formula, =countif( A4:A, ">" & B4:B ) would give you the same result as =countif( A4:A, ">" & B4 ) i.e. it would count only values of A greater than .60, giving the answer 2.
(2) As an array formula, =sum(countif( A4:A, ">" & B4:B )) would give you a separate result for each value of B (2+5+9+2...) giving the answer 56.
If you wanted to use countif, you would need to do something like this:
=ArrayFormula(countif(A4:A-B4:B,">"&0))
try:
=INDEX(SUM(IF(A4:A>B4:B, 1)))

GLMM glmer and glmmADMB - comparison error

I am trying to compare if there are differences in the number of obtained seeds in five different populations with different applied treatments, and having maternal plant and paternal plant as random effects. First I tried to fit a glmer model.
dat <-dat [,c(12,7,6,13,8,11)]
dat$parents<-factor(paste(dat$mother,dat$father,sep="_"))
compareTreat <- function(d)
{
d$treatment <-factor(d$treatment)
print (tapply(d$pop,list(d$pop,d$treatment),length))
print(summary(fit<-glmer(seed_no~treatment+(1|pop/mother)+
(1|pop/father),data=d,family="poisson")))
}
Then, I compared two treatments in two populations (pop 64 and pop 121, in that case). The other populations do not have this particular treatments, so I get NA values for those.
compareTreat(subset(dat,treatment%in%c("IE 5x","IE 7x")&pop%in%c(64,121)))
This is the output:
IE 5x IE 7x
10 NA NA
45 NA NA
64 31 27
121 33 28
144 NA NA
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: poisson ( log )
Formula: seed_no ~ treatment + (1 | pop/mother) + (1 | pop/father)
Data: d
AIC BIC logLik deviance df.resid
592.5 609.2 -290.2 580.5 113
Scaled residuals:
Min 1Q Median 3Q Max
-1.8950 -0.8038 -0.2178 0.4440 1.7991
Random effects:
Groups Name Variance Std.Dev.
father.pop (Intercept) 3.566e-01 5.971e-01
mother.pop (Intercept) 9.456e-01 9.724e-01
pop (Intercept) 1.083e-10 1.041e-05
pop.1 (Intercept) 1.017e-10 1.008e-05
Number of obs: 119, groups: father:pop, 81; mother:pop, 24; pop, 2
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.74664 0.24916 2.997 0.00273 **
treatmentIE 7x -0.05789 0.17894 -0.324 0.74629
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
tretmntIE7x -0.364
It seems there are no differences between treatments. But as there are many zeros in the data, a zero-inflated model would be worthy to try. I tried with glmmabmd, and I wrote the script like this:
compareTreat<-function(d)
{
d$treatment<-factor(d$treatment)
print(tapply(d$pop,list(d$pop,d$treatment), length))
print(summary(fit_zip<-glmmadmb(seed_no~treatment + (1|pop/mother)+
(1|pop/father),data=d,family="poisson", zeroInflation=TRUE)))
}
Then I compared again the treatments. Here I have not changed the code.
compareTreat(subset(dat,treatment%in%c("IE 5x","IE 7x")&pop%in%c(64,121)))
But in that case, the output is
IE 5x IE 7x
10 NA NA
45 NA NA
64 31 27
121 33 28
144 NA NA
Error in pop:father : NA/NaN argument
In addition: Warning messages:
1: In pop:father :
numerical expression has 119 elements: only the first used
2: In pop:father :
numerical expression has 119 elements: only the first used
3: In eval(parse(text = x), data) : NAs introduced by coercion
Called from: eval(parse(text = x), data)
I tried to change everything I came up with, but I still don't know where the problem is.
If I remove the (1|pop/father) from the glmmadmb script, the model runs, but it feels not correct. I wonder if the mistake is in the loop prior to the glmmadmb but it worked OK in the glmer model, or if it is in the comparison itself after the model. I tried as well to remove NAs with na.omit in case that was an issue, but it did not make a difference. Why does the script stop and does not continue running?
I am a student beginner with RStudio, my version is 3.4.2, called Short Summer. If someone with experience could point me in the right direction I would be very grateful!
H.

Resources