Dynamic QUERY range - google-sheets

I have a spreadsheet and in one of the tabs I have a table with computed data from other tabs. This is small table with 11 columns. Row(1) is the Header row and Column A is the list of items, Column B to J is the types. Data consists of numbers only.
As the data is computed, time to time values in some of the columns thru B to J can be totally zero. I want to create a subset of this table with QUERY but constructing a dynamic range getting only the columns which has at least 1 value which is greater than zero.
I'm aware that a range can be created as an array like {A:A\B:B\D:D} but in my case I don't know which columns can have values of greater than zero and I don't want to take columns into the range which has completely zero values.
I have created an expression to concatenate this array value as a text in a cell, however I can't use it with the QUERY formula either with INDEX or TEXT functions. Table is like this:
Items TypeA TypeB TypeC TypeD
Bronze 0 0 0 0
Silver 0 0 1 0
Gold 0 0 1 0
Titanimum 1 0 0 0
For this snapshot of table, I want to QUERY range to be {A:A\B:B\D:D}. However, as the data is computed, the table can be like this after 2hrs or the next day:
Items TypeA TypeB TypeC TypeD
Bronze 1 0 0 1
Silver 0 0 1 0
Gold 0 1 1 0
Titanimum 1 0 0 0
And so, for this snapshot of table, I want to QUERY range to be {A:A\B:B\C:C\D:D\E:E}.
Is this doable? And how can I achieve or construct a dynamic QUERY range?
Thanks for everyone...

You can remove columns from a range based on a criteria using the FILTER command.
Unfiltered
Items TypeA TypeB TypeC TypeD TypeE TypeF TypeG
Bronze 1 0 0 1 0 0 1
Silver 1 1 0 1 0 0 1
Gold 1 0 0 1 0 0 1
Titan 1 0 0 1 1 0 1
1 4 1 0 4 1 0 4
Filtered to remove columns with total of 0
Items TypeA TypeB TypeD TypeE TypeG
Bronze 1 0 1 0 1
Silver 1 1 1 0 1
Gold 1 0 1 0 1
Titan 1 0 1 1 1
The 'trick' is to sum the sum the column data (for your example) and then test for >0
The filter expression is:
=FILTER(A1:H5,A6:H6 >0)
By way of explanation:
A1:H5 is the range to be filtered;
A6:H6 >0 selects all columns that have a value > 0 in row 6
I placed a 1 in A6 to make sure colA is included.
You can now do queries on the range returned by the above expression.

Related

Predict next integer in sequence using ML.NET

Given a lengthy sequence of integers in the range of 0-1 I would like to be able to predict the next likely integer.
Example dataset:
1 1 1 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 0 0 1 0 1 1 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 1 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0
A quick look at the above perhaps shows some obvious patterns which may be recognised by an ML model.
I do have other features available in the dataset but I don't think they correlate to the integer result so the prediction should be based purely on the statistical relevance of the supplied integer dataset.
I'm unsure how to approach this using ML.NET. I have successfully classified models previously but those predictions are all made based on multiple features. In this case if I just supply a 0 or 1 there's no relevant historical sequence to aid the prediction.
How do I train an ML.NET model to return a prediction based on a range of previous data?
Working theory: the above dataset has 100 integers. I could create a class which has 100 properties (Integer0..Integer99) and painstakingly map each field and submit that but it seems really clunky.

How to count 2 columns with a range

A B C
Val 1 2
Val 2 1
Val 3 1
Item 1 Val 1 1
Item 2 Val 2 1
Item 3 Val 3 0
Item 4 Val 1 0
Consider the above sheet. In the first 3 rows I am counting how many times corresponding val# shows up in the sheet. I have done that with: =COUNTIF($B$5:$B, A1) However, I can't figure out how to make it count only if the value matches and column C doesn't have a 1 next to it on same row. Is this possible?
try COUNTIFS:
=COUNTIFS(B$5:B, A1, C$5:C, "<>"&1)
make sure C column is formatted as Number

Transform string variable into 0-1 columns

As a very begginer in SPSS I would ask you for help with some transformation from table A into table B. I have to recode values of "brand" variable into columns and make 0-1 variables.
#table A#
nr brand
1 GREEN CARE PROFESSIONAL
1 GREEN CARE PROFESSIONAL
1 GREEN CARE PROFESSIONAL
2 HENKEL
3 HENKEL
3 HENKEL
3 HENKEL
3 VIZIR
4 BIEDRONKA
4 BOBINI
4 BOBINI
4 BOBINI
4 BOBINI
4 BOBINI
4 HENKEL
5 VIZIR
6 HENKEL
#table B#
nr GREEN HENKEL VIZIR BIEDR BOBINI
1 1 0 0 0 0
1 1 0 0 0 0
1 1 1 0 0 0
2 0 1 0 0 0
3 0 1 0 0 0
3 0 1 0 0 0
3 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
4 0 0 0 0 1
4 0 0 0 0 1
4 0 0 0 0 1
4 0 0 0 0 1
4 0 0 0 0 1
4 0 1 0 0 0
5 0 0 1 0 0
6 0 1 0 0 0
I can do it in this particular case in this simple way:
compute HENKEL=0.
...
do if BRAND='GREEN_CARE' .
compute GREEN_CARE=1.
else if ....
but the loop has to be usable with another variable and different number of values ect. I was trying to make it all day and gave up.
Do you have any idea to make it in a easy way?
Thanks!
The following syntax does the job on the sample data you provided.
First, let's recreate the sample data to demonstrate on:
Data list list/nr (f1) brand (a30).
begin data
1 "GREEN CARE PROFESSIONAL"
1 "GREEN CARE PROFESSIONAL"
1 "GREEN CARE PROFESSIONAL"
2 "HENKEL"
3 "HENKEL"
3 "HENKEL"
3 "HENKEL"
3 "VIZIR"
4 "BIEDRONKA"
4 "BOBINI"
4 "BOBINI"
4 "BOBINI"
4 "BOBINI"
4 "BOBINI"
4 "HENKEL"
5 "VIZIR"
6 "HENKEL"
end data.
dataset name originalDataset.
Now for the restructure.
sort cases by nr brand.
* creating an index to enumerate cases for each combination of `nr` and `brand`.
* This is necessary for the `casestovars` command to work later.
compute ind=1.
if $casenum>1 and lag(nr)=nr and lag(brand)=brand ind=lag(ind)+1.
exe.
* variable names can't have spaces in them, so changing the category names accordingly.
compute brand=replace(rtrim(brand)," ","_").
sort cases by nr ind brand.
compute exist=1.
casestovars /id=nr ind /index= brand/autofix=no.

column =char(1), and also =char(0)

I have a table that includes a column foo.
show create table shows `foo` bit(1) DEFAULT b'0', so the column should contain binary strings: the 0 and 1 bytes.
select ascii(foo),
ord(foo),
foo=char(1),
foo=char(0),
char(1)=char(0)
from table_name
group by 1,2,3,4,5
yields
ascii(foo) ord(foo) foo=char(1) foo=char(0) char(1)=char(0)
0 0 1 1 0
1 1 0 0 0
I'd expect it to yield
ascii(foo) ord(foo) foo=char(1) foo=char(0) char(1)=char(0)
0 0 0 1 0
1 1 1 0 0
Can someone please explain what's going on?
Nor is this restricted to the select clause. It happens in the where clause also: select distinct ascii(foo) from table_name where foo=char(0) and select distinct ascii(foo) from table_name where foo=char(1) both return only 0.
select ##version
5.7.21-20-57-log

How do I find out the longest run of a number?

This seemed like a trivial question to me, but I cannot get it done correctly. Part of my dataset looks like this
1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0
and contains two “runs” of 1 (not sure if that’s the correct word), one with a length 3, the other with a length of 5.
How can I use Google Docs or similar spreadsheet applications to find the longest of those runs?
In Excel you can use a single formula to get the maximum number of consecutive 1s, i.e.
=MAX(FREQUENCY(IF(A2:A100=1,ROW(A2:A100)),IF(A2:A100<>1,ROW(A2:A100))))
confirmed with CTRL+SHIFT+ENTER
In Google Sheets you can use the same formula but wrap in arrayformula rather than use CSE, i.e.
=arrayformula(MAX(FREQUENCY(IF(A2:A100=1,ROW(A2:A100)),IF(A2:A100<>1,ROW(A2:A100)))))
Assumes data in A2:A100 without blanks
EDIT: whuber's suggestion is just too simple for me to not update this response. One can just use a simple IF statement checking if the current row is equal to 1. If it is, it starts a counter (the prior row + 1), if it is not it starts the counter again at 0.
You just need to initialize the first row of B1 to 1 or 0. Using the dynamic updating of cell formulas once you have it written once it fills in the rest.
So you would start out;
A B
1 1
1 =IF(A2=1, B1+1, 0)
1
0
0
1
1
1
1
0
0
0
Then fill in;
A B
1 1
1 =IF(A2=1, B1+1, 0)
1 =IF(A3=1, B2+1, 0)
0 =IF(A4=1, B3+1, 0)
0 =IF(A5=1, B4+1, 0)
1 =IF(A6=1, B5+1, 0)
1 =IF(A7=1, B6+1, 0)
1 =IF(A8=1, B7+1, 0)
1 =IF(A9=1, B8+1, 0)
0 =IF(A10=1, B9+1, 0)
0 =IF(A11=1, B10+1, 0)
0 =IF(A12=1, B11+1, 0)
And here the result in column B is;
A B
1 1
1 2
1 3
0 0
0 0
1 1
1 2
1 3
1 4
0 0
0 0
0 0
Hopefully the logic is extendable to Google Docs.

Resources