Repeat N1:Nx rows Y1:Yx times? - google-sheets

I'm trying to create a Google sheet for an address label mail merge to direct people to their nearest outlet.
For 105 people, it might be Store 2 at 300 Block St; for another 60, it might be Store 8 at 55 Front Ave.
The goal is to have Google Sheets output a table with 105 rows of "Store 2; 300 Block Street", 60 rows of "Store 8; 55 Front Ave", etc.
I've tried using
transpose(split(rept("<cell with address>"&",", "<number of rows>"), ","))
but that's super laborious and error-prone to type out if I have 30 locations to repeat the process for.
Any ideas?
EDIT:
I managed to solve the problem soon after posting this but have left it up to see if there was a better way. The key to getting it working was using JOIN. Here is what I ended up using:
=arrayformula(transpose(split(join(",",rept(F2:F&",",H2:H)),",")))

you could create a master key table which will serve as a feeding ground for this formula:
=TRANSPOSE(SPLIT(JOIN(",", ARRAYFORMULA(REPT(SPLIT(
INDIRECT("A1:A"&COUNTA(A1:A)), ",")&",",
INDIRECT("B1:B"&COUNTA(B1:B))))), ","))
or standalone like:
=TRANSPOSE(SPLIT(JOIN(",", ARRAYFORMULA(REPT(SPLIT(
{"300 Block Street"; "55 Front Ave"; "102 King Street"}, ",")&",",
{10; 6; 2}))), ","))

Related

How to check for overlapping dates

I am looking for a solution on either Google sheets or app script to check for overlapping dates for the same account. There will be multiple accounts and the dates won't be in any particular order. Here is an example below. I am trying to achieve the right column "check" with some formula or automation. Any suggestions would be greatly appreciated.
Start Date
End Date
Account No.
Check
2023-01-01
2023-01-02
123
ERROR
2023-01-02
2023-01-05
123
ERROR
2023-02-25
2023-02-27
456
OK
2023-01-11
2023-01-12
456
OK
2023-01-01
2023-01-15
789
ERROR
2023-01-04
2023-01-07
789
ERROR
2023-01-01
2023-01-10
012
OK
2023-01-15
2023-01-20
012
OK
I also found some similar past questions, but they don't have the "for the same account" component and/or requires some sort of chronological order, which my sheet will not have.
How to calculate the overlap between some Google Sheet time frames?
How to check if any of the time ranges overlap with each other in Google Sheets
Another approach (to be entered in D2):
=arrayformula(lambda(last_row,
lambda(acc_no,start_date,end_date,
if(isnumber(match(acc_no,unique(query(query(split(flatten(acc_no&"|"&split(map(start_date,end_date,lambda(start_date,end_date,join("|",sequence(1,end_date-(start_date-1),start_date)))),"|")),"|"),"select Col1,count(Col2) where Col2 is not null group by Col1,Col2",0),"select Col1 where Col2>1",1)),0)),"ERROR","OK"))(
C2:index(C2:C,last_row),A2:index(A2:A,last_row),B2:index(B2:B,last_row)))(
counta(A2:A)))
Briefly, we are creating a sequence of dateserial numbers between the start & end dates for each row, doing some string manipulation to turn it into a table of account number against each date, then QUERYing it to get each account number which has dateserials with count>1 (i.e. overlaps), using UNIQUE to get the distinct list of those account numbers, then finally matching this list against the original list of account numbers to give the ERROR/OK output.
(1) Here is one way, considering each case which could result in an overlap separately:
=ArrayFormula(if(A2:A="",,
if((countifs(A2:A,"<="&A2:A,B2:B,">="&A2:A,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
+countifs(A2:A,"<="&B2:B,B2:B,">="&B2:B,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
+countifs(A2:A,">="&A2:A,B2:B,"<="&B2:B,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
)>0,"ERROR","OK")
)
)
(2) Here is the method using the Overlap formula
min(end1,end2)-max(start1,start2)+1
which results in
=ArrayFormula(if(byrow(A2:index(C:C,counta(A:A)),lambda(r,sum(text(if(index(r,2)<B2:B,index(r,2),B2:B)-if(index(r,1)>A2:A,index(r,1),A2:A)+1,"0;\0;\0")*(C2:C=index(r,3))*(row(A2:A)<>row(r)))))>0,"ERROR","OK"))
(3) Most efficient is to use the original method of comparing previous and next dates, but then you need to sort and sort back like this:
=lambda(data,sort(map(sequence(rows(data)),lambda(c,if(if(c=1,0,(index(data,c-1,2)>=index(data,c,1))*(index(data,c-1,3)=index(data,c,3)))+if(c=rows(data),0,(index(data,c+1,1)<=index(data,c,2))*(index(data,c+1,3)=index(data,c,3)))>0,"ERROR","OK"))),index(data,0,4),1))(SORT(filter({A2:C,row(A2:A)},A2:A<>""),3,1,1,1))
HOWEVER, this only checks for local overlaps. not globally. You can see what I mean if you change the dataset slightly:
Clearly the first and third pair of dates have an overlap but G4 contains "OK". This is because each pair of dates is only checked against the adjacent pairs of dates. This also applies to the original reference cited by OP - here's an example where it would give a similar result:
The formula posted by #The God of Biscuits gives the correct (global) result :-)

Explode each row into multiple rows by splitting a column of a given computed range

I was recently tasked with 'exploding' each row in a given range with respect to the split value of one of the columns, i.e. going from
Name
Interests
Age
John
swimming, movies
31
Mary
basketball
26
Richard
football, music
21
to:
Name
Interest
Age
John
swimming
31
John
movies
31
Mary
basketball
26
Richard
football
21
Richard
music
21
It's a little similar to a Cartesian product, only one of the terms needs to be computed on the basis of the value in the Interests column. I eventually solved it using an Apps Script function, but I'm wondering if it could be easily solved using a regular formula.
Note that the input range in my case was a product of another formula (a QUERY(...), to be exact), so not necessarily contiguous or addressable within the spreadsheet.
Any ideas?
try:
=INDEX(QUERY(SPLIT(FLATTEN(A1:A&"×"&SPLIT(B1:B, ", ", )&"×"&C1:C), "×"),
"where Col3 is not null"))
You can use the custom "UNPIVOT" function found on this sheet. File>Make a Copy to grab the script. Also here on github.
=ARRAYFORMULA(UNPIVOT(A2:A,"V",SPLIT(B2:B,", ",0),"B",C2:C,"V"))
You would then QUERY() the output to eliminate the rows where there was nothing in the second column.

Google Sheets formula - Sorting a row of numerical data

Link to example spreadsheet
I'm trying to find out how often the lead of a football match changes based on the time goals are scored.
In A1, the user inputs the times of the home team's goals, separated by commas. B1 has the same but for the away team.
I use the following code (in A2) to split those scores into separate cells and put them in order from least to greatest:
=IF(A1="","", TRANSPOSE(SORT(TRANSPOSE(SPLIT(A1,",")),1,TRUE)))
I've reserved A2:J2 for these goals.
In K2, I have the same idea, but for the away team:
=IF(B1="","", TRANSPOSE(SORT(TRANSPOSE(SPLIT(B1,",")),1,TRUE)))
I have K2:T2 reserved for these results.
I would now like to get all of the results from A2:T2 to be converted from the inputed minute data to 1H, 2H, 1A or 2A (in order of the times the goals were scored). 1H means the goal was a goal for the home team in the 1st half (<=45). 2H = home 2nd half. 1A = away 1st half. 2A = away 2nd half.
The output I'd like to have is shown in U2:AK2.
Any help is appreciated.
try:
=TRANSPOSE(INDEX(SORT(SPLIT(FLATTEN({
IF(SPLIT(A1, ",")<=45, 1, 2)&"H×"&SPLIT(A1, ","),
IF(SPLIT(B1, ",")<=45, 1, 2)&"A×"&SPLIT(B1, ",")}), "×"), 2, 1),, 1))

Google Sheet: formula to loop through a range

It's not hard to do this with custom function, but I'm wondering if there is a way to do it using a formula. Because datas won't automatically update when using custom function.
So I have a course list sheet, each with a price. And I'm using google form to let users choose what courses they will take. Users are allowed to take multiple courses, so how many they will take is unknown.
Now in the response sheet, I have datas like
Order ID
User ID
Courses
Total
1001
38
courseA, courseC
What formula to put here?
1002
44
courseB, courseC, courseD
What formula to put here?
1003
55
courseE
What formula to put here?
and the course sheet is like
course
Price
A
23
B
33
C
44
D
23
E
55
I want to output the total for each order and am looking at using FILTER to do this. Firstly I can get a range of unknown length for the chosen courses
=SPLIT(courses, ",") // having named the Courses column as "courses"
Now I need to filter this range against the course sheet? not quite sure how to do it or even if it is possible. Any hint is appreicated.
try:
=ARRAYFORMULA(IF(A2:A="",,MMULT(IFERROR(
VLOOKUP(SPLIT(C2:C, ", "), {F1&F2:F, G2:G}, 2, 0))*1,
ROW(INDIRECT("1:"&COLUMNS(SPLIT(C2:C, ", "))))^0)))
demo spreadsheet
As I need time to digest #player0's answer, I am doing this in a more intuitive way.
I create 2 sheets to store intermediate values.
The first one is named "chosen_courses"
Order ID
User ID
1001
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
1002
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
1003
=IFERROR(ARRAYFORMULA(TRIM(SPLIT(index(courses,Row(),1),","))),"")
In this sheet every row is a horizontal list of the chosen courses, and I created another sheet
total
course price
=IF(isblank(order_id),"",SUM(B2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
=IF(isblank(order_id),"",SUM(C2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
=IF(isblank(order_id),"",SUM(D2:2))
=IFERROR(VLOOKUP('chosen_courses'!B2,{course_Names,course_price},2,false),"")
course_Names,order_id and course_price are named ranges.
This works well, at least for now.
But there is a problem:
I have 20 courses, so in the 2nd sheed, there are 21 columns. And I copy the formulas to 1000 rows because that is the maximum rows you can get to using ctrl+shift+↓ and ctrl+D. Now sometimes when I open the sheet, there will be a progress bar calculating formulas in this sheet, which could take around 2 mins, even though I have only like 5 testing orders in the sheet. I am afraid this will get worse when I have more datas or when it is open by old computers.
Is it because I use some resource consuming functions? Can it be improved?

Duplicates varying slightly in string values with additional temporal aspect

I use emergency tweets from the netherlands for a project. There are sometimes more than one tweet regarding one event, varying slightly in timestamp and in the string of the tweet itself. I want to delete those "duplicates".
So, In my database if have rows which are quite alike but not exactly the same like
"2014-01-11 10:01:17";"HV 1 METINGEN (+Inc,net: 1+) (KLEIN OGS) (slachtoffers: ) , Van Ostadestraat 332 AMSTERDAM [ ] "
"2014-01-11 09:59:06";"HV 1 METINGEN (+Inc,net: 1+) (KLEIN OGS) (slachtoffers:1) , Van Ostadestraat 332 AMSTERDAM ] "
The Problem is that i have to take into account the temporal aspect and can't just rely on the string. The text can occur multiple times.
Ideal would be an approach where i delete all rows within a temporal buffer of 10 minutes after the first tweet, when the text similarity is over a threshold of 0.75.
for the string comparison i tried similarity(text,text) see
http://www.postgresql.org/docs/9.1/static/pgtrgm.html
for the time aggregation i used :
(extract(minute FROM timestamp_column)::int / 10)
in addition to the regular YYYY-MM-DD-HH24 time aggregation
Any help is appreciated.

Resources