Google Sheets - Latest date of each summary value - Conditional Formatting - google-sheets

I have a table as follows, which has Unique IDs for each person and some dates
NAME
DATE
Info
John
01/11/2022
Praesent accumsan.
John
29/11/2022
Phasellus fermentum.
John
30/11/2022
Curabitur molestie.
Peter
09/05/2019
Cras mollis est.
Peter
06/05/2019
Nulla eu metus.
Peter
06/05/2019
Proin commodo.
Peter
20/09/2022
Nunc rhoncus dui.
Peter
22/09/2022
Aliquam accumsan.
Beth
11/08/2021
Integer sollicitudin.
Beth
13/09/2021
Integer eget dolor.
Beth
13/09/2021
Cras vitae massa non.
Sarah
02/12/2021
Cras interdum nibh.
Sarah
13/04/2022
Mauris cursus augue.
Sarah
13/04/2022
Sed varius lacus.
Sarah
14/04/2022
Aliquam lacinia.
Sarah
18/05/2022
Fusce scelerisque.
Sarah
19/05/2022
Suspendisse viverra.
Sarah
02/06/2022
Ut nec dui molestie.
Sarah
07/06/2022
Maecenas ac neque nec.
Sarah
19/10/2022
Mauris sodales tellus.
Sarah
19/10/2022
Pellentesque auctor.
Sarah
20/10/2022
Morbi fringilla felis.
Sarah
21/10/2022
Praesent fringilla.
Mathew
18/01/2021
Fusce sagittis dui.
Mathew
18/01/2021
Nunc at erat eget.
Mathew
19/01/2021
Sed nec mauris eu.
Mathew
19/01/2021
Aenean a arcu nec.
Mathew
03/02/2021
Nunc mollis turpis.
I want to get the latest date for each ID, and tag it somehow, I thought about doing it by Conditional formatting, this table is currently on googlesheets.
For example, John's would be 30/11/2022, Peter 22/09/2022, beth would be the multiple 13/09/2021 ones, Sarah would be 21/10/2022, Mathew would be 03/02/2021.
This simplified version has only 5 IDs (that I converted to names) and some dates and info, but the real one has hundreds of IDs and hundreds of dates for each one.
This table will keep self populating with newer info all the time, but for focus purposes only the last input on each ID is important.
I tried Maxif or other approaches with no success, even a tag on a new column would really help.
I mean, maxif did showed the latest date on a new column but I wasn't able to pinpoint the line it belonged to for each ID.
Any help would be appreciated :)
Thanks, Rafael

You will definitely need a helper column if you have a lot of data. I tried using a fairly simple match/countifs formula on 50K of data and it took about 20 minutes to update.
However there is a solution available here. The answers to this question describe fast ways of numbering subgroups. The fast solutions use sorting to find the breakpoints betweem the groups, then Scan with a lambda to increment within each group, and finally a reverse sort to link back to the original data. This can be adapted for the current problem by sorting the data in ascending order of group, then in descending order of date within each group and numbering each group from the most recent date to the oldest. The numbering does not increment where there are two or more duplicate dates within a group.
The original formula was
=LAMBDA(a,INDEX(if(a="",, LAMBDA(srt, SORT( SCAN(1,SEQUENCE(ROWS(a)),
LAMBDA(ini,v,if(v=1,1,if(INDEX(srt,v,1)<>INDEX(srt,v-1,1),1,ini+1)))), index(srt,,2),1) )
(SORT({a,SEQUENCE(ROWS(a))})))))(A2:A)
The modified formula is
=LAMBDA(a,INDEX(if(index(a,0,1)="",, LAMBDA(srt, SORT(SCAN(1,SEQUENCE(ROWS(a)),
LAMBDA(ini,v,if(v=1,1,if(INDEX(srt,v,1)<>INDEX(srt,v-1,1),1,if(index(srt,v,2)<>index(srt,v-1,2),ini+1,ini))))), index(srt,,3),1) )
(SORT({a,SEQUENCE(ROWS(a))},1,1,2,0)))))(A2:B)
If this is placed in E2 (say), the conditional formatting formula is simply
=$E2=1
This updates in 2-3 seconds with 50K of data.

You can have a formula like this:
=MATCH($A2&$B2,UNIQUE($A2:$A)&BYROW(UNIQUE($A2:$A),LAMBDA(each,MAX(FILTER($B$2:$B,$A$2:$A = each)))))
But, if you have a really big table, you could create an auxiliary column with the formula in order to make it faster:
={UNIQUE(A2:A),BYROW(UNIQUE(A2:A),LAMBDA(each,MAX(FILTER(B2:B,A2:A = each))))}
And then you can set a conditional formatting like this:
=($A2<>0)*MATCH($A2&$B2,ARRAYFORMULA($E:$E&$F:$F),0)
Or you could even join those two columns in just one

Related

Automatically add rows in google sheets when importing data from other sheets

I am trying to import data from several sheets into one, But they need to come between data in other cells.
So i have:
Fixed text data 1
=FILTER('Car Parks'!A:AC, NOT(ISBLANK('Car Parks'!A:A)))
=FILTER('Chapter 8'!A:AC, NOT(ISBLANK('Chapter 8'!A:A)))
=FILTER('Production'!A:AC, NOT(ISBLANK('Production'!A:A)))
=FILTER('CSAS'!A:AC, NOT(ISBLANK('CSAS'!A:A)))
Fixed text data 2
However, each of the FILTER functions will return a #REF as it cannot overwrite the other FILTER functions or the fixed text data.
Each filter function works correctly as long as there is not too many rows required.
Is there a straight forward way to allow each of these FILTER functions to add rows until they are completed, before the next filter function or fixed text data?
Context:
Used to generate a quote document.
Each Filter function imports shift timings for different sectors on a job
Fixed text data 1 is the initial data such as client details
Fixed test data 2 in the terms and conditions of the quote
you could append the filters and fixed text 2 to keep it dynamic. try:
={IFNA(FILTER({'Car Parks'!A:AC;'Chapter 8'!A:AC;Production!A:AC;CSAS!A:AC},{'Car Parks'!A:A;'Chapter 8'!A:A;Production!A:A;CSAS!A:A}<>""));"TERMS AND CONDITIONS";"Lorem ipsum dolor sit amet. Rem laudantium reiciendis eos error quia aut autem molestiae aut temporibus magnam!"}

CONCATENATE or JOIN multiple columns from VLOOKUP into single string

I have a worksheet with 2 tabs - Customers, Data. All tabs have a list of customers. The list on Data is a subset of all Customers. I need to pull available address information for Customers from Data.
I need the Address1-3 columns in Data to be joined using <br> and placed in the Address column in Customers. The situation seems similar to this other SO thread joining results into a single string, however those values are all vertical in different rows and the difference here is the values are horizontal in different columns.
Not working:
=TEXTJOIN("<br>",1,VLOOKUP(A2,Data!A:D,{2,3,4},FALSE))
=TEXTJOIN("<br>",1,QUERY("Data!A:H","SELECT B,C,D WHERE "&A2&"="&Data!A:A))
=TEXTJOIN("<br>",1,FILTER(Data!A:D,A2))
Google Sheets example ready for copy/fiddle.
Example Data - the names have been changed to protect the innocent
Account
Address1
Address2
Address3
City
State
Zip
Country
Facebook
Lorem
Ipsum
Dolor
Menlo Park
CA
94025
United States
Amazon
Sit
Amet
Consectetur
Seattle
WA
98109
United States
Apple
Adipiscing
Elit
Ut
Cupertino
CA
95014
United States
Microsoft
Ultricies
Velit
Eu
Redmond
WA
98052
United States
Google
Interdum
Bibendum
Proin
Mountain View
CA
94043
United States
Example Customers - the names have been changed to protect the innocent
Account
Address
City
State
Zip
Country
Facebook
Walmart
Amazon
Home Depot
Apple
CVS
Microsoft
BMW
Google
Toyota
...
Expected Output
Account
Address
City
State
Zip
Country
Facebook
Lorem<br>Ipsum<br>Dolor
Menlo Park
CA
94025
United States
Walmart
Amazon
Sit<br>Amet<br>Consectetur
Seattle
WA
98109
United States
Home Depot
Apple
Adipiscing<br>Elit<br>Ut
Cupertino
CA
95014
United States
CVS
Microsoft
Ultricies<br>Velit<br>Eu
Redmond
WA
98052
United States
BMW
Google
Interdum<br>Bibendum<br>Proin
Mountain View
CA
94043
United States
Toyota
Your formula is fine, you just have to wrap it in an ArrayFormula():
=ArrayFormula(IFNA(TEXTJOIN("<br>",1,VLOOKUP(A2,Data!A:H,{2,3,4},FALSE))))
You need to FILTER() then join. Try-
=IFERROR(MAP(A2:A14,LAMBDA(x,JOIN("<br>",FILTER(Data!B2:D,Data!A2:A=x)))),"")
Try this:
=ARRAYFORMULA(JOIN("<br>",QUERY({Data!A1:H6},"SELECT Col2,Col3,Col4 WHERE Col1 = '"&A2&"'",0)))
Edit: Harun24hr has the better answer, this will not autofill down.

Gsheet - Arrayformula function to include 2 conditions (AND operator)

The goal is to create an arrayformula that looks over two separate columns and returns a SUM if it matches a certain string.
Here's an example table:
Feature
Status
Description
API
Completed
Lorem ipsum
Database
In review
lorem ipsum
Server
Backlog
lorem ipsum
Load Balancer
Completed
lorem ipsum
DB
QA
lorem ipsum
LB
Completed
lorem ipsum
Data base
Backlog
lorem ipsum
The first thing I wanted to pull, was the total number of Data base entries, regardless of the spelling. Which works
For that I used:
=ArrayFormula(Sum(CountIfs(A2:A8, {"db","data b*","database"})))
On that note: I know that's not scalable to keep adding different string variations, it's a one-off-scenario.
What I'd like to return is "For all Database entries, return the SUM where status = Completed". Which would be 0 in this scenario.
I tried adding another arrayformula into the above but I'm not sure how to reference only those items found in the previous formula? If that makes sense?
To visualise the confusing explanation:
=ArrayFormula(Sum(CountIfs(A2:A8, {"db","data b*","database"}) AND "WHERE STATUS IS COMPLETE"))
Could someone point me into the right direction? I'm happy to read through any documentation (only started looking at excel formulas today for the first time)
try:
=SUMPRODUCT(B:B="completed", REGEXMATCH(A:A, "(?i)database|db|data b"))

How do I get REGEXMATCH to look for one term across a range of cells in Google Sheets?

In Google Sheets I have the following formula:
=IF(REGEXMATCH(B1;"offers");"spring";0)
If the cell B1 contains the text "offers" the output will be "spring", otherwise the output will be "0". This works fine but now I want the formular to look at B1 and C1 and if either of them contains "offers" the output should be "spring".
Example Output with formula in column D:
B
C
D
test offers test
lorem ipsum
spring
lorem ipsum
test offers test
spring
lorem ipsum
lorem ipsum
0
I tried the obvious using
=IF(REGEXMATCH(B1:C1;"offers");"spring";0)
but it gives back a #VALUE!
In the second step I want to use this formula in a nested if function like here:
=IF(REGEXMATCH(B1;"offers");"spring";IF(REGEXMATCH(B1;"shop");"summer";0))
The solution seems to be:
=if(and(arrayformula(regexmatch(B1:C1; "(^| )offers( |$)"))); "spring"; "O")
As modified from user6655984's answer in this post. Note I altered the regex to ensure the pattern you are looking to match is preceded by the start of the line or a space, and is proceeded by a space or the end of the line which ensures it does not fall in the middle of a larger string and handles being at the start or end of the main string.
use:
=IF(REGEXMATCH(B1&C1; "offers"); "spring"; 0)
arrayformula:
=INDEX(IF(REGEXMATCH(B1:B&C1:C; "offers"); "spring"; 0))

Filter rows based on field text in Google Sheets

In Google Sheets, I'm trying to get the sum of all values in column B for which column A equals to 'Lorem'. The result should be 15.
A B
1. Lorem 5
2. Lorem 5
3. Ipsum 100
4. Lorem 5
Tried the following formule, but get the error: Formula parse error.
=SUM(FILTER(B1:B4,A1:A4='Lorem'))
Here is the Google Sheet for reference.
Use double quotes around the string.
=SUM(FILTER(B1:B4,A1:A4="Lorem"))
Alternative ways would be..
=SUMIF(A1:A4, "Lorem", B1:B4)
or
=SUMPRODUCT(A1:A4="Lorem", B1:B4)

Resources