Google Sheets/Regex - How to pull all numbers that start with # - google-sheets

I have a google sheet cell that reads:
Ticket No. #3223
Ticket No. #2334
Ticket No. #4005
Is there a way to pull all numbers starting with #, but not include the #?
Results Example:
3223
2334
4005
Thank you for any assistance

Yes you can just use this:
=REGEXREPLACE(A1,"(\D+)(\d+)","$2"&char(10))
The parenthesis are a capture group so what I am doing is saying replace all non-digits \D+ in the first capture group, with all digits \d+ in the second capture group. The CHAR(10) at the end is what gives you a new line.
If you actually want them in separate cells you can change the &char(10) to a ; and then use split and transpose to stack them:
=TRANSPOSE(SPLIT(REGEXREPLACE(A1,"(\D+)(\d+)","$2;"),";"))

Related

How to check for overlapping dates

I am looking for a solution on either Google sheets or app script to check for overlapping dates for the same account. There will be multiple accounts and the dates won't be in any particular order. Here is an example below. I am trying to achieve the right column "check" with some formula or automation. Any suggestions would be greatly appreciated.
Start Date
End Date
Account No.
Check
2023-01-01
2023-01-02
123
ERROR
2023-01-02
2023-01-05
123
ERROR
2023-02-25
2023-02-27
456
OK
2023-01-11
2023-01-12
456
OK
2023-01-01
2023-01-15
789
ERROR
2023-01-04
2023-01-07
789
ERROR
2023-01-01
2023-01-10
012
OK
2023-01-15
2023-01-20
012
OK
I also found some similar past questions, but they don't have the "for the same account" component and/or requires some sort of chronological order, which my sheet will not have.
How to calculate the overlap between some Google Sheet time frames?
How to check if any of the time ranges overlap with each other in Google Sheets
Another approach (to be entered in D2):
=arrayformula(lambda(last_row,
lambda(acc_no,start_date,end_date,
if(isnumber(match(acc_no,unique(query(query(split(flatten(acc_no&"|"&split(map(start_date,end_date,lambda(start_date,end_date,join("|",sequence(1,end_date-(start_date-1),start_date)))),"|")),"|"),"select Col1,count(Col2) where Col2 is not null group by Col1,Col2",0),"select Col1 where Col2>1",1)),0)),"ERROR","OK"))(
C2:index(C2:C,last_row),A2:index(A2:A,last_row),B2:index(B2:B,last_row)))(
counta(A2:A)))
Briefly, we are creating a sequence of dateserial numbers between the start & end dates for each row, doing some string manipulation to turn it into a table of account number against each date, then QUERYing it to get each account number which has dateserials with count>1 (i.e. overlaps), using UNIQUE to get the distinct list of those account numbers, then finally matching this list against the original list of account numbers to give the ERROR/OK output.
(1) Here is one way, considering each case which could result in an overlap separately:
=ArrayFormula(if(A2:A="",,
if((countifs(A2:A,"<="&A2:A,B2:B,">="&A2:A,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
+countifs(A2:A,"<="&B2:B,B2:B,">="&B2:B,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
+countifs(A2:A,">="&A2:A,B2:B,"<="&B2:B,C2:C,C2:C,row(A2:A),"<>"&row(A2:A))
)>0,"ERROR","OK")
)
)
(2) Here is the method using the Overlap formula
min(end1,end2)-max(start1,start2)+1
which results in
=ArrayFormula(if(byrow(A2:index(C:C,counta(A:A)),lambda(r,sum(text(if(index(r,2)<B2:B,index(r,2),B2:B)-if(index(r,1)>A2:A,index(r,1),A2:A)+1,"0;\0;\0")*(C2:C=index(r,3))*(row(A2:A)<>row(r)))))>0,"ERROR","OK"))
(3) Most efficient is to use the original method of comparing previous and next dates, but then you need to sort and sort back like this:
=lambda(data,sort(map(sequence(rows(data)),lambda(c,if(if(c=1,0,(index(data,c-1,2)>=index(data,c,1))*(index(data,c-1,3)=index(data,c,3)))+if(c=rows(data),0,(index(data,c+1,1)<=index(data,c,2))*(index(data,c+1,3)=index(data,c,3)))>0,"ERROR","OK"))),index(data,0,4),1))(SORT(filter({A2:C,row(A2:A)},A2:A<>""),3,1,1,1))
HOWEVER, this only checks for local overlaps. not globally. You can see what I mean if you change the dataset slightly:
Clearly the first and third pair of dates have an overlap but G4 contains "OK". This is because each pair of dates is only checked against the adjacent pairs of dates. This also applies to the original reference cited by OP - here's an example where it would give a similar result:
The formula posted by #The God of Biscuits gives the correct (global) result :-)

How to remove a piece of text from a cell?

I'm trying to remove a piece of text (Perfomance) from a column in Google Spreadsheet that contains (XX Performance) XX is a number like 89. I'm using:
=REGEXREPLACE(D:D, " Performance "," - ")
But no love...
enter image description here
Try this Example Sheet
=ArrayFormula(IF(D2:D="",, REGEXEXTRACT(D2:D, "[0-9]+")))
You can use the expression \D+:
\D matches any character that's not a digit (equivalent to [^0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed
The formula will be like:
=REGEXREPLACE(D:D, "\D+","")
UPDATE
I did put it in another column otherwise it creates a circular dependency. The data is imported via API from another app.
Then you will need to create another sheet or use a hidden column to put that information and then use the regex on the column you want the final result.

Increment Number By Value With Condition - Array Formula

i'm confused with the following condition, simply i want to have arrayformula or maybe a custom-formula to increment number in a way bound by specific condition based on value in other column, put it simply :
if the group doesn't change and sub-group is different increment number by 1
else if the group doesn't change and sub-group is doesn't change (same) hold value by previous
else if the group change regardless sub-group value reset number back to 1
for ilustration
** notes Number is the result that i want, in example i fill it manually
Group
Sub-Group
Animal
Number
Land
poisonus
snake
1
land
friendly
dog
2
land
friendly
cat
2
land
scary
lion
3
aquatic
friendly
nemo fish
1
aquatic
predator
shark
2
UPDATE (dummy file link) :
https://docs.google.com/spreadsheets/d/1DAPf-DvWz50_DJ0IqAoSHbfEnfg_mN1lNXHcCjkj27M/edit#gid=0
try:
=INDEX(IF(A4:A="",,VLOOKUP(A4:A&B4:B, {UNIQUE(A4:A&B4:B), COUNTIFS(
REGEXEXTRACT(UNIQUE(A4:A&"×"&B4:B), "(.*)×"),
REGEXEXTRACT(UNIQUE(A4:A&"×"&B4:B), "(.*)×"),
SEQUENCE(COUNTA(UNIQUE(A4:A&"×"&B4:B))), "<="&
SEQUENCE(COUNTA(UNIQUE(A4:A&"×"&B4:B))))}, 2, 0)))
I have entered my solution in cell D1 of the sheet "Erik Help." As I said in the comments to your original post, this is a more complex solution than I can generally offer here on the free, volunteer-run forums. I did choose to develop and share the formula with you, but I will need to leave it to you (and any other future site visitors who may be interested) to study the formula for understanding how it works. Explaining the formula would take longer than writing it.
Here is the formula:
=ArrayFormula({"Number"; IF(A2:A="",,VLOOKUP(LOWER(A2:A&B2:B),QUERY({UNIQUE(FILTER({A2:B,A2:A&B2:B},A2:A<>"")),COUNTIFS(QUERY(UNIQUE(FILTER({A2:B,A2:A&B2:B},A2:A<>"")),"Select Col1"),QUERY(UNIQUE(FILTER({A2:B,A2:A&B2:B},A2:A<>"")),"Select Col1"),SEQUENCE(COUNTA(QUERY(UNIQUE(FILTER({A2:B,A2:A&B2:B},A2:A<>"")),"Select Col1"))),"<="&SEQUENCE(COUNTA(QUERY(UNIQUE(FILTER({A2:B,A2:A&B2:B},A2:A<>"")),"Select Col1"))))},"Select Col3, Col4"),2,FALSE))})

Remove everything before and after a word that begins with "#" in Google Sheets?

I have an Ifttt setup that writes to a Google Sheet. The content of the cell, directly from the source, is a sentence. Right now I'm manually cleaning up the cells as they come in but since this is a recurring sheet that generates content it's been time consuming.
The content will always have "#" with the word after it. Examples:
Here is an #example lol words here
#AnotherExample or this
Is there a formula to take all the content before and after the # so the result should be:
example
AnotherExample
I kept trying the =REGEXREPLACE formula but I can't seem to make it work for my use case. Any help is appreciated!
Something like:
=REGEXEXTRACT(A1,"#(\S*)")
# - Match a literal "#".
(\S*) - 0+ non-whitespace characters captured in a group.
REGEXEXTRACT() Will then extract this capture group. You could also use #(\w*) to capture 0+ word characters. if your input can be something like "test1 #test2, test3".
Thrown in an array variant:
=INDEX(IF(A1:A="","",REGEXEXTRACT(A1:A,"#(\w*)")),)

QUERY() to get rows below a certain cell

I'm downloading a CSV file from an external API. It returns a table with the following structure:
foo
bar 1 4
baz 2 3
Is there any way to make a QUERY (or some other function?) to get the first two rows below the foo cell?
There are several other occurrences of bar and baz rows, that's why I only want the ones below the foo cell. Doable?
Here is one alternative also: using regexextract and concatenate , with the additional rept function to grab each of the values , pretending the url is in A1:
=regexextract(concatenate(IMPORTDATA(A1)),"(Basic)"&rept("(\d\.\d{1,3})",6))
=regexextract(concatenate(IMPORTDATA(A1)),"(Diluted)"&rept("(\d\.\d{1,3})",6))
comes out looking like this:
if you want to be doubly sure: add the Earnings per share in front of basic:
=regexextract(concatenate(IMPORTDATA(A1)),"Earnings per share(Basic)"&rept("(\d\.\d{1,3})",6))
=regexextract(concatenate(IMPORTDATA(A1)),"Earnings per shareBasic.*(Diluted)"&rept("(\d\.\d{1,3})",6))
To have the output all on one line, one after the other:
=regexextract(concatenate(IMPORTDATA(A1)),"Earnings per share(Basic)"&rept("(\d\.\d{1,3})",6)&"(Diluted)"&rept("(\d\.\d{1,3})",6))
Basically if you want to see the raw data , remove the regex part and just leave the concatenate and the importdata - the regex part helps to ignore the beginning portion and then specify which pieces to capture using the parentheses. These are called capture groups. Anything outside of them technically gets ignored.
try this formula:
=QUERY(FILTER(A1:C13,row(A1:C13)>MATCH("foo",A1:A13,0)),"select * limit 2")
example workbook
To use imported data instead of range, use:
=QUERY(FILTER(Data,row(Data)>MATCH("foo",query(Data,"select Col1"),0)),"select * limit 2")
Try:
=query(A2:C,"select * where A='bar' or A='baz' limit 2")

Resources