Google Sheets Regexextract between specific points in string - google-sheets

CARD TRANSACTION 12NOV21 V DEBIT 007905AED 111.04 Card Ending with 8906 Majid Al Futtaim A89517641 ATM
Hi,
I have the above bank transaction string, and I am looking to extract the description which lies between 8906 and the unique transaction code immediately beginning with A i.e. desired output "Majid Al Futtaim".
I tried:
=regexextract(B2,"8906(.+)A")
But the output I get is the spaces inherent and "Majid Al Futtaim A89517641", whereas I just want "Majid Al Futtaim" only.
Thanks in advance for your help.

Given the sole string you've shared, this should work:
=REGEXEXTRACT(B2,"8906\s+(.+)\s+\D*\d")
I've made some predictions about variation in pattern within this formula; but there is no way to control for something the scope of which I can't see in full.

I would use this regex pattern:
Card Ending with \d+\s+([A-Za-z]+(?: [A-Za-z]+\b)*)
Your updated sheets code:
=regexextract(B2, "Card Ending with \d+\s+([A-Za-z]+(?: [A-Za-z]+\b)*)")
Here is a regex demo showing that the pattern is working.

it should be:
=TRIM(REGEXEXTRACT(B2, "8906(.+)A.+"))

Related

Google Sheet: How to search & extract specific value from a long and multiple line of texts in single cell

Hello and good day everyone!
I need your help and advise, i've one set of table and data as per below, and i'd like to use a formula to extract specific value from a single cell by using some value as indicator to indicate which text's line to extract.
Given the sample table as below,
Column A
Column B
Column C
This is example of the long texts value with multiple linethis text is very long also included value as below,Company: Apple IncContractor name: John Wick the value above, is per line.. and this text continue.. continue text..example text again..
This is where i'd like to display the Company name extracted from Column A
This is where i'd like to display Contractor name extracted from Column A
Example of what i want to achieve,
Column A
Column B
Column C
This is example of the long texts value with multiple line this text is very long also included value as below,Company: Apple IncContractor name: John Wick the value above, is per line.. and this text continue.. continue text..example text again..
Apple Inc
John Wick
I've tried with
LEFT()
MID()
=LEFT(A2,SEARCH("Company",A2)-1)
=REGEXREPLACE(A2,"(.*)Company(.*)","$2")
with no success.
May I request your advise and help on this please!
Thanks in advance.
In your situation, how about the following sample formula?
Sample formula:
Retrieve "Company name".
=TRIM(REGEXEXTRACT(A1,"Company:(.+)"))
Retrieve Contractor name
=TRIM(REGEXEXTRACT(A1,"Contractor name:(.+)"))
Testing:
When these formulas are used, the following result is obtained.
Note:
For example, the base data is put to the cells "A1:A3", you can also use the following formulas.
=ARRAYFORMULA(TRIM(REGEXEXTRACT(A1:A3,"Company:(.+)")))
=ARRAYFORMULA(TRIM(REGEXEXTRACT(A1:A3,"Contractor name:(.+)")))
Reference:
REGEXEXTRACT

Remove $ Sign from importXML formula in Google Sheets

I have an issue with the IMPORTXML function and then changing the currency in my portfolio tracker.
IMPORTXML (C3=IMPORTXML(B3,"//div[#class='priceValue___11gHJ']") takes the price of a cryptocurrency from coinmarketcap (B3=https://coinmarketcap.com/currencies/ethereum/ - this all works fine (would rather prefer to have the prices from coingecko, but cannot figure the IMPORTXML function for that website... - if anyone has some valuable input for this too, would be great).
However, the imported price in C3 has a dollar sign before the actual numbers, which mess up the GOOGLEFINANCE formulas in columns D (D3=C3GoogleFinance("CURRENCY:USDEUR")) and E (E3=C3GoogleFinance("CURRENCY:USDGBP")). Screenshot of the error attached. Error Message
Does anyone know how to fix this?
Much appreciated!
Rob
On cell C3 you could use:
=VALUE(SUBSTITUTE(IMPORTXML(B3,"//div[#class='priceValue___11gHJ']"),"$",""))
References:
SUBSTITUTE
VALUE
If I understand correctly, the problem is that the imported price in C3 has a dollar sign in front of the actual numbers, which breaks down further processing by formulas.
To solve the problem, you can wrap the formula in C3 in a MID() formula that returns a segment of a string. If we specify a position number equal to 2, starting from which the string segment will be extracted, then the dollar sign at position 1 will be ignored. The sign of the pound or the euro will also be ignored - any currency sign with a length of 1 symbol.
10 is the segment length with some margin. Please note if the end of string is reached before segment length characters are encountered, MID returns the characters from starting position to the end of string.
C3:
=MID(IMPORTXML(B3,"//div[#class='priceValue___11gHJ']"),2,10)
MID()
Here's a little workaround to import the prices from coingecko, and remove the $ simbol, of course:
=SUBSTITUTE(IMPORTXML("https://www.coingecko.com/en/coins/ethereum"; "//div[contains(#data-controller,'coins-information')]//span[contains(#data-coin-symbol,'eth')]");"$";"")
But, as you can imagine, if they change the html structure of the site, this will stop working.
Have fun!

Google Sheets: "Bob Smith" --> "bsmith" formula?

I'm trying to pull data from another Google Sheet to feed another Google Sheet. I need to pull from a full name field, which will have something like, "Bob Smith" and then I need to have it rewrite into the new Google Sheet as "bsmith".
Basically, "Get first letter of the first string, then concatenate the entire second string, and then make all lowercase."
So far I've gotten =LEFT(A28,1) working to grab the first letter of a string, but then not sure how to grab the second word and then concatenate.
To get the 2nd word you need to FIND() the first space then read from that position + 1 to the end of the string using MID(). & is used for concatenation.
=lower(left(A28,1) & mid(A28, find(" ", A28) + 1, len(A28)))
Try this for a Google sheet specific solution:
=LOWER(REGEXREPLACE(A2,"^(\w).*?(\w+$)","$1$2"))
It uses REGEX, a much more sophisticated engine and easily adaptable to variations than LEFT and/or MID.
Shorter:
=lower(left(A28)&index(split(A28," "),2))
(Assumes only ever two words.)

Add data to row if it meets criteria, else ignore

I have raw data in my spreadsheet that comes from a Google Form that looks like the following:
(Cost) (Source) (Frivolous) (Medium) (Comments)
A B C D E
1 15.94 McDonalds Yes Credit was hungry
2 98.32 School No Check Paid for textbooks
3 843.00 Hospital No Check Surgery
4 0 asdff Yes N/A Ignore this one woops
5
6 23.99 Dentist No Credit Check up
I want this data to always be copied to a different sheet, but ONLY the data that matches a condition. That condition in this case is if Frivolous is No, meaning I only want on this separate page to track valid important spending.
My second page I want them to look like the following:
(Cost) (Source) (Frivolous) (Medium) (Comments)
A B C D E
1 98.32 School No Check Paid for textbooks
2 843.00 Hospital No Check Surgery
3 23.99 Dentist No Credit Check up
Notice how empty entries are ignored and also entries with Yes under Frivolous are ignored as well.
How would I achieve this? I have absolutely no idea how that would work since I've only been able to achieve this through filter which will not work for this.
I would like to say a few words in defense of Google Spreadsheets and show some great functions that will work, but they are not supported by [excel].
Query
First you may use simple query:
=QUERY(sheet1!A:E,"select * where C = 'No'")
This single short formula will give the desired result, there's no need to fill right and down.
Filter
Actually you may use filter too. This function seems to work too:
=FILTER(sheet1!A:E,sheet1!C:C="No")
Please, read more info about this functions:
Filter
Query and full Query Language Reference
You'll find many exciting things that could be done in Google spreadsheets.
Actually, I was having some trouble with [google-sheets] ArrayFormula function so I used an old-school formula with SMALL and INDEX function in its array form. In A2,
=iferror(index(Sheet13!A$1:A$99, small(index(row($1:$99)+(Sheet13!$C$1:$C$99<>"no")*1E+99, 0, 0), row(1:1))), "")
Fill both right and down.
So you were in fact correct that this could be solved in [excel] with an identical solution as [google-spreadsheet]. However, there are superior methods in newer [exce] (2010+) using the AGGREGATE function that [google-spreadsheet] does not support and I'm sure that [google-sheets] has more elegant functions that I am not recalling right this moment.
Look to Sheet13 and Sheet14 here for the working sample.

Countif with len in Google Spreadsheet

I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.

Resources