How can I extract the text within quotes in google sheets? - google-sheets

How can I get just the text within the quotes?
Below shows each line as a cell
a:3:{i:0;s:5:"hello";i:1;s:5:"sdfsf";i:2;s:6:"orange";}
a:4:{i:0;s:5:"hello";i:1;s:3:"How";i:2;s:3:"Are";i:3;s:3:"You";}
a:6:{i:0;s:5:"apple";i:1;s:6:"papaya";i:2;s:6:"Orange";i:3;s:4:"Pear";i:4;s:6:"Banana";i:5;s:9:"Starfruit";}
a:2:{i:0;s:5:"apple";i:1;s:0:"";}
Result that I would like is:
hello,sdfsf,orange
hello,How,Are,You
apple,papaya,Orange,Pear,Banana,Starfruit
apple,

You can use re:
import re
data = """a:3:{i:0;s:5:"hello";i:1;s:5:"sdfsf";i:2;s:6:"orange";}
a:4:{i:0;s:5:"hello";i:1;s:3:"How";i:2;s:3:"Are";i:3;s:3:"You";}
a:6:{i:0;s:5:"apple";i:1;s:6:"papaya";i:2;s:6:"Orange";i:3;s:4:"Pear";i:4;s:6:"Banana";i:5;s:9:"Starfruit";}
a:2:{i:0;s:5:"apple";i:1;s:0:"";}"""
quoted = re.compile('"[^"]*"')
for row in data.split("\n"):
print(",".join(value for value in quoted.findall(row)).replace('"', ""))
This prints:
hello,sdfsf,orange
hello,How,Are,You
apple,papaya,Orange,Pear,Banana,Starfruit
apple,

In case you want to do it using formula, you can use something like this:
=ArrayFormula(TEXTJOIN(",", 1, FILTER(TRANSPOSE(SPLIT(A1, CHAR(34))), IFNA(REGEXEXTRACT(TRANSPOSE(SPLIT(A1, CHAR(34))), "^[a-zA-Z ]*$"), FALSE())<>FALSE())))
Assuming your data is in separate rows.

try:
=ARRAYFORMULA(REGEXREPLACE(TRIM(FLATTEN(QUERY(TRANSPOSE(IFERROR(IF(REGEXMATCH(
SPLIT(REGEXREPLACE(REGEXREPLACE(A1:A, "(:"")", "♂♀"), "("";)", ",♂"), "♂"), "♀"),
SPLIT(REGEXREPLACE(REGEXREPLACE(A1:A, "(:"")", "♂♀"), "("";)", ",♂"), "♂"), ))),,9^9))),
", ♀,|,$|♀", ))

There are a lot of good answers here, but if you want a formula in one cell that gives you the trailing , after apple on the last line (as per your requested output), then you could use:
=arrayformula(regexreplace(flatten(split(textjoin(",",1,if(A1:A<>"",split(regexreplace(regexreplace(A1:A,"""""",char(6655))&char(9999),".\:.|[\:\{\}\;""]","|"),"|"),)),char(9999))),"^,|,$|"&char(6655)&"",))
Assumes your data is in Col A from row 1.
If not:
=arrayformula(regexreplace(flatten(split(textjoin(",",1,if(A1:A<>"",split(regexreplace(A1:A&char(9999),".\:.|[\:\{\}\;""]","|"),"|"),)),char(9999))),"^,|,$",))

Let's say your raw data were in A2:A. You could place the following in cell B2 of an otherwise empty range B2:B ...
=ArrayFormula( IF( A2:A="",, REGEXREPLACE( TRIM( TRANSPOSE( QUERY( TRANSPOSE( IF( REGEXMATCH( SPLIT( A2:A, CHAR(34)), ":|;|\{|\}"),, SPLIT( A2:A, CHAR(34))&",")),, COLUMNS( SPLIT( A2:A, CHAR(34)))))), "\s|[\s,]+$", "")))
This one formula will produce all results for the column. No dragging involved. (Interposed spacing in the formula as shown above is only for the sake of readability here on the site; none of the spaces are actually necessary to the functionality of the formula, and you may remove them if you like.)
If you'd like to pad comma-separated list entries with a space, remove this portion from the formula: \s|.
As to how it works, this is such a custom requirement that I don't think it will be of use to any future site visitors. So I encourage you to dig into it, take it apart and see what the parts do separately and cumulatively. And if you get stuck (if understand it is even important to you at all), feel free to ask any specific question you may have.

Related

mapping vlookup for each element of splitted list in google sheets

This question is also answered here:
Get a vlookup of a cell after split in Google sheet
but not marked as corrected answer, and cannot make it work.
Goal : I want to apply a vlookup function to a split function, so that I can search for corresponding values (found by the vlookup) for each token obtained from a string.
Consider this sheets:
// Sheet 'veggies'
A
apple, pine, tree
pine
// Sheet 'themes':
A
B
C
apple
8
theme1,theme2
tree
3
theme2
pine
1
theme1,theme3
I want to:
split cells of column A of 'veggies' by commas, so to have tokens
vlookup for the C column in 'themes' sheet, by using the index of tokens, for all of them
As approach I tried to first retrieve the frequences of tokens in column B, sheet 'themes', and cannot understand what my formula is doing:
=ARRAYFORMULA( VLOOKUP( split(A2;",");'themes'!A$2:D;2;FALSE))
This formula only get the frequency from column be for the first token, while for others will only report N/A saying could not find a value, but it is clearly there.
Any help?
Am I on the right track ?
P.s. if one would like to offer use of query , like in the other SO answer, please help me to break down what it does.
ARRAYFORMULA( VLOOKUP( split(A2;",");'themes'!A$2:D;2;FALSE))
Your formula works. But when splitting by comma , there's a extra space left over in all the elements from the second element. So, when
apple, pine, tree is splitted, it becomes apple, pine, tree(note the extra space prefix). To fix, you can simply add a space to the split as well:
=ARRAYFORMULA( VLOOKUP( split(A2;", ");'themes'!A$2:D;2;FALSE))
this should work if you want the results in one cell.
=ARRAYFORMULA(TEXTJOIN(", ";TRUE;IFERROR(VLOOKUP(SPLIT(A2;", ";0);themes!A:C,2,0))))
Use this formual
=ArrayFormula(LAMBDA(v,
IF(v="",,{v,SPLIT(VLOOKUP(v,themes!A2:C,3,0), ", ",1)}))
(FLATTEN(IF(A2:A="",,SPLIT(A2:A, ", ", 1)))))

Horizontally Concatenate Array of Columns with delimiter and ignore blank columns in google sheets [duplicate]

This question already has an answer here:
Concatenate non empty cells in each row with arrayformula in google sheets
(1 answer)
Closed 6 months ago.
The shared sheet shows multiple column rows which can be individually concatenated horizontally with a comma & space between using TEXTJOIN(", ", TRUE, A2:D2) and blank spaces are ignored. But textjoin cannot be used in Arrayformula as far as I know and I would like ot find a suitable replacement that can also be combined as a string along with other strings of information.
I want to be able to use this as an independent formula string that might be added to other strings of information. For example, "Favorite colors: "& textjoin(", ",1,A2:D2)&"Favorite foods:"&textjoin(", ",1,E2:G2)&"...
Possible solutions
May be a variant of one of the following:
Modifying this so it could be used w/ an array formula JOIN("~", SPLIT(JOIN(CHAR(60000), B3:E3), CHAR(60000)))
Modifying this formula works with join also JOIN(", ",FILTER(H2:H,H2:H<>""))
Using a combination of IF(a2:A<>"" along with a regex replacement at the end (see my answer below) but this could be very long formula compared to textjoin if there are many columns)
An ideal solution would be concise and look closest to something this:
arrayformula(TEXTJOIN(", ", TRUE, A2:A,B2:B,C2:C)
Shared sheet is here
use:
=INDEX(REGEXREPLACE(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(A2:D="",,A2:D&",")),,9^9))), ",$", ))
Using a series of IF statements, adding a delimiter and then removing any trailing delimiters can be accomplished using: Arrayformula(regexreplace(if(A2:A100<>"",A2:A100&", ","")&if(B2:B100<>"",B2:B100&", ","")&if(C2:C100<>"",C2:C100&", ","")&if(D2:D100<>"",D2:D100&", ",""),", $",""))
Use a query smush, like this:
=transpose(query(transpose(A2:D), "", 9^9))
The formula will separate values with spaces. To separate with commas and remove unwanted white space, use trim() and substitute() or regexreplace(), like this:
=arrayformula( substitute( trim( transpose( query( transpose(A2:D), "", 9^9 ) ) ), " ", ", " ) )

How to remove spaces in Google Sheets query results

The following is combining the results of several cells (F:T) into one cell down several rows. How can I remove any blank spaces that result from combining the cells?
=transpose(
query(
transpose(F:T),,9^9
)
)
I know I can use the TRIM() function, but I'm not sure how to include it in the Query.
use:
=INDEX(SUBSTITUTE(TRANSPOSE(QUERY(TRANSPOSE(F:T),,9^9)), " ", ))
You already know the answer, just use trim, you will notice an array error, to resolve this you just wrap the entire thing in an arrayformula:
=arrayformula(trim(transpose( query( transpose(F:I),,9^9 ) )))
Be careful with query if the majority of your data in a particular column will ever be numbers it will delete the text.

ArrayFormula with GoogleFinance dynamic date

First of all, i'm not a powerful sheets user :)
I'm trying to use GOOGLEFINANCE to calculate amounts in multiple currencies.
I use this formula:
=IF($A2;
IF(
$C2:C;
$C2:C;
IF(
$D2:D;
$D2:D*INDEX(GoogleFinance("CURRENCY:USDUAH";"close";$A2);2;2);
$E2:E*INDEX(GoogleFinance("CURRENCY:EURUAH";"close";$A2);2;2)
));
0)
A-column contains dates,
C,D,E - amounts in 3 different currencies.
IFs are just to prioritize columns :)
The formula works well but i need to "extend" it each time i add row - to increment
$A2 -> $A3 to get rate for specified date.
I try to use ArrayFormula but it turns out it keeps reference to $A2 so i get same rate irrelevant from date specified in A-cells.
I have created sample sheet to illustrate:
https://docs.google.com/spreadsheets/d/1K2TbGIWl7JacYKiWgwwmJfelxJ-7fa9F9obp5XswW18/edit?usp=sharing
I have allowed editing by anyone, so if you decide to edit - please don't remove anything :) also you can drop your username in sticky row(above your proposed solution)
Is there a way to apply ArrayFormula to this to make it work?
Maybe you can provide more readable solution to nested IFs.
try:
=ARRAYFORMULA(IF(A2:A<>"";
IF(C2:C<>""; C2:C;
IF(D2:D<>""; VLOOKUP(TO_TEXT(A2:A);
TO_TEXT(QUERY(GOOGLEFINANCE("CURRENCY:USDUAH";
"close"; MIN(A:A); MAX(A:A)+1);
"offset 1 format Col1'dd.mm.yy'"; 0)); 2; 0)*1;
VLOOKUP(TO_TEXT(A2:A);
TO_TEXT(QUERY(GOOGLEFINANCE("CURRENCY:EURUAH";
"close"; MIN(A:A); MAX(A:A)+1);
"offset 1 format Col1'dd.mm.yy'"; 0)); 2; 0)*1)); ))
There is a new simpler and more flexible method now since the introduction of LAMBDA and its helper functions in Google Sheets in August 2022.
Assuming dates in A2:A, and amounts in UAH, USD, EUR in C2:C, D2:D, E2:E respectively, then the following formula will work, e.g. in cell F2:
=MAP(A2:A;C2:C;D2:D;E2:E;
LAMBDA(date;uah;usd;eur;
IFS(
uah;uah;
usd;usd*INDEX(GOOGLEFINANCE("currency:usduah";"price";date);2;2);
eur;eur*INDEX(GOOGLEFINANCE("currency:euruah";"price";date);2;2);
ISBLANK(date);)))
The trick here is that MAP(LAMBDA) calculates the specified formula for each row of the input array separately (effect similar to manually expanding the formula over the whole range), whereas ARRAYFORMULA passes the whole array as an argument to the formula (GOOGLEFINANCE is special and doesn't work intuitively with such input).
This general method with MAP(LAMBDA) can now be used to pass any arguments to GOOGLEFINANCE in a way one would otherwise expect to do with ARRAYFORMULA.
Try This One:
=arrayformula(
IF(query(arrayformula(if(A2:A="",False,True)),
"Select * where Col1=True"),
IF( $C2:C,
$C2:C,
IF( $D2:D,
$D2:D*INDEX(GoogleFinance("CURRENCY:USDUAH","close",$A2),2,2),
$E2:E*INDEX(GoogleFinance("CURRENCY:EURUAH","close",$A2),2,2))),0))

Combine Text in ArrayFormula

I have a table using Google Sheets. It has three columns that will always have a null value or a specific value for that column. Each line will have one, two, or three values; it will never have three null values on one line. In the fourth column, I want an ArrayFormula that will combine those values and separate the values with a comma if there is more than one.
Here is a photo of what I am trying to accomplish.
I've tried several ideas so far and this formula is the closest I've gotten so far but it's still not quite working correctly; I think it is treating each column as an array before joining rather than doing the function line by line. I'm using the LEN function rather than A2="" or ISBLANK(A2) because columns A-C are ArrayFormulas as well. I realize this probably isn't the most efficient formula to use but I think it covers every possibility. I'm definitely open to other ideas as well.
={"Focus";
ArayFormula(
IFS(
$A$2:$A="", "",
(LEN(A2:A)>0 & LEN(B2:B)>0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, A2:A, B2:B, C2:C),
(LEN(A2:A)>0 & LEN(B2:B)>0 & LEN(C2:C)=0), TEXTJOIN(", ", TRUE, A2:A, B2:B),
(LEN(A2:A)>0 & LEN(B2:B)=0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, A2:A, C2:C),
(LEN(A2:A)=0 & LEN(B2:B)>0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, B2:B, C2:C),
(LEN(A2:A)>0 & LEN(B2:B)=0 & LEN(C2:C)=0), A2:A,
(LEN(A2:A)=0 & LEN(B2:B)>0 & LEN(C2:C)=0), B2:B,
(LEN(A2:A)=0 & LEN(B2:B)=0 & LEN(C2:C)>0), C2:C
)
)
}
Is it possible to achieve this with Google Sheets?
Sample File
Please try:
=ARRAYFORMULA(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(A2:C,ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99)))," ",", "))
Notes:
The formula will work incorrectly if some names have space inside: like "Aston Martin"
So if you have spaces, please try this:
=ARRAYFORMULA(SUBSTITUTE(
SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(SUBSTITUTE(A2:C," ",char(9)),ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99)))," ",", "),
CHAR(9)," "))
EDIT
Noticed the shorter variant (without *COLUMN(A2:C)^0) will work:
=ARRAYFORMULA(SUBSTITUTE(
SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(SUBSTITUTE(A2:C," ",char(9)),ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C),0)))),,2^99)))," ",", "),
CHAR(9)," "))
Notes:
I used an old trick to join strings with an array-formula. See sample file
Explanations
If you like to understand any tiered formula, the best way is to split it by parts:
Part 1. Filter the data
FILTER(any_columns,ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0))). this is my way to limit the data range.
The range is open, means it starts from the second row (A2) and
ends in any row.
I want to get the limited array in this step to reduce work that the formula should do. This is done with a condition, if.
ROW(A2:C) must be less or equal to the max row of data.
MAX(IF(LEN(A2:C), some_rows) gives the max row.
If(len.. part checks if a cell has some text inside it.
Note some_rows part:
MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99))).
ROW(A2:C) must be multiplied by columns, because filter formula
takes only one row into its condition. That is why I multiply by
COLUMN(A2:C)^0 which is columns with 1s. Edit. Now noticed,
that the formula works fine without *COLUMN(A2:C)^0, so it's an
overkill.
Part 2. Join the text
query formula has 3 arguments: data, query_text, and a number_of_header_rows.
data is made with a filter.
query_text is empty, which gives us equivalent to select all
("select *").
And the number of rows of a header is some big number (2^99).
This is a trick: when a query has more headers then one row,
it will join them with space.
After a union is made, transpose function will convert the result back to the column.
Part 3. Substitute and trim
The function trim deletes extra spaces.
Then we replace spaces with the delimiter: ", ". That is why the
formula needs to be modified if spaces are in strings. Correct
result: "Ford, Aston Martin". Incorrect: "Ford, Aston, Martin". But
if we previously replace spaces with some char (char(9) is Tab),
then we do not replace it in this step.

Resources