Replace comma chars within importdata function - google-sheets

I'm importing a .csv file with the IMPORTDATA function. The separator is ; and decimal char , on which Google Sheets automatically applies a text to column. I guess this is the expected behavior from IMPORTDATA but as a result, my file is not correctly parsed.
I've tried to use the substitute function on , with . but I guess that the text to the column is applied within the IMPORTDATA function.
=ARRAYFORMULA(SPLIT(SUBSTITUTE(IMPORTDATA("https://drive.google.com/uc?export=download&id=1hosZrfgrKnJJgXkgmPZSKdFoYV_AxKJS"), ",", "."), ";"))
Is there any way to import a CSV with ; as a separator and , as a decimal symbol using a single formula?
I've seen solutions using multiple sheets but I'd like to keep it simple.

=ARRAYFORMULA(SPLIT(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IMPORTDATA(
"https://drive.google.com/uc?export=download&id=1hosZrfgrKnJJgXkgmPZSKdFoYV_AxKJS")), ,
999^99))), " ", "."), ";"))
to compensate for space separated values:
=ARRAYFORMULA(SUBSTITUTE(SPLIT(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(SUBSTITUTE(
IMPORTDATA("https://drive.google.com/uc?export=download&id=1hosZrfgrKnJJgXkgmPZSKdFoYV_AxKJS"),
" ", "♠")), , 999^99))), " ", "."), ";"), "♠", " "))

Related

Google Sheets: Remove comma between specific cells in an TEXTJOIN

I'm using an ARRAYFORMULA / TEXTJOIN formula in Google Sheets to pull selected data together to make a single line of code arranged in a specific way for my project.
The resulting array needs to both INCLUDE commas in the first half, as well as EXCLUDE commas towards the end of the same formula.
example of commas needing removed
I'm currently using a ", " at the beginning of my TEXTJOIN, which works for placing a , between each cell, however I also need the last few cells (in this case: I9, O5, O6, O7, O8) to not have any commas between them.
Is there a way to do this?
Thank you in advance!
Here is a demo of what I'm working on:
https://docs.google.com/spreadsheets/d/1gTQiNKy4c376FuIWQQAomlJ6J1utCOjuq6JzRplTSu4/edit?usp=sharing
Option 01
=TEXTJOIN(", ",1,
TEXTJOIN(", ",1,C5:F8),
TEXTJOIN(", ",1,C3,I5))&", "&
TEXTJOIN(", ",1,I6:L9)&" "&
TEXTJOIN(" ",1,O5:O8)
Option 02
Use this formula to replace the last set of commas
=REGEXREPLACE(B12,
REGEXEXTRACT(B12&"", " --ar.+?(,.+)"),
REGEXREPLACE(REGEXEXTRACT(B12&"", " --ar.+?(,.+)"), ",", ""))
Try this simpler formula (based on your formula)
=INDEX(concatenate("signal code: ",
TEXTJOIN(", ",1,C5:C8,C3,I5:I8) & " "
& TEXTJOIN(" ", 1,I9, O5, O6, O7, O8)))
I got the answer from Reddit:
If your formula is:
=ARRAYFORMULA(concatenate("signal code: ",TEXTJOIN(", ",TRUE,
$C$5:$F$5,$C$6:$F$6,$C$7:$F$7,$C$8:$F$8,$C3,$I5,$I6:L6,
$I7:L7,$I8:L8,$I9,$O5, O6, O7, O8)))
Then just change the separator (in this example, a space) for those
few:
=ARRAYFORMULA(concatenate("signal code:",TEXTJOIN(",",TRUE,
$C$5:$F$5,$C$6:$F$6,$C$7:$F$7,$C$8:$F$8,$C3,$I5,$I6:L6,$I7:L7,
$I8:L8)& " " & TEXTJOIN(" ", TRUE, $I9, $O5, O6, O7, O8)))

Horizontally Concatenate Array of Columns with delimiter and ignore blank columns in google sheets [duplicate]

This question already has an answer here:
Concatenate non empty cells in each row with arrayformula in google sheets
(1 answer)
Closed 6 months ago.
The shared sheet shows multiple column rows which can be individually concatenated horizontally with a comma & space between using TEXTJOIN(", ", TRUE, A2:D2) and blank spaces are ignored. But textjoin cannot be used in Arrayformula as far as I know and I would like ot find a suitable replacement that can also be combined as a string along with other strings of information.
I want to be able to use this as an independent formula string that might be added to other strings of information. For example, "Favorite colors: "& textjoin(", ",1,A2:D2)&"Favorite foods:"&textjoin(", ",1,E2:G2)&"...
Possible solutions
May be a variant of one of the following:
Modifying this so it could be used w/ an array formula JOIN("~", SPLIT(JOIN(CHAR(60000), B3:E3), CHAR(60000)))
Modifying this formula works with join also JOIN(", ",FILTER(H2:H,H2:H<>""))
Using a combination of IF(a2:A<>"" along with a regex replacement at the end (see my answer below) but this could be very long formula compared to textjoin if there are many columns)
An ideal solution would be concise and look closest to something this:
arrayformula(TEXTJOIN(", ", TRUE, A2:A,B2:B,C2:C)
Shared sheet is here
use:
=INDEX(REGEXREPLACE(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(A2:D="",,A2:D&",")),,9^9))), ",$", ))
Using a series of IF statements, adding a delimiter and then removing any trailing delimiters can be accomplished using: Arrayformula(regexreplace(if(A2:A100<>"",A2:A100&", ","")&if(B2:B100<>"",B2:B100&", ","")&if(C2:C100<>"",C2:C100&", ","")&if(D2:D100<>"",D2:D100&", ",""),", $",""))
Use a query smush, like this:
=transpose(query(transpose(A2:D), "", 9^9))
The formula will separate values with spaces. To separate with commas and remove unwanted white space, use trim() and substitute() or regexreplace(), like this:
=arrayformula( substitute( trim( transpose( query( transpose(A2:D), "", 9^9 ) ) ), " ", ", " ) )

Variable for joining a header value AND/OR a delimeter to ordered values in Google Sheets (1B)

In this shared Google Sheet there are values in a range, contactenated in a specified order, how can I maintain the specified order while specifying 2 variables:
A) when to combine the "header value" and "cell value" into a header/value pair and
B) when to add a delimiter
A header/value pair are displayed with the following equation:
=ARRAYFORMULA(TRIM(SUBSTITUTE(TRIM(FLATTEN(QUERY(REGEXREPLACE(SORT(TRANSPOSE(
IF((A2:D4<>"")*(REGEXMATCH(A1:D1, "^\d+_")), "♥"&A1:D1&" "&A2:D4, )),
FLATTEN(A1:D1), 1), "\d+_", ),,9^9))), "♥", CHAR(10)))
All header values with a preceding number are displayed in order in the above image.
Goals
However, as in the following image, there are times when it would be useful to
Not to include the header value
add one or more variable delimeters (ie CHAR(10), " , " after the cell value string and other times when it would help to
Possible approach:
In the above image, a number + an underscore ex 1_Name (B1 )can represent a parameter to include the header/value pair and a number + tilda 4~Math (A1) can represent a parameter to exclude the header value
Zero or more delimeter variables (>1 represented by &)can be provided after a pipe to designate the delimiters to include afer the resulting string (value or header/variable pair)
Since the above equation, already adds a CHAR(10) delimiter I believe it would make sense to specify values after a pipe to represent a delimeter variable. As in the image below, this parameter would be inserted into the formula in that space.
For example, in B1 the value after the pipe CHAR(10) & CHAR(10) would add two line spaces. In A1, the value after the pipe ", " would simply add a comma without a new line space.
The shared sheet is here.
stuff in your row 1 after pipe | can be accessed only as plain text and not as formula input. therefore it's better to have , commas without double-quotes and instead of CHAR(10) & CHAR(10) use a unique symbol twice like ♦♦:
see the image for recommended changes highlighted by yellow:
=INDEX(SUBSTITUTE(REGEXREPLACE(SUBSTITUTE(TRIM(FLATTEN(QUERY(REGEXREPLACE(SORT(TRANSPOSE(
TRIM(IF((A2:D4<>"")*((REGEXMATCH(A1:D1, "^\d+_"))+(REGEXMATCH(A1:D1, "^\d+~"))),
IFNA(REGEXEXTRACT(A1:D1, "\d+_(.*:)"))&" "&A2:D4&REGEXEXTRACT(A1:D1, "\|(.*)"), ))),
FLATTEN(A1:D1), 1), "\d+_", ),,9^9))), "♦ ", "♦"), ",$", ), "♦", CHAR(10)))

Converting full names to 'Surname, First name' format on Google Spreadsheets

I have a Google Spreadsheet that record some author names like this:
A
A. Dagliati
A. Zambelli
A.H.M. ter Hofstede
Agnes Bates Koschmider
Ágnes Vathy-Fogarassy
Ahmed B. Najjar
Ala Norani
I want column B to receive some formula such that B will display the last name, a comma, and the first/middle name, like this:
A B
A. Dagliati Dagliati, A.
A. Zambelli Zambelli, A.
A.H.M. ter Hofstede Hofstede, A.H.M. ter
Agnes Bates Koschmider Koschmider, Agnes Bates
Ágnes Vathy-Fogarassy Vathy-Fogarassy, Ágnes
Ahmed B. Najjar Najjar, Ahmed B.
Ala Norani Norani, Ala
How can I do that?
Try this formula on row 2 of your sheet, with an empty column below it.
=ArrayFormula(IF(LEN(A2:A),REGEXEXTRACT(A2:A,".+\s(.+)") &", " & LEFT(A2:A,LEN(A2:A)-LEN(REGEXEXTRACT(A2:A,".+\s(.+)") )),""))
Image:
=CONCAT(RIGHT(A1,LEN(A1)-FIND("#",SUBSTITUTE(A1," ","#",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))),1)), CONCAT(", ", LEFT(A1, FIND("#",SUBSTITUTE(A1," ","#",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))),1))))
Basically, we are cutting off text from last index of " " (whitespace), append comma and do same from beginning.
Your best bet is probably to use regular expression replacement. It's not pretty, but it's the easiest way to search for the delimiter and perform the replacement using capture groups. I am still working on how to properly support cases where there is a single name like "A". For now it assumes it is the surname. Here is the formula:
=REGEXREPLACE(A1,"(.*?)([^ ]+)$", "$2, $1")
This function will search for any character non-greedily (.*? see the docs for RE2 here) which will allow the second capture group to find all the characters from the end of the string to the first delimiter which is a space in this case. Since we are using capture groups in the regular expression we can reference them in the replacement string using the $1 and $2 placeholders.
The output is as desired:
You could use a combination of:
SPLIT()
COLUMNS()
INDEX()
ARRAY_CONSTRAIN()
JOIN()
=JOIN(
", ",
INDEX(SPLIT(A2, " "), 1, COLUMNS(SPLIT(A2, " "))),
JOIN(" ", ARRAY_CONSTRAIN(SPLIT(A2, " "), 1, COLUMNS(SPLIT(A2, " "))-1))
)
Get the last name
Split up the name by space SPLIT(A2, " ")
Get the number of names COLUMNS(SPLIT(A2, " "))
Select on the last name INDEX(SPLIT(A2, " "), 1, COLUMNS(SPLIT(A2, " ")))
Get the other names
Split up the name by space again SPLIT(A2, " ")
Get all names except the last name ARRAY_CONSTRAIN(SPLIT(A2, " "), 1, COLUMNS(SPLIT(A2, " "))-1)
Join them back together JOIN(" ", ARRAY_CONSTRAIN(SPLIT(A2, " "), 1, COLUMNS(SPLIT(A2, " "))-1))
Join in desired order
JOIN(", ", LAST_NAME_FORMULA, OTHER_NAMES_FORMULA)

Arrayformula generating duplicates in Google Sheets

I seem to have an arrayformula which is generating duplicates for me.
I'll show you the code in column D and then explain the problem:
=arrayformula(
if (C1225:C$1684="A",A1225:A$1684,
if (C1225:C$1684="B",B1225:B$1684,
if (C1225:C$1684="AB",{A1225:A$1684, B1225:B$1684},""))))
If A or B, it should take the content of A or B and put it in D. So I want cell E to be blank UNLESS the content in C was "AB" - only then do I want two cells populated with the data from A and B.
At the moment it's putting out A twice.
For No One|No Reply|ab|For No One|No Reply
Across The Universe (Let It Be Naked...)|The End|a|Across The Universe (Let It Be Naked...)|Across The Universe (Let It Be Naked...)
All You Need Is Love|Twist And Shout|b|Twist And Shout|Twist And Shout
So the first row is OK, but the second two are generating unwanted duplicates.
The problem you are experiencing is because the ARRAYFORMULA is making all responses an Array of 2 items. Since the end result has this size, all results must have the same size.
Try this, changing the ranges to match your needs as well as the 10:
=arrayformula(
if (C2:C$11="A",{A2:A$11, transpose(split(rept(", ", 10), ",", TRUE))},
if (C2:C$11="B",{B2:B$11, transpose(split(rept(", ", 10), ",", TRUE))},
if (C2:C$11="AB",{A2:A$11, B2:B$11},))))
The rept(", ", 10) portion creates a text string which is ", " repeated 10 times, or ", , , , , , , , , , "
Split() then splits this into an entry for each comma, removing the comma, so a series of spaces in this case. The TRUE tells Split to do this for every occurrence of the comma, so it becomes:
split(", , , , , , , , , , ", ",", TRUE)
I then use transpose () to change this into rows instead of columns. This needs to be the same number of rows as the other items in the array I am creating using the brackets so I basically get:
{A2:A$10, [Make_Blank_Entries_For_Each_Row]}
So if I did my math correctly, you should use:
=arrayformula(
if (C1225:C$1684="A",{A1225:A$1684, transpose(split(rept(", ", 460), ",", TRUE))},
if (C1225:C$11="B",{B1225:B$1684, transpose(split(rept(", ", 460), ",", TRUE))},
if (C1225:C$11="AB",{A1225:A$1684, B2:B$11},))))

Resources