Data cleaning and splitting text manually into columns - google-sheets

Using google sheets how can i extract a columns from this data
"aboutYouAnswers" is the first category
there are a set of questions in this category people are answering to from a website, so all the numbers are question numbers ("0"."1" etc)
All the text within the quotes are the answers i would like to convert into individual columns to compare all answers together.
{"aboutYouAnswers":{"0":"eating pizza","1":"lowering the communities IQ","2":"Avax Apes + Origins","3":"ate 100 pizzas in an hour"}}
image of test data and how it is formatted

Option 1
=ArrayFormula(IF(G3:G="",,
BYROW(G3:G, LAMBDA(rg,
LAMBDA(l, TEXTJOIN(CHAR(10), 1, {
CONCAT(FILTER(l, ISODD(MATCH(l,l,0))=TRUE)&" ",
FILTER(l, ISEVEN(MATCH(l,l,0))=TRUE))}))
((REGEXREPLACE(FLATTEN(SPLIT(REGEXEXTRACT(rg, "\:{(.+)\}}"), ",:")), """","")))))))
Option 2
=ArrayFormula(LAMBDA(s, SPLIT(QUERY(FLATTEN(IF(s="",,SPLIT(s, ""&CHAR(10)&""))), " Select Col1 "),"♦"))
(QUERY({
IF(G3:G="",,
BYROW(G3:G, LAMBDA(rg,
LAMBDA(l, TEXTJOIN(CHAR(10), 1, {
CONCAT(FILTER(l, ISODD(MATCH(l,l,0))=TRUE)&"♦",
FILTER(l, ISEVEN(MATCH(l,l,0))=TRUE))}))
((REGEXREPLACE(FLATTEN(SPLIT(REGEXEXTRACT(rg, "\:{(.+)\}}"), ",:")), """",""))))))}, "Select Col1 Where Col1 <>'' ")))
Used formulas help
ARRAYFORMULA - LAMBDA - SPLIT - QUERY - FLATTEN - IF - CHAR - BYROW - TEXTJOIN - CONCAT - FILTER - ISODD - MATCH - ISEVEN - REGEXREPLACE - REGEXEXTRACT

Related

Selecting few columns in Google Sheet's QUERY function

I am trying to select few columns in Google Sheet's QUERY function but get errors when I combine with other formula in the function.
Here is my formula. What I am trying to do? my goal is combine data (column) from different sheets that will ultimately feed into a pivot table
=QUERY({TeamData!C:C,TeamBonusData!F:F;IndividualData!M:O,IndividualBonusData!P:R},) - this does not work
=QUERY({TeamData!C:C,TeamData!F:F},) - this works.
Follow the advice Tedinoz gave and ensure that those references have a matching number of rows and columns. It may help if you visually group ranges so that they stack side-by-side or on top of each other, like this:
=lambda(
teams, teamsBonus, individuals, individualsBonus,
lambda(
numTeams, numIndividuals,
{
array_constrain(teams, numTeams, 2), array_constrain(teamsBonus, numTeams, 2);
array_constrain(individuals, numIndividuals, 2), array_constrain(individualsBonus, numIndividuals, 2)
}
)(
min(rows(teams), rows(teamsBonus)),
min(rows(individuals), rows(individualsBonus))
)
)(
TeamData!C2:D, TeamBonusData!F2:G, IndividualData!M2:O, IndividualBonusData!P2:R
)

Splitting data separated by a | in a cell to new rows

I have a Spreadsheet (see link at the bottom of the page) that has 1 row and 3 columns.
I want to take the data contained and split it out, resulting in a row by row breakdown.
Is anyone aware of how this could be done using a formula? It would save me a bunch of time doing it manually!
DemoSheet - This shows what the input and the desired outputs are
EDIT:
The Input sheet shows the data as I have it, using metasyntactic variables as examples (real data will vary, but will always follow the same formatting).
For every email address in the email column, I need to do the following
Get the list of managers and members and have it output as per the Desired Output 1 sheet. So for each entry in ColA, a row entry for each of the data in B and C, as if they were concatenated, split by " | " and transposed vertically.
Repeat the above process but only for managers (as per the the Desired Output 2 sheet).
Is this what you need?
Output1:
=arrayformula({"Email","Members";
query(
array_constrain(
{
flatten(split(rept("|"&Input!A2:A,len(regexreplace(Input!B2:B&" | "&Input!C2:C,"[^\|]",))+1),"|")),
trim(flatten(split(Input!B2:B&"|"&Input!C2:C,"|")))
},
max(if(Input!B2:B<>"",len(regexreplace(Input!B2:B&" | "&Input!C2:C,"[^\|]",))+1,))*counta(Input!B2:B),2),
"where Col1 is not null",0)
})
Output2:
=arrayformula({"Email","Manager";
query(
array_constrain(
{
flatten(split(rept("|"&Input!A2:A,len(regexreplace(Input!B2:B,"[^\|]",))+1),"|")),
trim(flatten(split(Input!B2:B,"|")))
},
max(if(Input!B2:B<>"",len(regexreplace(Input!B2:B,"[^\|]",))+1,))*counta(Input!B2:B),2),
"where Col1 is not null",0)
})
You could optionally wrap unique() within the arrayformula if it is likely that you'll get duplicates in the dataset.

Building a portfolio tracker char based on a list of transactions: a formula/approach is incorrect

I'm trying to build a chart, which tracks month-to-month portfolio assets:
The initial dataset is a basic transaction list (A:G). Next, I'm trying to define a dataset (I:L), where the column I contains first days and J:L columns have actual balances at those moments.
A J:2 formula is:
=INDEX(array_constrain(filter(SORT($A$2:$G;1;FALSE); $B$2:$B = J$1; $A$2:$A < $I2); 1; 7); 1; 7)
Unfortunately, this does not work as expected. The formula is also quite complicated, so I wish it can be simpler.
Any help and links are highly appreciated.
p.s. If you are not in a personal challenge mode, then I suggest you to look at this portfolio tracker example: https://investmentmoats.com/stock-market-commentary/portfolio-management/introducing-our-free-stock-portfolio-tracker-spreadsheet/
Here is what I came up with:
On I2 you can add:
=SORT(UNIQUE($A$2:A), 1, true)
This basically gets the unique dates and sorts them automatically.
On J1:
=TRANSPOSE(SORT(UNIQUE($B$2:B), 1, true))
Which is the same but using column B and transposing so it's horizontal.
And now the big one, on J2:
=
ARRAYFORMULA(
IF(
(I2:I<>"")*(J1:1<>"");
IFERROR(
VLOOKUP(
I2:I&"␟"&J1:1;
QUERY(
{ARRAYFORMULA(A2:A&"␟"&B2:B),G2:G};
"select Col1, sum(Col2) where Col2 is not null group by Col1 order by Col1"
);
2;
true
);
0
);
""
)
)
The basic idea is to use QUERY to make the sum of the values and then use VLOOKUP to find which sum should go to that cell. Because you cannot lookup 2 cells at the same time, I join both columns using a unit separator ␟ to separate the 2 values. This could be any other character that it will never appear but this has this precise function. Add a few conditionals to handle errors and empty values, and you have the result.
Reference
UNIQUE (Docs Editors Help)
SORT (Docs Editors Help)
TRANSPOSE (Docs Editors Help)
QUERY (Docs Editors Help)
VLOOKUP (Docs Editors Help)
ARRAYFORMULA (Docs Editors Help)

Google sheets - QUERY, IMPORTRANGE and append text to results

I have a google sheet (sheetA) that contains master data. I am importing this data into another google sheet (Sheet B) using "IMPORTRANGE" function along with "QUERY"
=SORT(QUERY(IMPORTRANGE("url for sheet A","Crown DB!A2:E"),"SELECT Col1 WHERE not(Col5='SS')"))
Suppose, following is an example output I get after runnning the above formula
item1
item2
item3
item4
I want to append text to these returned values so that I can obtain two new values for each returned value:
item1 - var1
item1 - var2
item2 - var1
item2 - var2
item3 - var1
item3 - var2
item4 - var1
item4 - var2
If it were a single variant, I can just append "- Var1" to the above formula:
SORT(QUERY(IMPORTRANGE("url for sheetA","Crown DB!A2:E"),"SELECT Col1 WHERE not(Col5='SS')")) & "- Var1"
How can I modify the formula to get append multiple variants (>=2) for each item returned using IMPORTRANGE? The number of variants for each item is the same.
Simplest approach is by appending the values inside an array formula and flatten it. Then use SORT afterwards.
Formula:
=ARRAYFORMULA(SORT(FLATTEN(
QUERY({A2:E},"SELECT Col1 WHERE not(Col5='SS') and not(Col1='')")
& {" - Val1", " - Val2"})))
Output (2 variants):
Output (3 variants):
Note:
Used {A2:E} to show the full formula easier in testing. Change {A2:E} into the IMPORTRANGE in your case.
not(Col1='') is important to skip rows with blank Col1's.
Appending an n-array with an m-array in the ARRAYFORMULA will result into n x m dimension of array. Using FLATTEN on that will combine them all in one column. Then we use SORT afterwards.
Final formula should be:
=ARRAYFORMULA(SORT(FLATTEN(QUERY(
IMPORTRANGE("url for sheetA","Crown DB!A2:E"),
"SELECT Col1 WHERE not(Col5='SS') and not(Col1='')"
) & {" - Val1", " - Val2"})))

Arrayformula concatenating strings - will it slow down my sheet?

Will an array formula like this one slow down my sheet? I am using it to concatenate 3 strings.
ARRAYFORMULA(M3:M & " - " & O3:O & " - " & V3:V)
I will have about 4 similar array formulas each in about 10 tabs with 5000 rows each.
there are no heavy calculation in it so the answer is no - you should be just fine. the most performance consuming formulae class for a given scale (amount of rows/tabs you have) are IMPORTRANGE QUERY and VLOOKUP or MMULT

Resources