Google Sheets: Split data and delete first part of each new cell - google-sheets

I'm feeding data from a SAAS into a Google Sheet, and would need to format it a bit to be able to work with it.
Most columns are ok, but one column has multiple parameters in one. Each cell looks like (data anonymized):
affiliate_fees: None
affiliate_percent: 0.X
amount_refunded: 0
author_fees: 0
author_id: xxxx
author_percent: 0.5
coupon_id: xxxx
created_at: 2016-xxxxx
currency: USD
custom_gateway?: None
earnings_usd: None
meta: {u'url': None, u'class': u'transaction', u'image_url': None, u'description': None, u'name': u'xxxx'}
net_charge: xxx
net_charge_usd: xxx
paypal_payment_id: PAY-XXXXXXX
purchased_at: 2016-xxxx
refundable: True
sale_id: xxxx
status: None
stripe_charge_token: None
stripe_invoice_id: None
total_fedora_fee: None
total_processor_fee: None
user_id: xxxx
vat_fees: None
I've already found out how to SPLIT the data into different columns - I'm doing it via =SPLIT(CC2,CHAR(10))
Now what I'd like to do, ideally in the same operation, is to remove the part before the first colon :
So the goal is: ending up with only the values (part after the :) spread into different columns. I can manually enter the column names. For examaple:
--------------------------------------------------
| affiliate_fees | affiliate_percent |
--------------------------------------------------
| None | 0.X |
--------------------------------------------------
| ... | ... |
--------------------------------------------------
Any hints? Thanks for your time!
Note: I don't really need the meta: line, it can be discarded. I just left it in there because it might (or might not?) make things extra tricky

Alternative 1
Google Sheets introduced few months ago "Split text to columns" as a menu command. See Separate cell text into columns for further details.
Once you separate the text, you could use copy & paste > transpose
Alternative 2
A single formula alternative is to use
=ArrayFormula(transpose(REGEXEXTRACT(A1:A25,{"(.*[\w\?])+\:","\: (.*)+"})))
This will return an 25 x 2 array, and you will not have to manually add the column headers.
Alternative 3
If you still want to use SPLIT, you could use ": " as the separator and FALSE as the third argument to threat them as a single separator, but this also will split the meta: ... into several columns.
Assume that your data start at A1, then the formula to use is:
=SPLIT(A1,": ",FALSE)
To include all the rows with data, you will have to fill down this formula. Then do copy & paste > transpose.

In this spreadsheet I used this formula in cell E2
=ArrayFormula({regexreplace(split(A3, char(10)), "\:(.+)",""); regexreplace(split(A3, char(10)), "(.+)\: ","")})
This will create a row with headers and the values in row 2. If you don't want the headers, just use
=ArrayFormula(regexreplace(split(A3, char(10)), "(.+)\: ",""))
See if that works for you ?

Related

Can I filter out pivot table results that only have one row for a value in column A?

I created a pivot table in googlesheets, and it returns results that look like:
first | second | CountOf3
--------------------------
thing | value | 23
| newVal | 3
| cool | 34
that | value | 234
otherThing | cool | 4
| newVal | 345
And I want to filter out results with just one resulting row for the item in the first column.
So in this example, that would be the row: that | value | 234.
I would like the filter to remove that row, and leave the remaining rows. This is a pivot table in a 2nd sheet that updates when Sheet1 changes.
I have been trying all day, and have not been able to come up with a solution. I was hoping there would be some sort of filter, or spreadsheet formula to do this. I've tried multiple combinations of filters, but nothing seems to work - I'm starting to wonder if this is even possible.
It isn't pretty, but a brute force way is to have a check column beside your pivot table, with this formula on the first data row, ie beside "thing | value | 23".
It flags each row where the subsequent cell in column D is not blank. Then use a query (or filter) to list only the output rows you want. Note that you would hide the columns or rows with the actual (unfiltered) pivot output.
This is the simplest version, to see the logic:
=AND(LEN(D3),LEN(D4))
which results in a TRUE value for pivot chart rows that only have one value.
A more elegant version is an arrayformula, adds the header lable, and uses "Skip" as the flag for which rows to filter out.
={"Better Check";ARRAYFORMULA(IF(LEN(D3:D998)*LEN(D4:D999)*LEN(E3:E998),"Skip",))}
Note that this formula allows for a pivot table result effectively to the bottom of the sheet, but it does have a finite range, due to the constraint of checking two rows at once. It could be enhanced by using a COUNTA on the third data column to measure the exact length of the pivot table results and control the range dynamically, Like this:
={"Better Check";
ARRAYFORMULA( IF( LEN(INDIRECT("D3:D" & (COUNTA(F$3:F)+ROW(F$2)))) *
LEN(INDIRECT("D4:D" & (COUNTA(F$3:F)+1+ROW(F$2)))),
"Skip",))}
Let us know if this helps at all.

GAS - Concatenate Unique List + Sumifs

I am really struggling with this to come up with an easy way to do this in Google Sheets.
I need a unique list with filter.
Fairly straight forward:
Unique(Filter(UniqueRange,FilterRange,Criteria)
I need to sumifs that list. If i do it one row at a time, its fairly straight forward pointing one of my criterias to the result of #1 and copy it downward:
sumifs(SumRange, Criteria1Range, Criteria1, Criteria2Range, Criteria2))
What i am struggling with is, i do not know how far that unique list will go. So i do not know how far down to copy #2's formula. Its no big deal if i had unlimited rows but i need to output the results of the above to all show up in a single cell with a character between the results because i am trying to make all this fit in a "calendar" for a dashboard. Can this even be done?
Sample Data:
Apple | 2
Orange | 3
Red | 1
Green | 4
Orange | 5
Red | 2
Simple result i have now by letting the uniques list grow as needed on the left column and copying formula #2 downward on the right hand column:
Apple | 2
Orange | 8
Red | 3
Green | 4
My question again is, is there a way to have the below result all show up in a single cell and toss in a hyphen between the results?
Apple - 2
Orange - 8
Red - 3
Green - 4
EDIT:
Thank you all for the help.
#theMayer
You pointed me in the right direction and ended up solving my issue. Thank you!
#I'-'I
Helper columns will just not work for my needs.
What i ended up doing is modifying the solution a little. I modified it to have a " _ " between the data because the data had hyphens in it and it was confusing. As for the number formatting, ill just leave it out for now. Here is my final code along with an additional date filter in the select query:
ARRAYFORMULA(TRIM(CONCATENATE(QUERY({CHAR(10)&$G$3:$G,$F$3:$F,$A$3:$A},"select Col1,' _ ',sum(Col2) where Col3 = date '" & text(C3,"yyyy-MM-dd") & "' group by Col1 label sum(Col2) ''"))))
=ARRAYFORMULA(TRIM(CONCATENATE(QUERY({CHAR(10)&A1:A6,--("-"&B1:B6)},"select Col1,sum(Col2) group by Col1 label sum(Col2) ''"))))

Dealing with multiple strings and corresponding integers in a single cell

I’ve got a question about something I’m trying to achieve in Google spreadsheets. I’ve got the following table:
| NAME | COUNTRY |
--------------------------------------------------
| Alpha | GER*1, SWE*3 |
--------------------------------------------------
| Beta | GER*5, SWE*1 |
--------------------------------------------------
| Gamma | SWE*5, GER*3 |
--------------------------------------------------
| Delta | SWE*2, GER*1 |
--------------------------------------------------
Now I’d like to be able to calculate how many SWE there are in line Gamma. I’d need spreadsheets to return an integer value, in my example the correct one would be 5.
However, I’d also like to be able to calculate the total number of GER in line Alpha to Delta, also returned as an integer value, in this case the result should be 1+5+3+1 = 10.
As you can see I’m dealing with both strings and integers in a single cell. If it would be possible to for example create an array of all GER*(INTEGER), then delete the GER* and have an array of only the integers that might help me?
I’ve searched here, googled around and fiddled with spreadsheets to try and come to a solution but either I’m too daft or it’s not as trivial as I thought it’d be. Any help would be much appreciated.
To calculate the total number of GER in line Alpha to Delta (assuming your data extends from row 2 to row 5) , try:
=sum(ArrayFormula(iferror(regexextract(B2:B5, "GER\*(\d+)")+0,0)))
To calculate how many SWE there are in line Gamma, try:
=sum(ArrayFormula(iferror(regexextract(filter(B2:B6, A2:A6="Gamma"), "SWE\*(\d+)")+0,0)))
Change the ranges so that they suit your actual data range and see if that works ?
You could use one of the Text functions, like FIND together with MID. REGEXEXTRACT is even more powerful but harder to use. When you have the substring, use VALUE to get the numeric value from it.
For the summary, the easiest is probably to create an extra column, extract the GER value into it as above, and SUM that column.
Here's the function reference for Google Spreadsheets: https://support.google.com/docs/table/25273?hl=en

Display a range of filtered values as a comma-delimited list in one cell on a Google sheet?

Two sheets, one called Core Data, one called Schedule. The Schedule sheet needs to take information about deadlines from Core Data and display it concatenated in deadline-order. (Simple example with numbers and letters instead of dates and tasks given below.)
Here's what I have so far in 'Schedule' (cell B2 specifically in this case):
=JOIN(", ", FILTER('Core Data'!A2:A, 'Core Data'!B2:B=A2))
It's saying no matches are found so I assume this is a problem with the filter component of the formula. However, I've checked the help pages and can't see a problem with the condition I've created.
The formula should:
Get all the values in the given range (cells A2 downward on a 'Core Data' sheet),
Filter them so that only those with certain values are selected. (The information from 'Core Data' should only be selected if the date in the same row on column B matches the date in the cell in the A column on the Schedule sheet.)
Join all these values together and list them as a comma-delimited list.
Example (without dates, for ease):
Core Data sheet:
A | B
-----
a | 5
b | 7
c | 5
d | 3
Schedule sheet (or what it should look like):
A | B
---------
3 | d
5 | a, c
7 | b
Any idea what is going wrong with my formula or if there is an easier way to solve this problem?
The error message I was getting in the cell is:
Error: No matches are found in FILTER evaluation.
It turns out that the cell I was trying this formula on simply had no matches from the filter (no dates corresponded) but instead of returning empty it threw an error. This sounds simple but it's an annoying quirk for me that the cell didn't end up empty which made me assume the formula was at fault.
While the example in the question works you can quickly break it by adding an extra row to the 'Schedule' table with "8" as the value in the A column and the formula in B:
A | B
---------
3 | d
5 | a, c
7 | b
8 | N/A
The "8" throws an error since it isn't found in the 'Core Data'.
Conversely, on my original spreadsheet, When I tried the formula in a cell which did correspond to a noted deadline, it worked.
I found the solution here is to add an IFERROR function to the formula to deal with this.
So a formula that works for this is:
=JOIN(", ", IFERROR(FILTER('Core Data'!A:A, 'Core Data'!B:B=A5)))
One does not use the second IFERROR argument as advised in Google's own helpsheet. I tried putting in an empty array at first ({}) but this threw a different error. It seems if you miss the argument out, the JOIN knows it has nothing to work with and the cell ends up with a nice blank value.

Is there a multiple-and-add formula in Google's spreadsheet?

What I want is to easily multiply a number by another number for each column and add them up at the end in Google Sheets. For example:
User | Points 1 | Points 2 | Points 3 | Total
| 5 | 1 | 4 |
-----+----------+----------+----------+------
Jane | 2 | 3 | 0 | 13 (2*5 + 3*1 + 0*4)
John | 1 | 11 | 4 | 32 (1*5 + 11*1 + 4*4)
So it's easy enough to make this formula for the total:
= B3*$B$2 + C3*$C$2 + D3*$D$2
The problem is I frequently need to insert additional columns or even remove some columns. So then I have to mess with all the formulas. It's a pain... we have many spreadsheets with these formulas. I wish there was a formula like SUM(B3:D3) where I could just specify a range. Is there anything like MULTIPLY_AND_SUM(B2:D2, B3:D3) that would do this? Then I could insert columns in the middle and the range would still work.
There is a built in function in Google Sheets that does exactly what you are looking for: SUMPRODUCT.
In your example the formula would be:
=sumproduct(B$2:D$2,B3:D3)
Click here for more information about this function.
You can accomplish that without requiring a special-purpose function.
In E3, try this (and copy it to the rest of your rows):
=sum(arrayformula(B3:D3*B$2:D$2))
You can read about arrayformula here.
As long as you introduce new columns between B and D, this formula will automatically adjust. If you add new columns outside of that range, you'll need to edit (and cut & paste).
On it's own, arrayformula(B3:D3*B$2:D$2) operates over each value in B3:D3 in turn, multiplying it by the corresponding value in B$2:D$2. (Note the use of absolute references to 'lock down' to row 2.) The result in this case is three values, [10,3,0], arranged horizontally in three rows because that matches the dimensions of the ranges.
The enveloping sum() function adds up the values of the array produced by arrayformula, which is 13 in this case.
As you copy that formula to other rows, the relative range references get updated for the new row.

Resources