Google Sheet to SPLIT and STRIP - google-sheets

I am splitting the content of a Google sheet cell (C2) which has the contents as below:
using =SPLIT(C2,CHAR(10))
Work Order number: 1157
Item: 0.16/50/100
Shift knots: 7700
Shift weight: 6.300
Downtime: 0
That will SPLIT those values into into 4 cells in C4, D4, E4, F4 and then I strip out the leading characters like below:
Item: from C4 using =RIGHT(C4, LEN(C4)-6),
Shift knots: from D4 using =RIGHT(D4, LEN(D4)-13),
Shift weight: from E4 =RIGHT(E4, LEN(E4)-14) and
Downtime: from F4 using =RIGHT(F4, LEN(F4)-10)
What I would like to ask you is if there is a better way to achieve values like
0.16/50/100, 7700, 6.300, 0 split up across cells from 1 cell? which at the moment I am doing as a 2 step process.

Try
=ArrayFormula(regexreplace(split(C2, char(10)), ".*:\s",))
and see if that helps ?
EDIT: If the data is 'horizontal', you could use the same approach but split the cell using the comma as the delimiter.
=ArrayFormula(regexreplace(split(C5, ", ", 0), ".*:\s",))
References:
ARRAYFORMULA
REGEXREPLACE
SPLIT
CHAR

Related

Need help explaining this formula provided to me

I recently posted on here to get help with a formula, here is the link...https://stackoverflow.com/questions/75068029/vlook-up-style-forumla-but-range-is-2-cells A user called rockinfreakshow was really awesome and provided a great solution for me. I'm not very experienced and don't understand what the formula at all but I'd love to be able to add more attributes to it. Is anyone able to help break it down for me ?
I havent tried anything here, it's totally out of my realm of understanding
=MAKEARRAY(COUNTA(B2:B),COUNTA(D1:O1),LAMBDA(r,c,IF(REGEXMATCH(LAMBDA(ax,bx,IFS(REGEXMATCH(ax,"Mixed")*REGEXMATCH(INDEX(C2:C,r),"Blend")*REGEXMATCH(INDEX(C2:C,r),"Filter"),"BLEND-"&bx&"|FILTER-"&bx,REGEXMATCH(ax,"Mixed")*NOT(REGEXMATCH(INDEX(C2:C,r),"Blend"))*REGEXMATCH(INDEX(C2:C,r),"Filter"),"ESP-"&bx&"|FILTER-"&bx,REGEXMATCH(ax,"Mixed")*NOT(REGEXMATCH(INDEX(C2:C,r),"Filter")),"BLEND-"&bx&"|ESP-"&bx,LEN(ax),SUBSTITUTE(ax&"-"&bx,"Espresso","ESP")))(regexextract(INDEX(B2:B,r),"([^\s]*?) Subscription"),IFNA(SWITCH(REGEXEXTRACT(INDEX(C2:C,r),"Small|Medium|Large"),"Small",250,"Medium",450,"Large",900),SWITCH(REGEXEXTRACT(INDEX(B2:B,r),"Medium|Large"),"Medium",225,"Large",450))),"(?i)"&INDEX(D1:O1,,c)),1,)))
see the WHY LAMBDA? part of this answer to understand the LAMBDA
the formula contains 2x LAMBDA and there are a total of 4 placeholders which translates to:
r - COUNTA(B2:B)
c - COUNTA(D1:O1)
ax - REGEXEXTRACT(INDEX(B2:B, r), "([^\s]*?) Subscription")
bx - IFNA(SWITCH(REGEXEXTRACT(INDEX(C2:C, r), "Small|Medium|Large"),
"Small", 250, "Medium", 450, "Large", 900),
SWITCH(REGEXEXTRACT(INDEX(B2:B, r), "Medium|Large"),
"Medium", 225, "Large", 450))
r counts how many items are in B column
c counts how many items are in row 1 of range D1:O1
ax extracts the word from B column that precedes the word Subscription
bx is a bit complex but essentially it extracts from C column word Small or Medium or Large and replaces it with 250, 450 or 900 respectively. then if C column does not contain one of those 3 words it checks for Medium or Large within B column and assigns 225 or 450 respectively
what we are left with is the core of the formula:
IFS( REGEXMATCH(ax, "Mixed")*
REGEXMATCH(INDEX(C2:C, r), "Blend")*
REGEXMATCH(INDEX(C2:C, r), "Filter"), "BLEND-"&bx&"|FILTER-"&bx,
___________________________________________________________________________
REGEXMATCH(ax, "Mixed")*
NOT(REGEXMATCH(INDEX(C2:C, r), "Blend"))*
REGEXMATCH(INDEX(C2:C, r), "Filter"), "ESP-"&bx&"|FILTER-"&bx,
___________________________________________________________________________
REGEXMATCH(ax, "Mixed")*
NOT(REGEXMATCH(INDEX(C2:C, r), "Filter")), "BLEND-"&bx&"|ESP-"&bx,
___________________________________________________________________________
LEN(ax), SUBSTITUTE(ax&"-"&bx, "Espresso", "ESP"))
for better visualization, the IFS formula contains only 4 elements. each of these 4 elements acts as a switch - if there is a match x we get output y. for example let's dissect the first element...
REGEXMATCH(ax, "Mixed")*
REGEXMATCH(INDEX(C2:C, r), "Blend")*
REGEXMATCH(INDEX(C2:C, r), "Filter"), "BLEND-"&bx&"|FILTER-"&bx
there are 3x REGEXMATCHes multiplied by each other. whenever there is such multiplication in array formulae it translates as AND logic gate (if there would be + it would mean OR logic gate) eg.:
1 * 1 = 1
1 * 0 = 0
0 * 1 = 0
0 * 0 = 0
REGEXMATCH outputs TRUE or FALSE so if we get 3x TRUE the whole argument is considered as TRUE (because 1 * 1 * 1 = 1) so we proceed to output our first switch
therefore if B column contains Mixed and C column contains Blend and C column contains Filter then we output Blend-000|Filter-000 where 000 stands for a specific number determined from bx placeholder/formula and also you can notice the | (which btw stands for OR logic within the regex) but in this case, it's just a unique symbol to join stuff for REGEXMATCH. which REGEXMATCH is this for you may ask? ...this one:
so the output of IFS formula is the input for most outer REGEXMATCH and we check if the IFS output matches something within D1:O1 range. IF yes then output 1 otherwise output nothing. shortened:
IF(REGEXMATCH(IFS(...), "(?i)"&INDEX(D1:O1,,c), 1, )
(?i) in regex means "case insensitive". it is there just for safety reasons because regex is by default case sensitive.
and we reached the MAKEARRAY formula that creates an array of numbers across the whole range with height r and width c where output is the result of IF eg. either 1 or empty cell

Check if a list of dates falls between any one list of dates

I have a spreadsheet like this:
A B C D
01 11/10/21 25/09/21 10/10/21
02 29/11/21
03 17/01/22 17/12/21 30/01/22
04 07/03/22
05 25/04/22 09/04/22 25/04/22
06 13/06/22 25/06/22 17/07/22
07 01/08/22
08 19/09/22 24/09/22 09/10/22
09 07/11/22
10 26/12/22 16/12/22 31/01/23
11 13/02/23
12 03/04/23
Basically, the dates in the A column are my data.
The dates in B and C represent intervals. So, B1 and C1 mean "from 25/09/21 to 10/10/21".
I can easily have this in D1, to tell me if the date in A1 falls between B1 and C1:
D1 => =AND(A1 > B1, A1 < C1)
But, I need it to tell me if that dates falls in ANY one of those. So, I can write:
D1 => =OR(AND(A1>B1, A1<C1), AND(A1>B2, A1<C2), ..., AND(A3>B12, A1<C12))
OK, it's ugly, but it does the job. I really did think I could get away with this.
But...
Then I need to repeat the process for ALL of them (A1, B1, C1), comparing EACH one with EACH range on the right. Like this:
D1 -> =OR(AND(A1>B1, A1<C1), AND(A1>B2, A1<C2), ..., AND(A1>B12, A1<C12))
D2 -> =OR(AND(A2>B1, A2<C1), AND(A2>B2, A2<C2), ..., AND(A2>B12, A2<C12))
D3 -> =OR(AND(A3>B1, A3<C1), AND(A3>B2, A3<C2), ..., AND(A3>B12, A3<C12))
And it NEEDS to be written like this (ugh) since smart cut&pasting will mess up the lot.
My current solution is totally terrible.
I assign this to the first one:
=OR(
AND(A1>$C$1 ,A1<$D$1 ),
AND(A1>$C$2 ,A1<$D$2 ),
AND(A1>$C$3 ,A1<$D$3 ),
AND(A1>$C$4 ,A1<$D$4 ),
AND(A1>$C$5 ,A1<$D$5 ),
AND(A1>$C$6 ,A1<$D$6 ),
AND(A1>$C$7 ,A1<$D$7 ),
AND(A1>$C$8 ,A1<$D$8 ),
AND(A1>$C$9 ,A1<$D$9 ),
AND(A1>$C$10,A1<$D$10),
AND(A1>$C$11,A1<$D$11),
AND(A1>$C$12,A1<$D$12),
AND(A1>$C$13,A1<$D$13),
AND(A1>$C$14,A1<$D$14),
AND(A1>$C$15,A1<$D$15)
)
(I came up with this as I wrote this question)
And then paste it again to all of the others. That way, the smart paste will make sure A1 becomes A2 in the second row, and so on. However, it just feels. So. Ugly.
Is there a better way to do this?
Bonus question: how do I make the date in A1 RED if D1 is "TRUE"?
Thanks in advance.
In D2 add formula:
=ArrayFormula(IF(LEN(A2:A),(A2:A>B2:B)*(A2:A<C2:C)>0,))
Bonus:
Add conditional formatting rule for range A2:A:
=IF(LEN(A2),(A2>$B$2:B)*(A2<$C$2:C)>0,)
Try this formula in cell D1 and drag down:
=ArrayFormula(IF(SUM((A1>$B$1:$B$12)*(A1<$C$1:$C$12))>0;TRUE;FALSE))
For the question related to conditional formatting, select the range A1:A12 and apply this custom formula as a rule:
=D1=TRUE
Finally, this is the result that we got:
You can find an example here.

Getting number values out of prodduct name in google sheets

I have a problem. We have coded item names which has certain values that I need to do calculations with.
I.E. ASG-120U9624M I need to extract only 120, 96, 24, as they are parameters required for calculations. Also 96 could be 220(2-3 digits). 24 could be only 12 or 24. I know that you can get values after certain symbols i.e (-, u) but can you detect that value ends before 12/24. If 96 value could be only 2 digits it would be easy but now it's out of my knowledge to do so. Need some help.
B1:
=ARRAYFORMULA(IFNA(REGEXEXTRACT(A1:A, "-(\d+)U")))
C1:
=ARRAYFORMULA(IFNA(REGEXEXTRACT(A1:A, "U(\d+)..M")))
D1:
=ARRAYFORMULA(IFNA(REGEXEXTRACT(A1:A, ".+(\d{2})M")))
Try this:
=ARRAYFORMULA(IFNA(IF(IFERROR(LEN(REGEXEXTRACT(A1:A, ".*U(\d{4})M")), 5) = 4, REGEXEXTRACT(A1:A, "^ASG-(\d{3})U(\d{2})(\d{2})M$"), REGEXEXTRACT(A1:A, "^ASG-(\d{3})U(\d{3})(\d{2})M$"))))
LEN(REGEXEXTRACT(A1:A, ".*U(\d{4})M")), 5) = 4 - Determine the number of digits from U-M
REGEXEXTRACT(A1:A, "^ASG-(\d{3})U(\d{2})(\d{2})M$") - use this regex if number of digits is 4.
REGEXEXTRACT(A1:A, "^ASG-(\d{3})U(\d{3})(\d{2})M$") - use this regex if number of digits is 5.
Sample Sheet:
Let's say your raw data runs A2:A. Place the following in B2:
=ArrayFormula(IF(A2:A="",,REGEXEXTRACT(A2:A,"(\d+)\D(\d+)(12|24)")))
This one formula will extract all three columns of numbers.
The regex captures three groups, each contained in parentheses. It reads: "Any number of digits followed by one non-digit followed by any number of digits up to a 12 or 24."

Can I define a local value (or variable) in a Google Spreadsheet formula?

Sometimes I come up with long spreadsheet formulas, such as this one to create "data bars" using Unicode characters (addresses are relative to G3):
= rept("█"; floor(10 * F3 / max(F$1:F$999)))
& mid(" ▏▎▍▌▋▊▉█";
1 + round(8 * ( 10 * F3 / max(F$1:F$999)
- floor(10 * F3 / max(F$1:F$999))));
1)
It would be nice to have some kind of let() to define local variables:
= let('x', 10 * F3 / max(F$1:F$999),
rept("█"; floor(x))
& mid(" ▏▎▍▌▋▊▉█"; 1 + round(8 * (x - floor(x))); 1))
Does such a thing exist?
If not, are there any clever hacks to achieve the same result inside the formula? (without using another cell)
Edit: this is not a good example, because the sparkline() function already does this kind of bar chart (thanks Harold!) but the question still stands: how to clean up complex formulas and avoid repetition, apart from using additional spreadsheet cells?
Can the spreadsheet formula SPARKLINE be a solution for you?
=SPARKLINE(10,{"charttype","bar";"max",20})

Fill missing data by interpolation in Google Spreadsheet

I have Google Spreadsheet with following data
A B D
1 Date Weight Computation
2 2015/12/09 =B2*2
3 2015/12/10 65 =B3*2
4 2015/12/11 =B4*2
5 2015/12/12 =B5*2
6 2015/12/14 62 =B6*2
7 2015/12/15 =B7*2
8 2015/12/16 61 =B8*2
9 2015/12/17 =B9*2
I want to graph the weight w.r.t. date, and/or use it with other columns that compute other quantities off the weight. However you will notice that there are some missing entries. What I want is another column which has data which is based on the Weight column with missing values interpolated and filled in. E.g.:
A B C D
1 Date Weight WeightI Computation
2 2015/12/09 65 =C2*2 # use first known value
3 2015/12/10 65 65 =C3*2
4 2015/12/11 64 =C4*2 # =(62-65)/3*(1)+65
5 2015/12/12 63 =C5*2 # =(62-65)/3*(2)+65
6 2015/12/14 62 62 =C6*2
7 2015/12/15 61.5 =C7*2 # =(61-62)/2*(1)+62
8 2015/12/16 61 61 =C8*2
9 2015/12/17 61 =C9*2 # use the last known value
In column C are values filled in using linear interpolation when I have to find missing data between two known points.
I believe this is a really simple and common use case, so I am sure its a trivial thing to do, but I am unable to find a solution using built in functions. I don't have much experience with spreadsheets either. I have spent hours experimenting with =INDEX, =MATCH, =VLOOKUP, =LINEST, =TREND etc., but I am not able to come up with something from the examples. The only solution that I could use was to create a custom function using Google Apps Script. Though my solution works, it seems to execute really very slowly. My spreadsheet is also huge.
Any pointers, solutions?
You might want to use forecast for which it may be more convenient first to separate out the dates you have readings from those you don't (and rearrange later). So with just three readings say:
A B
1 10/12/2015 65
2 14/12/2015 62
3 16/12/2015 61
and the dates for which values are required on the left below:
6 09/12/2015 65.6
7 11/12/2015 64.3
8 12/12/2015 63.6
9 15/12/2015 61.5
10 17/12/2015 60.2
The formula giving rise to 65.6 in B6 (and copied down from there to suit) is:
=forecast(A6,$B$1:$B$3,$A$1:$A$3)
This is not calculated in quite the way you show but may be considered slightly more accurate, in particular by extrapolating the missing end values, rather than just repeating their nearest available value.
Having calculated the values you would probably want to reassemble the data in date order. So I suggest copy B6:B10 and Edit, Paste special, Paste values only over the top and then sort to suit.
The chart below compares the results above (blue) with those in your OP (green) and marks the given data points:
Found an solution that satisfies most of my requirements using:
Used =FILTER() to first remove blank lines where data is not available (thanks for a tip from "pnuts").
And =MATCH() to lookup two consecutive rows from the filtered table. In my case I was able to use this function because column A is sorted and has no repetitions.
And then using line formula to interpolate values.
So the output becomes:
A B C D E
1 Date Weight FDdate FWeight IWeight
2 2015/05/09 2015/05/10 65.00 #N/A
3 2015/05/10 65.00 2015/05/13 62.00 65.00
4 2015/05/11 2015/05/15 61.00 64.00
5 2015/05/12 63.00
6 2015/05/13 62.00 62.00
7 2015/05/14 61.50
8 2015/05/15 61.00 61.00
9 2015/05/16 61.00
10 2015/05/17 61.00
Where cells C2 and D2 have the following range formula (minor note: the following formulas could of course be combined if columns A and B are adjacent):
C2 =FILTER($A$2:$A$10, NOT(ISBLANK($B$2:$B$10)))
D2 =FILTER($B$2:$B$10, NOT(ISBLANK($B$2:$B$10)))
Cells E2 through E10 contain the following line interpolation formula: [y = y1 + (y2 - y1) / (x2 - x1) * (x - x1)]:
E2 =(INDEX($D:$D, MATCH($A2, $C:$C, 1), 1))
+(INDEX($D:$D, MATCH($A2, $C:$C, 1) + 1, 1)
- INDEX($D:$D, MATCH($A2, $C:$C, 1), 1))
/(INDEX($C:$C, MATCH($A2, $C:$C, 1) + 1, 1)
- INDEX($C:$C, MATCH($A2, $C:$C, 1), 1))
*(INDEX($C:$C, MATCH($A2, $C:$C, 1), 1) - $A2) * -1
What this solution does not work for is when the first cell B2 does not have a value, where the formula result in #N/A. All this would have been much more efficient if we had something like =INTERPOLATE_LINE( A2, $A$2:$A$10, $B$2:$B$10 ) in google spreadsheet, but unfortunately this does not exist. Please correct me if I have missed it in my reading of the supported functions in google spreadsheet.
I found a solution which satisfies the requirements completely. I used a separate sheet so I could break up the calculation into pieces.
Create a new sheet. Enter the following formulas into Cells A2-F2, and then copy them down the page.
Cell A2: Copy your weight data into the first column. (In this example, the sheet name is Daily Record and the weights are recorded in column D.)
'Daily Record'!D2
Cell B2: Find the most recent recorded weight.
=INDEX(FILTER(A$2:A2,A$2:A2 <> ""),COUNT(FILTER(A$2:A2,A$2:A2 <> "")),1)
Cell C2: Count the number of days since the most recent weigh-in.
=IF(A2<>"",0,IF(ROW(C2)<3,0,C1+1))
Cell D2: Find the next recorded weight (from the current date or later.)
=IFERROR(INDEX(FILTER(A2:A,A2:A <> ""),1,1),"")
Cell E2: Count the number of days until the next weigh-in.
=IF(A2<>"",0,IF(E3="","",E3+1))
Cell F2: Calculate the interpolated weight.
=IF(A2 <> "", A2, IF(D2 = "", "", B2 + (D2-B2)*C2/(C2+E2)))

Resources