How to force sumif() to use string matching (instead of numerical matching)? - google-sheets

How can I force Google Sheets sumif() formula to use string matching? I'm using a string criteria on string data and it is undesirably using numerical matching.
Example data:
1.1 1
1.1 2
1.10 4
1.11 5
Doing sumif on the above data yields the value 7, instead of 3 as the "1.10" row is being matched:
SUMIF(A1:A,"1.1",B1:B)
I can achieve the desired result using query()
query(A1:B, "select sum(B) where A matches '1.1' label sum(B) ''")
but in my complex real-world use-case I do not find it as intuitive and would prefer to use sumif() if possible.
Online example:
https://docs.google.com/spreadsheets/d/1s41B3pIWixkiAh7DFvGmcJzrr6gjmQlcO3yYrvuFYzc/edit?usp=sharing

Use "'1.1"
=SUMIF(A1:A,"'1.1",B1:B)
to force text match.

Try adding a non-numerical character to both the range array and the search:
=ArrayFormula(SUMIF(A1:A&"x","1.1x",B1:B))

You may try filter function, I also feel surprise sumif that will confuse over 1.1 vs 1.10"
=sum(FILTER(B:B,A:A=E7))

The TO_TEXT function is useful here:
=arrayformula(sumif(to_text(A1:A),"1.1",B1:B))
It needs to be put in arrayformula to add up matching values in the columns.

Related

SUMIFS and/or QUERY inside ARRAYFORMULA

Google spreadsheet sample: https://docs.google.com/spreadsheets/d/1MdRjm5QmKY_vaah9c3GrvH6dDOBCQX_zvCubvN0akmk/edit?usp=sharing
Im trying to get the sum of all values for each ID. The values im trying to add up are found in the Source tab while the calculations are done in the Output. My desired values are based on 2 things: ID and Date. The Id is supposed to match and the Date is supposed to be February. I tried first just using a sumif with just matching ID and it worked using this formula: =ARRAYFORMULA(IF(A2:A="",, SUMIF(Source!A:A,A2:A,Source!B:B)))
But when I add the 2nd critera and use a sumifs function, it only outputs for the first id. Here is the sumifs formula I used: =ARRAYFORMULA(SUMIFS(Source!B2:B,Source!A2:A,A2:A,Source!C2:C,">="&DATE(2021,2,1),Source!C2:C,"<="&DATE(2021,2,28)))
I tried using query as some of the answers I found online suggested to use it but it also outputs the first data only, here is the query formula I used =ARRAYFORMULA(QUERY(Source!A2:C,"select sum(B) where A = '"&Output!A2:A&"' and C >= date '"&TEXT(DATEVALUE("2/1/2021"),"yyyy-mm-dd")&"' and C <= date '"&TEXT(DATEVALUE("2/28/2021"),"yyyy-mm-dd")&"' label sum(B) '' "))
I know this is possible by making a temporary query/filter where you only include desired dates and from there I can use SUMIF, but I will be needing to make a monthly total and making 12 of these calculated temporary filters/query would take up a lot of space since we have a lot of data so I want to avoid this option if possible. Is there a better fix to this situation?
Solved by Astrotia - =arrayformula(sumif(I3:I20&month(K3:K20), A2:A6&2, J3:J20))

Formula that will skip one column when calculating SUM() or similar functions

I'd like to run a =SUM(A1:G1), but always skip one column, regardless if it has value or not.
In this case, it should calculate A1+C1+E1+G1.
Is there another function I could append to SUM() or other similar functions as SUM in order to skip one column?
Thank you!
Using the following method you can calculate any number of alternate columns, without the need of manual +
Suppose your data is in second row onwards, use this formula
=SUMPRODUCT(A2:G2, MOD(COLUMN(A2:G2),2))
Simply a sumproduct of cell values and a array of {1,0,1,0,1...}
Another slight variation
=SUMPRODUCT(A2:G2*ISODD(COLUMN(A2:G2)))
But if the even columns contain letters instead of numbers this will give an error, so you can use instead
=SUMPRODUCT(N(+A1:G1)*ISODD(COLUMN(A1:G1)))
Comparing #AnilGoyal's answer, this works as well
=SUMPRODUCT(A1:G1,--ISODD(COLUMN(A1:G1)))
You can use:
=SUM(INDEX(A1:G1,N(IF(1,{1,3,5,7}))))
Or with Excel O365:
=SUM(INDEX(A1:G1,{1,3,5,7}))
A bit more of a general solution:
=SUMPRODUCT(MOD(COLUMN(A1:G1),2)*A1:G1)
Or with Excel O365:
=SUM(MOD(COLUMN(A1:G1),2)*A1:G1)
Or even:
=SUM(INDEX(1:1,SEQUENCE(4,,1,2)))
Since you included Google-Sheets, I'll throw in an option using QUERY():
=SUM(QUERY(TRANSPOSE(1:1),"Select * skipping 2"))
Maybe a bit more verbose, but very understandable IMO.
Consider something of the format:
=SUM(A1:G1)-INDEX(A1:G1,2)
The 2 in the formula means remove the 2nd item in the part of the row. (so the 999 is dropped)
So the formula =SUM(BZ10:ZZ10)-INDEX(BZ10:ZZ10,2) drops CA10 from the sum, etc.(a similar formula can be constructed for columns)
google sheets:
=INDEX(MMULT(N(A1:H3), 1*ISODD(SEQUENCE(COLUMNS(A:H)))))
=INDEX(IF(ISODD(COLUMN(A:H)), TRANSPOSE(MMULT(TRANSPOSE(
IFERROR(A1:H3*ISODD(COLUMN(A:H)), 0)), 1^ROW(A1:A3))), ))

Google Sheets: Native formula to return slice of INDEX() values between two indices?

Using only native Google Sheets functions (no scripts), how can I return a slice of values between two given indices from =INDEX() or another range reference?
Example
If:
=INDEX("Apple", "Banana", "Curant", "Delicious", "Eggplant", "Fruit")
and given indices are 3 and =LEN(INDEX(...))
then desired returned output is:
{"Curant", "Delicious", "Eggplant", "Fruit"}
Note
Use of =INDEX() is preferable but other native solutions are great too.
Use-case requires that the solution be able to take dynamic input as part of a larger function, rather than being given explicit ranges as input.
Solution can be inclusive or exclusive of the given indices.
Thanks!
You can accomplish this by using the =QUERY() function:
=QUERY(INDEX(A2:A), "SELECT A LIMIT "&(D2-C2+1)&" OFFSET "&(C2-1))
Where C2 holds the initial index, and D2 the final one.
In case you are using this function on an output of another function, you will have to replace the A in the SELECT clause for the Col1 identifier:
=QUERY(INDEX(A2:A), "SELECT Col1 LIMIT "&(D2-C2+1)&" OFFSET "&(C2-1))
Examples

How to use substitute function with query function in google spreadsheet

I am trying to use substitute function inside a query function but not able to find the correct syntax to do that. My use case is as follows.
I have two columns Name and Salary. Values in these columns have comas ',' in them. I want to import these two columns to a new spreadsheet but replace comas in "Salary" column with empty string and retain comas in "Name" column. I also want to apply value function to "Salary" column after removing comas to do number formatting.
I tried with the following code but it is replacing comas in both the columns. I want a code which can apply the substitute function only to a subset of columns.
Code:
=arrayformula(SUBSTITUTE(QUERY(IMPORTRANGE(Address,"Sheet1!A2:B5"),"Select *"),",",""))
Result:
Converted v/s Expected Result
Note :
I have almost 10 columns to import and comas should be removed from 3 of them.
Based on your suggestions, I was able to achieve the objective by treating columns separately. Below is the code.
=QUERY({IMPORTRANGE(Address,"Sheet1!A3:A5"),arrayformula(VALUE(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!B3:B5"),",","")))},"Select * where Col2 is not null")
Basically, two IMPORTRANGE functions side by side for each column.
The same query on the actual data with 10 columns will look like this.
=QUERY({IMPORTRANGE(Address,"Sheet1!A3:C"),arrayformula(VALUE(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!D3:H"),",",""))),IMPORTRANGE(Address,"Sheet1!I3:J")},"Select * where Col2 is not null")
I used 3 IMPORTRANGE functions so that I can format the columns D to H by removing comas and changing them to number.
My suggestion is to use 2 formulas and more space in your sheets.
Formula #1: get the data and replace commas:
=arrayformula(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!A2:B5"),",",""))
Formula #2: to convert text into numbers:
=arrayformula (range_of_text_to_convert * 1)
Notes:
using 2 formulas will need extra space, but will speed up formulas (importrange is slow)
the second formula uses math operation (*1) which is the same as value formula.
Try this. I treats the columns separately.
=arrayformula(QUERY({Sheet1!A2:A5,SUBSTITUTE(Sheet1!B2:B5,",","")},"Select *"))
Thanks to Ed Nelson, I was able to figure out this:
=arrayformula(QUERY({'Accepted Connections'!A:R,SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE('Accepted Connections'!A:R,"AIF®",""),"APA",""),"APMA®",""),"ASA",""),"C(k)P®",""),"C(k)PS",""),"CAIA",""),"CBA",""),"CBI",""),"CCIM",""),"","")},"Select *"))
That removed all the text I didn't need in specific columns.

Countif with len in Google Spreadsheet

I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.

Resources