Google spreadsheets query() is replacing text with null for mostly numeric columns [duplicate] - google-sheets

This question already has answers here:
Query is ignoring string (non numeric) value
(2 answers)
Closed 5 months ago.
I have =query(importrange(...);"select * where Col1>' '") formula in my spreadsheet.
Importrange() by itself works ok, loading all the cells from source spreadhseet exactly as they are.
But Col7 contains few text cells, but mostly numbers, and when query() is applied -- numbers are kept as they are, but text is replaced with blank.
I've tried adding options no_format at the end of the query, with no difference.
Here's the contents of Col7, first line gets replaced with blank:
free
41,25
34,25
34,25
48,25
41,25

QUERY won't return columns with mixed data types by design:
In case of mixed data types in a single column, the majority data type
determines the data type of the column for query purposes. Minority
data types are considered null values.
The workaround will depend on how you want to use your data afterwards, and what compromises you would be willing to make. For example you could fairly easily convert the entire dataset into text strings, so that everything will be retained, but then all your numbers will be text strings as well.
If you needed to retain numbers as numbers, often the best bet will be to ImportRange the entire dataset somewhere in your spreadsheet (could be on a hidden sheet), and then use an alternative to QUERY on that (namely FILTER).

You can use vlookup for the columns that need to be a mixed data type. Query the unique row identifier and then vlookup the rest. I often use the following formula:
=arrayformula(if(isblank(A:A), "", vlookup(A:A, search_range, col_index, FALSE)))

Change the format of the whole column to plain text.
The numbers will still read as numbers, and the minority text values will retain their text values. Nothing will be counted as null.

Related

The function "query" in Google spreadsheets works for all columns except one

In Google spreadsheets I use the following simple formula:
=QUERY({'pivot data source'!A:AN},"select * where Col1='2021-08' order by Col2")
This works fine so far. However, there is one comment column. It is empty for most rows. Now I added a comment there - it just won't appear in the result of the query formula.
I realized, that it works fine when the comment is a plain number. As soon as there is text, it won't show up.
As stated in the Google Help section for "query":
In case of mixed data types in a single column, the majority data type determines the data type of the column for query purposes. Minority data types are considered null values.
That means: As long as there are more values with numbers (numeric) than values with text (string), the rows with text will not show up. Even if there are as many numbers as text fields (e.g. one numeric value, one string), Google seems to define the column as numeric and strings don't show up.
To solve this problem, you can try to format the corresponding column in tab "pivot data source" as text (Format > Number > Plain text in the menu).

Is there a way to specify an input is a single cell in Google Sheets?

I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.

Query formula not displaying results that start with (') leading apostrophe strings

I have a sheet here where I need to use query formula
It doesn't display data that start with ' symbol (strings).
How do I make them display? The red cells are empty.
You mentioned
I have a sheet here where I need to use query formula
You can use the following formula:
=QUERY(ARRAYFORMULA(IF(LEN(A2:A),TEXT(A2:A,0),"")))
(Following that, you can leave the cells as text or change them to numbers depending on their further use.)
Functions used:
QUERY
ArrayFormula
IF
LEN
TEXT
Query considers only one data type for each column. As it is stated in the official documentation:
In case of mixed data types in a single column, the majority data type
determines the data type of the column for query purposes. Minority
data types are considered null values.
Therefore, the solution is to change the format to Plain text for column A.
Result:
You can also convert column to text inside QUERY:
=ArrayFormula(QUERY(TO_TEXT(Sheet1!A2:A),"select *"))

import query in google sheets not importing certain numbers [duplicate]

This question already has answers here:
Query is ignoring string (non numeric) value
(2 answers)
Closed 5 months ago.
I am working on some data where i have to import the raw data from sheet Prepaid to the Master sheet but am seeing that certain number cells dont get imported like in cell B18 in sheet named Master. If I convert the raw data cell to number it works but it converts 11892667013478301 to 11892667013478300 leading to a mismatch. Is this is a size restriction on the number
Sheet is below
https://docs.google.com/spreadsheets/d/12y5h6NYArpEctQ2FD-AXJrqZcQydnEd5BjrOALMJEGI/edit?usp=sharing
From QUERY docs:
In case of mixed data types in a single column, the majority data type determines the data type of the column for query purposes. Minority data types are considered null values.
Since most values in your column end with two 0s, they don't reach the digit limit of 15, and are treated as numbers. The values that reach 15 digits are treated as string values, and since those are a minority in the column, they are considered null values.
To avoid this, you can force all values in the column to be treated as strings via TO_TEXT, and apply the QUERY to that.
=QUERY(ARRAYFORMULA(TO_TEXT(Prepaid!E:F)),"select * where Col1 is not null")
I'll delete my other answer, since yes, the issue seems to be that you are hitting the maximum number of significant digits, 15, for a number in Google Sheets. You can prove this by tring to add any small number to any of your (numeric) cells in Prepaid!F - the number doesn't increase, since it can't display any more significant digits.
The majority of your values are 15 signifcant digits plus two zeroes on the end. But F18 and F28 end in 01, not 00, so they are treated as strings. Forcing them to a number "discards" the last two significant digits, making them 00.
Perhaps the easiest answer for you is to force all of columns E and F to be text strings, rather than numeric values, and then they can all be dealt with equally, such as running queries against them.
Let me know if this helps at all.

Query Importrange in Google Sheets Not Importing Correctly

We are using Google Forms to collect data on our students. They use the same Google Form for all students, but as part of the form, they are asked the students name.
The data that ends up being collected you can see on the tab Form Responses 1 on the Google Sheet linked here.
I am attempting to use ImportRange to create a tab for each of the students. The formula that I am using for just one of the students is...
=QUERY(IMPORTRANGE("1nJANDP1fiQunxfxEf-EjwJrnIRICv6kLhYYY9XBXtD4", "Form Responses 1!A:I"),"SELECT * WHERE Col3 = 'Adam N.'")
You can take a look at the tab called Adam N. and you'll see it is kind of working.
One thing that doesn't seem to be working is when there is a text value in columns E-I, that text value doesn't end up showing on the Adam N. tab. Any ideas how I can get both the numbers and the text values to show up?
The other thing that seems to be a problem is the fact that on the Adam N. tab, the very first row has the same headers as the Form Responses 1 tab, but it also has the very first line of data. Any way to remove that?
Importrange is not needed since you are 'importing' from within the same spreadsheet. Also, I'd recommend using the (optional) header argument in query().
It is often noted that users are tempted to mix data types within a column. The query() function will give undesirable output. If a column is intended for numeric values then only numerical values must reside in that column. Date columns must only contain dates and text columns only contain text values.
This does not mean that numbers cannot appear in a text column as long as they are in a text format. So it is important to plan the columns in a table to make sure this rule is maintained regardless if the data table is created manually or via submissions from a Google Form.
Generally, the query() function will assume the greater number of cell types in a column to be that data type. For example, if there are 100 numbers and 20 text values in the same column then a numeric value will be assumed for that column. There is a good chance the text values will just be ignored. One way to avoid this, would be to convert everything to text.
See if this works
=ArrayFormula(QUERY(to_text('Form Responses 1'!A:I),"WHERE Col3 = 'Adam N.'", 1))

Resources