This question already has answers here:
Query is ignoring string (non numeric) value
(2 answers)
Closed 5 months ago.
I am working on some data where i have to import the raw data from sheet Prepaid to the Master sheet but am seeing that certain number cells dont get imported like in cell B18 in sheet named Master. If I convert the raw data cell to number it works but it converts 11892667013478301 to 11892667013478300 leading to a mismatch. Is this is a size restriction on the number
Sheet is below
https://docs.google.com/spreadsheets/d/12y5h6NYArpEctQ2FD-AXJrqZcQydnEd5BjrOALMJEGI/edit?usp=sharing
From QUERY docs:
In case of mixed data types in a single column, the majority data type determines the data type of the column for query purposes. Minority data types are considered null values.
Since most values in your column end with two 0s, they don't reach the digit limit of 15, and are treated as numbers. The values that reach 15 digits are treated as string values, and since those are a minority in the column, they are considered null values.
To avoid this, you can force all values in the column to be treated as strings via TO_TEXT, and apply the QUERY to that.
=QUERY(ARRAYFORMULA(TO_TEXT(Prepaid!E:F)),"select * where Col1 is not null")
I'll delete my other answer, since yes, the issue seems to be that you are hitting the maximum number of significant digits, 15, for a number in Google Sheets. You can prove this by tring to add any small number to any of your (numeric) cells in Prepaid!F - the number doesn't increase, since it can't display any more significant digits.
The majority of your values are 15 signifcant digits plus two zeroes on the end. But F18 and F28 end in 01, not 00, so they are treated as strings. Forcing them to a number "discards" the last two significant digits, making them 00.
Perhaps the easiest answer for you is to force all of columns E and F to be text strings, rather than numeric values, and then they can all be dealt with equally, such as running queries against them.
Let me know if this helps at all.
Related
I want to iterate over an array of cells, in this case B5:B32, and keep the values that are equal to some reference text in a new array.
However, SPLIT nowadays accepts arrays as inputs. That means that if I use the array notation of "B5:B32" within ARRAYFORMULA or FILTER, it treats it as a range, rather than the array over which we iterate one cell at a time.
Is there a way to ensure that a particular range is the range over which we iterate, rather than the range given at once as an input?
What I considered was using alternative formulations of a cell, using INDEX(ROW(B5), COLUMN(B5)) but ROW and COLUMN also accept array values, so I'm out of ideas on how to proceed.
Example code:
ARRAYFORMULA(
INDEX(
SPLIT(B5:B32, " ", 1), 1
) = "Some text here"
)
Example sheet:
https://docs.google.com/spreadsheets/d/1H8vQqD5DFxIS-d_nBxpuwoRH34WfKIYGP9xKKLvCFkA/edit?usp=sharing
Note: In the example sheet, I can get to my desired answer if I create separate columns containing the results of the SPLIT formula. This way, I first do the desired SPLITS, and then take the values I need from that output by specifying the correct range.
Is there a way to do this without first creating an output and then taking a cell range as an input to FILTER or other similar functions?
For example in cell C35 I've already gotten the desired SPLIT and FILTER done in one go, but I'd still need to find a way to sum up the values of the first character of the second column. Doing this requires that I take the LEFT value of the second column, but for that I need to output the results and continue in a new cell. Is there a way to avoid this?
Ralph, I'm not sure if your sample sheet really reflects what you are trying to end up with, since, for example, I assume you are likely to want the total of the hours per area.
In any case, this formula extracts all of the areas, and the hours worked, and is then easy to do further calculations with.
=ArrayFormula({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))})
Try that in cell E13, to see the output.
The first REGEXEXTRACT pulls out all the text in front of the first space and number, and the second pulls out all the digits in a string of " #hr" in each cell. These criteria could be modified, if necessary, depending on your actual requirements. Note that it requires the use of VALUE, to convert the hours from text to numeric values, since REGEXEXTRACT produces text (string) results.
It involved concatenating your multiple data columns into one long column of data, to make it simpler to process all the cells in the same way.
This next formula will give you a sum, for whatever matching room/task you type into B6, as an example.
=ArrayFormula(QUERY({REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9},"(.*) \d"),
VALUE(REGEXEXTRACT({C5:C9;D5:D9;E5:E9;F5:F9;G5:G9;H5:H9}," (\d+)hrs"))},
"select Col1, sum(Col2) where Col1='"&B6&"' group by Col1 label sum(Col2) '' ",0))
I will also answer my own question given what I know from kirkg13's answer and other sources.
Short answer: no, there isn't. If you want to do really convoluted computations with particular cell values, there are a few options and tips:
Script your own functions. You can expand INDEX to accept array inputs and thereby you can select any set of values from an array without outputting it first. Example that doesn't use REGEXMATCH and QUERY to get the SUM of hours in the question's example data set: https://docs.google.com/spreadsheets/d/1NljC-pK_Y4iYwNCWgum8B4NJioyNJKYZ86BsUX6R27Y/edit?usp=sharing.
Use QUERY. This makes your formula more convoluted quite quickly, but is still a readable and universally applicable method of selecting data, for example particular columns. In the question's initial example, QUERY can retrieve only the second column just like an adapted INDEX function would.
Format your input data more effectively. The more easily you can get numbers from your input, the less you have to obfuscate your code with REGEXMATCHES and QUERY's to do computations. Doing a SUM over a RANGE is a lot more compact of a formula than doing a VALUE of a LEFT of a QUERY of an ARRAYFORMULA of a SPLIT of a FILTER. Of course, this will depend on where you get your inputs from and if you have any say in this.
Also, depending on how many queries you will run on a given data set, it may actually be desirable to split up the formula into separate parts and output partial results to keep the code from becoming an amalgamation of 12 different queries and formulas. If the results don't need to be viewed by people, you can always choose to hide specific columns and rows.
This question already has answers here:
ultimate short custom number formatting - K, M, B, T, etc., Q, D, Googol
(3 answers)
Closed 1 year ago.
I have a spreadsheet full of data that each number represent thousand
For example: 1,745 represent 1,745,000
How can I convert each number (in the same cell) to represent millions so 1.745 will represent 1,745,000?
In MS Excel it can be done with special paste divide by 1000.
What's the equivalent in Google spreadsheet ?
Thanks!
Sheets' concept of thousands and millions is the same as Excel's (not everything else is however). Sheets does not, at the moment, have a direct equivalent of Excel's Paste Special with Operation.
If you want to convert each number representing thousands (in the same cell) to represent millions in Sheets then you will have to have each be divided by 1000. If you want this without helper cells you probably will need to write a script, though you could edit each cell individually or might export to Excel, use their inbuilt code and import back into Sheets, if required there.
if the number is 1,745 you can't change it in the same cell to million equivalent
if the number is 1.745 you can change it in the same cell but only "visually" to million equivalent with custom number formatting:
note1: its only visual change if you keep one eye shut
note2: this applies to the sheet with US locale. if you want to reverse it use some European locale
There is a ton of examples of generating random numbers in LUA that have no duplicates, and just a standard math.random(x,y) can get a set of random whole numbers in a range....
... but I am having trouble finding a set of random numbers between a range, but allowing x amount of duplicates. For my immediate needs I can allow 1 set of duplicates, but it would be great to have code where you can set "duplicate value" to anything for future projects.
Example : I want to generate a list of 10 whole numbers between 1-10... each value can be anything between 1-10, but any one number can only be generated and added to the list twice.
Example Result: 1,1,2,4,5,5,7,7,8,9
In this example result math.random() tried to spit out 3 or more of the same number, but the code makes it go back and try again if it has already produced 2 of the same number.
Thanks in advance!
You can use "merge trick":
Create "unical" array of numbers for 5 (10/number of dublicats) elements: 1,2,5,7,9
Repeate #1
Merge arrays.
You can generalize it with paramers of minValue, maxValue, totalNumber, numberOfDublicates, but will need to little more code for handling 10/3 problems and maxValue < totalNumber.
Generate a sequential list of non-random numbers between a range with
no duplicates.
Add them to a table, but add each number X amount of times, where X
is the total amount of duplicates allowed. So we know have a table x
times as long with each individual number listed X amount of times.
Shuffle the table, or generate a list of random numbers or both.
Then simply extract the numbers from the table using the generated
numbers as the numeric key value for the "duplicate" table.
You can store anything at those key values so this works for
anything.. not just numbers.
Is there a formula to randomize a column of data which keeps each item represented only once (has the same items)?
So:
APPLES
PEARS
BERRIES
Might come out as
PEARS
BERRIES
APPLES
Randbetween formulas no good here, as you might get two 'PEAR's.
There is a new "randomize range" feature available in the context menu after selecting a range:
]
The following approach implements the idea of pnuts, but without creating a column filled with random numbers:
=query({A2:A20, arrayformula(randbetween(0, 1e20 + row(A2:A20)))}, "select Col1 order by Col2", 0)
Here A2:A20 is the range to be permuted. The arrayformula generates a random integer for each. The query sorts the array by those random integers, but does not put the random numbers in the spreadsheet.
The entropy of randbetween is 64 bits, so collisions are extremely unlikely. And even if two random numbers happen to be equal, that will not generate repetitions; sorting by whatever column never does that. It only means the corresponding pair of entries will appear in their original order.
Came across this while looking for a formula to generate a set of random unique integers and ended up devising my own, so I'm leaving it here for anyone else looking for the same:
=SORT(SEQUENCE(A$1),RANDARRAY(A$1),FALSE) where A$1 is the count of integers to generate (expressed here as a cell reference because I like to create sheets where I can input a number in a cell rather than changing the formula, but this can of course be just a number.)
This can be expanded by adding the three other fields to SEQUENCE as explained in the function's documentation, or by wrapping it in an ARRAYCONSTRAIN to limit the count of entries returned without changing the minimum or maximum values of the generated entries. Hope all this makes sense!
I adopted a similar approach to user6655984 before I found this post.
RANDARRAY seemed to be a neat call once solution.
I had similar demands. Formula based, randomized return order, ability to have only unique records or not as the whim took me.
Right clicking to randomize range meant user interaction I didn't want and the data is dynamic.
I built in the random numbers into a query data range on the fly.
I get the flexibility of query (can easily expand the range, add returned columns filter criteria etc), I don't have to show the random numbers at all and can wrap it in UNIQUE if desired, it re-randomizes with each recalc.
Have some data in column A2:A.
To see the inline data range.
={RANDARRAY(ROWS($A$2:$A)),$A$2:$A}
Query (inc duplicates), filter out empty.
=QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0)
Same but wrapped by unique.
=UNIQUE(QUERY({RANDARRAY(ROWS($A$2:$A)),$A$2:$A},"SELECT Col2 WHERE COL2<>'' ORDER BY Col1 ",0))
Hope it helps someone, even if years later. :)
Matt
This question already has answers here:
Query is ignoring string (non numeric) value
(2 answers)
Closed 5 months ago.
I have =query(importrange(...);"select * where Col1>' '") formula in my spreadsheet.
Importrange() by itself works ok, loading all the cells from source spreadhseet exactly as they are.
But Col7 contains few text cells, but mostly numbers, and when query() is applied -- numbers are kept as they are, but text is replaced with blank.
I've tried adding options no_format at the end of the query, with no difference.
Here's the contents of Col7, first line gets replaced with blank:
free
41,25
34,25
34,25
48,25
41,25
QUERY won't return columns with mixed data types by design:
In case of mixed data types in a single column, the majority data type
determines the data type of the column for query purposes. Minority
data types are considered null values.
The workaround will depend on how you want to use your data afterwards, and what compromises you would be willing to make. For example you could fairly easily convert the entire dataset into text strings, so that everything will be retained, but then all your numbers will be text strings as well.
If you needed to retain numbers as numbers, often the best bet will be to ImportRange the entire dataset somewhere in your spreadsheet (could be on a hidden sheet), and then use an alternative to QUERY on that (namely FILTER).
You can use vlookup for the columns that need to be a mixed data type. Query the unique row identifier and then vlookup the rest. I often use the following formula:
=arrayformula(if(isblank(A:A), "", vlookup(A:A, search_range, col_index, FALSE)))
Change the format of the whole column to plain text.
The numbers will still read as numbers, and the minority text values will retain their text values. Nothing will be counted as null.