Select every n-th row in Informix? - informix

I was wondering if it is possible to select every n-th row in Informix just like in MS SQL?!
Something like
SELECT * FROM <TABLE> order by <COLUMN> ASC limit 1 OFFSET 4
just didn't work. We have to work with driver version 4.10.FC9DE.
My goal is to get only every 5th row back from a table with about 350 entries. I'm happy for every hint to achieve this.

I propose this solution to select every 5th row:
First I number all the rows from 1, then select every row that MOD 5 is 0
SELECT t.*
FROM (SELECT *, SUM(1) OVER (ORDER BY <COLUMN>) AS num
FROM <TABLE> ) AS t
WHERE MOD(t.num, 5) = 0
Surely, this is not the most efficient way to make this

select skip 4 first 1 *
from <table>
order by <column> asc
You can see more at:
https://www.ibm.com/support/knowledgecenter/en/SSGU8G_14.1.0/com.ibm.sqls.doc/ids_sqs_0987.htm

Select 1 every 5 rows:
SELECT * FROM <TABLE> WHERE mod(rowid, 5) = 0;
Select 1 every 10 rows:
SELECT * FROM <TABLE> WHERE mod(rowid, 10) = 0;

Related

My QUERY function in GOOGLE SHEETS is pulling the first row of the data set but it doesn't belong

I often use query functions to pull the top 5 (or bottom 5) values of a column. My basic formulas usually look like this:
=QUERY(A2:N32, "SELECT A,N ORDER BY N DESC LIMIT 5")
but this time, it's grabbing the 1st row (N103SY, 34.7) as part of the query even though 34.7 does not come close to being within the top 5 values of all 31 possible values. The output IS correct starting with (N136SY, 62.0), so why the extra row at the top when it's not a part of the query?
N103SY 34.7
N136SY 62.0
N139SY 43.6
N127SY 43.3
N124SY 43.2
N119SY 41.0
Open doc (editable)...
https://docs.google.com/spreadsheets/d/1Oq1GvbsHdxpPM1wZ2HAXSjzYeA7dmT1Blq-raLilvbQ/edit#gid=735538815
use:
=QUERY(A2:N32, "SELECT A,N ORDER BY N DESC LIMIT 5", )
or upgrade to:
=SORTN({A2:A32, N2:N32}, 5, 2, 0)

How can I get the top x products of each brand in Google Sheets?

I have a Google Spreadsheet that retrieves product data containing the SKU, Name of the product, the revenue in the past x days. Using a Regexmatch function I retrieve the brands from the products that will be in sale the coming weeks.
Now I want to retrieve the top 5 products of each brand (based on highest revenue) and if I use the Query function I am not able to get the limit per brand sorted. How can I do this? Hereby an example of the dataset: https://docs.google.com/spreadsheets/d/19ysERREFus9sKuF99roj2fLQYvqNBZx-L9iNknExCfY/edit#gid=0
Use reduce() to iterate brands, and order by C desc limit 5 to get the top five products per brand, like this:
=reduce(
Dataset!A1:D1, unique(Dataset!D2:D),
lambda(
result, brand,
{
result;
query(
Dataset!A2:D,
"where D = '" & brand & "'
and A is not null
order by C desc
limit 5",
0
)
}
)
)
See your sample spreadsheet.
you can try this:
=SORT(filter({A2:C,IF(ISBLANK(D2:D),"NO_BRAND",D2:D)},MAP(IF(ISBLANK(D2:D),"NO_BRAND",D2:D),C2:C,LAMBDA(dx,cx,if(dx="",,RANK(cx,FILTER(C2:C,IF(ISBLANK(D2:D),"NO_BRAND",D2:D)=dx),0))))<6),4,0,3,0)
-

Influxdb: How to get count of number of results in a group by query

Is there anyway that i can get the count of total number of results / points / records in a group by query result?
> SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m)
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z 2
2015-08-18T00:24:00Z 2
I expect the count as 3 in this case. Even though I can calculate the number of results using the time period and interval (12m) here, I would like to know whether it is possible to do so with a query to database.
You can use Multiple Aggregates in single query.
Using your example add a select count(*) from (<inner query>):
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m))
name: h2o_feet
--------------
time count_count
1970-01-01T00:00:00Z 3
However if you had a situation in which the grouping by returns empty rows, they will not be counted.
For example, counting over the below table will result in a count of 2 rather than 3:
name: h2o_feet
--------------
time count
2015-08-18T00:00:00Z 2
2015-08-18T00:12:00Z
2015-08-18T00:24:00Z 2
To include empty rows in your count you will need to add fill(1) to your query like this:
> SELECT COUNT(*) FROM (SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' AND time >= '2015-08-18T00:00:00Z' AND time <= '2015-08-18T00:30:00Z' GROUP BY time(12m) fill(1))
You will need to do some manual work. Run it directly,
$ influx -execute "select * from measurement_name" -database="db_name" | wc -l
This will return 4 more than the actual values.
Here is an example,
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles" | wc -l
5
luvpreet#DHARI-Inspiron-3542:~/www$ influx -execute "select * from yprices" -database="vehicles"
name: yprices
time price
---- -----
1493626629063286219 2
luvpreet#DHARI-Inspiron-3542:~/www$
So, I think now you know why subtract 4 from the value.

Why sum and count give me unexpected results while using influxdb v.013

I'm not able to figure out how sum and count works. I'm using influxdb with version 0.13.
Let's say I've a time measurement with lots of data and first let me query it to get 10 rows:
> select count from X where time > 1472807400000000000 LIMIT 10
will respond with:
name: (X)
-------------------------
time count
1472807580000000000 1
1472807640000000000 1
1472807640000000000 1
1472807650000000000 3
1472807660000000000 1
1472807660000000000 6
1472807670000000000 1
1472807670000000000 3
1472807680000000000 1
1472807680000000000 1
Now I will sum this column:
> select sum(count) from X where time > 1472807400000000000 LIMIT 10
name: X
-------------------------
time sum
1472807400000000001 102
and count this column:
> select count(count) from X where time > 1472807400000000000 LIMIT 10
name: X
-------------------------
time count
1472807400000000001 44
What I was expecting
"count - Returns the number of non-null values in a single field"
shouldn't that be 10 ?
"sum - Returns the sum of the all values in a single field."
shouldn't that be value close to 19 ?(1,1,1,3,1,6,1,3,1,1)
The LIMIT clause limits the number of results that are returned. Any function call takes precedence.
So
select sum(count) from X where time > 1472807400000000000 LIMIT 10
is functionally equivalent to
select sum(count) from X where time > 1472807400000000000
Similarly
select count(count) from X where time > 1472807400000000000 LIMIT 10
is functionally equivalent to
select count(count) from X where time > 1472807400000000000

Generate a list of all unique values of a multi-column range and give the values a rating according to how many times they appear in the last X cols

As the title says.
I have a range like this:
A B C
------ ------ ------
duck fish dog
rat duck cat
dog bear bear
What I want is to get a single-column list of all the unique values in the range, and assign them a rating (or tier) according to the number of times they have appeared in the last X columns (more columns are constantly added to the right side).
For example, let's say:
Tier 0: hasn't appeared in the last 2 columns.
Tier 1: has appeared once in the last 2 columns.
Tier 2: has appeared twice in the last 2 columns.
So the results should be:
Name Tier
------ ------
duck 1
rat 0
dog 1
fish 1
bear 2
cat 1
I was able to generate a list of unique values by using:
=ArrayFormula(UNIQUE(TRANSPOSE(SPLIT(CONCATENATE(B2:ZZ9&CHAR(9)),CHAR(9)))))
But it's the second part that I am not sure exactly how to achieve. Can this be done through Google Sheets commands or will I have to resort to scripting?
Sorry, my knowledge is not enough to build an array-formula but I can explain how I get it per cell and then expanded a range from it.
Part 1: count the number of nonempty columns (assuming that if column has something on the second row, then it's filled.
COUNTA( FILTER( Sheet1!$B$2:$Z$2 , NOT( ISBLANK( Sheet1!$B$2:$Z$2 ) ) ) )
Part 2: build a range for the last two filled columns:
OFFSET(Sheet1!$A$2, 0, COUNTA( ... )-1, 99, 2)
Part 3: use COUNTIF to count how many values of "bear" we meet there (here we can pass a cell-reference instead) :
COUNTIF(OFFSET( ... ), "bear")
I built a sample spreadsheet that gets the results, here's the link (I know external links are bad, but there's no other choice to show the reproducible example).
Sheet1 contains the data, Sheet2 contains the counts.
I suggest using both script and the formula.
Normalize the data
Script is the easiest way to normalize data. It will convert your columns into single column data:
/**
* converts columns into one column.
*
* #param {data} input the range.
* #return Column number, Row number, Value.
* #customfunction
*/
function normalizeData(data) {
var normalData = [];
var line = [];
var dataLine = [];
// headers
dataLine.push('Row');
dataLine.push('Column');
dataLine.push('Data');
normalData.push(dataLine);
// write data
for (var i = 0; i < data.length; i++) {
line = data[i];
for (var j = 0; j < line.length; j++) {
dataLine = [];
dataLine.push(i + 1);
dataLine.push(j + 1);
dataLine.push(line[j]);
normalData.push(dataLine);
}
}
return normalData;
}
Test it:
Go to the script editor: Tools → Editor (or in Chrome browser: [Alt → T → E])
After pasting this code into the script editor, use it as simple formula: =normalizeData(data!A2:C4)
You will get the resulting table:
Row Column Data
1 1 duck
1 2 fish
1 3 dog
2 1 rat
2 2 duck
2 3 cat
3 1 dog
3 2 bear
3 3 bear
Then use it to make further calculations. There are a couple of ways to do it. One way is to use extra column with criteria, in column D paste this formula:
=ARRAYFORMULA((B2:B>1)*1)
it will check if column number is bigger then 1 and return ones and zeros.
Then make simple query formula:
=QUERY({A:D},"select Col3, sum(Col4) where Col1 > 0 group by Col3")
and get the desired output.

Resources