InfluxDB: Select latest values in ascending order - influxdb

To select the latest 100 data points from a measurement I use the following query:
select field1, field2 from measurement
where time < now()
order by time desc limit 100
However I need the values in ascending order. Currently I'm inverting the result in my application, which is costly.
I also tried a subquery, but without success:
select field1, field2 from
(select * from measurement
where time <= now()
order by time desc limit 100) order by asc

You can use this query as a work out
select * from measurement where time <= now() and time >= now() - "some time you defined" limit 100
Though You have to cautious about your defining time.

Related

Union all performance on serverless sql in synapse is poor

We are moving a database to a delta lake, and have copied the tables to delta tables in azure delta lake gen2.
The users access the data through a serverless sql pool in synapse.
We have run into a showstopper.
When doing these queries table like this:
select top 1000
col1
,col2
,...
,col50
from tablea
select top 1000
col1
,col2
,...
,col60
from tableb
each will take about 5 seconds at most. the data use in each when looking at the look might ba about 100 MB
when doing this query
select top 1000 * from
(
select
col1
,col2
,...
,col50
from tablea
union all
select top 1000
col1
,col2
,...
,col50
from tableb
) a
it takes minutes, and the log shows a memory use of hundreds of gigabytes
Naively I would think that the optimizer would just return the first 1000 rows from the first table in the union all, but for some reason it parses all of the data before returning the first 1000 rows.
The only solution we can see is to create a new set of tables in the delta lake, with the tables unioned, however this seems like a waste of space.
We have seperate tables, because the dimensionality is slightly different, and the measures are wholly different.
In perhaps 80% of the time the users will only query rows from one of the tables.
Is there a way to optimize union all in serverless queries?

How to sort by two conditions in a query function?

Currently, the query in the image is sorting automatically by oldest date to newest. How would I go about sorting with two conditions: oldest date to newest and High Priority score to lowest (oldest to newest date would take priority, and within those dates, priority score would be the second priority for sorting)
The dates are under variable "A" and Priority score is under variable "G" in the named range.
Try
=QUERY(backlogdata, "Select A, D, E, G where A is not null order by A asc, G desc limit "&F1, 1)
and see if that works?

How can I label the result of a mathematical equation in a Google Sheets Query and then order it by that label?

I am trying to discern the base percentage points change in two columns in a query statement and then order by the results.
I have tried to label the result and Order By that label, but this doesn't work.
=QUERY('Sample Sheet'!A:I,"SELECT A,B, ((I-H)/H)*10000 LIMIT 5")
I would like the ((I-H)/H)*10000 to be that which I order by. Currently, the Limit statement brings the top 5 results. I don't want to Order By H or I because it is the change in the two numbers I want to display.
Any help would be much appreciated!
=QUERY(A:I,"SELECT A,B, ((I-H)/H)*10000 WHERE A IS NOT NULL ORDER BY ((I-H)/H)*10000 LIMIT 5",1)
ORDER BY can be used with the same expression used in SELECT
you could use double query and have it nice and clean like:
=QUERY(QUERY('Sample Sheet'!A:I,
"select A,B,((I-H)/H)*10000", 1),
"order by Col3 desc
limit 5
label Col3'equation'", 1)
but if you prefer one query then:
=QUERY('Sample Sheet'!A:I,
"select A,B,((I-H)/H)*10000
order by ((I-H)/H)*10000 desc
limit 5
label ((I-H)/H)*10000'equation'", 1)

Clean cumulative sum alongside grouped sum

I am working in PostgreSQL 9.6.6
For the sake of reproducibility, I'll use create tempory table to create a "constant" table to play with:
create temporary table test_table as
select * from
(values
('2018-01-01', 2),
('2018-01-01', 3),
('2018-02-01', 1),
('2018-02-01', 2))
as t (month, count)
A select * from test_table returns the following:
month | count
------------+-------
2018-01-01 | 2
2018-01-01 | 3
2018-02-01 | 1
2018-02-01 | 2
The desired output is the following:
month | sum | cumulative_sum
------------+-----+----------------
2018-01-01 | 5 | 5
2018-02-01 | 3 | 8
In other words, the values have been summed, grouping by month, and then the cumulative sum is displayed in another column.
The issue is that the only way I know to achieve this is somewhat convoluted. The grouped sum must be computed first, (as with a sub select or with statement), and then the running tally is computed with a select statement against that table, as so:
with sums as
(select month,
sum(count) as sum
from test_table
group by 1)
select month,
sum,
sum(sum) over (order by month) as cumulative_sum
from sums
What I wish could work would be something more like...
select month,
sum(count) as sum,
sum(count) over (order by month) as cumulative_sum
from test_table
group by 1
But this returns
ERROR: column "test_table.count" must appear in the GROUP BY clause or be used in an aggregate function
LINE 3: sum(count) over (order by month) as cumulative_sum
No amount of fussing with the group by clause seems to satisfy PSQL.
TL,DR: is there a way in PSQL to compute both a sum over groups and the cumulative sum over groups using just a single select statement? More generally, is there a "preferred" way to accomplish this, beyond the method I use in this question?
Your hunch to use SUM as an analytic function was on the right track, but you need to analytic sum the aggregate sum:
SELECT month,
SUM(count) as sum,
SUM(SUM(count)) OVER (ORDER BY month) AS cumulative_sum
FROM test_table
GROUP BY 1;
Demo
As to why this works, the analytic functions are applied after the GROUP BY clause has happened. So the aggregate sum in fact is available when we go take the rolling sum.

MySql query to select rows with the next lower or higher date

Let' say I selected those data from a table which have the biggest date value like this:
SELECT * FROM table_name WHERE column_name IN (SELECT MIN(column_name) FROM table_name
It works fine, it selects those data with the "newest" date value.
What I want to know if is there a query in MySQL to list out those data which have the next descending date.
For example, in my first select, I've selected those rows which have the MIN date value like this: 2012-08-27 10:15:00
What I want with another query to select those data wich date value is the closest next value like this: 2012-08-28 11:45:00
So there are other rows with bigger or lower date value, but I don't want to select them. Only the closest next, from what I'm currently on.
If you have a value for the datetime whose closest next element you want to find (let's name it comparison_date), you could try something like this:
SELECT * FROM table_name WHERE column_name IN (SELECT MIN(column_name) FROM table_name WHERE column_name > comparison_date ORDER BY column_name)
This will return all the datetime values later than that date. If you want to limit it to a single result, you should use:
SELECT * FROM table_name WHERE column_name IN (SELECT MIN(column_name) FROM table_name WHERE column_name > comparison_date ORDER BY column_name) LIMIT 1
Alternatively, if you want the next closest datetime from current time, you can replace comparison_date with NOW() which is a MySQL function. Then, you can try something like this:
SELECT * FROM table_name WHERE column_name IN (SELECT MIN(column_name) FROM table_name WHERE column_name > NOW() ORDER BY column_name)
In order to find the next lower datetime (or previous datetime, however we call it), you should use:
SELECT * FROM table_name WHERE column_name IN (SELECT MIN(column_name) FROM table_name WHERE column_name < comparison_date ORDER BY column_name DESC) LIMIT 1

Resources