InfluxDB how to query every nth value - influxdb

I am trying to query data every nth element in InfluxDB. I am executing the command below to do so, but I am getting no results. I am using sample data that I created for the sake of the example.
The command I am running in Influx's CLI:
SELECT value FROM generators GROUP BY time(5s)
The result:
GROUP BY requires at least one aggregate function
I am new to InfluxDB, and I am not sure what I am doing wrong. I have read up on making a continuous query, but when I do make one, I am unable to query data as it returns no results. Thank you all to those in advance who reply.

You can use functions like FIRST() or LAST() depending on your requirement.
SELECT FIRST(value) FROM generators GROUP BY time(5s)
https://docs.influxdata.com/influxdb/v1.7/query_language/functions/

Related

Updating a query for false positives

I work in a compliance role at a very small start-up and review a lot of information,for example bank transfers/direct deposits/ACHs every day. A report is pulled from BigQuery,which is exported to Google Sheets.
My question is there are a lot of false positives (basically, "posting data" that repeats often). I'm trying to eliminate it.
One idea, was just to update the query for key words:
WHERE postingdata LIKE 'PersonName%'
But it's tired and time-consuming. And I feel e there's a better way, perhaps 'filtering' the results and then feeding it back to the query. Any ideas or tips or just general thoughts?
In this case you can use group by in your query. This is how you can use this clause.
You can see this code.
SELECT account,TypeTransaction,amount,currency
FROM `tblBankTransaction`
The code returns this data, and some rows are repeated; for example, rows 1 and 7 with the account 894526972455, and it's a deposit.
In this case, I will use the group by clause.
SELECT account,TypeTransaction,amount,currency
FROM `tblBankTransaction`
group by account,TypeTransaction,amount,currency
And it returns this data:
You can see in this example that the account 894526972455 with a deposit only returns 1 row. The same account returns a second row, but is a transfer; it’s a different type of transaction. It depends on the information you have and what column you want to group.
within GS you can try UNIQUE or QUERY with group by aggregation or SORTN with mode 2 as 3rd parameter

Is it possible to retrieve only the timestamp from a influxDb query

Is it possible to pass the timestamp retuned in influxDb query to another query.
Select max("value")
from "temp" where ("floor" = "1);
Output
time max
---- ---
2020-01-17T00:00:00Z 573.44
Is it possible to pass the time from the result to another query?
You cannot do this with InfluxQL,it is not possible to nest the queries in a way that could pass the time range of the inner query to the outer query. it's another matter if you were using Flux (new Query Language but still in BETA).
In Flux this is possible, because you can access time as a column, which you can then use to query your other measurements as required. You can also use JOIN to do more advanced operations like cross measurement calculations etc.

Split a KV<K,V> PCollection into multiple PCollections

Hi after performing a group by key on a KV Pcollection, I need to:-
1) Make every element in that PCollection a separate individual PCollection.
2) Insert the records in those individual PCollections into a BigQuery Table.
Basically my intention is to create a dynamic date partition in the BigQuery table.
How can I do this?
An example would really help.
For Google Dataflow to be able to perform the massive parallelisation which makes it as one of its kind (as a service on the public cloud), the job flow needs to be predefined before submitting it to on the Google cloud console. Everytime you execute the jar file that conatins your pipleline code (which includes pipeline options and the transforms), a json file with the description of the job is created and submitted to Google cloud platform. The managed service then uses this to execute your job.
For the use case mentioned in the question, it demands that the input PCollection be split into as many PCollections as their are unique dates. For the split, the Tuple Tags needed to split the collection should be created dynamically which is not possible at this time. Creating tuple tags dynamically is not allowed because that doesn't help in creating the job description json file and beats the whole design/purpose with which dataflow was built.
I can think of a couple of solutions to this problem (both having its own pros and cons) :
Solution 1 (a workaround for the exact use case in the question):
Write a dataflow transform that takes the input PCollection and for each element in the input -
1. Checks the date of the element.
2. Appends the date to a pre-defined Big Query Table Name as a decorator (in the format yyyyMMDD).
3. Makes an HTTP request to the BQ API to insert the row into the table with the table name added with a decorator.
You will have to take into consideration the cost perspective in this approach because there is single HTTP request for every element rather than a BQ load job that would have done it if we had used the BigQueryIO dataflow sdk module.
Solution 2 (best practice that should be followed in these type of use cases):
1. Run the dataflow pipeline in the streaming mode instead of batch mode.
2. Define a time window with whatever is suitable to the scenario in which it is being is used.
3. For the `PCollection` in each window, write it to a BQ table with the decorator being the date of the time window itself.
You will have to consider rearchitecting your data source to send data to dataflow in the real time but you will have a dynamically date partitioned big query table with the results of your data processing being near real time.
References -
Google Big Query Table Decorators
Google Big Query Table insert using HTTP POST request
How job description files work
Note: Please mention in the comments and I will elaborate the answer with code snippets if needed.

division not working with continuous query in influxdb

I am trying to generate a continuous query in influxDB. The query is to fetch the hits per second by doing (1/response time) of the value which i am already getting for another series (say series1).
Here is the query:
select (1000/value) as value from series1 group by time(1s) into api.HPS;
My problem is that the query "select (1000/value) as value from series1 group by time(1s)" works fine and provide me results but as soon as I store the result into continuous query, it starts to give me parse error.
Please help.
Hard to give any concrete advice without the actual parse error returned and perhaps the relevant log lines. Try providing those to the mailing list at influxdb#googlegroups.com or email them to support#influxdb.com.
There's an email on the Google Group that might be relevant, too. https://groups.google.com/d/msgid/influxdb/c99217b3-fdab-4684-b656-a5f5509ed070%40googlegroups.com
Have you tried using whitespace between the values and the operator? E.g. select (1000 / value) AS value....

Calculating duration between a start and end event in InfluxDB

I have two write points for InfluxDB, one is the start and the other is the end. I just need to determine the duration between those two events, and make queries around it. InfluxDB has difference() aggregate method, but it doesn't work on the time meta field.
Is supplying a custom timestamp value the only way to accomplish this?
As per "Can I perform mathematical operations against timestamps?"
No:
"Currently, it is not possible to execute mathematical operators against timestamp values in InfluxDB. Most time calculations must be carried out by the client receiving the query results."
and yes, maybe:
The function ELAPSED() returns the difference between subsequent timestamps in a single field.
So it depends on the shape of your data.
If you write only the mentioned two entries then you can follow the below steps -
Limit the result to two (Eg: select * from timeseries limit 2)
Extract the time from the result set
Take the difference between the time

Resources