how to get latest timestamp data - influxdb

I am new to influx db, please help me with query?
I have below like data in influx where same Name data (A1, A2) can be available for multiple time.
I need only latest time stamp data (row 3,4,5) if same data is available in multiple time stamp and the new data (A3). Is such query available in influx?
this query only gives one record,
SELECT time, Name, value FROM "data" order by time desc limit 1

You can use the InfluxDB's last function to achieve this.
SELECT LAST("value") FROM AssetAssetType GROUP BY "Name"

Related

Tableau FIXED LOD vs COUNTD

I am working with a dataset containing 22,232,726 entries collected between 2008 and 2021. Because original entries can not be deleted from the database, a new entry must be created with the same ID to update an observation.
I want to remove all repeated IDs leaving only the latest entry per ID for my analysis.
I used the following Level of Detail function in Tableau to achieve this:
{FIXED [ID]: MAX([Date])} = [Date]
The function returns a total of 17,980,416 entries. However, when I run a distinct count COUNTD([ID]) before and after applying the LOD filter, I get 17,899,956 distinct IDs. Why is my LOD function returning an extra 80,460 repeated IDs to the result?
FYI, there are no Nulls in the ID nor the Date columns. So there can be repeated dates for the same ID, but I expected Tableau to keep only one of them in the results. How can I remove these extra repeated entries or fix this counting problem?
I eventually found a solution to the problem by using a Row_ID field as the criterium for selecting one of the records with an identical ID and Date. I used 2 LOD calcs as filters.
The first filter kept all unique IDs with the latest Date, including some repeated IDs with the same latest date.
1:{FIXED [ID]: MAX([Date])} = [Date]
The second filter took the repeated records with identical ID and Date and kept only the one with the last Row_ID.
2:{FIXED [ID],[Date]: MAX([Row_ID])}=[Row_ID]
The original dataset doesn't have a Row_ID variable, so I had to create it by using Pandas in Python by adding index and index_label parameters:
df.to_csv("my-file-name.csv", index=True, index_label='Row_ID')

Match Value based on corresponding dates

I have the spreadsheet attached.
I'd like to find Client No from lookup sheet based on the date provided in the live sheet.
The same client can appear with a different client number, so i need to lookup the name and date (from live sheet) and find the corresponding client number in the lookup sheet where the date from live sheet falls between the 2 dates on the lookup sheet.
I hope this makes sense.
Any help appreciated.
Thank you
This might do what you're looking for.
=IFERROR(
QUERY(SORT(FILTER(Lookup!A$2:D,Lookup!C$2:C=B2,Lookup!A$2:A<=A2),1,0),
"SELECT * WHERE COL4 >= DATE '"&TEXT(A2,"YYYY-MM-DD")&"' LIMIT 1",0),
QUERY(SORT(FILTER(Lookup!A$2:D,Lookup!C$2:C=B2,Lookup!A$2:A<=A2),1,0),
"SELECT * LIMIT 1",0) )
I've added a tab Live-GK to your sheet, with this formula in C2. It has to be dragged down. There may be another approach where it can be done as an arrayformula, but I haven't figured that out.
Note that on my tab, I'm doing the lookups from Lookup-GK, since I could add more test data there. The above formula can be used as is, pasted into cell C2 in your Live tab.
Note that for debugging purposes, column H of my tab returns all of the columns, not just the client #, so the start and end dates can be verified.
Let me know if this helps you.
Explanation:
The inner filter selects all rows from the Lookup tab where:
i) the client name (column C in Lookup) matches the client name in column B (of Live), and,
ii) the start date (column A in Lookup) is less than or equal the client date in Live.
These records are sorted in descending date order.
Then the query selects the first record where the end date (column D in Lookup) is greater than the client date in Live.
If the Lookup record has no end date, this gives an error (empty query result) so IFERROR, a second query is run, but without the filtering by end date, selecting the one record with no end date, but an appropriate start date.
These seemed to work with the few test records I used. If there is a duplication of client dates, the first client # is returned. See client #1 and #7 in my test data. Some more error handling might be necessary if your client records might have overlapping date ranges, as CalculusWhiz asked.

Get the value timestamp when grouping by time in InfluxDB

I am trying to get the MAX/MIN values of an interval of time, and I would like to get the correspond timestamp of the value.
If I run: SELECT max(value) FROM data WHERE time > 1549034249000000000 and time < 1550157449000000000 GROUP BY time(10s)
I am receiving the timestamp of the range beginning instead of the max(value) timestamp.
What alternatives could there be for receiving the max(value) of an interval and his timestamp?
In SQL is possible to execute a query like: SELECT value DENSE_RANK () OVER (PARTITION BY time ORDER BY variableName DESC) AS Rank FROM tableName Is not possible to run something like that in InfluxDB?
Yout can not. When you use group by you get ever the beginning timestamp of the group by.
The alternative is not to use group by.

Google Sheets - comparing dates and returning the latest row with requirements

I have a sheet that uses a query to pull data from another sheet. This data, looks a bit like this.
DATE STOREID OTHERDATA
02/11/2017 Store 1 Other data 1
01/11/2017 Store 1 Other data 2
09/10/2017 Store 2 Other data 3
05/10/2017 Store 2 Other data 4
I'm looking for a way for it to return only the latest date row per store, as seen below.
DATE STOREID OTHERDATA
02/11/2017 Store 1 Other data 1
09/10/2017 Store 2 Other data 3
The query I'm currently using looks something like this:
=query(DATASHEET!A2:CF11, "select C, CC, L,CD, E, BZ, CA, CB where (BF='CUSTOMERNAME1') order by C desc, CC, L, BZ desc",0)
Is this possible to make the query look at all dates and storeIDs and only return the highest date per storeID? I can imagine doing this in another language with a loop/for, but my Google results tell me it's not possible with query.
If that is the case, how would you recommend I do this in the data sheet so I could have a column say either LATEST / NOT LATEST for each row and then use query with a WHERE statement?
Here's an example sheet I tried setting up in case it helps explain what I'm trying to do.
https://docs.google.com/spreadsheets/removed
Any help is appreciated as I've spent all day trying to figure it out. Let me know if anything is unclear.
Thanks!!
Try this one:
=ARRAYFORMULA(VLOOKUP(QUERY({ROW(A2:A),SORT(A2:C)}, "SELECT MAX(Col1) WHERE Col3 IS NOT NULL GROUP BY Col3 LABEL MAX(Col1)''",0),{ROW(A2:A), SORT(A2:C)},{2,3,4},0))
Try this:
=ARRAYFORMULA(VLOOKUP(UNIQUE(B1:B11),SORT({B1:C11,A1:A11},3,0),{3,1,2},0))
By Reverse SORTing by Column A, and VLOOKUPing UNIQUE values, We have the latest row.

How to select last record in InfluxDB

I have pretty simple measurement in influxDB and have default time column and two other columns as shown below,
Select * from measurement
gives me this out put.
time component_id jkey
2016-09-27T02:49:17.837587671Z 3 "timestamp"
2016-09-27T02:49:17.849447239Z 3 "init_time"
2016-09-27T02:49:17.885999439Z 3 "ae_name"
2016-09-27T02:49:17.893056849Z 3 "init_time"
How can i select the last record of this measurement? The record which have maximum time value.
This can be done with last(). See the docs for more information: link. Or take a look at this example from the docs.
SELECT LAST("water_level") FROM "h2o_feet" WHERE "location" = 'santa_monica'
This will return the "newest" entry.

Resources