I am at a bit of a loss trying to figure out what is going on here:
I get results for this query:
SELECT value FROM "measures" WHERE time <= 1465195336002ms ORDER BY time desc
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "measures",
"columns": [
"time",
"value"
],
"values": [
[
1465195336000,
87.4
],
[
1464596862000,
86.66
],
[
1464070337000,
86.64
],
[
1463985100000,
86.77
]
]
}
]
}
]
}
All well and good, as expected.
But if I issue the following query, I get no results. Clearly this should match the same rows as above minus the first result:
SELECT value FROM "measures" WHERE time <= 1464596862000ms ORDER BY time desc
{
"results": [
{
"statement_id": 0
}
]
}
I figured it out, although far from obvious, it seems that this behaviour occurs when there is more than 1 measurement recorded for a given time period.
Related
I have a problem with WHERE clause in the URL query. Shortly, this works:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE time='2021-05-19T11:21:11.448Z'
this doesn’t:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE lon='23.53815'
Difference: in first statement I use ‘time’ in the WHERE clause, and in second one I use ‘lon’ instead:
WHERE time='2021-05-19T11:21:11.448Z' vs. WHERE lon='23.53815'
It doesn’t make sense to me why the second one doesn’t work. Any help would be much appreciated. Thanks.
P.S. Here’s an output of these two:
#1:
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "navigation.position",
"columns": [
"time",
"lat"
],
"values": [
[
"2021-05-19T11:21:11.448Z",
60.084066666666665
]
]
}
]
}
]
}
#2
{
"results": [
{
"statement_id": 0
}
]
}
It makes sense - lat (lon) is not a string type.
Filtering, where lat is a string type:
lat='60.084066666666665'
vs. filtering, where lat is a float type:
lat=60.084066666666665
I have Druid timeseries query:
{
"queryType": "timeseries",
"dataSource": {
"type": "union",
"dataSources": [
"ds1",
"ds2"
]
},
"dimensions":["dim1"],
"aggregations": [
{
"name": "y1",
"type": "doubleMax",
"fieldName": "value1"
}
],
"granularity": {
"period": "PT10S",
"type": "period"
},
"postAggregations": [],
"intervals": "2017-06-09T13:05:46.000Z/2017-06-09T13:06:46.000Z"
}
And i want to return the values of the dimensions as well, not just for the aggregations like this:
{
"timestamp": "2017-06-09T13:05:40.000Z",
"result": {
"y1": 28.724306106567383
}
},
{
"timestamp": "2017-06-09T13:05:50.000Z",
"result": {
"y1": 28.724306106567383
}
},
How do I have to change the query? Thanks in advance!
If your requirement is to use dimension column in time series query that means you are using aggregated data with non aggregated column, this requirement leads to the use of topN or groupBy query.
groupBy query is probably one of the most powerful druid currently supports but it has poor performance as well, instead you can use topN query for your purpose.
Link for topN documentation and example can be found here:
http://druid.io/docs/latest/querying/topnquery.html
Is Timeseries() query is not supporting dimension?
I tried it in my project but it is not working.
here is Error:
TypeError: queryRep.dataSource(...).dimension is not a function
2|DSP-api | at dimensionData (/home/ec2-user/reports/dsp_reports/controllers/ReportController.js:228:22)
Let me know if anyone has a solution for this.
TY.
I am trying to create a iOS 12 Shortcut based on the Gautrain API.
I want to do a POST to the URL https://api.gautrain.co.za/transport-api/api/0/journey/create with the following payload:
{
"geometry": {
"coordinates": [
[
28.23794,
-25.74762
],
[
28.05693,
-26.10858
]
],
"type": "MultiPoint"
},
"profile": "ClosestToTime",
"maxItineraries": 3,
"timeType": "DepartAfter",
"only": {
"agencies": [
"edObkk6o-0WN3tNZBLqKPg"
]
}
}
I have entered all these details into a "Get Contents of URL" block. For the elements of the "coordinates" arrays I have used "Number".
The problem is that when I track what my phone is sending via mitmproxy, it sends all the information correctly, but the coordinates have been rounded to integers:
{
"geometry": {
"coordinates": [
[
28,
-25
],
[
28,
-26
]
],
"type": "MultiPoint"
},
"maxItineraries": 1,
"only": {
"agencies": [
"edObkk6o-0WN3tNZBLqKPg"
]
},
"profile": "ClosestToTime",
"timeType": "DepartAfter"
}
For this reason, the request is not giving the desired results.
I have a feeling this may be a bug, but is there something I am missing where I can tell Shortcuts to use the full set of digits?
I have found the problem. Since I am in South Africa, the numbers are expected to have commas instead of periods for decimals. I would have loved some feedback in the field that this wasn't a valid number instead of just silently ignoring the decimal.
The solution therefore was to change the "28.23794" in the entry box to "28,23794".
I might also link to postman-echo.com as an excellent tool for debugging these kinds of requests.
I've set up a Druid cluster to ingest real-time data from Kafka.
Question
Does Druid support fetching data that's sorted by timestamp? For example, let's say I need to retrieve the latest 10 entries from a Datasource X. Can I do this by using a LimitSpec (in the Query JSON) that includes the timestamp field? Or is there another better option supported Druid?
Thanks in advance.
Get unaggregated rows
To get unaggregated rows, you can do a query with "queryType: "select".
Select queries are also useful when pagination is needed - they let you set a page size, and automatically return a paging identifier for use in future queries.
In this example, if we just want the top 10 rows, we can pass in "pagingSpec": { "pageIdentifiers": {}, "threshold": 10 }.
Order by timestamp
To order these rows by "timestamp", you can pass in "descending": "true".
Looks like most Druid query types support the descending property.
Example Query:
{
"queryType": "select",
"dataSource": "my_data_source",
"granularity": "all",
"intervals": [ "2017-01-01T00:00:00.000Z/2017-12-30T00:00:00.000Z" ],
"descending": "true",
"pagingSpec": { "pageIdentifiers": {}, "threshold": 10 }
}
Docs on "select" type queries
You can use a group by query to do this, So group by __time as an extraction function then set granularity to all and use the limitSpec to sort/limit that will work. Now if you want to use a timeseries query it is more tricky to get the latest 10. One way to do it is to set the granularity to the desired one let say Hour then set the interval to be 10H starting from the most recent point in time. This sounds more easy to say than achieve. I will go the first way unless you have a major performance issue.
{
"queryType": "groupBy",
"dataSource": "wikiticker",
"granularity": "all",
"dimensions": [
{
"type": "extraction",
"dimension": "__time",
"outputName": "extract_time",
"extractionFn": {
"type": "timeFormat"
}
},
],
"limitSpec": {
"type": "default",
"limit": 10,
"columns": [
{
"dimension": "extract_time",
"direction": "descending"
}
]
},
"aggregations": [
{
"type": "count",
"name": "$f2"
},
{
"type": "longMax",
"name": "$f3",
"fieldName": "added"
}
],
"intervals": [
"1900-01-01T00:00:00.000/3000-01-01T00:00:00.000"
]
}
I am trying to use cypher to perform the query in full text index. It returns results, but they are not ranked. Is there a way to get the match score?
start recordEmployee=node:fidx_RecordEmployee("F01:Leela* OR F01:Ph*") return recordEmployee.F01
Returns this, and I cannot find match score:
{
"results": [
{
"columns": [
"recordEmployee.F01"
],
"data": [
{
"row": [
"Philip"
],
"graph": {
"nodes": [],
"relationships": []
}
},
{
"row": [
"Leela"
],
"graph": {
"nodes": [],
"relationships": []
}
}
],
"stats": {
"contains_updates": false,
"nodes_created": 0,
"nodes_deleted": 0,
"properties_set": 0,
"relationships_created": 0,
"relationship_deleted": 0,
"labels_added": 0,
"labels_removed": 0,
"indexes_added": 0,
"indexes_removed": 0,
"constraints_added": 0,
"constraints_removed": 0
}
}
],
"errors": []
}
It's not possible in Cypher yet, but with stored procedures in Neo4j 3.0 it will be again.
Until then if you really need the score you can use the REST endpoint.
http://neo4j.com/docs/stable/rest-api-indexes.html#rest-api-find-node-by-query
Getting the results with a predefined ordering requires adding the
request parameter
?order=<ordering>
where <ordering> is one of index, relevance or score. In this case an
additional field will be added to each result, named score, that holds
the float value that is the score reported by the query result.