Return values for dimensions in timeseries query Druid - time-series

I have Druid timeseries query:
{
"queryType": "timeseries",
"dataSource": {
"type": "union",
"dataSources": [
"ds1",
"ds2"
]
},
"dimensions":["dim1"],
"aggregations": [
{
"name": "y1",
"type": "doubleMax",
"fieldName": "value1"
}
],
"granularity": {
"period": "PT10S",
"type": "period"
},
"postAggregations": [],
"intervals": "2017-06-09T13:05:46.000Z/2017-06-09T13:06:46.000Z"
}
And i want to return the values of the dimensions as well, not just for the aggregations like this:
{
"timestamp": "2017-06-09T13:05:40.000Z",
"result": {
"y1": 28.724306106567383
}
},
{
"timestamp": "2017-06-09T13:05:50.000Z",
"result": {
"y1": 28.724306106567383
}
},
How do I have to change the query? Thanks in advance!

If your requirement is to use dimension column in time series query that means you are using aggregated data with non aggregated column, this requirement leads to the use of topN or groupBy query.
groupBy query is probably one of the most powerful druid currently supports but it has poor performance as well, instead you can use topN query for your purpose.
Link for topN documentation and example can be found here:
http://druid.io/docs/latest/querying/topnquery.html

Is Timeseries() query is not supporting dimension?
I tried it in my project but it is not working.
here is Error:
TypeError: queryRep.dataSource(...).dimension is not a function
2|DSP-api | at dimensionData (/home/ec2-user/reports/dsp_reports/controllers/ReportController.js:228:22)
Let me know if anyone has a solution for this.
TY.

Related

InfluxDB query doesn't work with the "WHERE" clause

I have a problem with WHERE clause in the URL query. Shortly, this works:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE time='2021-05-19T11:21:11.448Z'
this doesn’t:
http://localhost:8086/query?pretty=true&db=boatdata&q=SELECT time,lat FROM "navigation.position" WHERE lon='23.53815'
Difference: in first statement I use ‘time’ in the WHERE clause, and in second one I use ‘lon’ instead:
WHERE time='2021-05-19T11:21:11.448Z' vs. WHERE lon='23.53815'
It doesn’t make sense to me why the second one doesn’t work. Any help would be much appreciated. Thanks.
P.S. Here’s an output of these two:
#1:
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "navigation.position",
"columns": [
"time",
"lat"
],
"values": [
[
"2021-05-19T11:21:11.448Z",
60.084066666666665
]
]
}
]
}
]
}
#2
{
"results": [
{
"statement_id": 0
}
]
}
It makes sense - lat (lon) is not a string type.
Filtering, where lat is a string type:
lat='60.084066666666665'
vs. filtering, where lat is a float type:
lat=60.084066666666665

Druid - Order data by timestamp column

I've set up a Druid cluster to ingest real-time data from Kafka.
Question
Does Druid support fetching data that's sorted by timestamp? For example, let's say I need to retrieve the latest 10 entries from a Datasource X. Can I do this by using a LimitSpec (in the Query JSON) that includes the timestamp field? Or is there another better option supported Druid?
Thanks in advance.
Get unaggregated rows
To get unaggregated rows, you can do a query with "queryType: "select".
Select queries are also useful when pagination is needed - they let you set a page size, and automatically return a paging identifier for use in future queries.
In this example, if we just want the top 10 rows, we can pass in "pagingSpec": { "pageIdentifiers": {}, "threshold": 10 }.
Order by timestamp
To order these rows by "timestamp", you can pass in "descending": "true".
Looks like most Druid query types support the descending property.
Example Query:
{
"queryType": "select",
"dataSource": "my_data_source",
"granularity": "all",
"intervals": [ "2017-01-01T00:00:00.000Z/2017-12-30T00:00:00.000Z" ],
"descending": "true",
"pagingSpec": { "pageIdentifiers": {}, "threshold": 10 }
}
Docs on "select" type queries
You can use a group by query to do this, So group by __time as an extraction function then set granularity to all and use the limitSpec to sort/limit that will work. Now if you want to use a timeseries query it is more tricky to get the latest 10. One way to do it is to set the granularity to the desired one let say Hour then set the interval to be 10H starting from the most recent point in time. This sounds more easy to say than achieve. I will go the first way unless you have a major performance issue.
{
"queryType": "groupBy",
"dataSource": "wikiticker",
"granularity": "all",
"dimensions": [
{
"type": "extraction",
"dimension": "__time",
"outputName": "extract_time",
"extractionFn": {
"type": "timeFormat"
}
},
],
"limitSpec": {
"type": "default",
"limit": 10,
"columns": [
{
"dimension": "extract_time",
"direction": "descending"
}
]
},
"aggregations": [
{
"type": "count",
"name": "$f2"
},
{
"type": "longMax",
"name": "$f3",
"fieldName": "added"
}
],
"intervals": [
"1900-01-01T00:00:00.000/3000-01-01T00:00:00.000"
]
}

Youtube Data v3 - How do I filter ChannelSections query by targeting

This query:
https://www.googleapis.com/youtube/v3/channelSections?part=snippet%2CcontentDetails%2Ctargeting&channelId=UC-9-kyTW8ZkZNDHQJ6FgpwQ&key={YOUR_API_KEY}
returns lots of data like this:
{
"kind": "youtube#channelSection",
"etag": "\"iDqJ1j7zKs4x3o3ZsFlBOwgWAHU/oIyqO89jk-vcfHm5Kuz3sikdUzc\"",
"id": "UC-9-kyTW8ZkZNDHQJ6FgpwQ.lc3PRFGaA4k",
"snippet": {
"type": "singlePlaylist",
"style": "horizontalRow",
"channelId": "UC-9-kyTW8ZkZNDHQJ6FgpwQ",
"position": 0
},
"contentDetails": {
"playlists": [
"PLFgquLnL59alW3xmYiWRaoz0oM3H17Lth"
]
},
"targeting": {
"regions": [
"US"
]
}
},
Is there any way to fetch only items with specific region?
Thanks for any help.
You can't filter these channel sections in your query like that. What you'd have to do is get the list of ChannelSections from that channel, store it in an object, and check if the targeted region for each channel section matches what you want (i.e. the value in targeting.regions[] equals XX, where XX is the region you're looking for). Then, you could store the channel sections that you wanted to find into an array and return that. If you're concerned about time, you'd have to set up a server that can do all of that for you.

JSON-LD normalization - ignore JSON nesting

I'm working on JSON-LD serialization, and ideally I would like to have a #context which I can add to the existing GeoJSON output (together with some #ids and #types), so that both the Turtle output and the JSON-LD output will normalize to the same triples.
Data is organized as follows: each object/feature has an ID and a name, and data on one or more layers. Per layer, there is a data field, which contains a JSON object.
Example GeoJSON output:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"id": "admr.nl.appingedam",
"name": "Appingedam",
"layers": {
"cbs": {
"data": {
"name": "Appingedam",
"population": 1092
}
},
"admr": {
"data": {
"name": "Appingedam",
"gme_code": 4654,
"admn_level": 3
}
}
}
},
"geometry": {…}
}
]
}
Example Turtle output:
<admr.nl.appingedam>
a :Node ;
dc:title "Appingedam" ;
:createdOnLayer <layer/admr> ;
:layerData <admr.nl.appingedam/admr> ;
:layerData <admr.nl.appingedam/cbs> .
<admr.nl.appingedam/admr>
a :LayerData ;
:definedOnLayer <layer/admr> ;
<layer/admr/name> "Appingedam" ;
<layer/admr/gme_code> "4654" .
<layer/admr/admn_level> "3" .
<admr.nl.appingedam/cbs>
a :LayerData ;
:definedOnLayer <layer/cbs> ;
<layer/cbs/name> "Appingedam" ;
<layer/cbs/population> "1092" ;
The properties object does not have its own URI. Is there a way to create a JSON-LD context which takes the contents of the properties into account, but further 'ignores' its precence?
Answered by Gregg Kellogg on JSON-LD mailing list:
This is something that keeps coming up: having a transparent layer,
that basically folds properties up a level. This was discussed during
the development of JSON-LD, but ultimately it was rejected.
I don't see any prospects for doing something in the short-term, but
it could be revisited in a possible future WG chartered with revising
the spec. Feedback like this is quite useful.
In the mean time, you can play with different JSON-LD encodings that
match your RDF though tools like http://json-ld.org/playground and my
own http://rdf.greggkellogg.net/distiller.
Gregg

Importing relationships with Core Data and Magical Record

I am getting JSON data from a webservice and try to store that in Core Data with Magical Record. I read the great post (and only documentation?) "Importing data made easy" by Saul Mora but I still do not really understand what I need to do to get all data in my entities.
Here is the JSON the web service returns:
{
"ApiVersion": 4,
"AvailableFileSystemLibraries": [
{
"Id": 10,
"Name": "Movie Shares",
"Version": "0.5.4.0"
},
{
"Id": 11,
"Name": "Picture Shares",
"Version": "0.5.4.0"
},
{
"Id": 5,
"Name": "Shares",
"Version": "0.5.4.0"
},
{
"Id": 9,
"Name": "Music Shares",
"Version": "0.5.4.0"
}
],
"AvailableMovieLibraries": [
{
"Id": 3,
"Name": "Moving Pictures",
"Version": "0.5.4.0"
},
{
"Id": 7,
"Name": "MyVideo",
"Version": "0.5.4.0"
}
],
"AvailableMusicLibraries": [
{
"Id": 4,
"Name": "MyMusic",
"Version": "0.5.4.0"
}
],
"AvailablePictureLibraries": [
{
"Id": 8,
"Name": "Picture Shares",
"Version": "0.5.4.0"
}
],
"AvailableTvShowLibraries": [
{
"Id": 6,
"Name": "MP-TVSeries",
"Version": "0.5.4.0"
}
],
"DefaultFileSystemLibrary": 5,
"DefaultMovieLibrary": 3,
"DefaultMusicLibrary": 4,
"DefaultPictureLibrary": 0,
"DefaultTvShowLibrary": 6,
"ServiceVersion": "0.5.4"
}
The entities I want to store that data in look like this:
There is also a Server entity with a 1:1 relationship to ServerInfo.
What I want to do:
Store basic data (ApiVersion, ...) in ServerInfo. This I already got to work.
Store each object in AvailableXYLibraries in BackendLibrary (1:n relationship from ServerInfo).
Set type based on the XY part of AvailableXYLibraries, for example "movie" for AvailableMovieLibraries.
Set defaultLibrary to true if this library is referenced by DefaultXYLibrary.
Set providerId to servername + LibraryId as there are multiple servers that can have BackendLibraries with the same numeric ID.
Is this possible with Magical Record? I guess I need to implement some of the import hooks and set some user info keys, but everything I read doesn't really tell me where to set what user info key or implement which method where and how.
I hope this made sense and that you can give me some hints :) Thanks!
The structure of this data is quite a bit different from your Core Data model. What you'll most likely have to do is iterate a bit on the dictionary. That is, there are various collections of library data, eg. FileSystemLibraries, AvailableMovieLibraries, etc. You'll have to get the array out of those keys, and then map your entities as I described in the article. In order to launch the process, you'll have to call
[BackendLibrary importFromArray:arrayFromDownloadedDictionary];
where the arrayFromDownloadedDictionary is each array in the example dictionary you've posted. Once you give the array to MagicalRecord, and provided the proper field mapping, MagicalRecord will then import and create all the entities for you at that point.
Make sure you map "Id" to BackendLibary.id, "Name" to BackendLibrary.name, and "Version" to BackendLibrary.version

Resources