Get combined results from different measurements - influxdb

I have two different measurements with different fields but with same tags.
Example:
Name: measurement1
Tags: prd
-------------------
time prd id value
x.x.x AV 1234 12345
y.y.y SK 567 5678
Name: measurement2
Tags: prd
-------------------
time prd name dept
a.a.a SK Rob 000
b.b.b AV Jack 111
From the above example,
Select * from measurement1, measurement2 where prd='AV'
This query doesn't combine all field values as result.
Can I get the below result ? Is it possible to combine field values from two different measurements based on a tag value ?
time prd id value name dept
x.x.x AV 1234 12345 - -
b.b.b AV - - Jack 111
I'm new to InfluxDB. Please bear with this question if it is silly.

That's not possible. Individual measurements will be grouped by the measurement name.
It is not possible to combine fields from multiple measurements, nor perform functions across measurements.
In output data, each measurement's fields and values are returned per measurement, eg:
name: measurement1
<fields>
---------
<data>
name: measurement2
<fields>
---------
<data>

Related

downsampling: get constant value. E.g. sensor name from GROUP BY

I created a continuous query to downsample readings from temperature sensors in my influxdb to store hourly means for a longer time. There are readings of multiple sensors in one table. Upon executing the query, the sensors ip is missing.
Basic data looks like this:
> SELECT ip,tC FROM ht LIMIT 5
name: ht
time ip tC
---- -- --
1671057540000000000 192.168.0.83 21
1671057570000000000 192.168.0.83 21
1671057750000000000 192.168.0.17 21.38
The continuous query (simplified without CREATE ... END):
SELECT last(ip), mean("tC") AS "mean_temp" INTO "downsampled"."ht_downsampled" FROM "ht" GROUP BY time(1h),ip
The issue is, the value of 'ip' is only a tag, not the value in the table and subsequently is missing in the table the query inserts into:
name: ht
tags: ip=192.168.0.17
time ip mean_temp mean_hum
---- -- --------- --------
1671055200000000000 21.47 42.75
1671058800000000000 21.39428571428571 48.785714285714285
1671062400000000000 21.314999999999998 51.625
Why is last(ip) not producing any value?
Can I get the value from the 'tags' into the table?
Is there a different approach to group data with a constant value?
Could you just try query the ip instead of the last(ip) since you are grouping by the ip in the statement already?
Sample code:
SELECT ip, mean("tC") AS "mean_temp" INTO "downsampled"."ht_downsampled" FROM "ht" GROUP BY time(1h), ip

InfluxDB merge different Field Value to single printout

I want to get the one line printout by mix/merge different field value, like the following , (Mix or Merge is method what I assume)
select mix(value) from test group by “name”
name: test
tags: name=“case1”
time mix/merge
1970-01-01T00:00:00Z failed,passed,skipped
select * from test group by “name”
name: test
tags: name=“case1”
time caseAuthor caseName caseResult value
2018-07-20T03:51:42.599533888Z mike case1 pass 1
2018-07-20T03:51:42.690955475Z mike case1 failed 2
2018-07-20T03:51:42.723272883Z mike case1 skipped 3
thanks for any help
BRs
/Yijun
As for v1.6 there is no "join" or "concat" aggregation function in InfluxQL. All you can do is running
SELECT DISTINCT("caseResult")
FROM "test"
GROUP BY “name”
and then join values by "comma" separator in each group with some bash script or the programming language executing a query.
And actually, even though InfluxDB supports various data types (e.g. strings and booleans) InfluxQL support for processing data of types other then numeric is very-very poor in my opinion...

Influxdb querying values from 2 measurements and using SUM() for the total value

select SUM(value)
from /measurment1|measurment2/
where time > now() - 60m and host = 'hostname' limit 2;
Name: measurment1
time sum
---- ---
1505749307008583382 4680247
name: measurment2
time sum
---- ---
1505749307008583382 3004489
But is it possible to get value of SUM(measurment1+measurment2) , so that I see only o/p .
Not possible in influx query language. It does not support functions across measurements.
If this is something you require, you may be interested in layering another API on top of influx that do this, like Graphite via Influxgraph.
For the above, something like this.
/etc/graphite-api.yaml:
finders:
- influxgraph.InfluxDBFinder
influxdb:
db: <your database>
templates:
# Produces metric paths like 'measurement1.hostname.value'
- measurement.host.field*
Start the graphite-api/influxgraph webapp.
A query /render?from=-60min&target=sum(*.hostname.value) then produces the sum of value on tag host='hostname' for all measurements.
{measurement1,measurement2}.hostname.value can be used instead to limit it to specific measurements.
NB - Performance wise (of influx), best to have multiple values in the same measurement rather than same value field name in multiple measurements.

Query Influxdb based on tags?

I have started playing around with Influxdb v0.13 and I have some dummy values in my test db where id is a tag and value is a field:
> SELECT * FROM dummy
name: dummy
--------------
time id value
1468276508161069051 1234 12345
1468276539152853428 613 352
1468276543470535110 613 4899
1468276553853436191 1234 12
I get no results returned when I run this query:
> SELECT * FROM dummy WHERE id=1234
but I do get the following when querying with the field instead:
> SELECT * FROM dummy WHERE value=12
name: dummy
--------------
time id value
1468276553853436191 1234 12
Am I doing something wrong here? I thought the point of tags were to be queried (since they are indexed and fields are not), but they seem to break my queries.
It appears that Influx will treat every tag key and value we insert as string and this is evidently shown in their official documentation.
See: https://docs.influxdata.com/influxdb/v0.13/guides/writing_data/
When writing points, you must specify an existing database in the db
query parameter. See the HTTP section on the Write Syntax page for a
complete list of the available query parameters. The body of the POST
- we call this the Line Protocol - contains the time-series data that you wish to store. They consist of a measurement, tags, fields, and a
timestamp. InfluxDB requires a measurement name. Strictly speaking,
tags are optional but most series include tags to differentiate data
sources and to make querying both easy and efficient. Both tag keys
and tag values are strings.
Note: the text in bold.
Hence to filter by tag key value - the query must be enquoted.
Example:
SELECT * FROM dummy WHERE id='1234'

Cleaning data in SPSS with name misspellings

I have a 5M records dataset in this basic format:
FName LName UniqueID DOB
John Smith 987678 10/08/1976
John Smith 987678 10/08/1976
Mary Martin 567834 2/08/1980
John Smit 987678 10/08/1976
Mary Martin 768987 2/08/1980
The DOB is always unique, but I have cases where:
Same ID, different name spellings or Different ID, same name
I got as far as making SPSS recognize that John Smit and John Smith with the same DOB are the same people, and I used aggregate to show how many times a spelling was used near the name (John Smith, 10; John Smit 5).
Case 1:
What I would like to do is to loop through all the records for the people identified to be the same person, and get the most common spelling of the person's name and use that as their standard name.
Case 2:
If I have multiple IDs for the same person, take the lowest one and make that the standard.
I am comfortable using basic syntax to clean my data, but this is the only thing that I'm stuck on.
If UniqueID is a real unique ID of individuals in the population and you are wanting to find variations of name spellings (within groupings of these IDs) and assign the modal occurrence then something like this would work:
STRING FirstLastName (A99).
COMPUTE FirstLastName = CONCAT(FName," ", LName").
AGGREGATE OUTFILE= * MODE=ADDVARIABLES /BREAK=UniqueID FirstLastName /Count=N.
AGGREGATE OUTFILE= * MODE=ADDVARIABLES /BREAK=UniqueID /MaxCount=MAX(Count).
IF (Count<>MaxCount) FirstLastName =$SYSMIS.
AGGREGATE OUTFILE= * MODE=ADDVARIABLES OVERWRITE=YES /BREAK=UniqueID /FirstLastName=MAX(FirstLastName).
You could then also overwrite the FName and LName fields also but then more assumptions would have to be made, if for example, FName or LName can contain space characters ect.

Resources