Using InfluxDB: Is there any way to build a time-bucketed report of a field value representing a state that persists over time? Ideally in InfluxQL query language
More specifically as an example: Say a measurement contains points that report changes in the light bulb state (On / Off). They could be 0s and 1s as in the example below, or any other value. For example:
time light
---- -----
2022-03-18T00:00:00Z 1
2022-03-18T01:05:00Z 0
2022-03-18T01:55:00Z 0
2022-03-18T02:30:00Z 1
2022-03-18T04:06:00Z 0
The result should be a listing of intervals indicating if this light was on or off during each time interval (e.g. hours), or what percentage of that time it was on. For the given example, the result if grouping hourly should be:
Hour
Value
2022-03-18 00:00
1.00
2022-03-18 01:00
0.17
2022-03-18 02:00
0.50
2022-03-18 03:00
1.00
2022-03-18 04:00
0.10
Note that:
for 1am bucket, even if the light starts and ends in On state, it was On for only 10 over 60 minutes, so the value is low (10/60)
and more importantly the bucket from 3am to 4am has value "1" as the light was On since the last period, even if there was no change in this time period. This rules out usage of simple aggregation (e.g. MEAN) over a GROUP BY TIME(), as there would not be any way to know if an empty/missing bucket corresponds to an On or Off state as it only depends on the last reported value before that time bucket.
Is there a way to implement it in pure InfluxQL, without retrieving potentially big data sets (points) and iterating through them in a client?
I consider that raw data could be obtained by query:
SELECT "light" FROM "test3" WHERE $timeFilter
Where "test3" is your measurement name and $timeFilter is from... to... time period.
In this case we need to use a subquery which will fill our data, let's consider grouping (resolution) time as 1s:
SELECT last("light") as "filled_light" FROM "test3" WHERE $timeFilter GROUP BY time(1s) fill(previous)
This query gives us 1/0 value every 1s. We will use it as a subquery.
NOTE: You should be informed that this way does not consider if beginning of data period within $timeFilter has been started with light on or off. This way will not provide any data before hour with any value within $timeFilter.
In next step you should use integral() function on data you got from subquery, like this:
SELECT integral("filled_light",1h) from (SELECT last("light") as "filled_light" FROM "test3" WHERE $timeFilter GROUP BY time(1s) fill(previous)) group by time(1h)
This is how it looks on charts:
And how Result data looks in a table:
This is not a perfect way of getting it to work but I hope it resolves your problem.
I am trying to observe historical trends on customer acquisitions (new and returning) and am looking to use a formula to automate it for me.
Essentially, I am looking to determine the average amount of new customers we acquire on a specific day, specific week, and specific month. For example: what are the average customers we have acquired every Monday for the past 6 months, or what is the average number of customers we acquire the first week of every month?
Solution:
You can use the date operators in your QUERY statement to filter by month, week, or even day of week.
Examples:
every Monday for past 6 months
=query(A1:B, "select avg(B) where datediff(todate(now()),todate(A)) < 180 and dayofweek(A) = 2", 1)
first week of every month
=query(A1:B, "select month(A),avg(B) where day(A) <= 7 group by month(A) offset 1", 1)
You would need to tweak the sample queries to cover your data range and which columns do you need to average and compare.
References:
QUERY()
Query Language Reference | Scalar Functions
I already have InfluxDB working and getting monitoring data about some services (the value is just 1 for up and 0 for down). For the management I need to select the values from the database that are in a specific month and either night time or day time. For example: I want to select all the data from April 2019 (doesn't matter if it 1 or 0) between 08:00AM and 07:00PM (day time)
Here is what I've tried:
SELECT value FROM probe_success
WHERE "instance" = 'https://myservice/api' AND time >= '08:00:00' AND time < '19:00:00'
AND time >= '2019-04-01' AND time <= '2019-04-30'
But I've got an error:
{"results":[{"statement_id":0,"error":"invalid operation: time and
*influxql.StringLiteral are not compatible"}]}
Can anyone tell me what I'm doing wrong or point me in the right direction?
Thank you very much!
Ok, so after some research I've found that it is not possible to use the "OR" statement in a "WHERE" clause to specify multiple time ranges. See this. I solved this problem by sending separate requests for each day in a for loop and then concatenate all the results. It's not the best solution, I admit, but since they say it's not possible I guess this is the workaround.
I have a two-part question about storing days of the week and time in a database. I'm using Rails 4.0, Ruby 2.0.0, and Postgres.
I have certain events, and those events have a schedule. For the event "Skydiving", for example, I might have Tuesday and Wednesday and 3 pm.
Is there a way for me to store the record for Tuesday and Wednesday in one row or should I have two records?
What is the best way to store the day and time? Is there a way to store day of week and time (not datetime) or should these be separate columns? If they should be separate, how would I store the day of the week? I was thinking of storing them as integer values, 0 for Sunday, 1 for Monday, since that's how the wday method for the Time class does it.
Any suggestions would be super helpful.
Is there a way for me to store the the record for Tuesday and
Wednesday in one row or do should I have two records?
There are several ways to store multiple time ranges in a single row. #bma already provided a couple of them. That might be useful to save disk space with very simple time patterns. The clean, flexible and "normalized" approach is to store one row per time range.
What is the best way to store the day and time?
Use a timestamp (or timestamptz if multiple time zones may be involved). Pick an arbitrary "staging" week and just ignore the date part while using the day and time aspect of the timestamp. Simplest and fastest in my experience, and all date and time related sanity-checks are built-in automatically. I use a range starting with 1996-01-01 00:00 for several similar applications for two reasons:
The first 7 days of the week coincide with the day of the month (for sun = 7).
It's the most recent leap year (providing Feb. 29 for yearly patterns) at the same time.
Range type
Since you are actually dealing with time ranges (not just "day and time") I suggest to use the built-in range type tsrange (or tstzrange). A major advantage: you can use the arsenal of built-in Range Functions and Operators. Requires Postgres 9.2 or later.
For instance, you can have an exclusion constraint building on that (implemented internally by way of a fully functional GiST index that may provide additional benefit), to rule out overlapping time ranges. Consider this related answer for details:
Preventing adjacent/overlapping entries with EXCLUDE in PostgreSQL
For this particular exclusion constraint (no overlapping ranges per event), you need to include the integer column event_id in the constraint, so you need to install the additional module btree_gist. Install once per database with:
CREATE EXTENSION btree_gist; -- once per db
Or you can have one simple CHECK constraint to restrict the allowed time period using the "range is contained by" operator <#.
Could look like this:
CREATE TABLE event (event_id serial PRIMARY KEY, ...);
CREATE TABLE schedule (
event_id integer NOT NULL REFERENCES event(event_id)
ON DELETE CASCADE ON UPDATE CASCADE
, t_range tsrange
, PRIMARY KEY (event_id, t_range)
, CHECK (t_range <# '[1996-01-01 00:00, 1996-01-09 00:00)') -- restrict period
, EXCLUDE USING gist (event_id WITH =, t_range WITH &&) -- disallow overlap
);
For a weekly schedule use the first seven days, Mon-Sun, or whatever suits you. Monthly or yearly schedules in a similar fashion.
How to extract day of week, time, etc?
#CDub provided a module to deal with it on the Ruby end. I can't comment on that, but you can do everything in Postgres as well, with impeccable performance.
SELECT ts::time AS t_time -- get the time (practically no cost)
SELECT EXTRACT(DOW FROM ts) AS dow -- get day of week (very cheap)
Or in similar fashion for range types:
SELECT EXTRACT(DOW FROM lower(t_range)) AS dow_from -- day of week lower bound
, EXTRACT(DOW FROM upper(t_range)) AS dow_to -- same for upper
, lower(t_range)::time AS time_from -- start time
, upper(t_range)::time AS time_to -- end time
FROM schedule;
db<>fiddle here
Old sqliddle
ISODOW instead of DOW for EXTRACT() returns 7 instead of 0 for sundays. There is a long list of what you can extract.
This related answer demonstrates how to use range type operator to compute a total duration for time ranges (last chapter):
Calculate working hours between 2 dates in PostgreSQL
Check out the ice_cube gem (link).
It can create a schedule object for you which you can persist to your database. You need not create two separate records. For the second part, you can create schedule based on any rule and you need not worry on how that will be saved in the database. You can use the methods provided by the gem to get whatever information you want from the persisted schedule object.
Depending how complex your scheduling needs are, you might want to have a look at RFC 5545, the iCalendar scheduling data format, for ideas on how to store the data.
If you needs are pretty simple, than that is probably overkill. Postgresql has many functions to convert date and time to whatever format you need.
For a simple way to store relative dates and times, you could store the day of week as an integer as you suggested, and the time as a TIME datatype. If you can have multiple days of the week that are valid, you might want to use an ARRAY.
Eg.
ARRAY[2,3]::INTEGER[] = Tues, Wed as Day of Week
'15:00:00'::TIME = 3pm
[EDIT: Add some simple examples]
/* Custom the time and timetz range types */
CREATE TYPE timerange AS RANGE (subtype = time);
--drop table if exists schedule;
create table schedule (
event_id integer not null, /* should be an FK to "events" table */
day_of_week integer[],
time_of_day time,
time_range timerange,
recurring text CHECK (recurring IN ('DAILY','WEEKLY','MONTHLY','YEARLY'))
);
insert into schedule (event_id, day_of_week, time_of_day, time_range, recurring)
values
(1, ARRAY[1,2,3,4,5]::INTEGER[], '15:00:00'::TIME, NULL, 'WEEKLY'),
(2, ARRAY[6,0]::INTEGER[], NULL, '(08:00:00,17:00:00]'::timerange, 'WEEKLY');
select * from schedule;
event_id | day_of_week | time_of_day | time_range | recurring
----------+-------------+-------------+---------------------+-----------
1 | {1,2,3,4,5} | 15:00:00 | | WEEKLY
2 | {6,0} | | (08:00:00,17:00:00] | WEEKLY
The first entry could be read as: the event is valid at 3pm Mon - Fri, with this schedule occurring every week.
The second entry could be read as: the event is valid Saturday and Sunday between 8am and 5pm, occurring every week.
The custom range type "timerange" is used to denote the lower and upper boundaries of your time range.
The '(' means "inclusive", and the trailing ']' means "exclusive", or in other words "greater than or equal to 8am and less than 5pm".
Why not just store the datestamp then use the built in functionality for Date to get the day of the week?
2.0.0p247 :139 > Date.today
=> Sun, 10 Nov 2013
2.0.0p247 :140 > Date.today.strftime("%A")
=> "Sunday"
strftime sounds like it can do everything for you. Here are the specific docs for it.
Specifically for what you're talking about, it sounds like you'd need an Event table that has_many :schedules, where a Schedule would have a start_date timestamp...
I need to grab the records for same day of the week for the preceeding X days of the week. There must be a better way to do it than this:
Transaction.find_by_sql "select * from transactions where EXTRACT(DOW from date) = 1 and organisation_id = 4 order by date desc limit 7"
It gets me what I need but is Postgres specific and not very "Rails-y". Date is a timestamp.
Anyone got suggestions?
How many days do you want to go back?
I have written a gem called by_star that has a dynamic finder suited for finding up to a certain number of days in the past. If the number of days was always a number you could use this finder:
Transaction.as_of_3_days_ago
If it was dynamic then I would recommend using something such as future or between, depending on if you have transactions in the future (i.e. time travel):
Transaction.future(params[:start_date].to_time)
Transaction.between(params[:start_date].to_time, Time.now)
AFAIK Rails has no any methods to do this by other way. So best, and faster, solution - build DOW index on date column and use your query.