aggregate functions for boolean field - influxdb

I want a continuous query on a stream of boolean fields, to downsample them.
So I need an aggregate function to convert a series of booleans to one. In my case I would need AND().
I don't seem to find such a function, in fact, none of the aggregate functions work on boolean types:
ERR: unsupported sum iterator type: *influxql.booleanInterruptIterator
Does there exist another way to aggregate boolean values? Custom aggregate functions are not supported as I understand?

I'm thinking it's easier to convert my booleans to 0 and 1..also for graphing in Grafana it will be easier to work with.

Related

Does odata v4 support aggregation on date values?

I am looking for an OData query syntax which helps to solve Sum((DateDiff(minute, StartDate, EndDate) which we do in SqlServer. Is it possible to do such things using OData v4?
I tried the aggregate function but not able to use the sum operator on the duration type. Any idea?
You can't execute a query like that directly in standards compliant v4 service as the built in Aggregates all operate on single fields, for instance there is no support for creating a new arbitrary column to project the results into, this is mainly because the new column is undefined. By restricting the specification to only columns that are pre-defined in the resource itself, we can have a strong level of certainty on the structure of the data that will be returned.
If you are the author of the API, there are three common approaches that can achieve a query similar to your request.
Define a Custom Data Aggregate, this is way more involved than is necessary, but it means you could define the aggregate once and use it in many resource queries.
Only research this solution if you truly need to reuse the same aggregate on multiple resources
Define a Custom Function to compute the result of all or some elements in your query.
Think of a Function as similar to a SQL View, it is really just a way of expressing a custom query and custom response object that is associated with a resource.
It is common to use Functions to apply complex filter conditions that still return the resource that they are bound to, but you can return an entirely different structure of data if you want.
Exploit Open Type, this can sometimes be more effort than you expect, but can be managed if there is only a small number of common transformations you want to apply to the resource and project their results as discrete properties in addition to the standard resource definition.
In your case you could project DateDiff(minute, StartDate, EndDate) into its own discrete column, perhaps called Minutes or Duration. Then you could $apply a simple SUM across this new field.
Exposing a custom Function is usually the least effort approach, because you are not constrained by the shape of the result at all, it can be maintained in relative isolation from the main resource, as with Open Types, the useful thing about functions is that the caller can still apply OData aggregates to the result of the Function.
If the original post is updated with some more detailed code examples, I can elabortate on the function implementation, however in this state I hope this information sets you on the right path.

How can I perform an actual count query on a Realm List collection?

This particular use case is lacking detail in the Realm docs, and Apple's NSPredicate reference is a nightmare for someone unfamiliar with the syntax. As a result, I've ended up with a bunch of interconnected questions.
The filter() and index() methods for Realm Lists have two variants, with one using NSPredicate while the other uses a string predicate: am I right to deduce from the GitHub page that the string predicate version is just a wrapper and uses NSPredicate syntax as well?
How can I perform a count query and actually get the number of entries that match a condition rather than a Results collection of objects that match said condition? Is this even possible? And is it even necessary?
Does using filter() to get a Results collection of objects actually tax system resources, or does the lazy nature of the references mean that getting the Results collection and then checking its .count is equivalent to an actual count query (a la SQL)?
What do I do if filter() isn't enough and I need to use Swift's map() or reduce() on a particular property in a Realm List collection? Is that even possible?
Basically, most of my problems are stemming from trying to work with properties of the objects stored in a Realm List rather than with the objects themselves, i.e. count how many objects in the List have a property set to a certain value, several times for different values, then figure out which of the counts is higher -- never actually retrieving any values to use directly.
There are 3 variants: Predicate, String (which wraps predicate) and a closure, which you shouldn't use unless you really need it because it decays into an array and prevents optimizations in the db query since it has to hand your closure all of the results.
Realm supports the NSPredicate aggregate functions including #count. See the documentation.
Results is indeed lazy and the count can be optimized. For instance if you are querying an index field then it can just look at the number of indices and does not have to pull all of the records.
You can always decay results into a swift array with Array(results). In that case the values are copied eagerly and you no longer have a lazy autoupdating set of Results. You can then do anything you can do with a regular Array or Sequence, such as filter, map, reduce, index, first etc.

Specifying the primitive type of a property in a Cypher CREATE clause

Contrary to what's possible with the Java API, there doesn't seem to be a way to specify whether a numeric property is a byte, short, int or long:
CREATE (n:Test {value: 1}) RETURN n
always seems to create a long property. I've tried toInt(), but it is obviously understood in the mathematical sense of "integer" more than in the computer data type sense.
Is there some way I'm overlooking to actually force the type?
We have defined a model and want to insert test data using Cypher statements, but the code using the data then fails with a ClassCastException since the types don't match.
If you run your cypher queries with the embedded API then
you can provide parameters in a hashmap with the correctly typed values.
For remote users it doesn't really matter as it goes through JSON serialization back and forth which looses the type information anyway. So it is just "numeric".
Why do you care about the numeric type?
you can also just use ((Number)n.getProperty("value")).xxxValue() (xxx = int,long,byte)

Mnesia: how to use indexed operations correctly when selecting rows based on criteria involving multiple, indexed columns

Problem:
How to select records efficiently from a table where the select is based on criteria involving two indexed columns.
Example
I have a record,
#rec{key, value, type, last_update, other_stuff}
I have indexes on key (default), type and last_update columns
type is typically an atom or string
last_update is an integer (unix-style milliseconds since 1970)
I want, for example all records whose type = Type and have been updated since a specific time-stamp.
I do the following (wrapped in a non-dirty transaction)
lookup_by_type(Type, Since) ->
MatchHead = #rec{type=Type, last_update = '$1', _= '_'},
Guard = {'>', '$1', Since},
Result = '$_',
case mnesia:select(rec,[{MatchHead, [Guard],[Result]}]) of
[] -> {error, not_found};
Rslts -> {ok, Rslts}
end.
Question
Is the lookup_by_type function even using the underlying indexes?
Is there a better way to utilize indexes in this case
Is there an entirely different approach I should be taking?
Thank you all
One way, which will probably help you, is to look at QLC queries. These are more SQL/declarative and they will utilize indexes if possible by themselves IIRC.
But the main problem is that indexes in mnesia are hashes and thus do not support range queries. Thus you can only efficiently index on the type field currently and not on the last_update field.
One way around that is to make the table ordered_set and then shove the last_update to be the primary key. The key parameter can then be indexed if you need fast access to it. One storage possibility is something like: {{last_update, key}, key, type, ...}. Thus you can quickly answer queries because last_update is orderable.
Another way around it is to store last-update separately. Keep a table {last_update, key} which is an ordered set and use that to limit the amount of things to scan on the larger table in a query.
Remember that mnesia is best used as a small in-memory database. Thus scans are not necessarily a problem due to them being in-memory and thus pretty fast. Its main power though is the ability to do key/value lookups in a dirty way on data for quick query.

thinking sphinx options

I'm confused, when should we use indexes and has method in sphinx. I have a very vague and abstract idea about it.I was told for date range we use has method. But nowhere I could find a concrete explanation for this.
The indexes method is for fields - and fields are the textual/string data that contain words you expect people to search for.
The has method is for attributes - which are mostly integers, floats, timestamps and boolean values, which are used by developers for sorting, filtering and grouping. If you want to filter for records given a date range, then an attribute is the best tool for the job.

Resources