I have a Master event (SeriesMaster) and need to assign a batch of occurrences, which do not fall into any recurrence pattern. Rather each of them just have some date. As far as I understand, to make an event with various instances I must have a recurrence pattern.
Is there a way to accomplish it?
Related
I am looking for an OData query syntax which helps to solve Sum((DateDiff(minute, StartDate, EndDate) which we do in SqlServer. Is it possible to do such things using OData v4?
I tried the aggregate function but not able to use the sum operator on the duration type. Any idea?
You can't execute a query like that directly in standards compliant v4 service as the built in Aggregates all operate on single fields, for instance there is no support for creating a new arbitrary column to project the results into, this is mainly because the new column is undefined. By restricting the specification to only columns that are pre-defined in the resource itself, we can have a strong level of certainty on the structure of the data that will be returned.
If you are the author of the API, there are three common approaches that can achieve a query similar to your request.
Define a Custom Data Aggregate, this is way more involved than is necessary, but it means you could define the aggregate once and use it in many resource queries.
Only research this solution if you truly need to reuse the same aggregate on multiple resources
Define a Custom Function to compute the result of all or some elements in your query.
Think of a Function as similar to a SQL View, it is really just a way of expressing a custom query and custom response object that is associated with a resource.
It is common to use Functions to apply complex filter conditions that still return the resource that they are bound to, but you can return an entirely different structure of data if you want.
Exploit Open Type, this can sometimes be more effort than you expect, but can be managed if there is only a small number of common transformations you want to apply to the resource and project their results as discrete properties in addition to the standard resource definition.
In your case you could project DateDiff(minute, StartDate, EndDate) into its own discrete column, perhaps called Minutes or Duration. Then you could $apply a simple SUM across this new field.
Exposing a custom Function is usually the least effort approach, because you are not constrained by the shape of the result at all, it can be maintained in relative isolation from the main resource, as with Open Types, the useful thing about functions is that the caller can still apply OData aggregates to the result of the Function.
If the original post is updated with some more detailed code examples, I can elabortate on the function implementation, however in this state I hope this information sets you on the right path.
I am wondering what is the best or most idiomatic way to expose all events in a window to a custom function. The following example is constructed following the stock price style examples used in the Esper online documentation.
Suppose we have the following Esper query:
select avg(price), custom_function(price) from OrderEvent#unique(symbol)
The avg(price) part returns an average of the most recent price for each symbol. Suppose we want custom_function to work in a similar manner, but it needs complex logic - it would want to iterate over every value in the window each time the result is needed (eg outlier detection methods might need such an algorithm).
To be clear, I'm requiring the algorithm look something like:
custom_function(window):
for each event in window:
update calculation
and there is no clever way to update the calculation as events enter or leave the window.
A custom aggregation could achieve this by pushing and popping events to a set, but this becomes problematic when primitive types are used. It also feels wasteful, as presumably esper already has the collection of events in the window so we prefer not to duplicate that.
Esper docs mention many ways to customize things, see this Solution Pattern, for example. Also mentioned is that the 'pull API' can iterate all events in a window.
What approaches are considered best to solve this type of problem?
For access to all events at the same time use window(*) or prevwindow(*).
select MyLibrary.computeSomething(window(*)) from ...
The computeSomething is a public static method of class MyLibrary (or define a UDF).
For access to individual event-at-a-time you could use an enumeration method. The aggregate-method has an initial value and accumulator lambda. There is an extension API to extend the existing enumeration methods that you could also use.
select window(*).aggregate(..., (value, eventitem) => ...) from ...
link to window(*) doc and
link to enum method aggregate doc and
link to enum method extension api doc
I am just getting into Graph databases and need advice.
For this example, I have a 'Person' node and a 'Project' node with two relationships between the two. The two relationships are:
A scheduled date, this is the projected finished date
A verified date, this is the actual finished date
Both are from the Person to the Project.
Specifically referring to using the relationship property to hold the "date value" of the event. Are they any downsides to this, or a better way to model this in a graph?
A simple mock up is below:
It is easier to hold dates in the form of Unix Epoch time stamp (stored as long integer), rather than as Julian dates. Neo4j has no built in date / time format.
Timestamp and can be used to perform calculations on the dates to find things like how-many days behind schedule is the project based on current date.
The timestamp() function in Cypher provides a way to get the current Unix time within neo4j.
Each relationship in Neo4J takes up 34 Bytes of data internally, excluding the actual content of the relationship. It might be more efficient to hold both scheduled completion and verified completion as properties in a single relationship rather than storing them as two relationships.
A relationship does not need to have both the scheduled date and the verified date at the same time (the advantages of NoSQL). You can add the verified date later using the SET keyword.
Just to give you an example.
use the following Cypher statement to create.
Create (p:Person {name:'Bill'})-[r:Works_On {scheduledcompletion: 1461801600}]->(pro:Project {name:'Jabberwakie'})
use the following Cypher statement to set the verified date to current time.
Match (p:Person {name:'Bill'})-[r:Works_On]->(pro:Project {name:'Jabberwakie'}) set r.verifiedcompletion=timestamp()
use the following Cypher statement to perform some kind of calculation, in this case to return a boolean value if the project was behind schedule or not.
Match (p:Person {name:'Bill'})-[r:Works_On]->(pro:Project {name:'Jabberwakie'}) return case when r.scheduledcompletion > r.verifiedcompletion then true else false end as behindschedule
Also think about storing projected finished date and actual finished date in node project in case if this property related to the whole project and is the same for all persons related to it.
This will help you to avoid duplication of data and will make querying projects by properties work faster as you woun't have to look for relationships. In cases where your model designed to have different dates actual and finished for different persons in addition to storing datasets in relationships it still makes sense to store in project Node combined information for the whole project. As it will make model more clear and some queries faster to execute.
We have items in our app that form a tree-like structure. You might have a pattern like the following:
(c:card)-[:child]->(subcard:card)-[:child]->(subsubcard:card) ... etc
Every time an operation is performed on a card (at any level), we'd like to record it. Here are some possible events:
The title of a card was updated by Bob
A comment was added by Kate mentioning Joe
The status of a card changed from pending to approved
The linked list approach seems popular but given the sorts of queries we'd like to perform, I'm not sure if it works the best for us.
Here are the main queries we will be running:
All of the activity associated with a particular card AND child cards, sorted by time of the event (basically we'd like to merge all of these activity feeds together)
All of the activity associated with a particular person sorted by time
On top of that we'd like to add filters like the following:
Filter by person involved
Filter by time period
It is also important to note that cards may be re-arranged very frequently. In other words, the parents may change.
Any ideas on how to best model something like this? Thanks!
I have a couple of suggestions, but I would suggest benchmarking them.
The linked list approach might be good if you could use the Java APIs (perhaps via an unmanaged extension for Neo4j). If the newest event in the list were the one attached to the card (and essentially the list was ordered by the date the events happened down the line), then if you're filtering by time you could terminate early when you've found an event which is earlier than the specified time.
Attaching the events directly to the card has the potential to lead you down into problems with supernodes/dense nodes. It would be the simplest to query for in Cypher, though. The problem is that Cypher will look at all of them before filtering. You could perhaps improve the performance of queries by, in addition to placing the date/time of the event on the event node, placing it on the relationships to the node ((:Card)-[:HAS_EVENT]->(:Event) or (:Event)-[:PERFORMED_BY]->(:Person)). Then when you query you can filter by the relationships so that it doesn't need to traverse to the nodes.
Regardless, it would probably be helpful to break up the query like so:
MATCH (c:Card {uuid: 'id_here')-[:child*0..]->(child:Card)
WITH child
MATCH (child)-[:HAS_EVENT]->(event:Event)
I think that would mean that the MATCH is going to have fewer permutations of paths that it will need to evaluate.
Others are welcome to supplement my dubious advice as I've never really dealt with supernodes personally, just read about them ;)
There are several possible ways I can think of to store and then query temporal data in Neo4j. Looking at an example of being able to search for recurring events and any exceptions, I can see two possibilities:
One easy option would be to create a node for each occurrence of the event. Whilst easy to construct a cypher query to find all events on a day, in a range, etc, this could create a lot of unnecessary nodes. It would also make it very easy to change individual events times, locations etc, because there is already a node with the basic information.
The second option is to store the recurrence temporal pattern as a property of the event node. This would greatly reduce the number of nodes within the graph. When searching for events on a specific date or within a range, all nodes that meet the start/end date (plus any other) criteria could be returned to the client. It then boils down to iterating through the results to pluck out the subset who's temporal pattern gives a date within the search range, then comparing that to any exceptions and merging (or ignoring) the results as necessary (this could probably be partially achieved when pulling the initial result set as part of the query).
Whilst the second option is the one I would choose currently, it seems quite inefficient in that it processes the data twice, albeit a smaller subset the second time. Even a plugin to Neo4j would probably result in two passes through the data, but the processing would be done on the database server rather than the requesting client.
What I would like to know is whether it is possible to use Cypher or Neo4j to do this processing as part of the initial query?
Whilst I'm not 100% sure I understand you requirement, I'd have a look at this blog post, perhaps you'll find a bit of inspiration there: http://graphaware.com/neo4j/2014/08/20/graphaware-neo4j-timetree.html