Best practices for events and date ranges - neo4j

I’m trying to find the best practice for how to store and then query a event like this. User has purchased 3 items on separate dates.
Over that period there were two events that were held (events added in well after the user purchased the items as a retrospect, so at the time of purchase, event was not known). I’m trying to see how many items were purchased during each event by that user. How should I do that?
One solution but it sounds weird to me: When inserting a event, scan and add a relationship to all vertices that match

Manage date-time types isn't exactly an easy task on Neo4j, even in 3.2 version.
You have two options:
Hard way: convert all date to unix "timestamp" format ('s' or 'ms' from 1970), in order to calc date ranges.
Easy, convenient way: use "APOC" (here), a set of procedures and functions available as plugin for Neo4j; installation can be a bit tricky but it's worth it, indeed. It has a good number of 'date-time' functions.

Related

Formula to sort rows of bank statement

I export my bank transactions to a PDF, that I then paste to a google spreadsheet.
Problem is: I may need to sort the transactions on my spreadsheet, and after reordering by date the amounts and balance may "shift" when there are several transactions on the same day:
It's not a big problem to me, but my accountant is all lost.
I would like to find a way to identify the orders of the transactions of a same date, by comparing the amounts/balance to the final balance of the previous date.
I managed to create a formula using a MATCH that would identify the first transaction of a specific date, but if I were to make it work for 10-20 potential transactions within a same date, it would get stupidly long and complex. I may eventually do that, but before i'd like to know if there is an easier solution.
I can add as many columns as I want, and I don't mind using scripts.
What I cannot do is create a column that would recalculate the balance according to the order the transactions are in. That would be the easiest solution, but if my accountant were to compare with what is on the real bank account, he would find discrepancies and be just as lost.
Thank you!
As #gries said:
Since your PDF contains the transactions already ordered the way you want you can assign to each of them an incremental ID.
In such a way, you will be able to restore the initial order ordering by the transaction ID instead of using the date that could be repeated.

Storing Date Components Instead of a Date

My app lets people log the movies they see (for example). Each logged movie usually (but not always) has a date and sometimes has a time. It's not unusual to have one but not the other. Occasionally the dates are only a year ("I watched a Dumbo sometime in 1984"), but could realistically be any combination of day/month/year/time.
I am used to modeling dates as date objects in my app and my backend. But is it a viable approach to store each component separately? When I need to reference an actual date from the components (e.g. for sorting the log) this will be built client-side, or perhaps be stored as a derived property sortDate and updated whenever any of the components change.
My reservation is that the information the user is saving is truly a 'moment in time' and I will have to take care of some things myself - for example what time zone are my components stored relative to? This would be captured automatically as part of a real Date object.
The alternative seems to be assuming some sort of 'default' for missing components (e.g. year 0000 if no year, time 00:00 if no time). But those defaults have meaning and I won't be able to distinguish them from 'not provided'.
What are the limitations and/or pitfalls of this approach? Does anyone have experience modeling their dates this way?
If it's of any consequence, my app is for iOS written in Swift and uses a Parse Server backend.
I've successfully used question marks to represent ambiguous and unknown timestamp parts in legal systems. Try to keep in mind that you're really not modeling dates here ('1984' isn't a date); you're modeling facts about dates.
So, if one of your users saw a movie some time in 1984, you might record the value '1984-??-?? ??:??:??' in a text column in a database. Values like this sort sensibly.
See also this answer on dba. Comments on that answer are also good to read.

better design for fact table where each row has a Start & End Date

My fact table contains details for clients who attend a course.
To ensure i can get a list of clients registered on any particular day, I have not related the date dimension to the fact table.
Instead i created a measure that does basic between logic (where startDate <= selectedDate && endDate >=SelectedDate)
This allows me to find all clients registered on one single selected day.
There are a few drawback to this however:
-I have to ensure the report user only selects a single day, i.e. they cannot select a date range.
-I cant easily do counts for samePeriodLastMonth or Year.
Is there a better design i should consider that will still allow me to see counts of registered clients on any given day, along with allowing me to use SamePeriodLastMonth/Year functionality?
Would you mind uploading the structure of your fact and dim tables?
Just a thought bubble: if you would like to measure counts for a program over calendar years, I believe you would definitely need to create a Date dimension. Also depending on your reporting needs you might want to consider whether you need an Accumulating Snapshot Fact table.
Please find further details on this:
http://www.kimballgroup.com/2012/05/design-tip-145-time-stamping-accumulating-snapshot-fact-tables/
Cheers
Nithin

Storing large amount of boolean values in Rails

I am to store quite large amount of boolean values in database used by Rails application - it needs to store 60 boolean values in single record per day. What is best way to do this in Rails?
Queries that I will need to program or execute:
* CRUD
* summing up how many true values are for each day
* possibly (but not nessesarily) other reports like how often true is recorded in each of field
UPDATE: This is to store events that may or may not occur in 5 minute intervals between 9am and 1pm. If it occurs, then I need to set it to true, if not then false. Measurements are done manually and users will be reporting these information using checkboxes on the website. There might be small updates, but most of the time it's just one time entry and then queries as listed above.
UPDATE 2: 60 values per day is per one user, there will be between 1000-2000 users. If there isn't some library that helps with that, I will go for simplest approach and deal with it later if I will get issues with performance. Every day user reports events by checking desired checkboxes on the website, so there is normally a single data entry moment per day (or few if not done on daily basis).
This is dependent on a lot of different things. Do you need callbacks to run? Do you need AR objects instantiated? What is the frequency of these updates? Is it done frequently but not many at a time or rarely but a bunch at once? Could you represent these booleans as a mask instead? We definitely need more context.
Why do these need to be in a single record? Can't you use a 'days' table to tie them all together, then use a day_id column in your 'events' table?
Specify in the Day model that it 'has_many :events' and specify in the Event model file that it 'belongs_to :day'. Then you can find all the events for a day with just the id for the day.
For the third day record, you'd do this:
this_day = Day.find 3
Then you can you use 'this_day.events' to get all the events for that day.
You'll need to decide what you wish to use to identify each day so you query for a day's events using something that you understand. The id column I used above to find it probably won't work.
You could use the timestamp first moment of each day to do that, for example. Or you could rely upon the 'created_at' column of the table to be between the start and end of a day
And you'll want to be sure to thing about what time zone you are using and how this will be stored in the database.
And if your data will be stored close to midnight, daylight savings time could also be an issue. I find it best to use GMT to avoid that issue.
Good luck.

Efficiently retrieving ice_cube schedules for a given time period

I'm looking into using Ice Cube https://github.com/seejohnrun/ice_cube for recurring events.
My question is, if I then need to get any events that fall within a given time period (say, on a day or within a week), is there any better way than to loop through them all like this:
items = Records.find(:all)
items.each do |item|
schedule = item.schedule
if schedule.occurs_on?(Date.new)
#if today is a recurrence, add to array
end
end
This seems horribly inefficient but I'm not sure how else to go about it.
That's one approach - but what people do more often is end up denormalizing their schedules into a format that is conveniently queryable.
You may have a collection called something like ScheduleOccurrences - that you build each week / and then query that instead.
Its unfortunate it has to work this way, but sticking to the iCal way of managing schedules has led IceCube to need to format its data in certain ways (specifically ways that can line up with the requirements of the iCal RFC).
I've been doing some thinking recently about what a library would look like that shook away some of those restrictions, for greater flexibility like this - but its definitely still a bit off.
Hope this helps
I faced a similar problem and here was my approach:
Create a column on Event table to store the next occurrence date, and write a method which stores that value after_save. (method available through ice_cube. Perhaps index column too for faster querying.)
Then you can query the database for occurrences happening in the timeframe you need. See below:
Event.where(next_occurrence: Date.today.all_day)
Store EventOccurrences on a separate table.
Update the next_occurrence column for the rows returned to you by your query. Or something similar. This works for me because I'm running a daily job, so that update next_occurrence will run regularly. But you may need to tweak a bit.

Resources