better design for fact table where each row has a Start & End Date - data-warehouse

My fact table contains details for clients who attend a course.
To ensure i can get a list of clients registered on any particular day, I have not related the date dimension to the fact table.
Instead i created a measure that does basic between logic (where startDate <= selectedDate && endDate >=SelectedDate)
This allows me to find all clients registered on one single selected day.
There are a few drawback to this however:
-I have to ensure the report user only selects a single day, i.e. they cannot select a date range.
-I cant easily do counts for samePeriodLastMonth or Year.
Is there a better design i should consider that will still allow me to see counts of registered clients on any given day, along with allowing me to use SamePeriodLastMonth/Year functionality?

Would you mind uploading the structure of your fact and dim tables?
Just a thought bubble: if you would like to measure counts for a program over calendar years, I believe you would definitely need to create a Date dimension. Also depending on your reporting needs you might want to consider whether you need an Accumulating Snapshot Fact table.
Please find further details on this:
http://www.kimballgroup.com/2012/05/design-tip-145-time-stamping-accumulating-snapshot-fact-tables/
Cheers
Nithin

Related

Update several Tbl Fields based on Start/End Dates

Good day to everyone! Hope all is well!
I am looking to run an update query or a group of queries that looks at my Date_Start and Date_End to determine if the Units (quantity of the respective record) fall in my defined current quarter 1/2/3/4 from another table (this table is a master table I’m using to provide the dates that I need to consider for defining the quarters).
I’ve been able to create queries that do this and then join them together to basically display the units out by quarter based on their respective start/end dates. The problem I am running into is this process takes a decent amount of time for the queries to populate that will drastically effect other processes down the line.
Thus we get to my desire. I am trying to no avail to create an update query that will update the quarter fields in my table based off of the queries I built to determine if the records start/end date fall in the respective quarter. I figure that running this update when records change will be an ok run time vs when I’m running reports or running an email script for the reports.
I have tried pulling in the table and query, joining them as equal on ID (the query pulls in the table's IDs), and selecting my field “CQ1” from the table, and setting the Update ether the respective field from the table or the query (which is the same as the field in the table).
All I get are the current values of the field in the data sheet view and an error of “Operation must use an updateable query.”
I have even tried placing a zero to see if that would do it with no luck. I have verified that all the fields are the same data type.
What am I doing wrong? Thanks!
Apologies to everyone.....I think my conscious brain was trying to overly complicate the process and while talking to a buddy about my issue distractedly created a new update query that worked. It all tied down to that I forgot to put a criteria on my quarter filed of is not null I believe. Thanks for anyone that has read this and is responding while I’m typing this or for those of you formulating a response.

Formula to sort rows of bank statement

I export my bank transactions to a PDF, that I then paste to a google spreadsheet.
Problem is: I may need to sort the transactions on my spreadsheet, and after reordering by date the amounts and balance may "shift" when there are several transactions on the same day:
It's not a big problem to me, but my accountant is all lost.
I would like to find a way to identify the orders of the transactions of a same date, by comparing the amounts/balance to the final balance of the previous date.
I managed to create a formula using a MATCH that would identify the first transaction of a specific date, but if I were to make it work for 10-20 potential transactions within a same date, it would get stupidly long and complex. I may eventually do that, but before i'd like to know if there is an easier solution.
I can add as many columns as I want, and I don't mind using scripts.
What I cannot do is create a column that would recalculate the balance according to the order the transactions are in. That would be the easiest solution, but if my accountant were to compare with what is on the real bank account, he would find discrepancies and be just as lost.
Thank you!
As #gries said:
Since your PDF contains the transactions already ordered the way you want you can assign to each of them an incremental ID.
In such a way, you will be able to restore the initial order ordering by the transaction ID instead of using the date that could be repeated.

Modelling recurrent items (expenses) as records with Rails

I am writing what could be defined as an accountancy/invoicing app using Rails 5. I am in need of implementing a section that predicts the company's cashflow in the future. So far I've got the following:
Actual bank movements and balances (in the past), imported from the bank
Future invoices (income) which are expected to be paid on a certain date
Future one-time expenses which are expected to be paid on a certain date
Using these three sets of data, I can calculate, for any given date in the future, the sum of: the last known bank balance, plus all the future invoices values coming IN, minus all the future expenses going OUT, so I get, theoretically, the expected balance of the company for any given date.
My doubt arises when it comes to recurrent expenses (or potentially incomes). Given that all of the items I mentioned before (bank movements, invoices and expenses) are actual ActiveRecord records stored in my database, I'm not sure about how to treat the recurrent expenses, for example:
Let's imagine I want to enter a known future recurrent paycheck of a certain employee, which is $2000 every first day of the month.
1- Should I generate at some point the next X entries and treat them as normal future expenses (each with its own ID, date and amount)?
2- The other option I've thought of is having some kind of "declaration" on the nature of the recurrent expense, as in "it's $2000 every day 1 of month until -forever-", similarly to a cronjob. But, if I were to take this approach, I'd like to have an ActiveRecord - similar interface, so that I can do something like:
cashflow = []
last_movement = BankMovement.last
value = last_movement.balance
(last_movement.date..(last_movement.date + 12.months)).each do |day|
value += Invoice.pending.expected_on(day).sum(:gross_amount)
value -= Expense.pending.expected_on(day).sum(:gross_amount)
value -= RecurringExpense.expected_on(day).sum(:gross_amount)
cashflow.push( { date: day, balance: value } )
end
This feels almost right but, I'm not sure about how to link the actual expense when it comes with the recurrent/calculated one. How can I then change the date if the expense gets paid the day after it was supposed? I need to have an actual record of each one of those, at least whenever they are "consolidated".
I'm not really sure if I was clear enough with my trouble here, so, should anyone want and have some spare time to help me out, please feel free to ask for any extra relevant info, I'd really appreciate some help, especially if we can find a way of doing this "the Rails way"!

Storing large amount of boolean values in Rails

I am to store quite large amount of boolean values in database used by Rails application - it needs to store 60 boolean values in single record per day. What is best way to do this in Rails?
Queries that I will need to program or execute:
* CRUD
* summing up how many true values are for each day
* possibly (but not nessesarily) other reports like how often true is recorded in each of field
UPDATE: This is to store events that may or may not occur in 5 minute intervals between 9am and 1pm. If it occurs, then I need to set it to true, if not then false. Measurements are done manually and users will be reporting these information using checkboxes on the website. There might be small updates, but most of the time it's just one time entry and then queries as listed above.
UPDATE 2: 60 values per day is per one user, there will be between 1000-2000 users. If there isn't some library that helps with that, I will go for simplest approach and deal with it later if I will get issues with performance. Every day user reports events by checking desired checkboxes on the website, so there is normally a single data entry moment per day (or few if not done on daily basis).
This is dependent on a lot of different things. Do you need callbacks to run? Do you need AR objects instantiated? What is the frequency of these updates? Is it done frequently but not many at a time or rarely but a bunch at once? Could you represent these booleans as a mask instead? We definitely need more context.
Why do these need to be in a single record? Can't you use a 'days' table to tie them all together, then use a day_id column in your 'events' table?
Specify in the Day model that it 'has_many :events' and specify in the Event model file that it 'belongs_to :day'. Then you can find all the events for a day with just the id for the day.
For the third day record, you'd do this:
this_day = Day.find 3
Then you can you use 'this_day.events' to get all the events for that day.
You'll need to decide what you wish to use to identify each day so you query for a day's events using something that you understand. The id column I used above to find it probably won't work.
You could use the timestamp first moment of each day to do that, for example. Or you could rely upon the 'created_at' column of the table to be between the start and end of a day
And you'll want to be sure to thing about what time zone you are using and how this will be stored in the database.
And if your data will be stored close to midnight, daylight savings time could also be an issue. I find it best to use GMT to avoid that issue.
Good luck.

How would you build this daily class schedule?

What I want to do is very simple but I'm trying to find the best or most elegant way to do this. The Rails application I'm building now will have a schedule of daily classes. For each class the fields relevant to this question are:
Day of the week
Starting time
Ending time
A single entry could be something such as:
day of week: Wednesday
starting time: 10:00 am
ending time: Noon
Also I must mention that it's a bi-lingual Rails 2.2 app and I'm using the native i18n Rails feature. I actually have several questions.
Regarding the day of the week, should I create an extra table with list of days, or is there a built-in way to create that list on the fly? Keep in mind these days of the week will have to be rendered in English or Spanish in the schedule view depending on the locale variable.
While querying the schedule I will need to group and order the results by weekday, from Monday to Sunday, and of course order the classes within each day by starting time.
Regarding the starting time and ending time of each class would you use datetime fields or integer fields? If the latter how would you implement this exactly?
Looking forward to read the different suggestions you guys will come up with.
I would just store the day of the week as an integer. 0 => Monday ... 6 => Sunday (or any way you want. ie. 0 => Sunday). Then store the start time and end time as Time.
That would make grouping really easy. All you would have to do is sort by the day of the week and the start time.
You can display this in multiple ways, but here is what I would do.
Have functions like: #sunday_classes = DailyClass.find_sunday_classes that returns all the classes for Sunday sorted by start time. Then repeat for each day.
def find_sunday_classes
find_by_day_of_week(1, :order -> 'start_time')
end
Note: find_by probably should have id at the end but that's just preference in how you want to name the column.
If you want the full week then call all seven from the controller and loop trough them in the view. You could even create detail pages for each day.
Translation is the only tricky part. You can create a helper function that takes an integer and returns the text for the appropriate day of the week based on local.
That's very basic. Nothing complicated.
If your data is a Time then I would store that as a Time - otherwise you will always have to convert it out of the database when you do date and time related operations on it. The day is redundant data, as it will be part of the time object.
This should mean that you don't need to store a list of days.
If t is a time then
t.strftime('%A')
will always give you the day as a string in English. This could then be translated by i18n as required.
So you only need to store starting time and ending time, or starting time and duration. Both should be equivalent. I would be tempted to store ending time myself, in case you need to do data manipulations on ending times, which therefore won't have to be calculated.
I think most of the rest of what you describe should also fall out of storing time data as instances of Time.
Ordering by week day and time will just be a matter of ordering by your time column. i.e.
daily_class.find(:all, :conditions => ['whatever'], :order => :starting_time)
Grouping by day is a little more tricky. However this is an excellent post on how to group by week. Grouping by day will be analogous.
If you are dealing with non-trivial volumes of data, it may be better to do it in the database, with a find_by_sql and that may depend on your database's time and date functionality, but again storing the data as a Time will also help you here. For example in Postgresql (which I use), getting the week of a class is
date_trunc('week', starting_time)
which you can use in a Group By clause, or as a value to use in some loop logic in rails.
Re days-of-week, if you need to have e.g. classes that meet 09:00-10:00 on MWF, then you could either use a separate table for days a class meets (keyed by both class ID and DOW) or be evil (i.e. non-normalized) and keep the equivalent of an array of DOW in each class. The classic argument is this:
The separate table can be indexed in a way to support either class-oriented or DOW-oriented selects, but takes a bit more glue to put the entire picture together for a class.
The array-of-DOW is simpler to visualize for beginning programmers and slightly simpler to code about, but means that reasoning about DOW requires looking at all classes.
If this is only for your personal class schedule, do what gets you the value you're looking for, and live with the consequences; if you're trying to build a real system for multiple users, I'd go with a separate table. All those normalization rules are there for a reason.
As far as (human-readable) DOW names, that's a presentation-layer issue, and shouldn't be in the core concept of DOW. (Suppose you decided to move to Montreal, and needed French? That should be another "face" and not a change to the core implementation.)
As for starting/ending times, again the issue is your requirements. If all classes begin and end at hour (x:00) boundaries, you could certainly use 0..23 as the hours of the day. But then your life would be miserable as soon as you had to accommodate that 45-minute seminar. As the old commercial said, "Pay me now or pay me later."
One approach would be to define your own ClassTime concept and partition all reasoning about times to that class. It could start with a simplistic representation (integral hours 0..23, or integral minutes after midnight 0..1439) and then "grow" as needed.

Resources