I'm reviewing old data warehouse & I have encountered unusual 1: 1 relationship between factless fact table (Fact_contact) & Dim_Incident.
In general, Fact_Contact is used for recording cases/tickets/enquiries. Some of the customers are anonymous; therefore, there are uniqueCustRef & CustomerRef "facts" that are used for distinct count.
1:1 relationship between fact & dimension tables does not feel right. Is it a recommended solution? Currently, there is no documentation why it was designed the way it was.
Thank you.
You might be right; this does not look right.
The FactContact
should not have the incidentId
I do not know the requirements but logically thinking I would suggest the following;
IncidentType - what is the type of incident that is logged
FirstIncidentId - the first incident corresponding to the customer/IncidentType
FirstIncidentDate - Date of the above incident
LastIncidentId - the last incident corresponding to the customer/incidenttype - when there is only one incident you will have the firstincidentid and lastincidentid the same
LastIncidentDate - the date of the above incident
IncidentCount - the number of incidents for the customer/incidenttype combination
Hope this helps
Related
I hope you can be helpful in answering one question in regards to role-playing dimensions.
When using views for a role playing dimension, Does it then matter which view is referred to later in the analysis. Especially, when sorting on the role playing dimension, can this be done no matter which view is used?
Hope the question is clear enough. If not, let me know and I will elaborate.
Thanks in advance.
Do you mean you have created a view similar to "SELECT * FROM DIM" for each role the Dim plays? If that's all you've done then you could use any of these views in a subsequent SQL statement that joins the DIM to a FACT table - but obviously if you use the "wrong" view it's going to be very confusing for anyone trying to read your SQL (or you trying to understand what've you've written in 3 months time!)
For example, if you have a fact table with keys OrderDate and ShipDate that both reference your DateDim then you could create vwOrderDate and vwShipDate. You could then join FACT.OrderDate to vwShipDate and FACT.ShipDate to vwOrderDate and it will make no difference to the actual resultset your query produces (apart from, possibly, column names).
However, unless the applicable attributes are very different for different roles, I really wouldn't bother creating views for role-playing Dims as it's an unnecessary overhead and just going to cause confusion to anyone you've given access to at this level of the DB (who presumably have pretty strong SQL skills to be given this level of access?).
If you are trying to make life easier for end-users then either create these types of "views" in the models of the BI tool(s) they are using - and not directly in the DB - or, if they are being given access to the DB, then create View(s) across the Fact(s) and all their joined Dimensions
I have an Employee dimension that I am using SCDs and Surrogate keys to track changes over time.
Employee's business system key: EmployeeID
Employee Surrogate key: EmployeeSCDKey
I would like to have Manager information tracked over time as well. The managers are employees like everyone else and as such, I was thinking about having a ManagerSCDKey column in my Employee dimension like so:
Example:
This is the problem I am facing though. The arrow shows the boundary from one transform to the next. In the event that a Manager changes jobs (or some other type 2 SCD field) and a new surrogate key is created for them, that change won't be recognized until the next time the dimension is transformed.
By this I mean that the row in red won't appear until the second transformation, so any fact rows associated with Joe for this time will have outdated manager information.
I guess it boils down to this:
Is there a way to make this pattern work? (dimension with a key into itself?)
Or is there a better practice way to accomplish the same task? I would prefer to not maintain a manager dimension that is extremely similar to the employee dimension, but if that's best practice then so be it.
Here's a good discussion of some alternatives, I'm sure you'll find something that matches what you need: http://www.informationweek.com/software/information-management/kimball-university-five-alternatives-for-better-employee-dimension-modeling/d/d-id/1082326?page_number=1
I'd likely opt for some kind of 'reports to' bridge table, perhaps having natural keys rather than surrogate keys depending on how you want it to behave (and to solve your type 2 SCD table). You wouldn't need to have a separately created manager dimension, only have employee pointing to the bridge table twice.
Hope you all are doing good!
My requirement is -
I have a decision table having hundreds of records, from front end application let's say .net or bpm users enters some data and based on this input data I want to fire a dynamic automated query (with the values entered by user) on the decision table which should return me let's say 15 or more records and then I want to keep firing different queries on the resultset until I have filtered it out to just one record.
Could someone please help me on how this can be done ? Appreciate all your help.
Thanks,
Rao
Sorry brother, this is not possible since a decision table is nothing but a RuleSet with if-then rules unlike a Database table. You cannot use a DT to store you data.
DT is not for data storage rather to take a decision.
Intro
I am trying to decide how best to set up my database schema for a (Rails) model. I have a model related to money which indicates whether the value is an income (positive cash value) or an expense (negative cash value).
I would like separate column(s) to indicate whether it is an income or an expense, rather than relying on whether the value stored is positive or negative.
Question:
How would you store these values, and why?
Have a single column, say Income,
and store 1 if it's an income, 0
if it's an expense, null if not
known.
Have two columns, Income and
Expense, setting their values to 1 or 0 as
appropriate.
Something else?
I figure the question is similar to storing a person's gender in a database (ignoring aliens/transgender/etc) hence my title.
My thoughts so far
Lookup might be easier with a single column, but there is a risk of mistaking 0 (false, expense) for null (unknown).
Having seperate columns might be more difficult to maintain (what happens if we end up with a 1 in both columns?
Maybe it's not that big a deal which way I go, but it would be great to have any concerns/thoughts raised before I get too far down the line and have to change my code-base because I missed something that should have been obvious!
Thanks,
Philip
How would you store these values, and why?
I would store them as a single column. Despite your desire to separate the data into multiple columns, anyone who understands accounting or bookkeeping will know that the dollar value of a transaction is one thing, not two separate things based on whether it's income or expense (or asset, liablity, equity and so forth).
As someone who's actually written fully balanced double-entry accounting applications and less formal budgeting applications, I suggest you rethink your decision. It will make future work on this endeavour a lot easier.
I'm sorry, that's probably not what you want to hear and may well result in ngative rep for me but I can't, in all honesty, let this go without telling you what a mistake it will be.
Your "thoughts so far" are an indication of the problems already appearing.
1/ "Having seperate columns might be more difficult to maintain (what happens if we end up with a 1 in both columns?" - well, this shouldn't happen. Data is supposed to be internally consistent to the data model. You would be best advised preventing it with an insert/update trigger or, say, a single column that didn't allow it to happen :-)
2/ "Lookup might be easier with a single column, but there is a risk of mistaking 0 (false, expense) for null (unknown)." - no mistake possible if the sign is stored with the magnitude of the value. And the whole idea of not knowing whether an item is expense or income is abhorrent to accountants. That knowledge exists when the transaction is created, it's not something that is nebulous until some point after a transaction happens.
Sometimes I use a character. For example, I have a column gender in my database that stores m or f.
And I usually choose to have just one column.
I would typically implement a flag as an nchar(1) and use some meaningful abbreviations. I think that's the easiest thing to work with. You could use 'I' for income and 'E' for expense, for example.
That said, I don't think that's a good way to do this system.
I would probably put incomes and expenses in separate tables, since they appear to be different sorts of things. The only advantages I can think of for putting them in the same table are lost once the meanings are differentiated by flags rather than postitive and negative values.
I made a fact constellation schema with 2 fact tables and 16 dimension tables with 4 common dimension tables. One of the dimension table needs to be normalized because data from data source can have variable number of rows. Can I still call it fact constellation schema having a branch in dimension table??
I hope you understand what I am trying to say.
Cheers.
I know it's been a while just writing to help if any other people needs information about this topic. Normally fact constellation model is made up of star models where any artifical or natural hierarchy should not be present. But according to your needs you can add normalized (hierarchical) dimension tables. In this case your fact constellation made up of snowflakes instead stars.
You may still call it a Constellation Schema with Sliced Dimension Table.
This term is very much in Oracle Datwarehosing Book which I read around 7 years ago.
Regards,
Jit