Track multiple status in Transaction Fact Table - data-warehouse

I have to track the status of my business process for analysis purpose. I have seen a post where it is mentioned that we can keep the status in Transaction Fact Table against time/transaction type/service center and we can use the Accumulated fact table to study the process lag, I am wondering if few transactions have multiple status in a single day should I store all the status in Transaction Fact Table? Here I am assuming that my ETL is done at end of the business day.
Secondly should i keep all my key dimensions keys into Transaction Fact Table. Keys in this case are Transaction Type, Department id, Service_type, Service_id, Submission Channel or should I divide them in multiple fact tables?
Third if I need to report which department is meeting its SLA what would be the best approach, Calculate and keep track of Within SLA and Not Within SLA in Transaction Fact Table or I should compute this value at run time?
Thanks in advance for your help and assistance.

For status tracking you should have:
A transaction table where ony events show up (but does not provide event tracing)
An accumulating snapshot table where each process's status are tracked/updated as they happen.
As for the keys, you should keep as much detail as possible. No need to delete keys if they may hold valuable information in the future.

Related

delphi freeze and activate order

I am trying to make small POS system, using FDMem table to fill items for the customer.
In case the customer forget his money or go to change a product .. and the customers in the line move and new customer shows up asking to start sale process, how can I save or freeze the current state of the FDMem table holding the products of the previous customer and start a new order and finish the sale. Then, when the first customer comes back, how can I resume the previous sales process and reactivate the previous FDMem table with it's items?
Add one more column in the FDMem table. That column could take 3 values: 'Finalized', 'Opened' and 'Paused' depending how the "status" of the sales is going. You can then navigate between the records.
So if I understand you correctly you want to be able to handle multiple customers or orders at the same time.
One solution one would guess would be to assign each customer unique number and them add that number as additional field to each record in your database. Now assigning unique values to online customers isn't particularly difficult. In fact you already have such information in form of session ID. But doing something like this for customers in a physical store is not feasible.
But there is another similar solution. Most Tax authorities require any shops to issue a receipt for any purchase made. And each of those receipts needs to have a unique number. So you could reserve a new receipt number when customer gets to your POS for the first time and you start adding its items to the system. Then you might want to have another table with all reserved receipt numbers and their current status, so you can know which receipt has been finished which one is still pending completion or which one was cancelled.

FactLoanVolume - One or Many Fact Tables

I am designing a Fact table to report on loan volume. The grain is one row per loan transaction. A loan has a few major milestones that we report on: In order of sequence, these are Lock Volume, Loan Funding Volume and Loan Sales Volume.
I have Lock Date, Loan Funding Date and Loan Sale Date as FK (there are other dimensions in addition to these) in the Fact table to role playing dimensions off my DimDate table.
My question is, should I create separate Fact Tables to report volume for each major milestone or should I keep all of this in one Fact Table and use a "far in the future" date (e.g., 12/31/2099) for a milestone on a loan that has not been met?
I have read the Kimball books but I didn't find a definitive answer(if one even exists).
Thanks
You may profit from immutable design, by setting the granularity more fine to the milestone level.
This gives you columns
transaction_id
milestone_type
milestone_date
in you fact table. The actual milestone of a transaction is the milestone from the last (most recent) record.
The one adavatage is that you may add new milestone types in the future, but the main gain is, that you never update your fact table - you use inserts only.
You may safe rollback a wrong ETL load, simple by deleting the records; which is while using updates much complicated.
You may also implement more complicated state diagrams, e.g. in case when some milestone is revoked and the transaction falls back in the previous state.
The question if you use one fact table or more depends on the fact if your milestones are homogenous or not. If the milestones have distinct attributes, you may get a more clean desing using dedicated fact tables, but the queries get complicated.
You would rather have only one Fact Table.
That following question and its conversation answer pretty well to the general question of " One or multiple fact tables? ", but maybe not to how to deal with your specific problem of dates.

Firestore: Is it possibile to have duplicate auto generated ID across different subcollection?

I have a collection of Shop, every shop have a subcollection of item.
Item document has a property isAvailable which is a boolean.
Then, I need to put item in the user's shopping cart.
It's important to observe item isAvailable value to inform in real-time that an item is no longer available and auto-remove from all shopping cart.
So I decided to put in the Item object an array of user id and create a duplicated list of all objects at root level of db to simulate an observable shopping cart (I thought it's a good way to structure for this purpose, if you have bettere ideas just tell me).
My problem is: since I duplicate all the subcollections in a single collection and use the same document id, there may be duplicates in the final big collection, is it right?
In short, auto-generate iDs are statistically unique with a good enough probability to consider it all the time. See here.
Also in firestore, the time-based calculation has been removed so the ids are not chronological anymore compared to the real-time database.
Regarding your data structure, I wouldn't recommend duplicating as one of the benefits of firestore is to avoid that, versus real-time database which in some cases you would need to do that.
Also avoid arrays as much as you can and use the object instead of as you can query them.
As I understand, you just want to make sure the items are available. I suggest you do a check when a user wants to proceed to checkout or anytime the page is refreshed and this way you ensure no unavailable product is purchased. That's it.
If you still have a problem, perhaps give me a snapshot of your data rather than explaining, something like
ShopsCollection
- itemDocument
- isAvailable : true

Transaction lifecycle tracking in data warehouse

How do you store facts within which data is related? And how do you configure the measure? For example, I have a data warehouse that tracks the lifecycle of an order, which changes states - ordered, to shipped, to refunded. And for a state like 'refunded', it is not always there. So in my model, I am employing the transaction store model, so every time the order changes state, it is another row in the fact table. So, for an order that was placed in april, and refunded in may, there will be two rows - one with a state of 'ordered' and another with a state of 'refunded'. So if the user wanted to see all the orders placed/ordered in april, and wanted to see how many of 'those' orders got refunded, how would he see that? Is this a MDX query that will be run at runtime? Is this is a calculated measure I can store in the cube? How would I do that? My thought process is that it should be a fact that the user can use in a pivottable, but I'm not sure.....
One way to model this would be to create a factless fact table to model events. Your ORDERS fact table models the transaction amount, customer information etc, while the factless fact table (perhaps called ORDER_STATUS) models any events that occur in relation to a specific order.
With this model, it's easy to count or add all transactions based on their order status by checking for existence of records in the factless fact table.

Generating sequential numbers in multi-user saas application

How do people generate auto_incrementing integers for a particular user in a typical saas application?
For example, the invoice numbers for all the invoices for a particular user should be auto_incrementing and start from 1. The rails id field can't be used in this case, as it's shared amongst all the users.
Off the top of my head, I could count all the invoices a user has, and then add 1, but does anyone know of any better solution?
Typical solution for any relation database could be a table like
user_invoice_numbers (user_id int primary key clustered, last_id int)
and a stored procedure or a SQL query like
update user_invoice_numbers set last_id = last_id + 1 where user_id = #user_id
select last_id from user_invoice_numbers where user_id = #user_id
It will work for users (if each user has a few simultaneously running transactions) but will not work for companies (for example when you need companies_invoice_numbers) because transactions from different users inside the same company may block each other and there will be a performance bottleneck in this table.
The most important functional requirement you should check is whether your system is allowed to have gaps in invoice numbering or not. When you use standard auto_increment, you allow gaps, because in most database I know, when you rollback transaction, the incremented number will not be rolled back. Having this in mind, you can improve performance using one of the following guidelines
1) Exclude the procedure that you use for getting new numbers from the long running transactions. Let's suppose that insert into invoice procedure is a long running transaction with complex server-side logic. In this case you first acquire a new id , and then, in separate transaction insert new invoice. If last transaction will be rolled back, auto-number will not decrease. But user_invoice_numbers will not be locked for long time, so a lot of simultaneous users could insert invoices at the same time
2) Do not use a traditional transactional database to store the data with last id for each user. When you need to maintain simple list of keys and values there are lot of small but fast database engines that can do that work for you. List of Key/Value databases. Probably memcached is the most popular. In the past, I saw the projects where simple key/value storages where implemented using Windows Registry or even a file system. There was a directory where each file name was the key and inside each file was the last id. And this rough solution was still better then using SQL table, because locks were issued and released very quickly and were not involved into transaction scope.
Well, if my proposal for the optimization seems to be overcomplicated for your project, forget about this now, until you will actually run into performance issues. In most projects simple method with an additional table will work pretty fast.
You could introduce another table associated with your "users" table that tracks the most recent invoice number for a user. However, reading this value will result in a database query, so you might as well just get a count of the user's invoices and add one, as you suggested. Either way, it's a database hit.
If the invoice numbers are independent for each user/customer then it seems like having "lastInvoice" field in some persistent store (eg. DB record) associated with the user is pretty unavoidable. However this could lead to some contention for the "latest" number.
Does it really matter if we send a user invoices 1, 2, 3 and 5, and never send them invoice
4? If you can relax the requirement a bit.
If the requirement is actually "every invoice number must be unique" then we can look at all the normal id generating tricks, and these can be quite efficient.
Ensuring that the numbers are sequenctial adds to the complexity, does it add to the business benefit?
I've just uploaded a gem that should resolve your need (a few years late is better than never!) :)
https://github.com/alisyed/sequenceid/
Not sure if this is the best solution, but you could store the last Invoice ID on the User and then use that to determine the next ID when creating a new Invoice for that User. But this simple solution may have problems with integrity, will need to be careful.
Do you really want to generate the invoice IDs in an incremental format? Would this not open security holes (where in, if a user can guess the invoice number generation, they can change it in the request and may lead to information disclosure).
I would ideally generate the numbers randomly (and keep track of used numbers). This prevents collisions as well (Chances of collision are reduced as the numbers are allocated randomly over a range).

Resources