How to convert pivot column structure to dimension table - data-warehouse

I want to convert a table structure with pivoted columns into a dimension table and fact table.
how to create a medication dimension table from the data with below structure with model enforcing star schema

For some reason I cannot see the picture you uploaded to show data structure. I am assuming that you want to create a relationship between a medicine dim and a patient. If this is true then you need a bridge table between your medicine and your Fact table. The bridge table should have one row per patient ID and a specific medicine ID. That means if a patient is taking 4 medicine then there are 4 rows in the bridge table for same patient ID but different medicine ID. A patient ID should exist in the Fact table for a day so when Fact is join with the bridge table it shows all the medicine that patient is taking.

Related

How to count cases with the same ID but different variables in SPSS

I have a data set which has 4420 attendances to a medical department from 1120 people. Each person has a unique ID number and other columns are demographics and primary care provider. I want to filter the data so I can work out how many times each person attends the department and then analyse the data by demographics eg primary care provider or age. It shows whether each attendance is primary or duplicate but I can't figure out how to work out attendances per person.
If what you want to do is to count the number of times each person has visited (assuming each one is represented by a single row in the data), use the AGGREGATE command breaking on the ID variable to add the number of instances to the file as a new variable. In the menus, Data>Aggregate, move the ID variable into the box for Break Variable(s), check the box for Number of cases under Aggregated Variables, change the default N_BREAK to another name if you want, and click OK. That will add a new variable to the data with the number of instances for each unique ID.

Split fact table because of one missing foreign key?

Imagine that we have two different messages:
CarDataLog
CarStatusLog
CarDataLog contains data which has a direct relation to a car and the corresponding Person and contains data about the car.
CarStatusLog contains data about the same car as mentioned above which had a customer in the log included. But this time the data is a status. For a field like: "CleaningState": "NotCleaned" or "Cleaned".
Both of the log messages contain a Car_ID. Would we create one Fact table with the foreign keys to Car and Person and have the risk the person_id is null sometimes because it is not given.. Or would a better approach be to create two fact tables with the risk of having the 'grain' spreaded out?
The use case would be: get data for a specific car, including the states it had and the Person first name.
I am new to data warehousing and I hope someone can assist me with this issue?
A standard practice in data warehousing is to make a dummy row for dimension tables that is used to match "UNKNOWN" data. This prevents NULLS in the foreign keys in the fact table.
Depending on your use case, you may have multiple types of "UNKNOWN" data. For example, you could use a key of -1 for "UNKNOWN" and -2 for "NOT APPLICABLE" dimensional data.
See also: https://www.kimballgroup.com/2010/10/design-tip-128-selecting-default-values-for-nulls/
You need dims as Car_dim, Person_dim, Status_dim (as values CleaningState,NotCleaned" or "Cleaned), and Date_dim. Person_dim can have a row of "Unknown" person name when you get a null person name.
Dim and Fact tables have parent/child relationship that means you have to load data in Dim first (Dim is a parent) and then you load into a Fact (child) table.
Load dim IDs from above Dims in your Fact table based on the data you get. Make sure the 2 logs you have date fields in them so you can join both logs on a Car_id and when a date in both logs matches for that Car_id.
If you get a scenario when a Car_id exists in CarDataLog but not in CarStatusLog, then you need to create a row of "Unknown Status" in the Status_dim so you can use it in the Fact table. Good Luck!

How to retrieve all of one table and all joined records in another with the other table's columns in ActiveRecord

I would like to retrieve all of one table and all joined records in another.
I would like to have all columns from both tables
This is extremely simple in SQL
e.g.
SELECT *
FROM students
JOIN teachers
ON students.id = teachers.student_id
How can I do the same in rails?
I've tried variations on
Student.includes(:teacher)
and
Student.joins(:teacher).includes(:teacher)
The join is working, but I cannot access columns from Teacher table
Note that the end goal is simply to be able to create an instance variable in the controller so that I can access both student and teacher data in the view
Student.includes(:teacher) will return ActiveRecord::CollectionProxy which means if take particular object in this collection, it will be Student class object.
Unlike sql query fired and returning data from 2 tables, it does not work same in rails, you get data only from students column which will relate associated record in teachers table because it represent Student model.
You can access further teachers data like,
students = Student.includes(:teacher)
students.last.teacher.name
In above no new query will get fired in database when you call teacher association on object

How to store data in fact table with multiple products in an order in data warehouse

I am trying to design a dimensional modeling for data warehousing for one of my project(Sales Order). I'm new to this concept.
So far, I could understand that the product, customer and date can be stored in the dimension table and the order info will be in the fact table.
Date_dimension table structure will be
date_dim_id, date, week_number, month_number
Product_dimension table structure will be
product_dim_id, product_name, desc, sku
Order_fact table structure will be
order_id, product_dim_id(fk), date_dim_id(fk), order_quantity, order_total_price, etc
If a order is place with 2 or more number of product, will there be repeated entry in the order_fact table for the same order_id, date_dim_id
Please help on this. I'm confused here. I know that in a relational database, order table will have one entry per order and relation between the product and order will be maintained in a different table having the order_id and product_id as the foreign key.
Thanks in advance.
This is a classic case where you should (probbaly) have two fact tables
FactOrderHeader and FactOrderDetail.
FactOrderHeader will have a record for each order, storing information regarding the value of the order and any order level discounts; though they could be expressed as an OrderDetail record in some cases.
FactOrderDetail will have a record for each order line, storing information regard the product, product cost, product sale price, number of items, item discount. etc.
You may need to have a DimOrderHeader as well, if there are non-Fact pieces of information that you want to store, for example, date the order was taken, delivered, paid.

Creating relationship in Core Data to perform deletion

I am new to core data just started learning the new ideas in core data.
I have core data database which has three entity Student,Department and an entity for Mapping Student and department.Let name it as StudentDepartment
Student will have all student details with a primary key studentID
Department will have department details with a primary key departmentID
StudentDepartment will have studentID and DepartmentID as foreign key.
Multiple student can be enrolled in a department and a same student can be enrolled to multiple department.
How to create this schema in core data.
If am deleting a studentID in student table subsequent row should be deleted in StudentDepartment table. Similarly if am deleting departmentID in department table subsequent rows should be deleted in StudentDepartment.How to make this relationship by using core data.
Please provide me a xcmodel.
CoreData isn't a database, it's an object store that happens to (sometimes) be implemented on top of a relational database.
The practical result of that is that you really don't need to explicitly create a separate table for relationship mapping. Instead you create your two entities and then create a relationship between the two. From your description, it sounds like you want a many-to-many relationship between the two. At an implementation level, core data will magically create the needed relationship table.
Additionally, you can establish a delete-rule for each side of the relationship that mandates what to do when an item is deleted. Pin this case, you'll want to set the delete rule for both to nullify, which will break the relationship when either end is deleted.

Resources