I'm creating a dimensional model about a "calls recording system", for a VoIP service.
I'll give demonstrate just a little example to show my question.
Suppose I have a fact that represents a single call. And I have a dimension called Client, and another one called Provider. (pretend that there are other dimensions, like Date of course, and etc...)
(Dimension)Client ---> (Fact)Call <--- (Dimension)Provider
With this, i'll be able to see how many calls a client did, or how many calls were sent through a provider, and other questions.
And lets suppose that one client is associated with a provider, and one provider can have many clients.
So, here comes the question. Hhow can I create a query like: What clients each provider has?
It seems to be a query that is just between both dimensions. I cant involve the fact on that, because if a client never used the service, he wont be on the calls fact table, and he wont apper on this "Clients per Provider" query.
I was thinking with myself that one way to do that would be by creating a Role-Playing-Dimension, a view of the Client dimension and add it directly to the Provider dimension, just to do queries like this. It would be something like this:
(Dimension)Client ---> (Fact)Call <--- (Dimension)Provider <--- (Dimension)View Client
Of course, with this approach the user must be very carefull to dont use this View Client dimension with the fact table, because it would duplicate fact rows.
So, is this one of the situations where I need to use the famous factless fact tables?
Whats the right way to do this?
Thanks!
Role-playing dimensions should be used when you are "recycling" a dimension to be used multiple times in the same fact table (i.e Date of Call, Date of Service, etc).
It doesn't sound like that's what you're looking for. Instead, if the relationship is truly one to many, then I would just add the provider ID directly on the client dimension (no need for a view or anything), with the recognition that this relationship has nothing to do with the facts.
Essentially, think of the "provider" as just an attribute snowflaked off of client, when it comes to this sort of query.
However, it sounds like you might want to be sure that you don't have a many to many relationship between Clients and Providers (a client can use multiple providers, and a provider can have multiple clients). A many-to-many relationship is modeled dimensionally as a fact table. Your fact table could be a snapshot of the current point in time, with or without history. Just two columns are needed, Client and Provider. If you wanted to keep a record of the client/provider relationship by some timeframe, you'd just add a date stamp.
Note that a factless fact will work to model the one-many relationship as well (and if the model changes on the back end, your ETL is already done..)
Related
When it comes down to good RESTfull setup, what is the best practice for providing results that pertain to the owner as the requestor and results that pertain to a user wanting data owned by another user.
I have read that a resource should have max 2 base URLs so how to handle say,
Get all items for authenticated user
Get a single item for Authenticated user
Get all items for a particular user
Get a single item for a particular user
Although your question is a bit unclear, it seems to me you might mix up "Resources" as in HTTP resources, and Model objects or database rows.
The two do not necessarily have a 1-to-1 relationship, or even 1-to-2 relationship as you seem to imply. You can expose a database row in multiple "forms" as resources, there is no limitation how many times you can aggregate, transform or publish the same information, as long as those are all semantically different things.
So, back to your problem. You can publish resources pertaining to the authenticated user, and just users independently which might also contain the current user. With an URI structure for example like this:
/currentuser
/user/1
/user/2 <- might be the same as /currentuser
/user/3
...
There also could be a list of users recently logged in:
/recentuser/444
/recentuser/445 <- might be again /currentuser
...
That would be a third reference on the same user, but it is ok, because all of those have a different meaning, might even have different representations to offer (one might offer more information than others).
First of all, I want to apologize for my terminology. I am not entirely sure what to call what I am looking for, so I can’t google for answers. But here is my problem.
I am working on a Rails application that stores information about different websites and provides various services for them. I will call these services ‘Products.’ One website can be subscribed to several products, and a product can be served to various websites. So here is a very simple association scheme for these relationships:
At least, it would have been simple, but the problem is that the Settings model (shown in red on this diagram) is different for each product: for one product, it will have one number of fields and data types, for another it will have a different number of fields with different data types. On the other hand, the Faq and Description are the same, so if I redraw the diagram as follows:
I will get another problem: too much repetition (shown in blue on the diagram). Ideally, I want some kind of modification of the first diagram, where the Product model will choose differens Settings models depending on a parameter that I pass to it:
So that a request website.products.find(1).settings will return the model Settings1, while a request website.products.find(2).settings will return a completely different model, Settings2.
Is this achievable in Rails? If not, how would you organize such data?
I am trying to learn RavenDB by replacing my RDBMS in a project that I've already worked on so that I'm using it in a real situation. I've come to a standstill while trying to create the database, and I'd love to know the best way to model this in a document database. Every possibility I come up with either ends up looking like a relational database or ends up repeating vasts amount of information. Repeating the information in the database isn't a big deal, but keeping it all up to date when changes occur would be.
I'm hoping that I'm stuck in SQL mode and I'm just completely unable to see an obvious answer.
Here are the basic objects I need to record data for:
-Event
-Person
-Organization
-Cabin
Simple Requirements:
-A person can be a part of multiple organizations.
-An organization can have many members (people).
-A person can attend multiple events.
-An event has many people that attend.
-Some details about a cabin may change depending on the event (e.g. Accommodations).
Complex Requirements:
-I need to be able to reserve cabins for an event so that a single cabin is not used by two events at once. (with RDBMS I would just create an "EventCabins" table).
-I need to be able to record which people are attending an event. People attending an event will have information associated with them that is not part of Person or Event.
-I need to be able to record which organizations are attending an event. Organizations attending will have information associated with them that is not part of Organization or Event.
-I need to be able to record which People are assigned to which cabins in a particular event.
-I need to be able to record which People are attending a particular event as a part of an organization (it's not required to attend as a part of an organization). Even though a person can belong to more than one organization, he/she can only attend as a part of one of those organizations for a particular event. He/she might attend as a part of a different organization for another event.
-In the program, the user will be looking at only one event at a time. In that event, the user can look at attenders grouped by cabin or grouped by organization.
It seems obvious that I will need separate collections for Events, People, Organizations, and Cabins. Fulfilling the complex requirements above is where I hit the wall.
Do I put Attenders inside the Event collection? If so, then what do I do with Cabins and Organizations?
Do I create a separate collection for Attenders? If so, then there will be 4 different related collections that I will need to store Ids for and query at various times (Organizations, Cabins, Events, People). This seems opposite of the document database approach.
Thanks!
It seems to me that you should just use a relational database for this project.
If you want to use RavenDB I would suggest to use completely separated collections for all of these objects, but keeping references to other documents. Then you could query database using .Include functionality. And the best way - to create map/reduce indecies for all of the possible cases, like an index returning object for Event filled with all of invited people.
I'm in the middle of a fictional scenario project where I have allowed multiple users for a company to log in, create records, and so on, who all connect to the one database. They can all records absence records, attendance records, and so on.
What I want to do however, is use this same schema but expands this to allow several companies to have their own databases using the same schema. So each company will have their own data, but all companies use the same data model. In other words all company's can create absence records, but they each only have access to their own absence records that they created themselves.
How can I achieve this?
All I need is two or three files for this, I'm not going commercial with it in case you guys think I'm cutting corners at someone else's expense!
Something as simple as an if-else that decides which file to use would be very useful to me, so if such a line of code exists please let me know.
I think you are doing it wrong (unless you have a really good reason to have a database for each company), because it seems like you are repeating your data model over and over while introducing unnecessary complexity to your code.
Try to have all the companies in one DB/tables with having separated by the company_id.
Ex: data structure would be as follows
companies table
id
name
users table
id
user_name
company_id
However if you really want to connect to multiple databases, check this SO question.
I'm learning Rails by building a simple site where users can create articles and comment on those articles. I have a view which lists a user's most recent articles and comments. Now I'd like to add user 'profiles' where users can enter information like their location, age and a short biography. I'm wondering if this profile should be a separate model/resource (I already have quite a lot of fields in my user model because I'm using Authlogic and most of it's optional fields).
What are the pros and cons of using a separate resource?
I'd recommend keeping profile columns in the User model for clarity and simplicity. If you find that you're only using certain fields, only select the columns you need using :select.
If you later find that you need a separate table for some reason (e.g. one user can have multiple profiles) it shouldn't be a lot of work to split them out.
I've made the mistake of having two tables and it didn't buy me anything but additional complexity.
Pros: It simplifies each model
Cons: Managing 2 at once is slightly harder
It basically comes down to how big the user and profile are. If the user is 5 fields, and the profile 3, there is no point. But if the user is 12 fields, and the profile 20, then you definitely should.
I think you'd be best served putting in a separate model. Think about how the models correspond to database tables, and then how you read those for the various use cases your app supports.
If a user only dips in to his actual profile once in a while but the User model is accessed frequently, you should definitely make it a separate object with a one-to-one relationship. If the profile data is needed every time the User data is needed, you might want to stick them in the same table.
Maybe the location is needed every time you display the user (say on a comment they left), but the biography should be a different model? You'll have to figure out the right breakdown, but the general rule is to structure things so you don't have to pull data that isn't being used right away.
A user "owns" various resources on your site, such as comments, etc. If you separate the profile from the user then it's just one more resource. The user is static, while the profile will change from time to time.
Separating it out would also allow you to easily maintain a profile history.
I would keep it separate. Not all your users would want to fill out a profile, so those would be empty fields sitting in your user table. It also means you can change the profile fields without changing any of the logic of your user model.
Depends on the width of the existing user table. Databases typically havea limit to the number of bytes a recird can contain. I fyou are close to (or over which you can usually do if you have lots of fields with null values) the limit, I would add a table with a one-to-one relationship for better performance and less of a likelihood of a record that suddenly can't be inserted as there is too much data for the row size. If you are nowhere near the limit, the add to the exisiting table.