I'm really struggling with this problem, would love some additional thoughts. Here's the basic context:
Users can both list items to be lent out, and make requests for items to borrow
Requests are posted by users who want to borrow something. Each request may contain several items
Items are predefined (i.e., POST form is a checkbox) and listed by users who want to lend them out, and are in turn borrowed by other users who have submitted a request
Workflow:
John is a user and submits a request for a tent from 6/5 to 6/8
The controller looks for all users (besides John) who own tents that are available from 6/5 to 6/8
This list of users are contacted to see who wants to provide a tent for John
Whoever responds affirmative to John first has their item in the table automatically updated to be no longer available from 6/5 to 6/8
John and the other user are connected to make the exchange happen
So far my thinking:
Users table has_many :items and has_many :requests
Requests table belongs_to :users
Items table belongs_to :users
Additional complexities that my brain can't seem to process:
One request can contain multiple items, and I've been told against accepting items as a serialized array in one cell, so then I'm not sure how to relate request and item. In the tables above, should items also belong_to requests? And if yes, this seems to imply that a user has to make a request for a specific item whereas I want the user to be able to search for tent, and see a list of all the users who have tents that are available
Requests contain start_date and end_date attributes, that somehow need to be compared to when the item is available. Right now I'm thinking in the items table, there needs to be a column that stores the dates when the item will be in use (i.e., not available). But then this data will be an array again. For example, a tent might be requested (and the user responds OK) from 6/5 to 6/8 and then again from 6/10 to 6/15, and then again from 7/8 to 7/9. So do I need a fourth table??
Items will be a predefined list, e.g., tent, sleeping bag, sleeping pad. In this way, I'm wondering do I actually need a has_and_belongs_to_many relationship with users, since a tent could belong to many users, and a user could have many tents.
Sorry if this sounds like a ramble... I've been sitting here for 4 hours with many sheets of paper and scribbles and this is not getting any clearer...
You need to understand the basic way of recording information relationally ie in tables.
Finding sufficient tables
Just have a base table for every statement you need to describe a business situation:
User(user_id,name,...)
// User [user_id] is named [name]
Contacted(contact_id,item,offer_id)
// user [contact_id] was contacted re item [item] offered by user [offer_id]
...etc...
The parameters of the statement are the columns of the table.
If you want to talk about the parts of something that you think of as having multiple parts (heterogenous or homogeneous) that just means that some statements will involve a thing and its parts:
table request(request_id,start_date,end_date,...)
// [request_id] goes from [start_date] to [end_date] and ...
table requested(request_id,item_id,person_id,...)
// person [person_id] requested item [item_id] in request [request_id]
What's in a table
A base table's value is the rows that make its statement true. (Every query subexpression also has a statement, and its value is the rows that make its statement true.)
Don't confuse table statements with business rules. Business rules state truths. But a table statement is a statement that some tuple makes true (and goes in the table) or false (is left out of the table). All the true and false statements from the tables tell you everything you need to know about the business. The business rules will never contradict them. (Since they're always true.)
Rearranging to better tables
A key is a set of columns that all other columns are such functions of but none of whose subsets have that property. A table can have more than one key.
To make a database easier to update and query you should break up certain statements that are other statements joined by AND. Break up until each statement consists of a statement only about key columns ANDed with statements of this form:
[my_column]=my_function([key_k_column_1],[key_k_column_2],...)
where key_k_column_1,... are columns of the same key key_n.
(Such a table is "in fifth normal form" and the topic is "normalization".)
Related
I'm building a Ruby on Rails App for a business and will be utilizing an ActiveRecord database. My question really has to do with Database Architecture and really the best way I should organize all the different tables and models within my app. So the App I'm building is going to have a database of orders for an ECommerce Business that sells products through 2 different channels, a subscription service where they pick the products and sell it for a fixed monthly fee and a traditional ECommerce channel, where customers pay for their products directly. So essentially while all of these would be classified as the Order model, there are two types of Orders: Subscription Order and Regular Order.
So initially I thought I would classify all this activity in my Orders Table and include a field 'Type' that would indicate whether it is a subscription order or a regular order. My issue is that there are a bunch of fields that I would need that would be specific to each type. For instance, transaction_id, batch_id and sub_id are all fields that would only be present if that order type was a subscription, and conversely would be absent if the order type was regular.
My question is, would it be in my best interest to just create two separate tables, one for subscription orders and one for regular orders? Or is there a way that fields could only appear conditional on what the Type field is? I would hate to see so many Nil values, for instance, if the order type was a regular order.
Sorry this question isn't as technical as it is just pertaining to best practice and organization.
Thanks,
Sunny
What you've described is a pattern called Single Table Inheritance — aka, having one table store data for different types of objects with different behavior.
Generally, people will tell you not to do it, since it leads to a lot of empty fields in your database which will hurt performance long term. It also just looks gross.
You should probably instead store the data in separate tables. If you want to get fancy, you can try to implement Class Table Inheritance, in which there are actually separate but connected table for each of the child classes. This isn't supported natively by ActiveRecord. This gem and this gem might be able to help you, but I've never used either, so I can't give you a firm recommendation.
I would keep all of my orders in one table. You could create a second table for "subscription order information" that would only contain the columns transaction_id, batch_id and sub_id as well as a primary key to link it back to the main orders table. You would still want to include an order type column in the main database though to make it a little easier when debugging.
Assuming you're using Postgres, I might lean towards an Hstore for that.
Some reading:
http://www.devmynd.com/blog/2013-3-single-table-inheritance-hstore-lovely-combination
https://github.com/devmynd/hstore_accessor
Make an integer column called order_type.
In the model do:
SUBSCRIPTION = 0
ONLINE = 1
...
It'll query better than strings and whenever you want to call one you do Order:SUBSCRIPTION.
Make two+ other tables with a foreign key equal to whatever the ID of the corresponding row in orders.
Now you can keep all shared data in the orders table, for easy querying, and all unique data in the other tables so you don't have bloated models.
I'm trying to avoid client side cookies because of different browsers to show again tour around pages in my app.
The user only have to see the tour when he firsts saw it. Then I'm thinking in a table like this:
table: users x pages_that_viewed
user_id seen_page seen_profile seen_another_page...
12 2012-12-12
13 2012-12-12 2012-12-12
Then I will have to join the table users with this one every time...
Another solution then will be adding this columns directly to users table.
Depending on how many pages you have and how many pages you are planning on adding, each additional page will require a database schema modification. Generally speaking, it's not a good idea. You can create one record per user per page instead. Index on user_id, page_id. If you want to know when they accessed it, third field would be date, otherwise a boolean will do. Another alternative is to use bitmap, but that will not work if you need dates, however it will only take 1 record per page to keep track of all user visits.
Bitmap field would store something like this 001000100011110010 where each digit represents a user_id and stores a visit to that page. Ex. user_id 12, would be the 12th digit 0 or 1 etc. On a visit you would update 1 field and set Nth digit to 1. Bitmaps are generally very fast and support some additional operations like unions, intersections etc.
I think a column would be more cost effective, much less queries...
try a boolean tour column then when someone logs to your app just fetch the column and store it in a session
session[:tour] ||= #user.tour
after reading your last comment i came up with this idea, I think you should make a relation between users and pages that would be a many to many asociation, then you could have a junction table pages_users with columns user_id and page_id, then at each render check if the current user has the current page. i think that is the best way to go.
I have two tables:
"sites" has_many "users"
"users" belongs_to "sites"
Is it better that whenever a users got added to sites I added column called users_count in sites table and increment it by one. Or is doing a conditional count on users table the best way?
"Better" is a subjective term.
However, I'll be adamant about this. There should not be two sources of the same information in a database, simply because they may get out of step.
The definitive way to discover how many users belong to a site is to use count to count them.
Third normal form requires that every non-key attribute depends on the key, the whole key, and nothing but the key (so help me, Codd).
If you add a user count to sites, that does not depend solely on the sites key value, it also depends on information in other tables.
You can revert from third normal form for performance if you understand the implications and mitigate the possibility of inconsistent data (such as with triggers) but the vast majority of cases should remain 3NF.
I've got a Posts document that belong to Users, and Users have an :approved attribute. How can I query my Posts using Mongodb s.t. I only get those for where User has :approved => true ?
I could write a loop that creates a new array, but that seems inefficient.
MongoDB does not have any notion of joins.
You've stated in the comments that Posts and Users are separate collections, but your query clearly involves data from both collections, which would imply a join.
I could write a loop that creates a new array, but that seems inefficient.
A join operation in SQL is basically a loop that happens on the server. With no join support on the server side, you'll have to make your own.
Note that many of the libraries (like Morphia) actually have some of this functionality built-in. You are using Mongoid which may have some of this support, but you'll have to do some hunting.
The easiest way to think about it would be to query for unique user ids of users who are approved and then query for post documents where the poster's user_id is in that set.
As Rubish said, you could de-normalize by adding an approved field to the post document. When a user's approval status is toggled (they become approved or unapproved) do an update on the posts collection where, for all of that user's posts, you toggle the denormalized approval field.
Using the denormalized method lets you do one query instead of two (simplifying the logic for the most common case) and isn't too much of a pain to maintain.
Let me know if that makes sense.
I'm programming a website that allows users to post classified ads with detailed fields for different types of items they are selling. However, I have a question about the best database schema.
The site features many categories (eg. Cars, Computers, Cameras) and each category of ads have their own distinct fields. For example, Cars have attributes such as number of doors, make, model, and horsepower while Computers have attributes such as CPU, RAM, Motherboard Model, etc.
Now since they are all listings, I was thinking of a polymorphic approach, creating a parent LISTINGS table and a different child table for each of the different categories (COMPUTERS, CARS, CAMERAS). Each child table will have a listing_id that will link back to the LISTINGS TABLE. So when a listing is fetched, it would fetch a row from LISTINGS joined by the linked row in the associated child table.
LISTINGS
-listing_id
-user_id
-email_address
-date_created
-description
CARS
-car_id
-listing_id
-make
-model
-num_doors
-horsepower
COMPUTERS
-computer_id
-listing_id
-cpu
-ram
-motherboard_model
Now, is this schema a good design pattern or are there better ways to do this?
I considered single inheritance but quickly brushed off the thought because the table will get too large too quickly, but then another dilemma came to mind - if the user does a global search on all the listings, then that means I will have to query each child table separately. What happens if I have over 100 different categories, wouldn't it be inefficient?
I also thought of another approach where there is a master table (meta table) that defines the fields in each category and a field table that stores the field values of each listing, but would that go against database normalization?
How would sites like Kijiji do it?
Your database design is fine. No reason to change what you've got. I've seen the search done a few ways. One is to have your search stored procedure join all the tables you need to search across and index the columns to be searched. The second way I've seen it done which worked pretty well was to have a table that is only used for search which gets a copy of whatever fields that need to be searched. Then you would put triggers on those fields and update the search table.
They both have drawbacks but I preferred the first to the second.
EDIT
You need the following tables.
Categories
- Id
- Description
CategoriesListingsXref
- CategoryId
- ListingId
With this cross reference model you can join all your listings for a given category during search. Then add a little dynamic sql (because it's easier to understand) and build up your query to include the field(s) you want to search against and call execute on your query.
That's it.
EDIT 2
This seems to be a little bigger discussion that we can fin in these comment boxes. But, anything we would discuss can be understood by reading the following post.
http://www.sommarskog.se/dyn-search-2008.html
It is really complete and shows you more than 1 way of doing it with pro's and cons.
Good luck.
I think the design you have chosen will be good for the scenario you just described. Though I'm not sure if the sub class tables should have their own ID. Since a CAR is a Listing, it makes sense that the values are from the same "domain".
In the typical classified ads site, the data for an ad is written once and then is basically read-only. You can exploit this and store the data in a second set of tables that are more optimized for searching in just the way you want the users to search. Also, the search problem only really exists for a "general" search. Once the user picks a certain type of ad, you can switch to the sub class tables in order to do more advanced search (RAM > 4gb, cpu = overpowered).