Database design for reverse-auction platform - ruby-on-rails

The idea is for a reverse auction platform where users post their auction for certain services and providers bid on it with their offers.
Should I be splitting my tables? For example the auction can be for a new service or to replace an existing service so there are questions that are specific to each selection.
Should I move those columns into a separate table for that option?
Here is a diagram of what I've come up with so far:
Database Diagram image
Am I on the right track here?
What data type should I use for columns where there will be an list of options to choose from in the auction form? For example, cash_back will give the user a range of choices as:
Donate to Charity
Deposit to my account
Credit Voucher
Is the norm to use a string for this column with the respective strings or do I create a new table for the options and use the option_id as a foreign key in this table?

I think it is worth discussing here Rails philosophy and database design generally.
As I often say you can have a database for your application, or you can have an application for your database. In the latter, database design is important. In the former, it usually follows application design.
What this means is that you probably, assuming this is a Rails app, don't want to design your database at all. What you want to do is design your application object model and let Rails design your database. You won't get a great db design that way, but it will be good enough.
The tradeoff is that when you go this way, you often end up with the database as effectively owned by the application and it may not be safe to have other apps add or modify the data in the database. Moreover it may be harder to come up with really good reports, but where you put in most of your time will be better optimized (your main app).
TL;DR: If you want a database for your app and using Rails or Django, then stop thinking about database design, but realize that while this optimizes some pathways today, it makes many other things harder down the road.

Related

Design application architecture for catalog/orders system

I have to write a system that allows clients to make orders of products. There is an existing application that handles creating catalog (adding, editing products etc.) It is used by producents to maintain data about their products. Now I am wondering how should I add order functionality (I have access to source code of catalog application and catalog database). Order functionality can be shortly describe as:
customer can login make new orders, see old ones
producer can login and see new orders made by clients, archive orders
midleman should see all orders grouped by producer so he can decide whether or not to ship them
I am thinking about few ways:
Create new application with new database, that contains only orders.
products in catalog database and order database will be connected by [producent_id, product_id] pair
in orders application I would connect to catalog database to show information about products
Disadvantages of that approach, that I can think of are
not being able to do statistics queries like "count all orders of items from category that are made of wood" etc.
Advantages
clear seperation from existing application
Add functionality to existing application and use the same database
producer has another option in menu "Orders"
add new functionality for clients and middleman
Disadvantages of that approach, that I can think of are
not being able to for example move orders database/aplication to another server
Advantages
able to run cross orders/catalog queries
I really can't make my mind about this. I can see upsides and downsides for both ways, but can't decide which is better. Do you have any suggestions, maybe third way?
P.S technology will be ruby on rails
Personally I would go for the 'add functionality to existing application'.
I don't suppose you want to sell the Order application separate because it totally depends on how your Catalog/Product app is build.
The development will be easier, not only database access but also the reuse of the existing infrastructure will improve your development speed.
I suppose the only reason you want to move the orders application to another server is because of performance problems. I would say that you should never optimize for a situation your not sure it will ever happen.
It's much harder to do cross database queries which are not supported then to optimize the application when you notice some real, measurable, performance issues.
So in my opinion development speed out wages supposed performance problems.

Multi-tenant rails application: what are the pros and cons of different techniques?

I originally wrote my Ruby on Rails application for one client. Now, I am changing it so that it can be used for different clients. My end-goal is that some user (not me) can click a button and create a new project. Then all the necessary changes (new schema, new tables, handling of code) are generated without anyone needing me to edit a database.yml file or add new schema definitions. I am currently using the SCOPED access. So I have a project model and other associated models have a project_id column.
I have looked at other posts regarding multi-tenant applications in Rails. A lot of people seem to suggest creating a different schema for each new client in Postgres. For me, however, it is not much useful for a new client to have a different schema in terms of data model. Each client will have the same tables, rows, columns, etc.
My vision for each client is that my production database first has a table of different projects/clients. And each one of those tables links to a set of tables that are pretty much the same with different data. In other terms a table of tables. Or in other terms, the first table will map to a different set of data for each client that has the same structure.
Is the way I explained my vision at all similar to the way that Postgres implements different "schemas"? Does it look like nested tables? Or does Postgres have to query all the information in the database anyway? I do not currently use Postgres, but I would be willing to learn if it fits the design. If you know of database software that works with Rails that fits my needs, please do let me know.
Right now, I am using scopes to accomplish multi-tenant applications, but it does not feel scalable or clean. It does however make it very easy for a non-technical user to create a new project provided I give them fillable information. Do you know if it is possible with the multi-schema Postgres defintion to have it work automatically after a user clicks a button? And I would prefer that this be handled by Rails and not by an external script if possible? (please do advise either way)
Most importantly, do you recommend any plugins or that I should adopt a different framework for this task? I have found Rails to be limited in some cases of abstraction as above and this is the first time I have ran into a Rails-scaling issue.
Any advice related to multi-tenant applications or my situation is welcome. Any questions for clarification or additional advice are welcome as well.
Thanks,
--Dave
MSDN has a good introduction to multi-tenant data architecture.
At one end of the spectrum, you have one database per tenant ("shared nothing"). "Shared nothing" makes disaster recovery pretty simple, and has the highest degree of isolation between tenants. But it also has the highest average cost per tenant, and it supports the fewest tenants per server.
At the other end of the spectrum, you store a tenant id number in every row of every shared table ("shared everything"). "Shared everything" makes disaster recovery hard--for a single tenant, you'd have to restore just some rows in every shared table--and it has the lowest degree of isolation. (Badly formed queries can expose private data.) But it has the lowest cost per tenant, and it supports the highest number of tenants per server.
My vision for each client is that my production database first has a
table of different projects/clients. And each one of those tables
links to a set of tables that are pretty much the same with different
data. In other terms a table of tables. Or in other terms, the first
table will map to a different set of data for each client that has the
same structure.
This sounds like you're talking about one schema per tenant. Pay close attention to permissions (SQL GRANT and REVOKE statements. And ALTER DEFAULT PRIVILEGES.)
There are two railscasts on multitenancy that using scopes and subdomains and another to help with handling multiple schemas.
There is also the multitenant gem which could help with your scopes and apartment gem for handling multiple schemas.
Here is also a good presentation on multitenancy-with-rails.
Dont forget about using default scopes, while creating named scops the way you are now works it does feel like it could be done better. I came across this guide by Samuel Kadolph regarding this issue a few months ago and it looks like it could work well for your situation and have the benefit of keeping your application free of some PgSQL only features.
Basically the way he describes setting the application up involves adding the concepts of tennants to your application and then using this to scope the data at query time using the database.

Modeling data for complex applications: inheritance? modules?

I'm developing a prototype for a Rails application that has a somewhat complex model. I'm having a hard time figuring out how the model should be designed.
Here's the basic idea:
Say the app is for a small coop, and should allow users to file "applications" (as in a apply for a loan, accounts, etc) online.
Technically, all of them are "applications" and my client would like a dashboard presenting all pending evaluations for a given group (ie: the auto loans group). So, I'm thinking of having an application model. I could subclass ApplicationModel for loans, accounts, etc., but I'm thinking Rails will use a Loans table, an Accounts table, and so, and so for each type of model. There are different kinds of loans too: Mortgage, Auto, Personal, etc. All require different data. A manager for the whole Loans division, should be able to see all pending loan approvals from his dashboard (all classes of loans).
If I were to do something like Loans.find() would I get all Loan subtypes with it? Or only what's stored in the loans table?
Is inheritance the way to go? Should I be looking at something else? Is rails even adequate for this kind of application?
I saw the legacy application built on Java that they were using (but hated with a passion). The previous developer had an ST_APPLICATIONS table which store all applications. In that table he stores all the application data in a blob containing a bunch of XML. Would such an approach be recommended with Rails?

Any thoughts on Multi-tenant versus Multi-database apps in Rails

Our app currently spawns a new database for each client. We're starting to wonder whether we should consider refactoring this to a multi-tenant system.
What benefits / trade-offs should we be considering? What are the best practices for implementing a multi-tenant app in Rails?
I've been researching the same thing and just found this presentation to offer an interesting solution: Using Postgre's schemas (a bit like namespaces) to separate data at the DB level while keeping all tenants in the same DB and staying (mostly) transparent to rails.
Writing Multi-Tenant Applications in Rails - Guy Naor
Multi-tenant systems will introduce a whole range of issues for you. My quick thoughts are below
All SQL must be examined and
refactored to include a ClientId
value.
All Indexes must be examined to
determine if the ClientId needs to be
included
An error in a SQL statement by a
developer/sysadmin in production will
affect all of your customers.
A database corruption/problem will
affect all of your customers
You have some data privacy issues
whereby poor code/implementation could
allow customerA to see data belonging
to CustomerB
A customer using your system in a
heavy/agressive manner may affect
other customers perception of performance
Tailoring static data to an individual customers preference becomes more complex.
I'm sure there are a number of other issues but these were my initial thoughts.
It really depends upon what you're doing.
We are making a MIS program for the print industry that tracks inventory, employees, customers, equipment, and does some serious calculations to estimate costs of performing jobs based on a lot of input variables.
We are anticipating very large databases for each customer, and we currently have 170 tables. Adding another column to almost every table just to store the client_id hurts my brain.
We are currently in the beta stage of our program, and here are some things that we have encountered:
Migrations: A Rails assumption is that you will only have 1 database. You can adapt it for multiple databases, and migrations is one of them. You need a custom rake task to apply migrations to all existing databases. Be prepared to do a lot of trouble shooting because a migration may succeed on one DB, but fail on another.
Spawning Databases: How do you create a new db? From a SQL file, copying an existing db, or running all migrations? How do you keep you schema consistent between your table creation system, and your live databases?
Connecting to the appropriate database: We use a cookie to store a unique value that maps to the correct DB. We use a before filter in an Authorized controller that inheirits from ActionController that gets the db from that unique value and uses the establish_connection method on a Subclass of ActiveRecord::Base. This allows us to have some models pull from a common db and others from the client's specific db.
If you have specific questions about any of these, I can help.
I don't have any experience with this personally, but during the lightning talks at the 2009 Ruby Hoedown, Andrew Coleman presented a plugin he designed and uses for multi-tenant databases in rails w/ subdomains. You can check out the lightning talk slides and here's the acts_as_restricted_subdomain repository.
Why would you? Do you have heavy aggregation between users or are you spawning too many DBs? Have you considered using SQLite files per tenant instead of shared DB servers (since multitenant apps often are low-profile and don't need that much concurrency)?

Considering a new Rails app with existing data (not a db, actual data) -- what is the best way to proceed?

I have been tasked with developing a new retail e-commerce storefront for my current job, and I am considering tackling it with RoR to A) Build a "real" project with my limited Rails knowledge, and B) Give management quick turnaround and feedback (they are wanting to get this done ASAP and their deadlines are rather unrealistic - I'm talking a couple of weeks to go from nothing to working model so they can start to market it with SEO/SEM and, I kid you not, "video blogging" because my boss heard that's the future).
We do have a database structure in place but it's absolutely terrible and was thrown together without rhyme nor reason, so I'm going to largely ignore it and create a new database from scratch; however, I have existing data that I need to load into the application (like I said, it's an e-commerce app and we have the product data). I need to massage this data into a usable format because our supplier provides it to us with cryptic, abbreviated column names and it's highly denormalized, especially in the categories (I've posted a question regarding it before - basically the categories table has six fields, one for each category/subcategory, with some of them being blank if that category doesn't apply).
There are two main issues that are giving me second thoughts:
As I said the data needs to be put into a "proper" database schema; I can't just load it as-is. I have some thoughts as to a good data model for it, but my analysis is not completed yet. There would end up being a large amount of joins tables to link various things together (e.g. products_categories, products_attributes, products_prices) etc and these tables would link products not via an ID but by their SKU (see below).
Everything already has an ID that's generated for it, but anything new I add needs to have one autogenerated; I doubt this will be a problem with any mature RDBMS, but I know Rails likes to generate IDs itself. Also, almost all of the product-related tables are linked by SKUs (and in the data provided by the supplier are actually a composite key consisting of the prefix and stock number, which combined make up the full SKU), not by IDs and I'm not sure if this will be a performance issue (of course, I could always manually create indexes on these columns to speed it up). It does mean that I'll need to break away from the Rails conventions, however.
In short, I think that Rails might be a good choice as far as time-to-market and ease-of-development, but having to work with the existing data content might turn into a pain because the application will need to be developed around that, instead of the "traditional" Rails app, and that factor is giving me major doubts about using Rails. There are also some other issues (having to set up a Linux server, and the fact that the area I live in has very few Rails developers so if I left the company I'd basically be holding them hostage as far as updates/modifications). I'm really unsure as to the best path to proceed.
I would develop the app as if you didn't have the data. Use the ORM and make your database the best it can be, but of course keep in mind what data you have to populate it with (eg: don't make crazy new constraints for things that will leave you going through old data record by record).
When you're done and tested, write an import script that pulls your real data onto your new database.
It's not that different from the conventional design/development model... Apart from you can do your data-input in a semi-automated fashion.
I was in the same situation not too long ago — a crappy PHP app that held ten years worth of all company data.
What I did was simply create a Migration model and added methods to import each resource.
class Migration
def migration_all
self.jobs
end
def self.jobs
...
end
end
The cool thing about this is that you can arrange which order resources are imported as one will likely reference another. I also added methods that directly modified the db schema. One nice trick if you have to keep an existing primary key is to create a field named 'legacy_id', copy over your existing primary key, and when you're done, simply remove the 'id' field, rename the 'legacy_id' field to 'id', then add the primary_key constraint on the new 'id' field.
Don't use the SKU as the unique key for each product - use the standard Rails incremented id.
SKU could change as it may be misentered, etc and that would make it a nightmare to change all of the references from other tables. Put your current id in a sku column, index it and update the references in your other tables to the Rails ids.
You'd be able to do Product.find_by_sku(params[:sku]) in your controllers, set up a /products/:sku route, etc. I don't see what you'd gain (other than a headache) by using your non generated ids as the database primary keys.
I'd also suggest running your old data through your app's validations to make sure you are not loading up a bunch of inconsistencies and erros. It will help your app run smoothly and highlight existing data errors at a point where you can fix them.
Don't assume the existing data is valid just because it is already there.

Resources