Rails model.create set id - ruby-on-rails

I wonder if it's possible to to run Model.create() such that instead of taking next free id integer it takes the lowest free integer.
For example, assume we have records for id=10..20 and we don't have records for id=0..9, I want create instance of Model with id starting from 0 (in normal Mode.create() in would create instance staring from 21)
Preferably I want to do it in automatic manner. I don't want to change id by explicitly defining it.

DB
You'll be best doing this at database-level (look at altering the auto-increment number)
Although I think you can do this in Rails, I would highly recommend using the DB functionality to make it happen. You can do something like this in PHPMYAdmin (for MYSQL):
If you set the Auto-Increment to the number you wish to start at, every time you save data into the DB, it will just save with that number. I think using any Rails-based method will just overcomplicate things unnecessarily.

I'd discourage it.
Those ids serve solely as unique identifiers for rows in a table, and it's the database's job to assign one. You can verify that the model doesn't require an id to be saved:
m = Model.new
# populate m with data
m.name = "Name"
# look at what m contains
m
# and save it
m.save
# now inspect it again and see it got its unique id
m
While it might be possible to modify ids, it's not a good practice to give more sense to ids — when each new record gets a unique id at any time it's easier to debug possible DB structure errors that might occur during development. Like, say, some associated objects suddenly show up in a new user's account. Weird enough, right? That can happen and, worst case, can show up in production resulting in a severe security breach.
Keeping ids unique at all times eliminates this bug's effect. That seems much more important if the associated objects store confidential information and you care about keeping them safe. Encryption concerns aside.
So, to be sure in every situation, developers have adopted a practice of not giving id any other role other than uniquely identifying a row in a table. If you want it to do something else, consider making another field for that purpose.

Related

Rails - best database design for existing model

So I have inherited some Rails 4 code and database models which I need to add to.
The model, called mpb_item has a table called mpb_items.
Inside this items table there are columns such as:
role1_start_date, role2_start_date, role3_start_date, role4_start_date
Not ideal but this is what it is. They should have been in a separate roles table I guess.
I need to add functionality to Suspend any one of these roles (or all of them).
I guess I can either:
To the existing table, I can add a new boolean column for each existing role column. e.g. role1_suspended, role2_suspended etc
Create table called mpb_suspensions, with 2 columns: mpb_item_id and role_name. Since the roles don't have ids themselves, the role_name column will store 'role1', or 'role2' etc depending on which role was suspended.
In my View, I need to have the ability to "suspend" each job, or all of them. I'm not sure how the model code would look to do this and which approach would be best.
If I had to build this, given the minimal info you provided I'd start by picking option 2 that you described. (a RoleSuspension is its own table) Benefits include:
fewer new columns total, and doesn't make any table indefinitely large
no redundant columns (role1, role2, etc.)
no need to add more columns if the # of possible roles changes (ie. it's more normalized)
no nil values (approach 1)
you can easily query for the overall state of RoleSuspensions (whereas with approach 1 you'd need to count the non-nil values in role1_suspended, role2_suspended, etc. then sum them up), and easier to index
you can attach logic to a RoleSuspension; it's a separate animal from its parent MbpItem. If it's just a bunch of boolean columns, any complex logic would need to be mushed into the MbpItem model, and would likely be much messier to maintain.
Following option 2's logic, suspending a role would involve creating a new record like this:
#mbp_item.role_suspensions.create!(role_number: 2)
and checking for suspension status would involve something like this:
if #mbp_item.role_suspensions.any?
# or....
RoleSuspension.where(role_number: 3).each do |s|
puts "Item #{s.mbp_item} is suspended for role #{s.role_number}."
end
Database performance will be a larger consideration with this approach, depending on the answers to the following questions:
Where and how often do you need to check for role suspensions?
When you check for role suspensions, how do you approach asking the question? In other words is some global task asking "What role suspensions exist, and for what mbp_items?" or do you check for suspensions on a given mbp_item when you render that object?
If you'll need to check for role suspensions very frequently, perhaps you should add a boolean column to mbp_items called has_suspensions, which will be a partial cache in that it will indicate whether any suspensions exist for this MbpItem (and must be maintained in after_create and after_destroy hooks).
On the other hand, if you know that suspension info never will need to have its own logic and never will need to be queried directly, you could add a single column to MbpItem, role_suspensions, containing a serialized array of integers for the suspended roles. This would be a much less invasive approach in terms of database structure, and probably much simpler even than your option 1 in that it would allow you to suspend and de-suspend any role number with less code and less metaprogramming, but if you ever need to add any fancy logic to the suspension or desuspension process (ie. if a RoleSuspension deserves status as its own object), you'd regret this approach.

Generic flags for a model in RoR

I am making a Ruby on Rails app and am realizing that my User class could potentially end up with a lot of generic boolean / integer attributes. For example, suppose I have a promotion each quarter, and I only want a person to be able to use the promotion once. Then I'd have to make a new column each quarter has_used_promotion_N to track that promotion.
Alternatively, I'm thinking of creating a new column called "Generic Flags" which is just a comma separated value of flags set on the account. For example:
"has_used_promotion_1, has_used_promotion_2, limit_on_feature_a=20" etc. could be set for some particular user
(or maybe I'll store it as JSON)
In any case, I'm thinking of giving myself some sort of NoSQL-like functionality in my DB.
Is this really bad design for some reason? Has anyone else done this before? Anything I'm completely missing about RoR?
In my opinion Promotion should be a separate model with a many to many relationship with User. When you have a promotion you would create a Promotion instance and when a person uses that promotion you add that person to promotion.users relationship.
This is much better than your idea because you can now query those relationship. Want a list of all users that used the first quarter promotion? No problem. You can do that with your solution, but you have to resort to some hackiness (is that a word?) to do it, and you'd have to parse the generic flag string for EVERY user on EVERY query. Not ideal to say the least.
If there's an arbitrarily-sized collection of associations then it should be a real relation, modeled using the existing DB and facilities. Promotions sounds like that, and it seems like it would be something you'd be modeling in your DB already; no real reason to keep a duplicate value hierarchy.
For actually-generic flags, you could have a named-flag table and again use a real association.
You could also just serialize a flag object to a text column. Doing so impedes your ability to do trivial searches on a flag/flag value, however. This may not matter for a wad of flags associated with a single user that you don't care about unless they're logged in, but tread lightly--it depends on your usecase.

How we design Dynamo db with keep relation of two entity

Hi iam new in dynamo db and, with my knowledge its a non relational db ie we cant join the tables. My doubt is how we design the table structure. Please clarify with following example.
I have a following tables
1) users - user_id, username, password, email, phone number, role
2) roles - id, name [ie admin, supervisor, ect..]
a) My first doubt is we have any provision to set auto increment for user_id fields ?
b) Is this correct way of setting primary key as user_id?
c) Is this is the correct method to store user role in dynamo db? ie a roles table contains id and title and store role id in user table?
e) Is this possible to retrieve two tables data along with each user? Am using rails 3 and aws-sdk gem
If anybody reply it will be very helpful for me like a new dynamodb user
Typically with nosql style databases you would provide the unique identifier, rather than having an auto increment PK field do that for you. This usually would mean that you would have a GUID be the key for each User record.
As far as the user roles, there are many ways to accomplish this and each has benefits and problems:
One simple way would be to add a "Role" attribute to the Users table and have one entry per role for that user. Then you could grab the User and you would have all the roles in one query. DynamoDB allows attributes to have multiple values, so one attribute can have one value per role.
If you need to be able to query users in a particular role (ie. "Give me all the Users who are Supervisors") then you will be doing a table scan in DynamoDB, which can be an expensive operation. But, if your number of users is reasonably small, and if the need to do this kind of lookup is infrequent, this still may be acceptable for your application.
If you really need to do this expensive type of lookup often, then you will need to create a new table something like "RolesWithUsers" having one record per Role, with the userIds of the users in the role record. For most applications I'd advise against doing something like this, because now you have two tables representing one fact: what role does a particular user have. So, delete or update needs to be done in two places each time. Not impossible to do, but it takes more vigilance and testing to be sure your application doesn't get wrong data. The other disadvantage of this approach is that you need two queries to get the information, which may be more expensive than the table scan, again, depending on the quantity of records.
Another option that makes sense for this specific use case would be to use SimpleDb. It has better querying capability (all attributes are indexed by default) and the single table with roles as multi-valued attribute is going to be a much better solution than DynamoDB in this case.
Hope this helps!
We have a similar situation and we simply use two DBs, a relational and a NoSQL (Dynamo). For a "User" object, everything that is tied to other things, such as roles, projects, skills, etc, that goes in relational, and everything about the user (attributes, etc) goes in Dynamo. If we need to add new attributes to the user, that is fine, since NoSQL doesn't care about those attributes. The rule of thumb is if we only need something on that object page (that is, we don't need to associate with other objects), then we put in Dynamo. Otherwise, it goes in relational.
Using a table scan on the NoSQL DB is not really an option after you cross even a small threshold (up to that point, you can just use an in memory DB anyway).

Can we use the ids again for whom the record has been deleted?

I was doing RoR tutorial wherein we could add,update,delete user details in the application and simultaneously an id gets auto defined with user,but once we delete a user details then for that id it displays record not found.
Can we use that id again?
From your comments it looks like you're trying to save on using high-value IDs by re-using lower value IDs after they've been freed. In general this is not considered a good idea.
The likelihood that you will run out of IDs at the top end is minimal (zero if you keep making your ID column accept larger integers) however reassigning IDs has the potential to open you up to problems. If, for instance you wanted to delete a user but keep content they had created (e.g. blog posts) then reassigning the IDs would mean that the new ID owner becomes the owner of those old comments.
It feels wasteful but the best thing to do is just leave old, vacant IDs vacant and eat up new ones.
You can use something like this (Rails 3 syntax)
#user = User.find_by_id(params[:id]) || User.where("id > ", params[:id]).first || User.where("id < ", params[:id]).last
I do not recommend re-using unique ID's.
First, ask yourself why do you assign unique ID's to users:
To uniquely identify users
To create unique URLs to a user's resources
To link other objects (orders, blog posts, game data) to that specific user
So, if you've deleted a user - what happens?
In almost every app I've written a user always leaves traces. Either from URLs that were once exposed to the internet (think indexed by google) or data from that user that's kept as records (orders, etc).
Reusing a user ID would cause problems - and thus work to refactor the application to cope with those problems. In 99% of these cases the easiest solution it to just keep generating new, unique IDs for those users.
There are also situations that you need to keep data from a deleted user around (e.g. financial systems and webshop are good examples). This would keep the unique ID alive after the user is deleted - you can't reuse it.
TL;DR: Reusing unique IDs is possible, but poses problems. Easiest solution around those problems is generating new unique IDs.
As an added note, unique IDs don't have to be auto incremented integers.

Generating sequential numbers in multi-user saas application

How do people generate auto_incrementing integers for a particular user in a typical saas application?
For example, the invoice numbers for all the invoices for a particular user should be auto_incrementing and start from 1. The rails id field can't be used in this case, as it's shared amongst all the users.
Off the top of my head, I could count all the invoices a user has, and then add 1, but does anyone know of any better solution?
Typical solution for any relation database could be a table like
user_invoice_numbers (user_id int primary key clustered, last_id int)
and a stored procedure or a SQL query like
update user_invoice_numbers set last_id = last_id + 1 where user_id = #user_id
select last_id from user_invoice_numbers where user_id = #user_id
It will work for users (if each user has a few simultaneously running transactions) but will not work for companies (for example when you need companies_invoice_numbers) because transactions from different users inside the same company may block each other and there will be a performance bottleneck in this table.
The most important functional requirement you should check is whether your system is allowed to have gaps in invoice numbering or not. When you use standard auto_increment, you allow gaps, because in most database I know, when you rollback transaction, the incremented number will not be rolled back. Having this in mind, you can improve performance using one of the following guidelines
1) Exclude the procedure that you use for getting new numbers from the long running transactions. Let's suppose that insert into invoice procedure is a long running transaction with complex server-side logic. In this case you first acquire a new id , and then, in separate transaction insert new invoice. If last transaction will be rolled back, auto-number will not decrease. But user_invoice_numbers will not be locked for long time, so a lot of simultaneous users could insert invoices at the same time
2) Do not use a traditional transactional database to store the data with last id for each user. When you need to maintain simple list of keys and values there are lot of small but fast database engines that can do that work for you. List of Key/Value databases. Probably memcached is the most popular. In the past, I saw the projects where simple key/value storages where implemented using Windows Registry or even a file system. There was a directory where each file name was the key and inside each file was the last id. And this rough solution was still better then using SQL table, because locks were issued and released very quickly and were not involved into transaction scope.
Well, if my proposal for the optimization seems to be overcomplicated for your project, forget about this now, until you will actually run into performance issues. In most projects simple method with an additional table will work pretty fast.
You could introduce another table associated with your "users" table that tracks the most recent invoice number for a user. However, reading this value will result in a database query, so you might as well just get a count of the user's invoices and add one, as you suggested. Either way, it's a database hit.
If the invoice numbers are independent for each user/customer then it seems like having "lastInvoice" field in some persistent store (eg. DB record) associated with the user is pretty unavoidable. However this could lead to some contention for the "latest" number.
Does it really matter if we send a user invoices 1, 2, 3 and 5, and never send them invoice
4? If you can relax the requirement a bit.
If the requirement is actually "every invoice number must be unique" then we can look at all the normal id generating tricks, and these can be quite efficient.
Ensuring that the numbers are sequenctial adds to the complexity, does it add to the business benefit?
I've just uploaded a gem that should resolve your need (a few years late is better than never!) :)
https://github.com/alisyed/sequenceid/
Not sure if this is the best solution, but you could store the last Invoice ID on the User and then use that to determine the next ID when creating a new Invoice for that User. But this simple solution may have problems with integrity, will need to be careful.
Do you really want to generate the invoice IDs in an incremental format? Would this not open security holes (where in, if a user can guess the invoice number generation, they can change it in the request and may lead to information disclosure).
I would ideally generate the numbers randomly (and keep track of used numbers). This prevents collisions as well (Chances of collision are reduced as the numbers are allocated randomly over a range).

Resources