I have a best practice question when it comes to default values, uniqueness constraints and API design. For this exercise, I am creating a pokedex api with the following database schema in postgresql.
create_table :pokemon, id: :uuid do |t|
t.string :name, null: false, unique: true
t.string :national_index, null: false, unique: true
t.text :description, default: 'unknown', null: false
t.string :hp, default: '0', null: false
t.string :attack, default: '0', null: false
t.string :defense, default: '0', null: false
t.string :special_attack, default: '0', null: false
t.string :special_defense, default: '0', null: false
t.string :speed, default: '0', null: false
t.string :height, default: '0', null: false
t.string :weight, default: '0', null: false
t.boolean :male, default: false, null: false
t.boolean :female, default: false, null: false
t.string :category, default: 'unknown', null: false
t.timestamps
end
As you can see from this rails migration, I have presence constraints on all my fields with the addition of a unique constraint for the name of the pokemon and its national index. Additionally, you can see in the database migration, a pokemon's battle statistics are represented with a default value of 0 in the event that one cannot determine a pokemon's battle statistic and its description and category fall under 'unknown' as its default value.
Here is my situation and I think it can be best explained with a user use case.
Let's say that a pokemon researcher encounters a new undiscovered pokemon and uploads this pokemon to the pokedex database though the api that I am creating. I imagine in this case, the pokemon researcher will have to give this pokmeon a name (which can later be changed) and the rest of this pokemon's field will all be filled with default values that can also be changed later once more research has been conducted on the pokemon. However, the pokemon researcher can't assign it a national index number because it is an "unknown" pokemon. Only an officially recognized pokemon can have a national index number and no two pokemon official or undiscovered can have a conflicting number. In other words its not officially recognized. So my question, is what can I use to add a default value to a field that Im defining as present and unique? I would love to give it a N/A, unkown, or other type of value but I'm unable to because they aren't unique. The next undiscovered pokemon can't also have a value of 'N/A' for its national index number. Is there a best practice for a situation like this?
You can argue that perhaps it would be best to just leave those fields empty but from this excellent book I'm reading about API design, I specifically refer to this section under default values for strings.
Similarly, to number and Boolean fields, many serialization formats don't necessarily permit a distinct null value (null) from a zero value(""). As a result, it can be difficult to determine the difference between a user specifying that a string should be the empty string rather than a user specifically asking the API to 'do what's best' for the field given the rest of the context.
Luckily though, there are quite a few options available. In many cases, an empty string is simply not a valid value for a field. As a result, the empty string can indeed be used as a flag indicating that a default value should be injected and saved. In other cases, the string value might have a specific set of appropriate values, with the empty string being on of the choices. In this scenario, it's perfectly reasonable to allow a choice of 'default' to act as a flag indicating that a default value should be stored instead.
the book seems to be advocating for a default value, but is it even possible in the situation I described? Basically, I want my api to be predictable and reliable and I imagine the scenario I described to be something that happens quite often.
You can set default values at the application level for more complex scenarios
class Pokemon < ActiveRecord::Base
before_validation :set_national_index, on: :create
private
def set_national_index
self.national_index = if official?
# ...
else
"unknown_#{Time.zone.now.to_i}_#{rand(99999)}"
end
end
end
Related
I'm trying to seed some data using an external trivia API.
Here's what I have in my seeds.rb file where HTTParty is a gem that parses JSON into a ruby hash:
response = HTTParty.get("THE-API-SITE")
response.each do |trivia|
triviaHash = {
category: trivia["category"],
answer: trivia["correctAnswer"],
incorrect: trivia["incorrectAnswers"],
question: trivia["question"],
}
Trivia.find_or_create_by(triviaHash)
end
And here's my schema:
create_table "trivia", force: :cascade do |t|
t.string "category"
t.string "answer"
t.string "incorrect"
t.string "question"
t.datetime "created_at", precision: 6, null: false
t.datetime "updated_at", precision: 6, null: false
t.string "ids"
end
Everything works except for the "incorrect" key which should have a value of an array of strings of incorrect answers, but when I seed my data, I get a string of the array of incorrect answers.
What I want:
incorrect: ["Deep Purple", "Feeder", "Uriah Heep"],
What I'm getting:
incorrect: "["Deep Purple", "Feeder", "Uriah Heep"]",
I'm not sure how to get what I want or if it's even possible the way I'm going about it.
Since your column incorrect is of type string, so the data stored in DB will also be a string value.
If your DB supports array data type then you can use that or else I would suggest you to use this in your model.
serialize :incorrect, Array
Change your migration field for incorrect field, from:
t.string "incorrect"
to:
t.text :incorrect, array: true, default: []
Also I suggest you to use symbols instead of strings for column names and use t.timestamps that generates created_at and updated_at for you
I'm struggling to wrap my mind around an ActiveRecord query.
I'm trying to search my database for GolfRetailer objects with ID's 1..100, that have something (not nil) in their :website field, and that don't have true in their duplicate_domain field.
Here's the query I expected to work:
GolfRetailer.where.not(website: nil, duplicate_domain: true).where(id: 1..100)
I also tried this variant of essentially the same query: GolfRetailer.where.not(website: nil).where(id: 1..100, duplicate_domain: !true)
But both return an empty array, despite there definitely being records that meet those requirements.
When I run GolfRetailer.where.not(website: nil).where(id: 1..100) I get an array, and when I run GolfRetailer.where.not(website: nil, duplicate_domain: nil).where(id: 1..100) I also get an array, but with all records that do have the true duplicate_domain flag, which isn't what I'm looking for.
I'd rather not search for records that have duplicate_domain: nil as that's not always correct (I may not have processed their domain yet).
For clarity, here is the Schema for the Model.
create_table "golf_retailers", force: :cascade do |t|
t.string "name"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.string "place_id"
t.string "website"
t.string "formatted_address"
t.string "google_places_name"
t.string "email"
t.boolean "duplicate_domain"
t.index ["duplicate_domain"], name: "index_golf_retailers_on_duplicate_domain"
end
What am I missing to make this query work?
This is happening because in SQL when you do a != TRUE, any NULL values will not be included in the result. This is because the NULL value represents an unknown value, so the DB does not know how to do any comparison operations on an unknown value and therefore they're excluded.
One way to get around this is to use IS DISTINCT FROM:
GolfRetailer
.where(id: 1..100)
.where.not(website: nil)
.where("duplicate_domain IS DISTINCT FROM ?", true)
As others have mentioned, you should also ask yourself if it's really the case that it's ever unknown to you if a GolfRetailer has a duplicate_domain.
If, all GolfRetailers with a duplicate_domain of NULL actually mean they don't have one (false) than you should consider preventing a NULL value for that column entirely.
You can do this by adding a NOT NULL constraint on the column with a change_column database migration.
In order to add the NOT NULL constraint you will first need to make sure all of the data in the column has non-null values.
def change
GolfRetailer.in_batches.update_all(duplicate_domain: false)
change_column_null :golf_retailers, :duplicate_domain
end
If your application is under load, you should also be careful about the potential performance any migration like this might have - notably if you add a NOT NULL constraint with a default value.
Consider using something like the Strong Migrations gem to help find DB migrations that might cause downtime before production.
I am trying to add a date of birth to each patient in my database and having issues adding a date to the dob attribute on each patient. When I add for ex. Patient.first(dob: "01/25/1990") I get an error reading no implicit conversion of Integer into Hash. Any ideas on how I would do so?
create_table "patients", force: :cascade do |t|
t.string "first_name"
t.string "last_name"
t.integer "age"
t.date "dob"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
When I seeded my database, my dob field was nil
I have also tried Patient.first(dob: Date.parse('31-12-2010')) and still get the same error.
You have two problems:
first doesn't take query arguments.
Your date format is ambiguous.
The first finder method looks like:
first(limit = nil)
Find the first record (or first N records if a parameter is supplied). If no order is defined it will order by primary key.
You want to use find_by as your .where(...).first shortcut.
And to avoid ambiguity with your dates, you should use Date objects or ISO-8601 formatted strings (i.e. YYYY-MM-DD) inside the application and leave the supposedly "human friendly" formats for the very edges.
You want to say:
Patient.find_by(dob: '1990-01-25')
So, I'm using Rails 4, and I have an enum column on my "Sales_Opportunity" object called pipeline_status - this enables me to move it through a sales pipeline (e.g. New Lead, Qualified Lead, Closed deal etc). This all works fine. I'm able to find the number of sales_opportunities that a company has by status through using the following:
<%= #company.sales_opportunities.where(pipeline_status: 3).count %>
This all works fine. What I want to do is to find all sales_opportunities that have the pipeline_status of "closed_won" (enum value of 4 in my app) and sum the value of each won deal (so I can represent the total value of the customer based on the deals that are won in the system). A Sales_Opportunity in my model has a sale_value field, so I tried:
<%= #company.sales_opportunities.where(pipeline_status: 4).each.sale_value.sum %>
which returns the following error:
undefined method `sale_value' for #<Enumerator:0x007f9b87a9d128>
This is probably a trivial error but I can't for the life of me figure out what's going on. Is there where statement returning the enumerator or the sales_opportunity objects with that enumerator? Any help would be gratefully appreciated.
If it helps here are the fields in my sales_opportunities table:
create_table "sales_opportunities", force: true do |t|
t.datetime "close_date"
t.integer "user_id"
t.datetime "created_at"
t.datetime "updated_at"
t.integer "pipeline_status", default: 0
t.string "opportunity_name"
t.integer "company_id"
t.decimal "sale_value", precision: 15, scale: 2, default: 0.0
end
A Sales_opportunity belongs_to a Company Object and a User Object, if that makes any difference.
use aggregate function sum
<%= #company.sales_opportunities.where(pipeline_status: 4).sum(:sale_value) %>
Other possibility is to use
<%= #company.sales_opportunities.where(pipeline_status: 4).pluck(:sale_value).reduce(0, :+) %>
I have an application that has tens of thousands of snapshot records. A very small number of these 'snapshots' (say 1 in 1000) will have one or more 'positions' through a :has_many association.
How can I efficiently discover if a snapshot has a position without firing an active record query for each snapshot? My current solution is to add a boolean field to snapshots - if a snapshot has a position, 'has_position' is set to true. This seems a little messy since it means I have to modify the associated snapshot every time I create a position. Is there a cleaner way to handle this scenario?
create_table "snapshots", :force => true do |t|
t.datetime "created_at",
t.datetime "updated_at",
t.boolean "has_position",
end
create_table "positions", :force => true do |t|
t.integer "snapshot_id"
t.datetime "created_at",
t.datetime "updated_at",
end
What will happen if you generate the migration for positions with the reference to snapshots, the migration file will be generated with a
add_index :positions, :snapshot_id
appended to the end of it.
With an index on snapshot_id the DB will take log(n) queries to figure out whether or not a position has at least one associated record. Not as good as constant time with the boolean, but with mere tens of thousands of records it shouldn't take noticeably longer (unless you're doing this very, very frequently).
Additionally, a simple has_position boolean might be harder than you think to maintain without an index. You can set it to true on creation of an associated position, but you can't set it to false on the deletion because there might exist another one, and you'd have to do a table scan again.
If for some reason using an index is undesirable (or you really need constant time lookup), then I'd recommend using a :counter_cache column.