I'm struggling to wrap my mind around an ActiveRecord query.
I'm trying to search my database for GolfRetailer objects with ID's 1..100, that have something (not nil) in their :website field, and that don't have true in their duplicate_domain field.
Here's the query I expected to work:
GolfRetailer.where.not(website: nil, duplicate_domain: true).where(id: 1..100)
I also tried this variant of essentially the same query: GolfRetailer.where.not(website: nil).where(id: 1..100, duplicate_domain: !true)
But both return an empty array, despite there definitely being records that meet those requirements.
When I run GolfRetailer.where.not(website: nil).where(id: 1..100) I get an array, and when I run GolfRetailer.where.not(website: nil, duplicate_domain: nil).where(id: 1..100) I also get an array, but with all records that do have the true duplicate_domain flag, which isn't what I'm looking for.
I'd rather not search for records that have duplicate_domain: nil as that's not always correct (I may not have processed their domain yet).
For clarity, here is the Schema for the Model.
create_table "golf_retailers", force: :cascade do |t|
t.string "name"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.string "place_id"
t.string "website"
t.string "formatted_address"
t.string "google_places_name"
t.string "email"
t.boolean "duplicate_domain"
t.index ["duplicate_domain"], name: "index_golf_retailers_on_duplicate_domain"
end
What am I missing to make this query work?
This is happening because in SQL when you do a != TRUE, any NULL values will not be included in the result. This is because the NULL value represents an unknown value, so the DB does not know how to do any comparison operations on an unknown value and therefore they're excluded.
One way to get around this is to use IS DISTINCT FROM:
GolfRetailer
.where(id: 1..100)
.where.not(website: nil)
.where("duplicate_domain IS DISTINCT FROM ?", true)
As others have mentioned, you should also ask yourself if it's really the case that it's ever unknown to you if a GolfRetailer has a duplicate_domain.
If, all GolfRetailers with a duplicate_domain of NULL actually mean they don't have one (false) than you should consider preventing a NULL value for that column entirely.
You can do this by adding a NOT NULL constraint on the column with a change_column database migration.
In order to add the NOT NULL constraint you will first need to make sure all of the data in the column has non-null values.
def change
GolfRetailer.in_batches.update_all(duplicate_domain: false)
change_column_null :golf_retailers, :duplicate_domain
end
If your application is under load, you should also be careful about the potential performance any migration like this might have - notably if you add a NOT NULL constraint with a default value.
Consider using something like the Strong Migrations gem to help find DB migrations that might cause downtime before production.
Related
I have a best practice question when it comes to default values, uniqueness constraints and API design. For this exercise, I am creating a pokedex api with the following database schema in postgresql.
create_table :pokemon, id: :uuid do |t|
t.string :name, null: false, unique: true
t.string :national_index, null: false, unique: true
t.text :description, default: 'unknown', null: false
t.string :hp, default: '0', null: false
t.string :attack, default: '0', null: false
t.string :defense, default: '0', null: false
t.string :special_attack, default: '0', null: false
t.string :special_defense, default: '0', null: false
t.string :speed, default: '0', null: false
t.string :height, default: '0', null: false
t.string :weight, default: '0', null: false
t.boolean :male, default: false, null: false
t.boolean :female, default: false, null: false
t.string :category, default: 'unknown', null: false
t.timestamps
end
As you can see from this rails migration, I have presence constraints on all my fields with the addition of a unique constraint for the name of the pokemon and its national index. Additionally, you can see in the database migration, a pokemon's battle statistics are represented with a default value of 0 in the event that one cannot determine a pokemon's battle statistic and its description and category fall under 'unknown' as its default value.
Here is my situation and I think it can be best explained with a user use case.
Let's say that a pokemon researcher encounters a new undiscovered pokemon and uploads this pokemon to the pokedex database though the api that I am creating. I imagine in this case, the pokemon researcher will have to give this pokmeon a name (which can later be changed) and the rest of this pokemon's field will all be filled with default values that can also be changed later once more research has been conducted on the pokemon. However, the pokemon researcher can't assign it a national index number because it is an "unknown" pokemon. Only an officially recognized pokemon can have a national index number and no two pokemon official or undiscovered can have a conflicting number. In other words its not officially recognized. So my question, is what can I use to add a default value to a field that Im defining as present and unique? I would love to give it a N/A, unkown, or other type of value but I'm unable to because they aren't unique. The next undiscovered pokemon can't also have a value of 'N/A' for its national index number. Is there a best practice for a situation like this?
You can argue that perhaps it would be best to just leave those fields empty but from this excellent book I'm reading about API design, I specifically refer to this section under default values for strings.
Similarly, to number and Boolean fields, many serialization formats don't necessarily permit a distinct null value (null) from a zero value(""). As a result, it can be difficult to determine the difference between a user specifying that a string should be the empty string rather than a user specifically asking the API to 'do what's best' for the field given the rest of the context.
Luckily though, there are quite a few options available. In many cases, an empty string is simply not a valid value for a field. As a result, the empty string can indeed be used as a flag indicating that a default value should be injected and saved. In other cases, the string value might have a specific set of appropriate values, with the empty string being on of the choices. In this scenario, it's perfectly reasonable to allow a choice of 'default' to act as a flag indicating that a default value should be stored instead.
the book seems to be advocating for a default value, but is it even possible in the situation I described? Basically, I want my api to be predictable and reliable and I imagine the scenario I described to be something that happens quite often.
You can set default values at the application level for more complex scenarios
class Pokemon < ActiveRecord::Base
before_validation :set_national_index, on: :create
private
def set_national_index
self.national_index = if official?
# ...
else
"unknown_#{Time.zone.now.to_i}_#{rand(99999)}"
end
end
end
I am trying to add a date of birth to each patient in my database and having issues adding a date to the dob attribute on each patient. When I add for ex. Patient.first(dob: "01/25/1990") I get an error reading no implicit conversion of Integer into Hash. Any ideas on how I would do so?
create_table "patients", force: :cascade do |t|
t.string "first_name"
t.string "last_name"
t.integer "age"
t.date "dob"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
When I seeded my database, my dob field was nil
I have also tried Patient.first(dob: Date.parse('31-12-2010')) and still get the same error.
You have two problems:
first doesn't take query arguments.
Your date format is ambiguous.
The first finder method looks like:
first(limit = nil)
Find the first record (or first N records if a parameter is supplied). If no order is defined it will order by primary key.
You want to use find_by as your .where(...).first shortcut.
And to avoid ambiguity with your dates, you should use Date objects or ISO-8601 formatted strings (i.e. YYYY-MM-DD) inside the application and leave the supposedly "human friendly" formats for the very edges.
You want to say:
Patient.find_by(dob: '1990-01-25')
I am obtaining this error on my Ruby on Rails app,
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I've read on the stack overflow API and can't find an answer that works for me. So this is the specific parts of the code:
ActiveRecord::StatementInvalid in Store#show_item
Showing /media/store_test/app/views/store/show_item.html.erb where line #24 raised:
PG::UndefinedFunction: ERROR: operator does not exist: character varying = integer
LINE 1: ...CT "show_item".* FROM "current_item" WHERE (user_id = 1)
^
My logic behind this error is that I have two users, store users and employee users, they are both users but employee users have a "flag" on them, so they can see all items in store. Store users do not have this flag, so this web page should show items they have "wishlisted", and when I create a table to populate this, I am getting the above error/
This code works when Im a employee user, and populates my table as required, but does not work when I'm a store user.
Question: How can I fix this error without heavily modifying my code?
EDIT: SCHEMA
create_table "current_item", force: :cascade do |t|
t.string "name", default: "", null: false
t.string "description"
t.integer "cost"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.string "user_id"
end
Thanks to Marcin Kologziej:
user_id column was defined with the wrong type. Instead of strings, it should have been integer, as per the error. Therefore, I have created a new migration, using the code:
change_column :current_item, :user_id, :integer, using: 'user_id_id::integer'
And then performing:
rake db:migrate
And everything works perfectly now.
My model is really simple:
create_table "stack_items", force: true do |t|
t.integer "stack_id"
t.integer "service_id"
t.text "description"
end
I need to remove duplicate StackItem records that have the same stack_id and service_id. However if one of the dupes has anything in the description field, I have to keep that one, and delete the other duplicate.
StackItem.group(:stack_id, :service_id).order("count_id desc").where("COUNT(*) > 1")
So far I've tried to just grab the duplicates but it's saying I cannot count within a where statement.
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: aggregate functions are not allowed in WHERE
How can I achieve this using Rails 4 and ActiveRecord? My database is Postgresql.
I have an application that has tens of thousands of snapshot records. A very small number of these 'snapshots' (say 1 in 1000) will have one or more 'positions' through a :has_many association.
How can I efficiently discover if a snapshot has a position without firing an active record query for each snapshot? My current solution is to add a boolean field to snapshots - if a snapshot has a position, 'has_position' is set to true. This seems a little messy since it means I have to modify the associated snapshot every time I create a position. Is there a cleaner way to handle this scenario?
create_table "snapshots", :force => true do |t|
t.datetime "created_at",
t.datetime "updated_at",
t.boolean "has_position",
end
create_table "positions", :force => true do |t|
t.integer "snapshot_id"
t.datetime "created_at",
t.datetime "updated_at",
end
What will happen if you generate the migration for positions with the reference to snapshots, the migration file will be generated with a
add_index :positions, :snapshot_id
appended to the end of it.
With an index on snapshot_id the DB will take log(n) queries to figure out whether or not a position has at least one associated record. Not as good as constant time with the boolean, but with mere tens of thousands of records it shouldn't take noticeably longer (unless you're doing this very, very frequently).
Additionally, a simple has_position boolean might be harder than you think to maintain without an index. You can set it to true on creation of an associated position, but you can't set it to false on the deletion because there might exist another one, and you'd have to do a table scan again.
If for some reason using an index is undesirable (or you really need constant time lookup), then I'd recommend using a :counter_cache column.