I have a lot of data that I need to query out of a database. Heroku is timing out when I do the following, because of the 30 second limit:
account.records.all.each do |record|
record.contacts.all.each do |contact|
contact.address.all.each do |address|
..write to file etc
end
end
end
I've read that an SQL View will help with performance rather than querying every record in a .each(), however I need to do a where clause on this set of data. Currently, if I use the 'ExportAllRecord' view like so: ExportAllRecords.where("account_id = 3"), it executes the following:
ExportAllRecord Load (5.0ms) SELECT "export_all_records".* FROM "export_all_records" WHERE (account_id = 3)
whereas, I actually need it to add the 'where clause' to the view.
How can I parameterise the SQL View?
I'm using ActiveRecord.
Thanks.
ActiveRecord doesn't care where it queries from a normal database table or a database view.
Assuming your database view is named export_all_records, then just create a new model:
# in app/model/export_all_record.rb
class ExportAllRecord < ActiveRecord::Base
default_scope { readonly }
end
Use this model like a normal ActiveRecord model:
id = 3 # or perhaps params[:id]
ExportAllRecord.find_by(account_id: id)
#=> returns all records from the view with the given id
You can add more conditions if you need to:
ExportAllRecord.
where(account_id: id).
where(column1: true, column2: 'foobar')
order(:column3)
Related
My BookingGroup has_many Booking. Booking contains column category where the data can be "adult" or "child_infant" or child_normal.
Now I want to count all total %child% and display it in my index view table
I was'nt sure whether this could be done in one line or I have to use a scope, this is where I stucked.
BookingGroup model
def search_by_category
bookings.visible.map(&:category).inject(:+)
end
Assuming category is a string column, you should be able to count it like that :
bookings.visible.where("category LIKE ?", "child%").count
bookings.visible.where(category: ["child_infant", "child_normal"]).count
We can use LIKE just as in SQL with active record
In your BookingGroup model
def search_by_category
bookings.visible.where('category LIKE ?', '%child%').size
end
But, if you do so for many booking_groups, your code will have N+1 queries issue. You can use eager load in your controller
#booking_groups = BookingGroup.joins(:bookings).select('booking_groups.*', 'count(*) as total_bookings').where('bookings.category LIKE ?', '%child%').group(:id)
Then you can
#booking_groups.first.total_bookings
I have an ActiveRecord relation of a user's previous "votes"...
#previous_votes = current_user.votes
I need to filter these down to votes only on the current "challenge", so Ruby's select method seemed like the best way to do that...
#previous_votes = current_user.votes.select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
But I also need to update the attributes of these records, and the select method turns my relation into an array which can't be updated or saved!
#previous_votes.update_all :ignore => false
# ...
# undefined method `update_all' for #<Array:0x007fed7949a0c0>
How can I filter down my relation like the select method is doing, but not lose the ability to update/save it the items with ActiveRecord?
Poking around the Google it seems like named_scope's appear in all the answers for similar questions, but I can't figure out it they can specifically accomplish what I'm after.
The problem is that select is not an SQL method. It fetches all records and filters them on the Ruby side. Here is a simplified example:
votes = Vote.scoped
votes.select{ |v| v.active? }
# SQL: select * from votes
# Ruby: all.select{ |v| v.active? }
Since update_all is an SQL method you can't use it on a Ruby array. You can stick to performing all operations in Ruby or move some (all) of them into SQL.
votes = Vote.scoped
votes.select{ |v| v.active? }
# N SQL operations (N - number of votes)
votes.each{ |vote| vote.update_attribute :ignore, false }
# or in 1 SQL operation
Vote.where(id: votes.map(&:id)).update_all(ignore: false)
If you don't actually use fetched votes it would be faster to perform the whole select & update on SQL side:
Vote.where(active: true).update_all(ignore: false)
While the previous examples work fine with your select, this one requires you to rewrite it in terms of SQL. If you have set up all relationships in Rails models you can do it roughly like this:
entry = Entry.find(params[:entry_id])
current_user.votes.joins(:challenges).merge(entry.challenge.votes)
# requires following associations:
# Challenge.has_many :votes
# User.has_many :votes
# Vote.has_many :challenges
And Rails will construct the appropriate SQL for you. But you can always fall back to writing the SQL by hand if something doesn't work.
Use collection_select instead of select. collection_select is specifically built on top of select to return ActiveRecord objects and not an array of strings like you get with select.
#previous_votes = current_user.votes.collection_select { |v| v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id }
This should return #previous_votes as an array of objects
EDIT: Updating this post with another suggested way to return those AR objects in an array
#previous_votes = current_user.votes.collect {|v| records.detect { v.entry.challenge_id == Entry.find(params[:entry_id]).challenge_id}}
A nice approach this is to use scopes. In your case, you can set this up the scope as follows:
class Vote < ActiveRecord::Base
scope :for_challenge, lambda do |challenge_id|
joins(:entry).where("entry.challenge_id = ?", challenge_id)
end
end
Then your code for getting current votes will look like:
challenge_id = Entry.find(params[:entry_id]).challenge_id
#previous_votes = current_user.votes.for_challenge(challenge_id)
I believe you can do something like:
#entry = Entry.find(params[:entry_id])
#previous_votes = Vote.joins(:entry).where(entries: { id: #entry.id, challenge_id: #entry.challenge_id })
I'm trying to do a simple query of a serialized column, how do you do this?
serialize :mycode, Array
1.9.3p125 :026 > MyModel.find(104).mycode
MyModel Load (0.6ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`id` = 104 LIMIT 1
=> [43565, 43402]
1.9.3p125 :027 > MyModel.find_all_by_mycode("[43402]")
MyModel Load (0.7ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` = '[43402]'
=> []
1.9.3p125 :028 > MyModel.find_all_by_mycode(43402)
MyModel Load (1.2ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` = 43402
=> []
1.9.3p125 :029 > MyModel.find_all_by_mycode([43565, 43402])
MyModel Load (1.1ms) SELECT `mymodels`.* FROM `mymodels` WHERE `mymodels`.`mycode` IN (43565, 43402)
=> []
It's just a trick to not slow your application. You have to use .to_yaml.
exact result:
MyModel.where("mycode = ?", [43565, 43402].to_yaml)
#=> [#<MyModel id:...]
Tested only for MySQL.
Basically, you can't. The downside of #serialize is that you're bypassing your database's native abstractions. You're pretty much limited to loading and saving the data.
That said, one very good way to slow your application to a crawl could be:
MyModel.all.select { |m| m.mycode.include? 43402 }
Moral of the story: don't use #serialize for any data you need to query on.
Serialized array is stored in database in particular fashion eg:
[1, 2, 3, 4]
in
1\n 2\n 3\n etc
hence the query would be
MyModel.where("mycode like ?", "% 2\n%")
put space between % and 2.
Noodl's answer is right, but not entirely correct.
It really depends on the database/ORM adapter you are using: for instance PostgreSQL can now store and search hashes/json - check out hstore. I remember reading that ActiveRecord adapter for PostgreSQl now handles it properly. And if you are using mongoid or something like that - then you are using unstructured data (i.e. json) on a database level everywhere.
However if you are using a db that can't really handle hashes - like MySQL / ActiveRecord combination - then the only reason you would use serialized field is for somet data that you can create / write in some background process and display / output on demand - the only two uses that I found in my experience are some reports ( like a stat field on a Product model - where I need to store some averages and medians for a product), and user options ( like their preferred template color -I really don't need to query on that) - however user information - like their subscription for a mailing list - needs to be searchable for email blasts.
PostgreSQL hstore ActiveRecord Example:
MyModel.where("mycode #> 'KEY=>\"#{VALUE}\"'")
UPDATE
As of 2017 both MariaDB and MySQL support JSON field types.
You can query the serialized column with a sql LIKE statement.
MyModel.where("mycode LIKE '%?%'", 43402)
This is quicker than using include?, however, you cannot use an array as the parameter.
Good news! If you're using PostgreSQL with hstore (which is super easy with Rails 4), you can now totally search serialized data. This is a handy guide, and here's the syntax documentation from PG.
In my case I have a dictionary stored as a hash in an hstore column called amenities. I want to check for a couple queried amenities that have a value of 1 in the hash, I just do
House.where("amenities #> 'wifi => 1' AND amenities #> 'pool => 1'")
Hooray for improvements!
There's a blog post from 2009 from FriendFeed that describes how to use serialized data within MySQL.
What you can do is create tables that function as indexes for any data that you want to search.
Create a model that contains the searchable values/fields
In your example, the models would look something like this:
class MyModel < ApplicationRecord
# id, name, other fields...
serialize :mycode, Array
end
class Item < ApplicationRecord
# id, value...
belongs_to :my_model
end
Creating an "index" table for searchable fields
When you save MyModel, you can do something like this to create the index:
Item.where(my_model: self).destroy
self.mycode.each do |mycode_item|
Item.create(my_model: self, value: mycode_item)
end
Querying and Searching
Then when you want to query and search just do:
Item.where(value: [43565, 43402]).all.map(&:my_model)
Item.where(value: 43402).all.map(&:my_model)
You can add a method to MyModel to make that simpler:
def find_by_mycode(value_or_values)
Item.where(value: value_or_values).all.map(&my_model)
end
MyModel.find_by_mycode([43565, 43402])
MyModel.find_by_mycode(43402)
To speed things up, you will want to create a SQL index for that table.
Using the following comments in this post
https://stackoverflow.com/a/14555151/936494
https://stackoverflow.com/a/15287674/936494
I was successfully able to query a serialized Hash in my model
class Model < ApplicationRecord
serialize :column_name, Hash
end
When column_name holds a Hash like
{ my_data: [ { data_type: 'MyType', data_id: 113 } ] }
we can query it in following manner
Model.where("column_name = ?", hash.to_yaml)
That generates a SQL query like
Model Load (0.3ms) SELECT "models".* FROM "models" WHERE (column_name = '---
:my_data:
- :data_type: MyType
:data_id: 113
')
In case anybody is interested in executing the generated query in SQL terminal it should work, however care should be taken that value is in exact format stored in DB. However there is another easy way I found at PostgreSQL newline character to use a raw string containing newline characters
select * from table_name where column_name = E'---\n:my_data:\n- :data_type: MyType\n :data_id: 113\n'
The most important part in above query is E.
Note: The database on which I executed above is PostgreSQL.
To search serialized list you need to prefix and postfix the data with unique characters.
Example:
Rather than something like:
2345,12345,1234567 which would cause issues you tried to search for 2345 instead, you do something like <2345>,<12345>,<1234567> and search for <2345> (the search query get's transformed) instead. Of course choice of prefix/postfix characters depends on the valid data that will be stored. You might instead use something like ||| if you expect < to be used and potentially| to be used. Of course that increases the data the field uses and could cause performance issues.
Using a trigrams index or something would avoid potential performance issues.
You can serialize it like data.map { |d| "<#{d}>" }.join(',') and deserialize it via data.gsub('<').gsub('>','').split(','). A serializer class would do the job quite well to load/extract tha data.
The way you do this is by setting the database field to text and using rail's serialize model method with a custom lib class. The lib class needs to implement two methods:
def self.dump(obj) # (returns string to be saved to database)
def self.load(text) # (returns object)
Example with duration. Extracted from the article so link rot wouldn't get it, please visit the article for more information. The example uses a single value, but it's fairly straightforward to serialize a list of values and deserialize the list using the methods mentioned above.
class Duration
# Used for `serialize` method in ActiveRecord
class << self
def load(duration)
self.new(duration || 0)
end
def dump(obj)
unless obj.is_a?(self)
raise ::ActiveRecord::SerializationTypeMismatch,
"Attribute was supposed to be a #{self}, but was a #{obj.class}. -- #{obj.inspect}"
end
obj.length
end
end
attr_accessor :minutes, :seconds
def initialize(duration)
#minutes = duration / 60
#seconds = duration % 60
end
def length
(minutes.to_i * 60) + seconds.to_i
end
end
If you have serialized json column and you want to apply like query on that. do it like that
YourModel.where("hashcolumn like ?", "%#{search}%")
How would i do a query like this.
i have
#model = Model.near([latitude, longitude], 6.8)
Now i want to filter another model, which is associated with the one above.
(help me with getting the right way to do this)
model2 = Model2.where("model_id == :one_of_the_models_filtered_above", {:one_of_the_models_filtered_above => only_from_the_models_filtered_above})
the model.rb would be like this
has_many :model2s
the model2.rb
belongs_to :model
Right now it is like this (after #model = Model.near([latitude, longitude], 6.8)
model2s =[]
models.each do |model|
model.model2s.each do |model2|
model2.push(model2)
end
end
I want to accomplish the same thing, but with an active record query instead
i think i found something, why does this fail
Model2.where("model.distance_from([:latitude,:longitude]) < :dist", {:latitude => latitude, :longitude => longitude, :dist => 6.8})
this query throws this error
SQLite3::SQLException: near "(": syntax error: SELECT "tags".* FROM "tags" WHERE (model.distance_from([43.45101666666667,-80.49773333333333]) < 6.8)
, why
use includes. It will eager-load associated models (only two SQL queries instead of N+1).
#models = Model.near( [latitude, longitude], 6.8 ).includes( :model2s )
so when you will do #models.first.model2s, associated model2s will already be loaded (see RoR guides for more info).
If you want to get an array of all model2s belonging to your collection of models, you can do :
#models.collect( &:model2s )
# add .flatten at the end of the chain if you want a one level deep array
# add .uniq at the end of the chain if you don't want duplicates
collect (also called map) will gather in an array the result of any block passed to each of the caller's elements (this does exactly the same as your code, see Enumerable's doc for more info). The & before the symbol converts it into a Proc passed to each element of the collection, so this is the same as writing
#models.collect {|model| model.model2s }
one more thing : #mu is right, seems SQLite does not know about your distance_from stored procedure. As i suspect this is a GIS related question, you may ask about this particular issue on gis.stackexchange.com
How do I define a model attribute as an expression of another attribute?
Example:
Class Home < ActiveRecord::Base
attr_accessible :address, :phone_number
Now I want to be able to return an attribute like :area_code, which would be an sql expression like "substr(phone_number, 1,3)".
I also want to be able to use the expression / attribute in a group by query for a report.
This seems to perform the query, but does not return an object with named attributes, so how do I use it in a view?
Rails Console:
#ac = Home.group("substr(phone_number, 1,3)").count
=> #<OrderedHash {"307"=>3, "515"=>1}>
I also expected this to work, but not sure what kind of object it is returning:
#test = Home.select("substr(phone_number, 1,3) as area_code, count(*) as c").group("substr(phone_number, 1,3)")
=> [#<Home>, #<Home>]
To expand on the last example. Here it is with Active Record logging turned on:
>Home.select("substr(phone_number, 1,3) as area_code, count(*) as c").group("substr(phone_number, 1,3)")
Output:
Home Load (0.3ms) SELECT substr(phone_number, 1,3) as area_code, count(*) as c FROM "homes" GROUP BY substr(phone_number, 1,3)
=> [#<Home>, #<Home>]
So it is executing the query I want, but giving me an unexpected data object. Shouldn't I get something like this?
[ #<area_code: "307", c: 3>, #<area_code: "515", c: 1> ]
you cannot access to substr(...) because it is not an attribute of the initialized record object.
See : http://guides.rubyonrails.org/active_record_querying.html "selecting specific fields"
you can workaround this this way :
#test = Home.select("substr(phone_number, 1,3) as phone_number").group(:phone_number)
... but some might find it a bit hackish. Moreover, when you use select, the records will be read-only, so be careful.
if you need the count, just add .count at the end of the chain, but you will get a hash as you already had. But isn't that all you need ? what is your purpose ?
You can also use an area_code column that will be filled using callbacks on create and update, so you can index this column ; your query will run fast on read, though it will be slower on insertion.