I have a Rails application where there is a request to create elements on a table and another one, waiting, that reads if such element was created or not.
Right now, I check the live Data and the waiting process never sees any new record for the table.
Any ideas how to force a reconnection or anything that will update the model reference?
The basic idea:
One user, through a POST, is creating a record (that i can see directly on the database).
Another piece of code, that was running before the requests, is waiting for that record, but does not find it.
I'd appreciate any insights.
The waiting request is probably using the ActiveRecord cache. Rails caches queries for the duration of a request, so if you run the exact same query multiple times in the same request, it will only hit the database the first time.
You can simulate this in the console:
Entity.cache { Entity.first; Entity.first }
In the logs, you'll see something like this (notice the second request says CACHE):
[2018-12-06T10:51:02.436 DEBUG (1476) #] Entity Load (4.9ms) SELECT "entities".* FROM "entities" LIMIT 1
[2018-12-06T10:51:02.450 DEBUG (1476) #] CACHE (0.0ms) SELECT "entities".* FROM "entities" LIMIT 1
To bypass the cache, you can use:
Entity.uncached { Entity.where(key: #key)&.last }
Using uncached will disable the cache inside the block, even if the enclosing scope is running inside a cached block.
It seems that, for a running Rails application, if a running piece of code is looking for anything that has been updated by a new request, after the first began its execution, the updated data would not show, using active_record models.
The solution: Just run a raw query.
sql = "SELECT * FROM entities WHERE key = '"+key+"' ORDER BY ID DESC LIMIT 1;"
records_array = ActiveRecord::Base.connection.execute(sql).values
I know it's a huge oversimplification and that there is probably an underlying problem, but it did solved it.
Related
I'm getting into a project in Ruby on Rails currently, which I don't really have any experience in. Now, we're running into some performance problems that I think are related to executing too many queries.
We have a Service model that we set in our controller as follows:
#services = Service.includes(:points).with_points
The with_points scope is defined in the service as such:
scope :with_points, -> { joins(:points).where.not(points: []).distinct }
(I think the where clause is not needed here, but that's probably not relevant to the question.)
Then in the view, we're using the fetched services, with linked points, as follows:
<% #services.each do |s| %>
<div class="col-xs-12 col-sm-6 col-lg-4 serviceDiv" data-rating="<%= s.service_ratings %>" >
<% if s.service_ratings == "A" %>
<% grade = "rating-a" %>
<!-- etc. -->
Now, as far as I've seen when researching around, this is a relatively normal pattern when trying to list all rows from a table. However, when I look at the logs, it looks like a separate query is executed for every service?
# This query is what I'd expect:
SQL (1.6ms) SELECT DISTINCT "services"."id" # etc, etc
# But then we also get one of these for every service
Service Exists (1.5ms) SELECT 1 AS one FROM "services" WHERE "services"."name" = $1 AND ("services"."id" != $2) LIMIT $3
# And quite a few of these for every service:
CACHE Service Exists (0.0ms) SELECT 1 AS one FROM "services" WHERE "services"."name" = $1 AND ("services"."id" != $2) LIMIT $3
Now, my hunch is that those "Service Exists" lines are bad news. What are they, and where are they coming from? Is there anything else that's relevant that I'm missing here?
After #MarekLipka's pointers in the comments on this question, I managed to find the source of the problem. Not sure how many people will run into this setup in the future, but I'll share it just in case.
The clue was in our accessing s.service_ratings which was not, in fact, a column in the database. ActiveRecord Query Trace pointed to the source of all those queries being a reference to s.service_ratings in the view, which was a red flag.
service_ratings was actually a method (assuming I'm using the correct terminology here) on the Service model that, apart from returning a value based on a calculation on several of the model's properties, also called self.update_attributes to actually store that value in the database. This meant that every time we retrieved this model's data in order to display it, we also ran another query to save that value to the database - often redundantly.
In other words, the solution for us now is to either run the calculation at some other point of time and store it in the database once, or recalculate it every time we need it and not store it at all.
We are working on a data visualization problem right now. Our customer wants us to show the last 6 months data for a honeybee hive on a graph.
Clearly it's gonna be a huge dataset. Adding indexes we overcame the database slowness problem in loading data though we still have problem in visualizing data on a graph.
Here is the related code:
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
data = []
messages.each do |message|
record = []
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
data << record
end
return data
end
The problem is that messages.each is very slow and takes more than 30 seconds. Is there any solution to overcome this?
Project Specification:
Rails Version: 4.1.9
Graph Library: Dygraph
Database: Postgres
There are two ways to attack a performance problem like this.
Find and correct the performance bottle neck
Break it into smaller pieces
Finding Performance issues
First, get a dataset large enough to reproduce the problem setup on your dev system. Then look at the logs so you can see how long the transaction is taking. You should be looking for a line like this:
Completed 200 OK in 432.1ms (Views: 367.7ms | ActiveRecord: 61.4ms)
Rerun the task a couple times since caching can cause variations. Write down your different times. Then remove everything in the loop and run it with just the loop. Do the numbers go back to looking reasonable? If that is the case then you know the problem is the work you are doing inside the loop. Next, add each line in the loop back on its own (or one at a time if they depend on each other). Figure out which line causes those numbers to jump the most.
This is the point where you should try to performance tune your code. Check for queries that could be smarter. Make sure you aren't querying the same data over and over. If you have a function in a model that computes something and you call it multiple times to get the same answer then use this to only compute once:
def something
return #savedvalue if #savedvalue
#savedvalue = really complex calculation
end
The goal is to find the worse offender so you can make changes that have the biggest impact. However, if you are working with a LOT of data this may only get you so far. It may be impossible to performance tune enough for all the data. In that case there is option 2.
Break it into smaller pieces
Write a second rails action who's only job is to render a single record on a graph. It will do the inner part of your loop but only on the message who's id was passed to it.
Call your original function to setup the view and pass the list of messages to the view. In the view loop through the list of messages to setup jquery ajax code to call the above action once for each message. Have this run in on document ready.
Then, the page will load with an empty graph... but as soon as it is up the individual processed records will be fed to it and appear one at a time on the page. It will still take just ask long (or even a little longer because of overhead) to complete the graph... but it will no longer time out. Each ajax call will be its own quick hit to the server instead of one big long hit.
I just used this very technique to load a rather long report on a site I work on. Ideally we'd like to fix any underlying performance issues... but what we really wanted was to have a report working right away and then fix the performance issues as we had time.
Ok you said every person sees the same set of data, which is great, means we can cache without worrying about who's logged in, first here's your method, with tiny improvements
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
messages.inject([]) do |records, message|
records << [].tap do |record|
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
end
end
end
Then create a caching function, that runs this method and caches it
# some class constants
CACHE_KEY = 'some_cache_key'
EXPIRY_TIME = 15.minutes
# the methods
def self.write_single_hive_messages_to_cache(messages, us_metric_enabled)
Rails.cache.write CACHE_KEY,
self.class.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled),
expires_in: EXPIRY_TIME
end
And a simple cache reading method
self.read_single_hive_messages_from_cache
Rails.cache.read CACHE_KEY
end
Then create a rake task that just fetches these messages and call the caching method, and rails will write the cache.
Create a cron job that calls this rake task, set the cron job to 5 minutes or so, the expiry time is longer just in case for some reason the cron job didn't run, the data will still be available for the next run.
This way your processing is run in the background, every 5 ( or whatever time you choose ) minutes, the page load should happen normally with no delay at all, since the array data will be loaded from the pre-calculated cache.
In case the cron stops working, the data will expire in the 15 minutes I've set, and then the read cache method will return nil, you could avoid this and set the data to never expire, but then the data will become stale and the old data will keep getting returned.
Another way to handle this is to tell the cache reading method how to generate the cache it self, so if it finds the cache empty it generates one and caches it itself before returning the data, the method would look like this
def self.read_single_hive_messages_from_cache(messages, us_metric_enabled)
Rails.cache.fetch CACHE_KEY, expires_in: EXPIRY_TIME do
self.class.write_single_hive_messages_to_cache(messages, us_metric_enabled)
end
end
But then make sure that messages is an ActiveRecord::Relation and not a processed array, because you don't want to query for 1+ million records and then find the cache already ready, if it's an ActiveRecord::Relation it will not touch the database until the array is started ( inside the caching block ), if the cache exists it will be returned before you enter the block and thus the data won't get fetched, saving you that huge query.
I know the answer got long, if you need more help tell me.
I'm working on a project where i want to do a mysql query from time to time. The query is too long, and actually it's done when the user does a request.
I'm afraid if many users does the request, the application will be too slow to respond. So, I want to do the query and load it with the query response from time to time, and then, on a request, the action from the controller will use this variable, instead of doing the query again and again.
How can I do that using Whenever?
on the schedule.rb
every 5.minutes do
runner "variable = Model.method"
end
and on the controller
def some_action
"the variable should be loaded here"
end
I agree with Damien Roche, you need to cache the results of the query. But, I don't think the example he gives is the best answer for you because you don't want for a user to wait for the query when it isn't cached, at the times when the cache is expired, even if this is a rare occurrence.
So you need to combine the periodic query with whenever, like you suggested, with a caching mechanism to store your query result, and retrieve it from the cache in your controller. since the runner is a different process, you will have to use a cache that is available to both the runner and your app. I recommend you look into Redis. it should be very simple to get it to work so that the runner runs the query and when it finishes writes a result set to the Redis cahce. The controller will then read the result set from the cache.
New to Ruby and Rails so forgive me if my terminology is off a bit.
I am working on optimizing some inherited code and watching the logs I am seeing queries repeat themselves due to lines like this in a .rabl file:
node(:program) { |user| (!user.programs.first.nil?) ? user.programs.first.name : '' }
user and program are both active record objects
Moving to the rails console, I can replicate the problem, but I can also get the expected behavior, which is only one query:
>u = User.find(1234)
User Load (0.3ms) SELECT `users`.* FROM `users` WHERE [...]
> (!u.programs.first.nil?) ? u.programs.first.name : ''
Program Load (0.2ms) SELECT `programs`.* FROM `programs` [...]
Program Load (0.3ms) SELECT `programs`.* FROM `programs` [...]
=> "Output"
Note that repeating the ternary statement in the console will always give me 2 queries.
I can get the expected behavior like so:
> newu = User.find(12345)
User Load (3.8ms) SELECT `users`.* FROM `users` WHERE [...]
> newu.programs
Program Load (0.6ms) SELECT `programs`.* FROM `programs` [...]
> (!newu.programs.first.nil?) ? newu.programs.first.name : ''
=> "Output"
Repeating the ternary statement now won't requery at all.
So the question is: why does calling newu.programs change the behavior? Shouldn't calling u.programs.first.nil? also act to load all the program records in the same way?
With an association, first is not sugar for [0].
If the association is loaded, then it just returns the first element of the array. If it is not loaded, it makes a database query to load just that one element. It can't stick that in the association cache (at least not without being smarter), so the next query to first does the query again (this will use the query cache if turned on)
What Rails is assuming is that if the association is big, and you are only using one element of it then it would be silly to load the whole thing. This can be a little annoying when this isn't the case and you are just using the one item, but you're using it repeatedly.
To avoid this you can either assign the item to a local variable so that you do genuinely only call first once, or do
newu.programs[0]
which will load the whole association (once only) and return the first element.
Rails does the same thing with include?. Instead of loading the whole collection, it will run a query that tests whether a specific item is in the collection (unless the collection is loaded)
I'm trying to figure out where a whole pile of extra queries are being generated by my rails app. I need some ideas on how to tackle it. Or, if someone can give me some hints, I'd be grateful.
I get these:
SQL (1.0ms) SELECT name
FROM sqlite_master
WHERE type = 'table' AND NOT name = 'sqlite_sequence'
SQL (0.8ms) SELECT name
FROM sqlite_master
WHERE type = 'table' AND NOT name = 'sqlite_sequence'
SQL (0.8ms) SELECT name
FROM sqlite_master
WHERE type = 'table' AND NOT name = 'sqlite_sequence'
repeated over and over on every request to the DB (as much as 70 times for a single request)
I tried installing a plugin that traced the source of the queries, but it really didn't help at all. I'm using the hobofields gem, dunno if that is what's doing it but I'm somewhat wedded to it at the moment
Any tips on hunting down the source of these extra queries?
Have a look at ActiveRecord gem in connection_adapters/sqlite_adapter.rb on line 173 (I'm using ActiveRecord 3.0.7) you have a function called tables that generates the exact query you posted:
SELECT name
FROM sqlite_master
WHERE type = 'table' AND NOT name = 'sqlite_sequence'
In development mode this function gets called for every table from your database, on every request.
In production this function gets called only once, when the server starts, and the result is cached.
In development for each request rails looks into your database to see what columns each table has so it can generate methods on your models like "find_by_name". Since in production is unlikely that your database changes, rails does this lookup only on server start.
It's very difficult to tell this without look into the Code.
but i am sure you write your query in a certain loop
for i in 0..70
SqliteMaster.find(:all, :conditions=>["type=? and not name=?", 'table', 'sqlite_sequesnce'])
end
So my advice is that to check all the methods which gets called after requesting certain method and look whether the query called in a loop.
I've just seen this appearing in my logs when I do a search with the metasearch gem but ONLY in development mode.
I believe that it is caused by the plugin acts-as-taggable-on.
It will check if the table or the cache column exists.