Need to output 1095 fields in Rails app - ruby-on-rails

I'm writing an application that requires 3 inputs a day for every day of the year. (3*365=1095). I'm struggling with a way to output each field in an efficient manner.
The inputs are not all-or-nothing (you could fill in 10 days worth of input, hit save, and come back later to fill in more)
I attempted to do this by building all 1095 objects in the controller and then outputting the inputs in the view, but obviously this is really slow and probably memory intensive.
Any suggestions? I'm leaning toward writing the entire form client-side and then filling in the existing elements using AJAX.
EDIT
The model is called Timing and has these attributes:
month, day, time1, time2, time3
so there are 365 models to be saved.

Sounds like you've got a nested resource. You have a resource called timing which contains a resource called, what, day?
#routes
resources :timing do
resources :day
end
So assuming that when timing is created, you have all 365 days created as well (sounds like a pretty expensive operation). Displaying the fields isn't that tricky. You could just do
#controller
def show
#timings = Timing.all
end
#view
(Date.beginning_of_year..Date.end_of_year).each do |day|
t = #timings.find { |timing| timing.date == day } #or some other method of deciding that the current day has a timing
unless t.nil?
form_for t #etc
else
form_for Timing.new #etc
end
end
Then perhaps you could make each for submit via UJS and call it a day. Though this sounds like a pretty complicated design, but I'm not sure what your problem area is.

If I understand your question correctly, you want a way to dynamically show time inputs, 3 of them per day, on a form.
If the above is correct, what I would suggest is that you do the nested resource as #DVG has detailed, and load the current day only. If you need to load multiple days, you can easily request that through UJS (Ajax) and load it on the same page.
What you probably want to do, in order not to melt down the server, is auto-save the time inputs or auto-save each day's time inputs when the grouping loses focus.

#DVG's answer probably works fine, but it keeps all of the work on the server.
What I ended up going with was my initial thought: get all of the existing timings like this:
def edit
#timings = Timing.find_by_store_id(params[:store_id])
end
then in the view, I wrote two javascript functions: one that writes all 365 rows with all 3 columns. Once the field were all output in Javascript, I used another function that took the existing records and inserted them into the form:
<script type="text/javascript">
function updateForm(){
timings = <%= #timings.to_json %>;
... fill out the fields ...
}
</script>
It works nice and fast, and best of all, no AJAX calls. Of course one caveat is that this fails if the user has Javascript disabled, but that's not an issue for me.

Related

Request timeout in Rails

We are working on a data visualization problem right now. Our customer wants us to show the last 6 months data for a honeybee hive on a graph.
Clearly it's gonna be a huge dataset. Adding indexes we overcame the database slowness problem in loading data though we still have problem in visualizing data on a graph.
Here is the related code:
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
data = []
messages.each do |message|
record = []
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
data << record
end
return data
end
The problem is that messages.each is very slow and takes more than 30 seconds. Is there any solution to overcome this?
Project Specification:
Rails Version: 4.1.9
Graph Library: Dygraph
Database: Postgres
There are two ways to attack a performance problem like this.
Find and correct the performance bottle neck
Break it into smaller pieces
Finding Performance issues
First, get a dataset large enough to reproduce the problem setup on your dev system. Then look at the logs so you can see how long the transaction is taking. You should be looking for a line like this:
Completed 200 OK in 432.1ms (Views: 367.7ms | ActiveRecord: 61.4ms)
Rerun the task a couple times since caching can cause variations. Write down your different times. Then remove everything in the loop and run it with just the loop. Do the numbers go back to looking reasonable? If that is the case then you know the problem is the work you are doing inside the loop. Next, add each line in the loop back on its own (or one at a time if they depend on each other). Figure out which line causes those numbers to jump the most.
This is the point where you should try to performance tune your code. Check for queries that could be smarter. Make sure you aren't querying the same data over and over. If you have a function in a model that computes something and you call it multiple times to get the same answer then use this to only compute once:
def something
return #savedvalue if #savedvalue
#savedvalue = really complex calculation
end
The goal is to find the worse offender so you can make changes that have the biggest impact. However, if you are working with a LOT of data this may only get you so far. It may be impossible to performance tune enough for all the data. In that case there is option 2.
Break it into smaller pieces
Write a second rails action who's only job is to render a single record on a graph. It will do the inner part of your loop but only on the message who's id was passed to it.
Call your original function to setup the view and pass the list of messages to the view. In the view loop through the list of messages to setup jquery ajax code to call the above action once for each message. Have this run in on document ready.
Then, the page will load with an empty graph... but as soon as it is up the individual processed records will be fed to it and appear one at a time on the page. It will still take just ask long (or even a little longer because of overhead) to complete the graph... but it will no longer time out. Each ajax call will be its own quick hit to the server instead of one big long hit.
I just used this very technique to load a rather long report on a site I work on. Ideally we'd like to fix any underlying performance issues... but what we really wanted was to have a report working right away and then fix the performance issues as we had time.
Ok you said every person sees the same set of data, which is great, means we can cache without worrying about who's logged in, first here's your method, with tiny improvements
def self.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled)
messages.inject([]) do |records, message|
records << [].tap do |record|
record << message.occurance_time.to_s(:dygraph_format)
record << weight_according_to_metric(message.weight, us_metric_enabled)
record << temperature_according_to_metric(message.temperature, us_metric_enabled)
record << (message.humidity.nil? ? nil : message.humidity.to_f)
end
end
end
Then create a caching function, that runs this method and caches it
# some class constants
CACHE_KEY = 'some_cache_key'
EXPIRY_TIME = 15.minutes
# the methods
def self.write_single_hive_messages_to_cache(messages, us_metric_enabled)
Rails.cache.write CACHE_KEY,
self.class.prepare_single_hive_messages_for_datatable_dygraph(messages, us_metric_enabled),
expires_in: EXPIRY_TIME
end
And a simple cache reading method
self.read_single_hive_messages_from_cache
Rails.cache.read CACHE_KEY
end
Then create a rake task that just fetches these messages and call the caching method, and rails will write the cache.
Create a cron job that calls this rake task, set the cron job to 5 minutes or so, the expiry time is longer just in case for some reason the cron job didn't run, the data will still be available for the next run.
This way your processing is run in the background, every 5 ( or whatever time you choose ) minutes, the page load should happen normally with no delay at all, since the array data will be loaded from the pre-calculated cache.
In case the cron stops working, the data will expire in the 15 minutes I've set, and then the read cache method will return nil, you could avoid this and set the data to never expire, but then the data will become stale and the old data will keep getting returned.
Another way to handle this is to tell the cache reading method how to generate the cache it self, so if it finds the cache empty it generates one and caches it itself before returning the data, the method would look like this
def self.read_single_hive_messages_from_cache(messages, us_metric_enabled)
Rails.cache.fetch CACHE_KEY, expires_in: EXPIRY_TIME do
self.class.write_single_hive_messages_to_cache(messages, us_metric_enabled)
end
end
But then make sure that messages is an ActiveRecord::Relation and not a processed array, because you don't want to query for 1+ million records and then find the cache already ready, if it's an ActiveRecord::Relation it will not touch the database until the array is started ( inside the caching block ), if the cache exists it will be returned before you enter the block and thus the data won't get fetched, saving you that huge query.
I know the answer got long, if you need more help tell me.

struggling with ember.js concepts - lots of data

I am looking at using Ember.js for a new, Rails-backed, app (using Active Model Serializers). I am struggling to get my head around the framework, so maybe this is a bit of a newbie question.
My data structure is like this (simplified):
Event Days --
Events --
* Participants
* Location
Inside of an 'Event Day' there can be thousands of events (and inside an event dozens of participants and all of their data).
It seems 'wrong' that when I want to get a listing of event days I load some JSON that has the not only the EventDays but also all the Events (and then all the data from everything inside there)... basically, it loads the whole tree!
I thought I could solve this problem by using custom Serializers for the actions, but, at some point I need to get the data and Ember seems to never call the server again.
So, if I load EventDays and simply have no event data inside it Ember never calls the server to update the EventDay object when I click through to a 'show' method.
I don't know if I am being clear here. I am hoping someone who is a little ahead of me in this can understand what I am driving at!
Really I think it boils down to 2 questions:
1)How to properly filter out information on requests so that only the local objects are filled in (i.e. on a call to an index method I need a list of event days without children, but on a call to a show method I need a single event day filled with the next level down)
2) How to get Ember to 'reload' an object at the appropriate time to fill out the appropriate content
Maybe I am looking at this wrong - missing the point of something like Ember - and if so I welcome pointers to appropriate tutorials but I can't find anything (even on the Ember site) that explains how to do anything other than load the whole tree at once. With Gigs of data, this seems slow, a definite browser-killer and just plain wrong.
I appreciate my StackOverflow brethren helping me learn!
edit
As I was immediately down voted for some reason I will add code:
Client Side:
App.EventDay = DS.Model.extend({
day: DS.attr('date'),
events: DS.hasMany("Event", {async: true})
});
Server Side:
class EventDaySerializer < ActiveModel::Serializer
attributes :id, :day
has_many :events, embed: :ids, key: :events
end
edit 2
after kertap's suggestion I added the async attribute and updated the serialiser code above.
The json is here:
{"event_day":
{"id":2,
"day":"2013-12-05",
"events":[1,2,3,4,5,6,7,8]
}
}
It is worth noting that if I do not use the key: :events parameter in the serialiser things come back as "event_ids": [1,2,3,4] which, you would think is right, but causes Ember to not see the events.
Also worth noting is that if I do this:
HorseFeeder.ApplicationSerializer = DS.ActiveModelSerializer.extend({});
Then nothing works at all! I get Error while loading route: Error: Assertion Failed: The response from a findAll must be an Array, not undefined
I really don't think it should be this difficult to get the basic wiring of Ember and Rails to work...
You can tell ember data that a relationship is asynchronous.
App.EventDay = DS.Model.extend({
day: DS.attr('date'),
events: DS.hasMany("Event", {async: true})
});
If you do this and get all EventDays you will get the list of EventDays from the server side. The server side should respond with the ids of the events that are contained in an event day or you can provide a url that is a link to all events. Ember won't load the events until you need them.
Then when you call the show method for an event day and in your template you get all events for that day ember data will go off and fetch the data for events.

Rails 3 and Memcached - Intelligent caching without expiration

I am implementing caching into my Rails project via Memcached and particularly trying to cache side column blocks (most recent photos, blogs, etc), and currently I have them expiring the cache every 15 minutes or so. Which works, but if I can do it more up-to-date like whenever new content is added, updated or whatnot, that would be better.
I was watching the episode of the Scaling Rails screencasts on Memcached http://content.newrelic.com/railslab/videos/08-ScalingRails-Memcached-fixed.mp4, and at 8:27 in the video, Gregg Pollack talks about intelligent caching in Memcached in a way where intelligent keys (in this example, the updated_at timestamp) are used to replace previously cached items without having to expire the cache. So whenever the timestamp is updated, the cache would refresh as it seeks a new timestamp, I would presume.
I am using my "Recent Photos" sideblock for this example, and this is how it's set up...
_side-column.html.erb:
<div id="photos"">
<p class="header">Photos</p>
<%= render :partial => 'shared/photos', :collection => #recent_photos %>
</div>
_photos.html.erb
<% cache(photos) do %>
<div class="row">
<%= image_tag photos.thumbnail.url(:thumb) %>
<h3><%= link_to photos.title, photos %></h3>
<p><%= photos.photos_count %> Photos</p>
</div>
</div>
<% end %>
On the first run, Memcached caches the block as views/photos/1-20110308040600 and will reload that cached fragment when the page is refreshed, so far so good. Then I add an additional photo to that particular row in the backend and reload, but the photo count is not updated. The log shows that it's still loading from views/photos/1-20110308040600 and not grabbing an updated timestamp. Everything I'm doing appears to be the same as what the video is doing, what am I doing wrong above?
In addition, there is a part two to this question. As you see in the partial above, #recent_photos query is called for the collection (out of a module in my lib folder). However, I noticed that even when the block is cached, this SELECT query is still being called. I attempted to wrap the entire partial in a block at first as <% cache(#recent_photos) do %>, but obviously this doesn't work - especially as there is no real timestamp on the whole collection, just it's individual items of course. How can I prevent this query from being made if the results are cached already?
UPDATE
In reference to the second question, I found that unless Rails.cache.exist? may just be my ticket, but what's tricky is the wildcard nature of using the timestamp...
UPDATE 2
Disregard my first question entirely, I figured out exactly why the cache wasn't refreshing. That's because the updated_at field wasn't being updated. Reason for that is that I was adding/deleting an item that is a nested resource in a parent, and I probably need to implement a "touch" on that in order to update the updated_at field in the parent.
But my second question still stands...the main #recent_photos query is still being called even if the fragment is cached...is there a way using cache.exists? to target a cache that is named something like /views/photos/1-2011random ?
One of the major flaws with Rails caching is that you cannot reliably separate the controller and the view for cached components. The only solution I've found is to embed the query in the cached block directly, but preferably through a helper method.
For instance, you probably have something like this:
class PhotosController < ApplicationController
def index
# ...
#recent_photos = Photos.where(...).all
# ...
end
end
The first instinct would be to only run that query if it will be required by the view, such as testing for the presence of the cached content. Unfortunately there is a small chance that the content will expire in the interval between you testing for it being cached and actually rendering the page, something that will lead to a template rendering error when the nil-value #recent_photos is used.
Here's a simpler approach:
<%= render :partial => 'shared/photos', :collection => recent_photos %>
Instead of using an instance variable, use a helper method. Define your helper method as you would've the load inside the controller:
module PhotosHelper
def recent_photos
#recent_photos ||= Photos.where(...).all
end
end
In this case the value is saved so that multiple calls to the same helper method only triggers the query once. This may not be necessary in your application and can be omitted. All the method is obligated to do is return a list of "recent photos", after all.
A lot of this mess could be eliminated if Rails supported sub-controllers with their own associated views, which is a variation on the pattern employed here.
As I've been working further with caching since asking this question, I think I'm starting to understand exactly the value of this kind of caching technique.
For example, I have an article and through a variety of things I need for the page which include querying other tables, maybe I need to do five-seven different queries per article. However, caching the article in this way reduces all those queries to one.
I am assuming that with this technique, there always needs to have at least "one" query, as there needs to be "some" way to tell whether the timestamp has been updated or not.

Dynamic nested form elements based on inputting a starting number

Like many others, I'm new to Rails and have a question. I'm working on a sample tracker for a small analytical lab. I would like users to submit batches consisting of many samples. I want the front page to be a simple gateway into batch submission. My general plan is:
Front page asks for number of samples in batch. User enters number and hits submit.
A form is generated where the user can enter batch information (Sampling date, experiment name, batch model stuff). Under the batch fields there should be as many fields for individual sample IDs as the user indicated in the first step.
User fills all this out and the batch and its samples are created upon submission.
My feeling is that the homepage should pass some sort of parameter to the batches controller, which then iteratively builds the samples while the model has a method to iteratively build form elements for the view. Is this thinking correct? How could I pass a parameter that isn't directly related to any models or controllers? I could find any similar questions, but if anyone can link me to a solution for a similar problem or a Railscast or something I'd be very grateful!
There's no need to back a form with a model. For your view, you'll just want something like this example (in Haml):
- form_tag new_batch_path, :method => "get" do
= label_tag(:sample_count, "Number of samples:")
= text_field_tag(:sample_count, 3)
= submit_tag("Get Started!")
And then in your controller, and the new_batch view, you can just reference params[:sample_count]
- (params[:sample_count] || 5).to_i.times do |n| ...
Because this isn't tied to a model (and nothing's being saved anyway) you can't use model validations to check the value. If you do want to verify, you'll do the verification in the batches controller - either as a before_filter, or just inline:
#sample_count = params[:sample_count].to_i
unless (1..10).include? #sample_count
flash[:error] = "A batch must contain between 1 and 10 samples."
redirect_to root_url
end
Note that nil.to_i, "".to_i and rubbish like "ajsdgsd".to_i all equal 0, so unless you want people to be able to specify 0 samples, this code is fairly robust
Have a look at these Railscasts series:
Nested Model Form: Part 1, Part 2
Complex Forms: Part 1, Part 2, Part 3
The "Nested Model Form" screencasts are newer, so I'd go with these ones first.

Need alternative to filters/observers for Ruby on Rails project

Rails has a nice set of filters (before_validation, before_create, after_save, etc) as well as support for observers, but I'm faced with a situation in which relying on a filter or observer is far too computationally expensive. I need an alternative.
The problem: I'm logging web server hits to a large number of pages. What I need is a trigger that will perform an action (say, send an email) when a given page has been viewed more than X times. Due to the huge number of pages and hits, using a filter or observer will result in a lot of wasted time because, 99% of the time, the condition it tests will be false. The email does not have to be sent out right away (i.e. a 5-10 minute delay is acceptable).
What I am instead considering is implementing some kind of process that sweeps the database every 5 minutes or so and checks to see which pages have been hit more than X times, recording that state in a new DB table, then sending out a corresponding email. It's not exactly elegant, but it will work.
Does anyone else have a better idea?
Rake tasks are nice! But you will end up writing more custom code for each background job you add. Check out the Delayed Job plugin http://blog.leetsoft.com/2008/2/17/delayed-job-dj
DJ is an asynchronous priority queue that relies on one simple database table. According to the DJ website you can create a job using Delayed::Job.enqueue() method shown below.
class NewsletterJob < Struct.new(:text, :emails)
def perform
emails.each { |e| NewsletterMailer.deliver_text_to_email(text, e) }
end
end
Delayed::Job.enqueue( NewsletterJob.new("blah blah", Customers.find(:all).collect(&:email)) )
I was once part of a team that wrote a custom ad server, which has the same requirements: monitor the number of hits per document, and do something once they reach a certain threshold. This server was going to be powering an existing very large site with a lot of traffic, and scalability was a real concern. My company hired two Doubleclick consultants to pick their brains.
Their opinion was: The fastest way to persist any information is to write it in a custom Apache log directive. So we built a site where every time someone would hit a document (ad, page, all the same), the server that handled the request would write a SQL statement to the log: "INSERT INTO impressions (timestamp, page, ip, etc) VALUES (x, 'path/to/doc', y, etc);" -- all output dynamically with data from the webserver. Every 5 minutes, we would gather these files from the web servers, and then dump them all in the master database one at a time. Then, at our leisure, we could parse that data to do anything we well pleased with it.
Depending on your exact requirements and deployment setup, you could do something similar. The computational requirement to check if you're past a certain threshold is still probably even smaller (guessing here) than executing the SQL to increment a value or insert a row. You could get rid of both bits of overhead by logging hits (special format or not), and then periodically gather them, parse them, input them to the database, and do whatever you want with them.
When saving your Hit model, update a redundant column in your Page model that stores a running total of hits, this costs you 2 extra queries, so maybe each hit takes twice as long to process, but you can decide if you need to send the email with a simple if.
Your original solution isn't bad either.
I have to write something here so that stackoverflow code-highlights the first line.
class ApplicationController < ActionController::Base
before_filter :increment_fancy_counter
private
def increment_fancy_counter
# somehow increment the counter here
end
end
# lib/tasks/fancy_counter.rake
namespace :fancy_counter do
task :process do
# somehow process the counter here
end
end
Have a cron job run rake fancy_counter:process however often you want it to run.

Resources