Save/update multiple rows in rails - ruby-on-rails

I'm currently working on saving a user social media posts in my app. The basic idea is to check if the post exists if it does update the data or if not create a new row. Right now I'm looping through all of the post that I receive from the social platform so potentially I'm looping through 3,000 and adding them to the database.
Is there a way that I could rewrite this to save all the items at once, which hopefully would speed up the save method?
post_data.each do |post_data_details|
post_instance = Post::Tumblr.
where(platform_id: platform_id).
where("data ->> 'id' = ?", post_data_details["id"].to_s).
first_or_initialize
exisiting_data = post_instance.data
new_data = exisiting_data.merge! post_data_details.to_hash
post_instance.data = new_data
post_instance.refreshed_at = date
post_instance.save!
end

It is good practice to run such long-running jobs via sidekiq or other background jobs solution.
You can also use single ActiveRecord transaction.
http://api.rubyonrails.org/classes/ActiveRecord/Transactions/ClassMethods.html
But keep in mind, that if one of records will be invalid - whole trasaction will be rollbacked.

Related

Rails console update

Hello I was trying the update the data in the table using the rails console.
Box.where("code = 'learning'").update(duration: 10)
I ran this command.
The data is temporarily changing.
Box.where("code = 'learning'")
When I run this the precious data is being displayed.
Could anyone let me the issue.
Thank you in advance.
#update updates a single record.
user = User.find_by(name: 'David')
user.update(name: 'Dave')
It will return true/false depending on if the record was actually updated. You can see the validation errors by inspecting the errors object:
user.errors.full_messages
In non user-interactions situations like seed files and the console it can be helpful to use the bang methods such as #update!, #save! and #create! which will raise an exception if the record is invalid.
If you want to update multiple records at once you need to use #update_all:
Box.where("code = 'learning'")
.update_all(duration: 10)
This creates a single SQL update statement and is by far the most performant option.
You can also iterate through the records:
Box.where("code = 'learning'").find_each do |box|
box.update(duration: 10)
end
This is sometimes necissary if the value you are updating must be calculated in the application. But it is much slower as it creates N+1 database queries.

Keeping Rails data updated from Websocket without saving to database

I have websocket price data streaming in to my rails api app which I want to keep updated so any api requests get an updated response. It would be too expensive to save each update to the database. How can I do this? In Ember I can modify the model and it persists. It doesn't seem to happen in rails.
Channel controller:
def receive(message)
#ActionCable.server.broadcast('channel', message)
platform = Platform.find(params[:id]);
market = platform.markets.find_by market_name: message["market_name"]
market.attributes = {
market.price = message.values["price"],
etc......
}
#market.save [this is too expensive every time]
end
Am I going about this in the right way? It also seems inefficient to use find every time I want to update which could be multiple times per second. In Ember I created a record Id lookup array so I could quickly match the market_name, I don't see how to do this in rails.
Persistence to some store is the only way you can have other threads respond with latest value.
Instead of 3 queries( 2 selects and 1 update) you can do it with just 1 update
Market.where(platform_id: params[:id], market_name: message["market_name"]).
update_all(price: message.values["price"])
With proper index, you might have a sub-ms performance for each update.
Depending on your business need:
If you are getting tons of updates for a market every second(making all prior stale and useless), you can choose to ignore few and not fire update at all.

Create or update multiple records like first_or_create.increment!(counter)

I have a request in my rails app that needs to create or update a nested record every time a request is made.
Model.where(id: 1).events.first_or_create.increment!(:counter)
This works fine, but today I need to make the same to a lot of objects
Model.where(created_at: start..end).map{|m|
m.events.first_or_create.increment!(:counter)
}
I could call update_all('counter = counter + 1') but I can't be sure if every model has an associated event, hence the first_or_create
Is there a way improve this call? Bare in mind some requests select more than 5000 records

Count current users on the page

I'm trying to count current viewers on the particular page. I need this count to be stored in the DB. The main trouble is to clean up after user leaves the page.
Users are anonymous. Every active user sends AJAX-request every 5 seconds.
What's the best algorithm to do that? Any suggestions?
UPD: I'm trying to reduce amount of queries to the DB, so, I think, I don't really need to store that count in the DB while I can access it other way from the code.
Don't even think about storing this in database, your app will be incredibly slowed down.
So use Cache for this kind of operation.
To count the number of people, I'd say:
assign a random ID to each anonymous user and store it in his session
send the ID within your ajax call
store an Array of Hashes in cache with [{ :user_id, :latest_ping }, {} ] (create a cache var for each page)
delete the elements of the array which appear to be too old
you've your solution: number of users = nb of elements in the array
If you store the users in the database somehow, you could store a last_seen_at field in the users table, and update that with Time.now for every AJAX request that user sends.
To display how many users you currently have, you can just perform a query such as:
#user_count = User.where("last_seen_at < ?", 5.seconds.ago).count
If you want to clean up old users, I suggest that you run some kind of cron job, or use the whenever gem, or something like that, to periodically delete all users that haven't been seen for some time.
I would suggest you create a model that contains a unique key (cookie-id or something) that you save or update with every AJAX heartbeat request.
You then have a session controller that could look like this:
def create
ActiveUser.where(:cookie => params[:id]) || ActiveUser.new
ActiveUser.cookie = prams[:id]
ActiveUser.timestamp = Time.now
ActiveUser.save
end
Your number of active users is then simply a SELECT COUNT(*) FROM ActiveUsers WHERE timestamp > NOW() - 5 or something like that.
Martin Frost is on the right track. There's the #touch method to update last_seen_at: user.touch(:last_seen_at)
But it would be even more efficient to just update the user without having to fetch the model from the database:
> User.update_all({:last_seen_at => Time.now}, {:id => params[:user_id})
SQL (3.1ms) UPDATE "users" SET "last_seen_at" = '2011-11-17 12:37:46.863660' WHERE "users"."id" = 27
=> 1

validates_uniqueness_of failing on heroku?

In my User model, I have:
validates_uniqueness_of :fb_uid (I'm using facebook connect).
However, at times, I'm getting duplicate rows upon user sign up. This is Very Bad.
The creation time of the two records is within 100ms. I haven't been able to determine if it happens in two separate requests or not (heroku logging sucks and only goes back so far and it's only happened twice).
Two things:
Sometimes the request takes some time, because I query FB API for name info, friends, and picture.
I'm using bigint to store fb_uid (backend is postgres).
I haven't been able to replicate in dev.
Any ideas would be extremely appreciated.
The signin function
def self.create_from_cookie(fb_cookie, remote_ip = nil)
return nil unless fb_cookie
return nil unless fb_hash = authenticate_cookie(fb_cookie)
uid = fb_hash["uid"].join.to_i
#Make user and set data
fb_user = FacebookUser.new
fb_user.fb_uid = uid
fb_user.fb_authorized = true
fb_user.email_confirmed = true
fb_user.creation_ip = remote_ip
fb_name_data, fb_friends_data, fb_photo_data, fb_photo_ext = fb_user.query_data(fb_hash)
return nil unless fb_name_data
fb_user.set_name(fb_name_data)
fb_user.set_photo(fb_photo_data, fb_photo_ext)
#Save user and friends to the db
return nil unless fb_user.save
fb_user.set_friends(fb_friends_data)
return fb_user
end
I'm not terribly familiar with facebook connect, but is it possible to get two of the same uuid if two separate users from two separate accounts post a request in very quick succession before either request has completed? (Otherwise known as a race condition) validates_uniqueness_of can still suffer from this sort of race condition, details can be found here:
http://apidock.com/rails/ActiveModel/Validations/ClassMethods/validates_uniqueness_of
Because this check is performed
outside the database there is still a
chance that duplicate values will be
inserted in two parallel transactions.
To guarantee against this you should
create a unique index on the field.
See add_index for more information.
You can really make sure this will never happen by adding a database constraint. Add this to a database migration and then run it:
add_index :user, :fb_uid, :unique => true
Now a user would get an error instead of being able to complete the request, which is usually preferable to generating illegal data in your database which you have to debug and clean out manually.
From Ruby on Rails v3.0.5 Module ActiveRecord::Validations::ClassMethods
http://s831.us/dK6mFQ
Concurrency and integrity
Using this [validates_uniqueness_of]
validation method in conjunction with
ActiveRecord::Base#save does not
guarantee the absence of duplicate
record insertions, because uniqueness
checks on the application level are
inherently prone to race conditions.
For example, suppose that two users
try to post a Comment at the same
time, and a Comment’s title must be
unique. At the database-level, the
actions performed by these users could
be interleaved in the following
manner: ...
It seems like there is some sort of a race condition inside your code. To check this, i would first change the code so that facebook values are first extracted and only then i would create a new facebook object.
Then i would highly suggest that you write a test to check whether your function gets executed once. It seems that it's executed two times.
And upon this, there seems to be a race condition upon waiting to get the facebook results.

Resources