Rails: huge data import to three connected tables - ruby-on-rails

Im looking for a good way to solve my performance issues in my rails application. I have three tables which have: one to one to many connections in between. If I want to fill in 130 items of the first table with all the data for the underneath tables, It results in about 1000 queries and takes about 10 seconds (SQLite DB).
I found the
accept_nested_attributes_for
statement, witch lets you enter data for multiple tables in one line of code. My question is, wether this is a good option in a performance point of view. Does somebody have any experience with it?
Thanks
Markus

accept_nested_attributes_for add the possibility to ActiveRecord to be able to write into association directly from one model.
exemple :
You have models like :
class User
accepts_nested_attributes_for :cars
end
class Car
belongs_to :user
end
and a hash like :
param[:user] = {}
params[:user][:name] = "Mike"
params[:user][:car] = {}
params[:user][:car][:brand] = "Nissan"
User.create(params[:user])
This will create a new user and a new car,
without accepts_nested_attributes_for :
#user = User.create(params[:user])
#car = Car.create(params[:user][:car])
#user.car = #car
This function is usually with fields_for in HTML forms so you can easily handle the creation of an object and his associations.
In your case I imagine your models like that (regarding your XML) :
class Card
has_one :front_side, :class => "Side"
has_one :back_side, :class => "Side"
end
class Side
belongs_to :card
has_many :card_side_entry
end
class CardSideEntry
belongs_to :side
end
I don't know where your XML come from (your data are extracted from it ??), but I imagine you could use accepts_nested_attributes_for so you could have each card hash generating the associations.
But I'm not sure to understand all the problem and if this is the best solution

here it is:
Table: cards
front_side_id
back_side_id
Table: card_sides
Table: card_side_entries
card_side_id
I have now a XML like this:
<Cards>
<Card>
<FrontSide>
<CardSideEntries>
<CardSideEntrie/>
...
</CardSideEntries>
</FrontSide>
<BackSide>
<CardSideEntries>
<CardSideEntrie/>
...
</CardSideEntries>
</BackSide>
</Card>
...
</Cards>
In my solution I parse the whole XML file line by line, and because I sometimes need a card_id I have to save a certain table entry twice... Does anybody now something about accept_nested_attributes_for?
Thanks,
Markus

Related

Includes method ( n+ 1 issue ) doesn't work with a push method but does with += when assigning to an array

Consider the simple relation between Employee and Company models(many to many):
Company model:
has_many :employees, through: :company_employees
has_many :company_employees
Employee model:
has_many :companies, through: :company_employees
has_many :company_employees
CompanyEmployee model(join table):
belongs_to :employee
belongs_to :company
also an Owner model:
has_many :companies
So in my system i have an owner which may have several companies and an Employee, which may work for multiple companies.
Now, In my employees controller i want to fetch all of the employees workin for an owner:
def owners_linked
#company_employees = []
owner.companies.each do |company|
#company_employees.push (company.company_employees.includes(:company, :employee)) # when += instead of push - it works
end
respond_to do |format|
format.js {render "employees_list"}
end
end
I need to have an access to Employee instances(personal data), company_employees table (information about the position in the company) and Company(company related data).
To resolve n+1 problem and speed up the performance i use includes method.
Well, the problem is that in my controller action in line:
#company_employees.push company.company_employees.includes(:company, :employee)
when using push method it doesn't work. I obtain the error in the view that employee method is not defined.
On the other hand when i change the push to += sign it works perfectly fine.
Can anyone help me to understand why it's like so?
I know that += is inefficint so i'd rather not stick to it.
This doesn't have anything to do with your use of includes.
When you use += you end up with an array of CompanyEmployee objects. However when you use push you are no longer concatenating arrays but creating an array of collections. You are then calling employee on the collection rather than an element of the collection which is why you get an error.
Personally I would write this as
#company_employees = owner.companies.flat_map do |company|
company.companee_employees.include(...)
end
Although I would do so for reasons of succinctness rather than performance. Any performance difference between += and other ways of concatenating arrays is minuscule compared to the time it takes to fetch data from the database.
This doesn't entirely solve your n+1 problem though, since the data for each company is loaded separately. I would do
#company_employees = owner.companies.include(company_employees: [:company, :employee]).flat_map(&:company_employees)
Which doesn't do as many queries.
I usually attack this by coming in from the other side. I believe this will get what you are looking for:
#company_workers = Employee.where(company_id: owner.companies.pluck(:id))
the where(company_id: ...) can take an array and automatically set it up to be an in(...) command in SQL.
So the SQL will end up being something like:
select * from employees where company_id is in(1,2,3,4)
With the 1, 2, 3, 4 are the owner's company IDs.

Deleting a record in a has_and_belongs_to_many table - Rails

I've been looking, and can't find a good answer for how to delete records in a HABTM table. I assume a lot of people have this same requirement.
Simply, I have Students, Classes, and Classes_Students
I want a student to be able to drop a class, or delete the HABTM record that has signed that student up for that class.
There must be a simple answer to this. Does anyone know what it is?
The reason why .destroy or .delete does not work on this situation is due to the missing primary key in the middle table. However, our parent objects have this really cool method called {other_obj}_ids. It is a collection of ids on the left table object, of the right table object. This information is of course populated from our middle table.
So with that in mind, we have 2 object classes (Student, and Classes). Active record magic can generally figure out the middle table if you are not doing anything fancy, but it is recommended to use has_many :through.
class Student < ActiveRecord::Base
has_and_belongs_to_many :classes
end
class Classes < ActiveRecord::Base
has_and_belongs_to_many :students
end
What we can now do in terms of the middle table with this setup...
student = Student.find_by(1)
student.classes # List of class objects this student currently has.
student.class_ids # array of class object ids this student currently has
# how to remove a course from the middle table pragmatically
course = Course.find_by({:name => 'Math 101'})
# if this is actually a real course...
unless course.nil?
# check to see if the student actually has the course...
if student.class_ids.include?(course.id)
# update the list of ids in the array. This triggers a database update
student.class_ids = student.class_ids - [course.id]
end
end
I know this is a little late to answer this, but I just went through this exact situation tonight and wanted to share the solution here.
Now, if you want this deleted by the form, since you can now see how it is handled pragmatically, simply make sure the form input is nested such that it has something to the effect of:
What kind of trouble are you having? Do you have the appropriate :dependent=>:destroy and :inverse_of=>[foo] on your relations?
Let's say a class had a course title. You can do:
student.classes.find_by_course_title("Science").delete
So the proper answer here is to do something like this in your view:
<%= link_to 'Remove', cycle_cycles_group_path(#cycle, cycle), method: :delete %><br />
cycle is from a block the above code is within.
#cycle is an instance variable from the join models controller.
cycle_cycles_group_path is the nested join table "cycles_groups" under the model "Cycle" in the routes.rb file:
resources :cycles do
resources :cycles_groups do
end
end
and the join model controller looks like this:
def destroy
#cycles_group = CyclesGroup.find(params[:id])
#cycle = #cycles_group.cycle
#cycles_group.destroy
puts "cycle: #{#cycle}"
respond_to do |format|
format.html {redirect_to cycle_path(#cycle), notice: 'Training Week was successfully removed!'}
end
end

Correct way to create or update with multiple belongs_to in Rails

New to Rails and Ruby and trying to do things correctly.
Here are my models. Everything works fine, but I want to do things the "right" way so to speak.
I have an import process that takes a CSV and tries to either create a new record or update an existing one.
So the process is 1.) parse csv row 2.) find or create record 3.) save record
I have this working perfectly, but the code seems like it could be improved. If ParcelType wasn't involved it would be fine, since I'm creating/retrieving a parcel FROM the Manufacturer, that foreign key is pre-populated for me. But the ParcelType isn't. Anyway to have both Type and Manufacturer pre-populated since I'm using them both in the search?
CSV row can have multiple manufacturers per row (results in 2 almost identical rows, just with diff mfr_id) so that's what the .each is about
manufacturer_id.split(";").each do |mfr_string|
mfr = Manufacturer.find_by_name(mfr_string)
# If it's a mfr we don't care about, don't put it in the db
next if mfr.nil?
# Unique parcel is defined by it's manufacturer, it's type, it's model number, and it's reference_number
parcel = mfr.parcels.of_type('FR').find_or_initialize_by_model_number_and_reference_number(attributes[:model_number], attributes[:reference_number])
parcel.assign_attributes(attributes)
# this line in particular is a bummer. if it finds a parcel and I'm updating, this line is superfulous, only necessary when it's a new parcel
parcel.parcel_type = ParcelType.find_by_code('FR')
parcel.save!
end
class Parcel < ActiveRecord::Base
belongs_to :parcel_type
belongs_to :manufacturer
def self.of_type(type)
joins(:parcel_type).where(:parcel_types => {:code => type.upcase}).readonly(false) unless type.nil?
end
end
class Manufacturer < ActiveRecord::Base
has_many :parcels
end
class ParcelType < ActiveRecord::Base
has_many :parcels
end
It sounds like the new_record? method is what you're looking for.
new_record?() public
Returns true if this object hasn’t been saved yet — that is, a record
for the object doesn’t exist yet; otherwise, returns false.
The following will only execute if the parcel object is indeed a new record:
parcel.parcel_type = ParcelType.find_by_code('FR') if parcel.new_record?
What about 'find_or_create'?
I have wanted to use this from a long time, check these links.
Usage:
http://rubyquicktips.com/post/344181578/find-or-create-an-object-in-one-command
Several attributes:
Rails find_or_create by more than one attribute?
Extra:
How can I pass multiple attributes to find_or_create_by in Rails 3?

Ruby on rails activerecord joins - select fields from multiple tables

models:
#StatusMessage model
class StatusMessage < ActiveRecord::Base
belongs_to :users
default_scope :order => "created_at DESC"
end
#User Model
class User < ActiveRecord::Base
has_many :status_messages
end
In controller I want to join these two tables and get fields from both table. for example I want email field from User and status field from StatusMessage. When I use :
#status = User.joins(:status_messages)
Or
#status = User.includes(:status_messages)
It gives me only the user table data.
How can I implement this requirement?
You need to use includes here. It preloads data so you won't have another SQL query when you do #user.status_messages.
And yes you can't really see the effect: you need to check your logs.
First of all, I don't think it is possible (and reasonable) what you want to do. The reason for that is that the relation between User and StatusMessage is 1:n, that means that each user could have 0 to n status messages. How should these multitudes of attributes be included in your user model?
I think that the method joints in class ActiceRecord has a different meaning, it is part of the query interface. See the question LEFT OUTER joins in Rails 3
There are similar questions on the net, here is what I have found that matches most:
Ruby on Rails: How to join two tables: Includes (translated for your example) in the user a primary_status_message, which is then materialized in the query for the user. But it is held in one attribute, and to access the attributes of the status_message, you have to do something like that: #user.primary_status_message.status
When you use #status = User.includes(:status_messages) then rails eagerley loads the data of all the tables.
My point is when you use this User.includes(:status_messages) it will loads the data of status_messages also but shows only users table data then if you want first user status_messages then you have to #user.first.status_messages

Write join table data - has_many :through

That should be a simple question but i can't find a good solution online.
I have three tables/models. User, Alliance and Alliance_Membership. The latter is a join table describing the :Alliance has_many :Users through :Alliance_Membership relationship.
Everything works ok, but Alliance_Membership now has an extra field called 'rank'. The question is, how do i set that when creating my new object ? Currently, i do something like :
#alliance.users << current_user
This is really convenient since it populates my Alliance_Membership table automatically. But, how can i set the Alliance_Membership.rank field as well ?
You'll need to create the membership yourself to set the 'rank' attribute. Something like this:
#alliance.alliance_memberships.create!(
:user => current_user,
:rank => 'whatever')

Resources