Ruby: If column exists use it, if not add to other column - ruby-on-rails

I'm parsing a CSV and trying to distinguish between columns in Model and "virtual" columns that'll be added to a JSONB :data column. So far I've got this:
rows = SmarterCSV.process(csv.path)
rows.each do |row|
row.select! { |x| Model.attribute_method?(x) } # this ignores non-matches
Model.create(row)
end
That removes columns from the CSV row that don't match up with Model. Instead, I want to add the data from all those into a column in Model called :data. How can I do that?
Edit
Something like this before the select! maybe?
row[:data] = row.select { |x| !Model.attribute_method?(x) }

There are a number of ways you could do this. One particularly straightforward way is with Hash#slice! from Rails' ActiveSupport extensions, which works like Array#slice! and returns a Hash with those keys that weren't given in its arguments, while preserving the keys that were given:
rows = SmarterCSV.process(csv.path)
attrs = Model.attribute_names.map(&:to_sym)
rows.each do |row|
row[:data] = row.slice!(*attrs)
Model.create(row)
end
P.S. This could probably be filed under "Stupid Ruby Tricks," but if you're using Ruby 2.0+ you can take advantage of the double-splat (**) for this compact construction:
rows.each do |row|
Model.create(data: row.slice!(*attrs), **row)
end
P.P.S. If your CSVs are big and you find yourself having performance concerns (calling create a few thousand times—and the subsequent database INSERTs—ain't cheap), I recommend checking out the activerecord-import gem. It's designed for exactly this sort of thing. With it you'd do something like this:
rows = SmarterCSV.process(csv.path)
attrs = Model.attribute_names.map(&:to_sym)
models = rows.map do |row|
row[:data] = row.slice!(*attrs)
Model.new(row)
end
Model.import(models)
There are other, faster options as well in the activerecord-import docs.

You can try this has_attribute?
row[:data] = row.keep_if { |x| !Model.has_attribute?(x) }

Have you tried:
row[:data] = row.delete_if {|k,v| !Model.attribute_method?(k) }
Model.create(row)
This will remove the elements from the row hash and add the key-value pairs back to the row under a :data key.

Related

is it possible to create hash based on columns of two connected models with ActiveRecord/SQL?

Here are 2 models Question & Answer
Question has many answers & has column question_text with unique values.
Answer belongs to question & has column answer_text.
I want to create hash based on columns of these tables, that question.question_text will be keys and answers.answer_text will be values as array.
I'm trying something like this:
Answer.joins(:question).where(questions: {question_text: ['value1', 'value2']}).group('questions.question_text').select('questions.question_text, array_agg(answers.answer_text)').as_json
But it doesn't work as well as I would like. Because, this return array of hashes:
[{"question_text"=>"value1", "answers"=>["some text", "some text", "some text", "some text", "some text"], "id"=>nil}, {"question_text"=>"value2", "answers"=>["text", "text"], "id"=>nil}]
I would like to prefer only hash with next format:
{question.question_text: [question.answers.answer_text], question.question_text: [question.answers.answer_text]}
You can declare an empty hash and then loop through the questions and assign key/value pairs in the hash:
answer_hash = {}
Question.all.includes(:answers).map do |question|
answer_hash[question.question_text] = question.answers.pluck(:answer_text)
end
The answer_hash contains the desired result.
array_agg is a Postgresql specific feature so using it your query cannot be considered DB-agnostic which is one of the main purpose ORMs like ActiveRecord in place.
Regarding the solution with query only whether what you desire can be obtained with less complexity I cannot comment but a simple solution I can suggest is
arel = Answer.joins(:question).where(questions: {question_text: ['value1', 'value2']})
arr_of_arr = arel.pluck('questions.question_text', 'answers.answer_text')
data_hash = {}
arr_of_arr.find_each do |arr|
question_text = arr[0]
answer_text = arr[1]
data_hash[question_text] ||= []
data_hash[question_text] ||= answer_text
end
data_hash
Hope this turns out to be useful.

Reducing Rails Queries (N+1?)

So in my past application, I was somewhat familiar with using .includes in Rails, but for some reason I'm having a bit of a difficult time in my current scenario.
Here's what I'm working with:
# If non-existent, create. Otherwise, update.
existing_data = Page.all
updated_data = {}
new_records = []
#latest_page_data.each do |key, value|
existing_record = existing_data.find_by(symbol: key)
if existing_record != nil
updated_data[existing_record.id] = value
else
new_records << Page.new(value)
end
end
if !new_records.empty?
Page.import new_reocrds
end
if !updated_data.empty?
Page.update(updated_data.keys, updated_data.values)
end
end
The problem that I'm having is that the .find_by portion of the code results in a query every single iteration of #latest_page_data. I guess I would think that existing_data would hold all of the data it needs in memory, but obviously it doesn't work that way.
So next, I tried something like this:
# If non-existent, create. Otherwise, update.
existing_data = Page.includes(:id, :symbol)
updated_data = {}
new_records = []
#latest_currency_data.each do |key, value|
existing_record = existing_data.find_by(symbol: key)
but then rails throws an error, stating:
ActiveRecord::AssociationNotFoundError (Association named 'id' was not
found on Page; perhaps you misspelled it?):
so I can't use this example to find the id and symbol attributes.
I tried to take out :id in the Page.includes method, but I need to be able to get to the ID attribute in order to update the respective record later down in the code.
I've also saw some other posts pertaining to this topic, but I think the problem I may be running into is I'm not dealing with associations (and I believe that's what .includes is for? If this is the case, is there any other way that I can reduce all of the queries that I'm submitting here?
The includes method is used to preload associated models. I think what you are looking for is a select. Modifying your code to use select, do this :
existing_data = Page.select(:id, :symbol).load
updated_data = {}
new_records = []
#latest_currency_data.each do |key, value|
existing_record = existing_data.find_by(symbol: key)
if existing_record
updated_data[existing_record.id] = value
else
new_records << Page.new(value)
end
end
The drawbacks of using select over pluck is that since Rails constructs an object for you, so it is slower than a pluck. Benchmark: pluck vs select
Rather than trying to figure out a way to do it in Rails (since I'm not familiar with the 100% correct/accurate Rails way), I just decided to use .pluck and convert it into a hash to get the data that I'm looking for:
existing_data = Page.pluck(:id, :symbol)
existing_data = Hash[*existing_data.flatten]
updated_data = {}
new_records = []
#latest_currency_data.each do |key, value|
if existing_data.values.include? key
id = existing_data.find{|k,v| v.include? key}[0]
updated_data[id] = value
else
new_records << Page.new(value)
end
end
If anyone has a better way, it'd be gladly appreciated. Thanks!

Can Rails sort a CSV file?

I'm exporting a CSV from many different sources which makes it very hard to sort before putting it into the CSV.
csv = CSV.generate col_sep: '#' do |csv|
... adding a few columns here
end
Now, it would be awesome if I was able to sort this CSV by the 2nd column. Is that in any way possible?
If you're trying to sort before writing, it depends on your data structure, in which i'll need to see your code a bit more. For reading a csv, you can convert it to hash and sort by header name even:
rows = []
CSV.foreach('mycsvfile.csv', headers: true) do |row|
rows << row.to_h
end
rows.sort_by{ |row| row['last_name'] }
Edit to use sort_by, thanks to max williams.
Here is how you would sort by column number:
rows = []
CSV.foreach('mycsvfile.csv', headers: true) do |row|
# collect each row as an array of values only
rows << row.to_h.values
end
# sort in place by the 2nd column
rows.sort_by! { |row| row[1] }
rows.each do |row|
# do stuff with your now sorted rows
end

Possible to use ActiveRecord methods on objects loaded from db?

I would like to apply the "find_all_by..." method to records that have already been retrieved by User.all. Is this possible? At this point I am getting an "undefined method `find_all_by_type" error:
rows = User.all
rows.each do |r|
result = rows.find_all_by_type(r.type)
end
Once the records are loaded, you can use any Enumerable method on the collection. What you're looking for here is select:
rows = User.all
rows.each do |r|
result = rows.select {|row| row.type == r.type}
end
Although I do wonder what you're actually trying to do here. If this is pseudocode or a simplified example, then you can probably apply my code above. You may be better off with this though:
rows = User.all.group_by(&:type)

Verifying if an object is in an array of objects in Rails

I'm doing this:
#snippets = Snippet.find :all, :conditions => { :user_id => session[:user_id] }
#snippets.each do |snippet|
snippet.tags.each do |tag|
#tags.push tag
end
end
But if a snippets has the same tag two time, it'll push the object twice.
I want to do something like if #tags.in_object(tag)[...]
Would it be possible? Thanks!
I think there are 2 ways to go about it to get a faster result.
1) Add a condition to your find statement ( in MySQL DISTINCT ). This will return only unique result. DBs in general do much better jobs than regular code at getting results.
2) Instead if testing each time with include, why don't you do uniq after you populate your array.
here is example code
ar = []
data = []
#get some radom sample data
100.times do
data << ((rand*10).to_i)
end
# populate your result array
# 3 ways to do it.
# 1) you can modify your original array with
data.uniq!
# 2) you can populate another array with your unique data
# this doesn't modify your original array
ar.flatten << data.uniq
# 3) you can run a loop if you want to do some sort of additional processing
data.each do |i|
i = i.to_s + "some text" # do whatever you need here
ar << i
end
Depending on the situation you may use either.
But running include on each item in the loop is not the fastest thing IMHO
Good luck
Another way would be to simply concat the #tags and snippet.tags arrays and then strip it of duplicates.
#snippets.each do |snippet|
#tags.concat(snippet.tags)
end
#tags.uniq!
I'm assuming #tags is an Array instance.
Array#include? tests if an object is already included in an array. This uses the == operator, which in ActiveRecord tests for the same instance or another instance of the same type having the same id.
Alternatively, you may be able to use a Set instead of an Array. This will guarantee that no duplicates get added, but is unordered.
You can probably add a group to the query:
Snippet.find :all, :conditions => { :user_id => session[:user_id] }, :group => "tag.name"
Group will depend on how your tag data works, of course.
Or use uniq:
#tags << snippet.tags.uniq

Resources