I am using Nokogiri to grab data from a webpage, so far i can save to one column in the model
def update_fixtures #rake task method
Fixture.destroy_all
get_fixtures.each {|match| Fixture.create(home_team: match )}
end
def get_fixtures # Get me all Home Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").map {|h| h.text.strip }
end
What I am wondering is the most efficient way to save to 2, 3 or 4 columms at the same time
So as an example I have another column called away_team and I would grad that data in the same way as the home team
away_team = doc.css(".team-away.teams").map {|a| a.text.strip }
is it advisable to put this within the get_fixtures method? and then add to the update_fixtures with something like
def update_fixtures #rake task method
Fixture.destroy_all
get_fixtures.each {|match| Fixture.create(home_team: match, away_team: match )}
end
After trying this the same data gets posted to the home and away columns.Which after reading back i can see why (I think its because match is only grabbing the home_team data?). How can i pass the attributes of the away team along with the home team?
This is all very new so any help provided is appreciated
This isn't the right approach because the variables home_team and away_team both are using the same common match and thus you are getting the same data for both.
Do the following:
UPDATE:
Your model:
attr_accessible :home_team, :away_team
def update_fixtures #rake task method
Fixture.destroy_all
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").map {|h| h.text.strip }
away_team = doc.css(".team-away.teams").map {|a| a.text.strip }
Fixture.create(home_team: home_team, away_team: away_team)
end
Related
I have two Model level functions that populate two arrays, my_favorite_brands and my_favorites. They are defined as follows:
def self.my_favorite_brands
brand_list = Array.new
Favorite.my_favorites.each do |favorite|
brand_list.push(favorite.style.shoe.brand)
end
brand_list.uniq
end
And here is my_favorites:
def self.my_favorites
Favorite.where(:user => current_user)
end
I want to print out each Brand in my_favorite_brands and while doing so, for each Brand print out all of it's associated Favorites in my_favorites. The relation between the two models Brand and Favorite is the following. Brand has many Shoes which has many Styles. Favorite belongs to Style and it belongs to User. Here is some probably non-functional pseudo-ish (in that it doesn't really work) code that emulates what I want to do.
#Controller stuff
#fav_brands = Brand.my_favorite_brands
#fav_favorites = Favorite.my_favorites
#in the view
favorites_by_brand = Array.new
#fav_brands.each do |brand|
favorites_by_brand = #fav_favorites.map do |favorite|
unless favorite.style.shoe.brand == brand
#fav_favorites.delete("favorite")
end
favorites_by_brand.each do |favorite|
puts favorite.style
end
end
I am trying to create a complete list of favorites where favorite.style.shoe.brand is equal to the current brand I am iterating over, so that I can display the styles by Brand.
I figured out a way to accomplish what I want to accomplish, although it's probably not Ideal .
styles_of_favorites = #fav_favorites.map {|favorite| favorite.style}
#fav_brands.each do |brand|
brand.shoes.each do |shoe|
shoe.style.each do | style|
if styles_of_favorites.include?(style)
puts style
end
end
end
end
I have a List model below, it has a has_and_belongs_to_many association with recipients. The purpose of the method make_recipient_lists is to save a parsed csv of numbers(initial parameter) in this format [[num1],[num2],[num3]...].
add_recipients work by finding existing recipients then adding them to the list or creating new recipients.
This whole process works well for small amount, 20k of numbers in 28minutes. However, the greater the number, the longer it takes exponentially, 70k took 14hours. Probably because it was checking for duplicates to a cached current_lists.
Question is, is there any way to make this faster? I am probably approaching this problem wrong. Thanks!
class List < ActiveRecord::Base
#other methods above
def make_recipient_lists(numbers,options)
rejected_numbers = []
account = self.user.account
#caching recipients
current_recipients = self.recipients
numbers.each do |num|
add_recipient(num[0], current_recipients)
end
end
def add_recipient(num, current_recipients)
account = self.user.account
recipient = current_recipients.where(number:num, account_id: account.id).first
recipient ||= current_recipients.create!(number:num, account_id: account.id)
recipient
end
end
You could do something like this. I have not tested this, but you get the idea.
def make_recipient_lists(numbers, options)
rejected_numbers = []
account = self.user.account
existing_numbers = self.recipients.where(number: numbers, account_id: account.id).map(&:number)
new_records = (numbers - existing_numbers).map {|n| {number: n, account_id: account.id, list_id: self.id} }
Recipient.create new_records
end
I think, you should use rails active_record query interface. you can use method find_or_create method for this: It will make your queries faster. change your method like this, and check the time difference:
def make_recipient_lists(numbers,options)
rejected_numbers = []
account = self.user.account
#caching recipients
current_recipients = self.recipients
numbers.each do |num|
self.recipients.find_or_create_by(number: num, account_id: account.id)
end
end
Hope it will help. Thanks.
I have a Rails site that logs simple actions such as when people upvote and downvote information. For every new action, an EventLog is created.
What if the user changes his or her mind? I have an after_create callback that looks for complementary actions and deletes both if it finds a recent pair. For clarity, I mean that if a person upvotes something and soon cancels, both event_logs are deleted. What follows is my callback.
# Find duplicate events by searching nearly all the fields in the EventLog table
#duplicates = EventLog.where("user_id = ? AND event = ? AND project_id = ? AND ..., ).order("created_at DESC")
if #duplicates.size > 1
#duplicates.limit(2).destroy_all
end
The above code doesn't quite work because if any of the fields happen to be nil, the query returns [].
How can I write this code so it can handle null values, and/or is there a better way of doing this altogether?
If I understood this correctly,
some of the fields can be nil, and you want to find activity logs that have same user_id, same project_id or project id can be nil.
So I guess this query should work for you.
ActivityLog.where(user_id: <some_id> AND activity: <complementary_id> AND :project_id.in => [<some_project_id>, nil] ....)
This way you would get the complementary event logs where user_id is same and project id may or may not be present
class ActivityLog
QUERY_HASH = Proc.new{ {user_id: self.user_id,
activity: complementary_id(self.id),
and so on....
} }
How about:
# event_log.rb
def duplicate_attr_map
{
:user_id,
:project_id
}
end
def duplicates
attribs = duplicate_attr_map.reject_if(&:blank?)
query = attribs.map { |attr| "#{attr} = ?" }.join(' AND ')
values = attribs.map { |attr| self.send(attr) }
EventLog.where(query, *values).order("created_at DESC")
end
def delete_duplicates(n)
duplicates.limit(n).delete_all if duplicates.size > 1
end
# usage:
# EventLog.find(1).delete_duplicates(2)
not tested, could be improved
Slowly getting there with what i am trying to achieve. I am grabbing data via screen grab and want to save the data to my model, i have two columns, home_team and away_team. So far i grab the data.
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").map {|h| h.text.strip }
away_team = doc.css(".team-away.teams").map {|a| a.text.strip }
#team_clean = Hash[:home_team => home_team, :away_team => away_team]
#team_clean = Hash[:team_clean => [Hash[:home_team => home_team, :away_team => away_team]]]
end
I have hashed out the two ways of getting the data into a hash, one is a hash and the other is a hash within a hash, I am not sure which one i need (if any?)
So if i want to save the data received from my home_team i run a rake task to do this
def update_fixtures #rake task method
Fixture.destroy_all
get_fixtures.each {|home| Fixture.create(:home_team => home )}
end
What i want to achieve is to be able to save home_team and away_team at the same time. Do i need to access the data within the hash, if so how? Bit lost here, but this is the first time i am attempting this
any help appreciated
Try this,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
matches = doc.css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team)
end
end
This will loop through the matches and create a new Fixture with away and home teams for each match.
Edit:
Added .text.strip
Edit 2:
This should get you the dates too,
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures # Get me all Home and away Teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
days = doc.css('#fixtures-data h2').each do |h2_tag|
date = Date.parse(h2_tag.text.strip)
matches = h2_tag.xpath('following-sibling::*[1]').css('tr.preview')
matches.each do |match|
home_team = match.css('.team-home').text.strip
away_team = match.css('.team-away').text.strip
Fixture.create!(home_team: home_team, away_team: away_team, date: date)
end
end
end
It's a bit more complicated than the previous code because it has to use some XPath to call the next HTML element after the h2 tag containing the date.
It loops through all the h2 html tags in the div#fixtures-data HTML then grabs the table tag directly below/after each h2.
I am trying to learn how to get data via a screen scrape and then save it to a model. So far I can grab the data. I say this as if I do:
puts home_team
I get all the home teams returned
get_match.rb #grabbing the data
require 'open-uri'
require 'nokogiri'
module MatchGrabber::GetMatch
FIXTURE_URL = "http://www.bbc.co.uk/sport/football/premier-league/fixtures"
def get_fixtures
doc = Nokogiri::HTML(open(FIXTURE_URL))
home_team = doc.css(".team-home.teams").text
end
end
Then i want to update my model
match_fixtures.rb
module MatchFixtures
class MatchFixtures
include MatchGrabber::GetMatch
def perform
update_fixtures
end
private
def update_fixtures
Fixture.destroy_all
fixtures = get_fixtures
end
def update_db(matches)
matches.each do |match|
fixture = Fixture.new(
home_team: match.first
)
fixture.save
end
end
end
end
So the next step is where I am getting stuck. First of all I need to put the home_team results into an array?
Second part is I am passing matches through my update_db method but that's not correct, what do I pass through here, the results of the home_team from my update_fixtures method or the method itself?
To run the task I do:
namespace :grab do
task :fixtures => :environment do
MatchFixtures::MatchFixtures.new.perform
end
end
But nothing is saved, but that is to be expected.
Steep learning curve here and would appreciate a push in the right direction.
Calling css(".team-home.teams").text does not return the matching DOM elements as an array, but as a single string.
In order to obtain an array of elements, refactor get fixture into something like this:
get_teams
doc = Nokogiri::HTML(open(FIXTURE_URL))
doc.css(".team-home.teams").map { |el| el.text.strip }
end
This will return an array containing the text of the elements matching your selector, stripped out of blank and new line characters. At this point you can loop over the returned array and pass each team as an argument to your model's create method:
get_teams.each { |team| Fixture.create(home_team: team) }
You could just pass the array directly to the update method:
def update_fixtures
Fixture.destroy_all
update_db(get_fixtures)
end
def update_db(matches)
matches.each {|match| Fixture.create(home_team: match.first) }
end
Or do away with the method all together:
def update_fixtures
Fixture.destroy_all
get_fixtures.each {|match| Fixture.create(home_team: match.first) }
end