How to optimize adding new stock locations in Spree? - ruby-on-rails

When I have a lot of products (3000 and 22000 variants), adding new stock location takes hours because Spree is creating stock items for every variant.
During this time variants table is locked and whole system is unusable. Is there some workaround for this or maybe it was fixed in some new version of Spree?
I am using spree 2.0.3.

I face the same problem, with >400K variants, it's impossible to add a new stock location. So, I create a script in ruby and for all variants write an insert statement to a SQL file. I must create the stock location without propagate_all_variants
# lib/create_stock_items.rb
begin
file = File.open("stock_items.sql", "w")
rescue IOError => e
puts e
end
file.write("INSERT INTO spree_stock_items (stock_location_id, variant_id, backorderable) VALUES \n")
variants = Spree::Variant.all.pluck(:id)
length = variants.count
variants.each_with_index do |variant, index|
if index+1 == length
file.write("(#{stock_location_id}, #{variant}, false); \n")
else
file.write("(#{stock_location_id}, #{variant}, false), \n")
end
end
file.close
Then run bundle exec rails runner lib/create_stock_items.rb -e production. This will create a stock_items.sql file in Rails root path, and finally load that SQL directly on BD (rails dbconsole).
I know it's a little hack, but a very fast solution for me.

Related

Rails console running out of memory

I need to download a blob from Azure (from a batch), perform some actions on the data, and re-upload to Azure.
I'm running a script to do that in rails console, and to my surprise after running successfully for 10 - 20 iterations, it starts to crash every single iteration, at the #client.get_blob line.
Is there any part of my process that could be using up the memory in console, or in some sort of scratch disk used by console? For example, the command File.binwrite?
#client = Azure::Storage::Blob::BlobService.create(
storage_account_name: ENV.fetch('AZURE_STORAGE_ACCOUNT'),
storage_access_key: ENV.fetch('AZURE_STORAGE_KEY'))
incorrect_files.each_with_index do |name, index|
puts "starting #{index + start}: #{name}"
# Get file data
data = #client.get_blob('uploadedfiles', name)[1]
# Create a temporary file
Tempfile.create('tmpfile') do |temp|
# Fix the file data and save to temp file
File.binwrite(temp, fix(data))
# Upload new file
Uploader.upload(temp)
end
end

Ruby Rails Screen Scrape different results in Rails Console

I'm confused about a difference I'm seeing in Nokogiri commands run from Rails Console and what I get from the same commands run in a Rails Helper.
In Rails Console, I am able to capture the data I want with these commands:
endpoint = "https://basketball-reference.com/leagues/BAA_1947_totals.html"
browser = Watir::Browser.new(:chrome)
browser.goto(endpoint)
#doc_season = Nokogiri::HTML.parse(URI.open("https://basketball-reference.com/leagues/BAA_1947_totals.html"))
player_season_table = #doc_season.css("tbody")
rows = player_season_table.css("tr")
rows.search('.thead').each(&:remove) #THIS WORKED
rows[0].at_css("td").try(:text) # Gets single player name
rows[0].at_css("a").attributes["href"].try(:value) # Gets that player page URL
However, my rails helper that is meant to take those commands and fold them into methods:
module ScraperHelper
def target_scrape(url)
browser = Watir::Browser.new(:chrome)
browser.goto(url)
doc = Nokogiri::HTML.parse(browser.html)
end
def league_year_prefix(year, league = 'NBA')
# aba_seasons = 1968..1976
baa_seasons = 1947..1949
baa_seasons.include?(year) ? league_year = "BAA_#{year}" : league_year = "#{league}_#{year}"
end
def players_total_of_season(year, league = 'NBA')
# always the latter year of the season, first year is 1947 no quotes
# ABA is 1968 to 1976
league_year = league_year_prefix(year, league)
#doc_season = target_scrape("http://basketball-reference.com/leagues/#{league_year}_totals.html")
end
def gather_players_from_season
player_season_table = #doc_season.css("tbody")
rows = player_season_table.css("tr")
rows.search('.thead').each(&:remove)
puts rows[0].at_css("td").try(:text)
puts rows[0].at_css("a").attributes["href"].try(:value)
end
end
On that module, I try to emulate the rails console commands and break them into modules. And to test it out (since I don't have any other functionality or views built yet), I run Rails console, include this helper and run the methods.
But I get wildly different results.
in the gather_players_from_season method, I can see that
player_season_table = #doc_season.css("tbody")
Is no longer grabbing the same data it grabbed when run as a command line by line. It also doesn't like the attributes method here:
puts rows[0].at_css("a").attributes["href"].try(:value)
So my first thought is a difference in gems maybe? Watir is launching the headless browser. Nokogiri isn't causing errors as near as I can tell.
Your first thought of comparing the Gem versions is a great idea, but I am noticing a difference between the two code solutions:
In the Rails Console
the code parses the HTML with URI.open: Nokogiri::HTML.parse(URI.open("some html"))
In the ScraperHelper code
the code does not call URI.open, Nokogiri::HTML.parse("some html")
Perhaps that difference will return different values and make the rest of the ScraperHelper return unexpected results.

Rails : Pathname issue with WSL

Hey we are 3 students, we all use the same rails db:seed. Our project is well git pulled and coordinated, but ..
One uses Linux, the rails:db:seed works for him.
One uses Mac, the rails:db:seed works for him too.
Me, I use WSL, and it dosnt work !
I've tried both Windows & WSL paths, as the screens bellow.
Thanks if anyone can guide me !
config/seed.rb
# This file should contain all the record creation needed to seed the database with its default values.
# The data can then be loaded with the bin/rails db:seed command (or created alongside the database with db:setup).
#
# Examples:
#
# movies = Movie.create([{ name: 'Star Wars' }, { name: 'Lord of the Rings' }])
# Character.create(name: 'Luke', movie: movies.first)
require 'faker'
RealEstate.destroy_all
User.destroy_all
Category.destroy_all
Category.create(title:"House")
Category.create(title:"Flat")
10.times do
u = User.create(email: Faker::Internet.email, password: Faker::Internet.password)
end
30.times do
re = RealEstate.create(
title: Faker::Space.galaxy,
description: Faker::Lorem.paragraph_by_chars(number: 256),
address: Faker::Address.full_address,
location: Faker::Address.city,
price: Faker::Number.number(digits: 8),
user: User.all.sample(),
category: Category.all.sample()
)
re.images.attach(io: File.open(ENV['SAMPLE_IMAGES']), filename: 'sample_image')
end
puts "%" * 50
puts " Base de données remplie !"
puts "%" * 50
Resolved !
I actually went to the folder where the file was stocked and simply typed " pwd " :
So, in my case the pathname was : '/home/pedrofromperu/next/images/indian.jpg'
This issue you are experiencing happens to lots of people who switch between Unix-Like and Windows operating systems. Paths in Windows are written using ‘\’ instead of ‘/‘. This is particularly confusing when using WSL and PowerShell in windows terminal - you have to keep track of which shell environment you are using. Congrats using ‘print working directory’, pwd, to solve the problem.
One thing you could do if you change environments a lot, (and this is just one approach.) Use the OS gem like so:
require 'os'
OS.windows? # returns true or false.
You could then either provide different paths or get fancy and replace the characters in your string.

Rails+resque background job import not adding anything to the database

I have an issue with importing a lot of records from a user provided excel file into a database. The logic for this is working fine, and I’m using ActiveRecord-import to cut down on the number of database calls. However, when a file is too large, the processing can take too long and Heroku will return a timeout. Solution: Resque and moving the processing to a background job.
So far, so good. I’ve needed to add CarrierWave to upload the files to S3 because I can’t just hold the file in memory for the background job. The upload portion is also working fine, I created a model for them and am passing the IDs through to the queued job to retrieve the file later as I understand I can’t pass a whole ActiveRecord object through to the job.
I’ve installed Resque and Redis locally, and everything seems to be setup correctly in that regard. I can see the jobs I’m creating being queued and then run without failing. The job seems to run fine, but no records are added to the database. If I run the code from my job line by line in the console, the records are added to the database as I would expect. But when the queued jobs I’m creating run, nothing happens.
I can’t quite work out where the problem might be.
Here’s my upload controller’s create action:
def create
#upload = Upload.new(upload_params)
if #upload.save
Resque.enqueue(ExcelImportJob, #upload.id)
flash[:info] = 'File uploaded.
Data will be processed and added to the database.'
redirect_to root_path
else
flash[:warning] = 'Upload failed. Please try again.'
render :new
end
end
This is a simplified version of the job with fewer sheet columns for clarity:
class ExcelImportJob < ApplicationJob
#queue = :default
def perform(upload_id)
file = Upload.find(upload_id).file.file.file
data = parse_excel(file)
if header_matches? data
# Create a database entry for each row, ignoring the first header row
# using activerecord-import
sales = []
data.drop(1).each_with_index do |row, index|
sales << Sale.new(row)
if index % 2500 == 0
Sale.import sales
sales = []
end
end
Sale.import sales
end
def parse_excel(upload)
# Open the uploaded excel document
doc = Creek::Book.new upload
# Map rows to the hash keys from the database
doc.sheets.first.rows.map do |row|
{ date: row.values[0],
title: row.values[1],
author: row.values[2],
isbn: row.values[3],
release_date: row.values[5],
units_sold: row.values[6],
units_refunded: row.values[7],
net_units_sold: row.values[8],
payment_amount: row.values[9],
payment_amount_currency: row.values[10] }
end
end
# Returns true if header matches the expected format
def header_matches?(data)
data.first == {:date => 'Date',
:title => 'Title',
:author => 'Author',
:isbn => 'ISBN',
:release_date => 'Release Date',
:units_sold => 'Units Sold',
:units_refunded => 'Units Refunded',
:net_units_sold => 'Net Units Sold',
:payment_amount => 'Payment Amount',
:payment_amount_currency => 'Payment Amount Currency'}
end
end
end
I can probably have some improved logic anyway as right now I’m holding the whole file in memory, but that isn’t the issue I’m having – even with a small file that has only 500 or so rows, the job doesn’t add anything to the database.
Like I said my code worked fine when I wasn’t using a background job, and still works if I run it in the console. But for some reason the job is doing nothing.
This is my first time using Resque so I don’t know if I’m missing something obvious? I did create a worker and as I said it does seem to run the job. Here’s the output from Resque’s verbose formatter:
*** resque-1.27.4: Waiting for default
*** Checking default
*** Found job on default
*** resque-1.27.4: Processing default since 1508342426 [ExcelImportJob]
*** got: (Job{default} | ExcelImportJob | [15])
*** Running before_fork hooks with [(Job{default} | ExcelImportJob | [15])]
*** resque-1.27.4: Forked 63706 at 1508342426
*** Running after_fork hooks with [(Job{default} | ExcelImportJob | [15])]
*** done: (Job{default} | ExcelImportJob | [15])
In the Resque dashboard the jobs aren’t logged as failed. They get executed and I can see an increment in the ‘processed’ jobs on the stats page. But as I say the DB remains untouched. What’s going on? How can I debug the job more clearly? Is there a way to get into it with Pry?
It looks like my problem was with Resque.enqueue(ExcelImportJob, #upload.id).
I changed my code to ExcelImportJob.perform_later(#upload.id) and now my code actually runs!
I also added a resque.rake task to lib/tasks as described here: http://bica.co/2015/01/20/active-job-resque/.
That link also notes how to use rails runner to call the job without running the full Rails server and triggering the job, which is useful for debugging.
Strangely, I didn't quite manage to get the job to print anything to STDOUT as suggested by #hoffm but at least it led me down a good avenue of inquiry.
I still don't fully understand the difference between why calling Resqueue.enqueue still added my jobs to the queue and indeed seemed to run them, but the code wasn't executed, so if someone has a better grasp and an explanation, that would be much appreciated.
TL;DR: calling perform_later rather than Resque.enqueue fixed the problem but I don't know why.

Precompile / preload cached fragments in Rails

Using Rails 3.1.1 and Herkou
I have 1.000 products in my app. They all have a very slow controller which is effectively solved by fragment caching. Although the data doesn't change very often, it still needs to expire (which I do by sweeping) periodically, in my case once a week.
Now, after sweeping the cached views I don't want my users to create the new fragments by trying to access the products one after another (takes about 6-8 secs at the first load, 2-3 sec for the cached load). I assume I can do that with some sort of script that will load each Product Page one by one and thus make the server create those fragments.
I can imagine this can be handled in three ways:
Run a script on my local machine that will try to access each url with some sort of get-command - Downside: Not very pretty and will affect visitor stats in a way I would not prefer.
Run the same type script on the server after the sweeper, that will load each Product. How can I do that, in that case?
Using a smart Rails command to do this automatically. Is there such an elegant command?
I made this script and it works. The "product.slug" is because I have friendly_id installed. It will produce url-variables with names such as www.mydomain.com/productabc-123/ which will be read by Nokogiri (Nokogiri gem is needed for this solution).
PLEASE NOTE THAT I SWITCHED FROM FRAGMENT CACHING TO ACTION CACHING IN THIS SOLUTION (as opposed to the question, where I am using fragment caching). The important difference for this is when I check the cache if Rails.cache.exist?('views/www.mydomain.com/' + product.slug). For fragment_caching it should be the fragment name there instead.
require 'nokogiri'
require 'open-uri'
Product.all.each do |product|
url = 'http://www.mydomain.com/' + product.slug
begin
if Rails.cache.exist?('views/www.mydomain.com/' + product.slug)
puts url + " is already in cache"
else
doc = Nokogiri::HTML(open(url))
puts "Reads " + url
# Verifies if the caching worked. Only for trouble shooting
if Rails.cache.exist?('views/www.mydomain.com/' + product.slug)
puts "--->" + url + " is NOW in the cache"
else
puts "--->" + url + " is still not in the cache!"
end
sleep 1
end
rescue
puts 'Normal rescue of ' + url
rescue Timeout::Error
puts 'Timeout rescue of ' + url
puts 'Sleep for 5 sec'
sleep 5
retry
end
end
Create a script that runs as rake task, or better yet a worker, that runs and curls the page. There is no need to include a gem when you can just call curl
`curl -A "CacheRefresher" #{ENV['HOSTNAME']}/api/v1/#{klass.name.underscore.pluralize}/#{id} >/dev/null 2>&1`

Resources