Mechanize in Module, Nameerror ' agent' - ruby-on-rails

Looking for advice on how to fix this error and refactor this code to improve it.
require 'mechanize'
require 'pry'
require 'pp'
module Mymodule
class WebBot
agent = Mechanize.new { |agent|
agent.user_agent_alias = 'Windows Chrome'
}
def form(response)
require "addressable/uri"
require "addressable/template"
template = Addressable::Template.new("http://www.domain.com/{?query*}")
url = template.expand({"query" => response}).to_s
page = agent.get(url)
end
def get_products
products = []
page.search("datatable").search('tr').each do |row|
begin
product = row.search('td')[1].text
rescue => e
p e.message
end
products << product
end
products
end
end
end
Calling the module:
response = {size: "SM", color: "BLUE"}
t = Mymodule::WebBot.new
t.form(response)
t.get_products
Error:
NameError: undefined local variable or method `agent'

Ruby has a naming convention. agent is a local variable in the class scope. To make it visible to other methods you should make it a class variable by naming it ##agent, and it'll be shared among all the objects of WebBot. The preferred way though is to make it an instance variable by naming it #agent. Every object of WebBot will have its own #agent. But you should put it in initialize, initialize will be invoked when you create a new object with new
class WebBot
def initialize
#agent = Mechanize.new do |a|
a.user_agent_alias = 'Windows Chrome'
end
end
.....
And the same error will occur to page. You defined it in form as a local variable. When form finishes execution, it'll be deleted. You should make it an instance variable. Fortunately, you don't have to put it in initialize. You can define it here in form. And the object will have its own #page after invoking form. Do this in form:
def form(response)
require "addressable/uri"
require "addressable/template"
template = Addressable::Template.new("http://www.domain.com/{?query*}")
url = template.expand({"query" => response}).to_s
#page = agent.get(url)
end
And remember to change every occurrence of page and agent to #page and #agent. In your get_products for example:
def get_products
products = []
#page.search("datatable").search('tr').each do |row|
.....
These changes will resolve the name errors. Refactoring is another story btw.

Related

Rails + Sidekiq not recognizing class

I have a CsvImport service object in my app/services and I'm trying to call one of the class methods from within a Worker.
class InventoryUploadWorker
include Sidekiq::Worker
def perform(file_path, company_id)
CsvImport.csv_import(file_path, Company.find(company_id))
end
end
But it seems that the worker doesn't know what the class is, I've attempted require 'csv_import' to no avail.
Heres where it breaks:
WARN: ArgumentError: undefined class/module CsvImport
The method being called in
csv_import.rb
class CsvImport
require "benchmark"
require 'csv'
def self.csv_import(filename, company)
time = Benchmark.measure do
File.open(filename) do |file|
headers = file.first
file.lazy.each_slice(150) do |lines|
Part.transaction do
inventory = []
insert_to_parts_db = []
rows = CSV.parse(lines.join, write_headers: true, headers: headers)
rows.map do |row|
part_match = Part.find_by(part_num: row['part_num'])
new_part = build_new_part(row['part_num'], row['description']) unless part_match
quantity = row['quantity'].to_i
row.delete('quantity')
row["condition"] = match_condition(row)
quantity.times do
part = InventoryPart.new(
part_num: row["part_num"],
description: row["description"],
condition: row["condition"],
serial_num: row["serial_num"],
company_id: company.id,
part_id: part_match ? part_match.id : new_part.id
)
inventory << part
end
end
InventoryPart.import inventory
end
end
end
end
puts time
end
your requires are inside the class. Put them outside the class so they're required right away when the file is loaded, not when the class is loaded.
Instead of
class CsvImport
require "benchmark"
require 'csv'
...
Do this
require "benchmark"
require 'csv'
class CsvImport
...
Try to add to application.rb
config.autoload_paths += Dir["#{config.root}/app/services"]
More details here: autoload-paths

Gem to wrap API can't make API key setter work in all classes

I have a Ruby gem which wraps an API. I have two classes: Client and Season with a Configuration module. But I can't access a change to the API Key, Endpoint made via Client in the Season class.
My ApiWrapper module looks like this:
require "api_wrapper/version"
require 'api_wrapper/configuration'
require_relative "api_wrapper/client"
require_relative "api_wrapper/season"
module ApiWrapper
extend Configuration
end
My Configuration module looks like this:
module ApiWrapper
module Configuration
VALID_CONNECTION_KEYS = [:endpoint, :user_agent, :method].freeze
VALID_OPTIONS_KEYS = [:api_key, :format].freeze
VALID_CONFIG_KEYS = VALID_CONNECTION_KEYS + VALID_OPTIONS_KEYS
DEFAULT_ENDPOINT = 'http://defaulturl.com'
DEFAULT_METHOD = :get
DEFAULT_API_KEY = nil
DEFAULT_FORMAT = :json
attr_accessor *VALID_CONFIG_KEYS
def self.extended(base)
base.reset
end
def reset
self.endpoint = DEFAULT_ENDPOINT
self.method = DEFAULT_METHOD
self.user_agent = DEFAULT_USER_AGENT
self.api_key = DEFAULT_API_KEY
self.format = DEFAULT_FORMAT
end
def configure
yield self
end
def options
Hash[ * VALID_CONFIG_KEYS.map { |key| [key, send(key)] }.flatten ]
end
end # Configuration
end
My Client class looks like this:
module ApiWrapper
class Client
attr_accessor *Configuration::VALID_CONFIG_KEYS
def initialize(options={})
merged_options = ApiWrapper.options.merge(options)
Configuration::VALID_CONFIG_KEYS.each do |key|
send("#{key}=", merged_options[key])
end
end
end # Client
end
My Season class looks like this:
require 'faraday'
require 'json'
API_URL = "/seasons"
module ApiWrapper
class Season
attr_accessor *Configuration::VALID_CONFIG_KEYS
attr_reader :id
def initialize(attributes)
#id = attributes["_links"]["self"]["href"]
...
end
def self.all
puts ApiWrapper.api_key
puts ApiWrapper.endpoint
conn = Faraday.new
response = Faraday.get("#{ApiWrapper.endpoint}#{API_URL}/") do |request|
request.headers['X-Auth-Token'] = "ApiWrapper.api_key"
end
seasons = JSON.parse(response.body)
seasons.map { |attributes| new(attributes) }
end
end
end
This is the test I am running:
def test_it_gives_back_a_seasons
VCR.use_cassette("season") do
#config = {
:api_key => 'ak',
:endpoint => 'http://ep.com',
}
client = ApiWrapper::Client.new(#config)
result = ApiWrapper::Season.all
# Make sure we got all season data
assert_equal 12, result.length
#Make sure that the JSON was parsed
assert result.kind_of?(Array)
assert result.first.kind_of?(ApiWrapper::Season)
end
end
Because I set the api_key via the client to "ak" and the endpoint to "http://ep.com" I would expect puts in the Season class's self.all method to print out "ak" and "http://ep.com", but instead I get the defaults set in the Configuration section.
What I am doing wrong?
The api_key accessors you have on Client and on ApiWrapper are independent. You initialize a Client with the key you want, but then Season references ApiWrapper directly. You've declared api_key, etc. accessors in three places: ApiWrapper::Configuration, ApiWrapper (by extending Configuration) and Client. You should probably figure out what your use cases are and reduce that down to being in just one place to avoid confusion.
If you're going to have many clients with different API keys as you make different requests, you should inject the client into Season and use it instead of ApiWrapper. That might look like this:
def self.all(client)
puts client.api_key
puts client.endpoint
conn = Faraday.new
response = Faraday.get("#{client.endpoint}#{API_URL}/") do |request|
request.headers['X-Auth-Token'] = client.api_key
end
seasons = JSON.parse(response.body)
seasons.map { |attributes| new(attributes) }
end
Note that I also replaced the "ApiWrapper.api_key" string with the client.api_key - you don't want that to be a string anyway.
Having to pass client into every request you make is going to get old, so then you might want to pull out something like a SeasonQuery class to hold onto it.
If you're only ever going to have one api_key and endpoint for the duration of your execution, you don't really need the Client as you've set it up so far. Just set ApiWrapper.api_key directly and continue using it in Season.

Capybara + remote form request

I have a form that I'm testing using Capybara. This form's URL goes to my Braintree sandbox, although I suspect the problem would happen for any remote URL. When Capybara clicks the submit button for the form, the request is routed to the dummy application rather than the remote service.
Here's an example app that reproduces this issue: https://github.com/radar/capybara_remote. Run bundle exec ruby test/form_test.rb and the test will pass, which is not what I'd typically expect.
Why does this happen and is this behaviour that I can rely on always happening?
Mario Visic points out this description in the Capybara documentation:
Furthermore, you cannot use the RackTest driver to test a remote application, or to access remote URLs (e.g., redirects to external sites, external APIs, or OAuth services) that your application might interact with.
But I wanted to know why, so I source dived. Here's my findings:
lib/capybara/node/actions.rb
def click_button(locator)
find(:button, locator).click
end
I don't care about the find here because that's working. It's the click that's more interesting. That method is defined like this:
lib/capybara/node/element.rb
def click
wait_until { base.click }
end
I don't know what base is, but I see the method is defined twice more in lib/capybara/rack_test/node.rb and lib/capybara/selenium/node.rb. The tests are using Rack::Test and not Selenium, so it's probably the former:
lib/capybara/rack_test/node.rb
def click
if tag_name == 'a'
method = self["data-method"] if driver.options[:respect_data_method]
method ||= :get
driver.follow(method, self[:href].to_s)
elsif (tag_name == 'input' and %w(submit image).include?(type)) or
((tag_name == 'button') and type.nil? or type == "submit")
Capybara::RackTest::Form.new(driver, form).submit(self)
end
end
The tag_name is probably not a link -- because it's a button we're clicking -- so it falls to the elsif. It's definitely an input tag with type == "submit", so then let's see what Capybara::RackTest::Form does:
lib/capybara/rack_test/form.rb
def submit(button)
driver.submit(method, native['action'].to_s, params(button))
end
Ok then. driver is probably the Rack::Test driver for Capybara. What's that doing?
lib/capybara/rack_test/driver.rb
def submit(method, path, attributes)
browser.submit(method, path, attributes)
end
What is this mysterious browser? It's defined in the same file thankfully:
def browser
#browser ||= Capybara::RackTest::Browser.new(self)
end
Let's look at what this class's submit method does.
lib/capybara/rack_test/browser.rb
def submit(method, path, attributes)
path = request_path if not path or path.empty?
process_and_follow_redirects(method, path, attributes, {'HTTP_REFERER' => current_url})
end
process_and_follow_redirects does what it says on the box:
def process_and_follow_redirects(method, path, attributes = {}, env = {})
process(method, path, attributes, env)
5.times do
process(:get, last_response["Location"], {}, env) if last_response.redirect?
end
raise Capybara::InfiniteRedirectError, "redirected more than 5 times, check for infinite redirects." if last_response.redirect?
end
So does process:
def process(method, path, attributes = {}, env = {})
new_uri = URI.parse(path)
method.downcase! unless method.is_a? Symbol
if new_uri.host
#current_host = "#{new_uri.scheme}://#{new_uri.host}"
#current_host << ":#{new_uri.port}" if new_uri.port != new_uri.default_port
end
if new_uri.relative?
if path.start_with?('?')
path = request_path + path
elsif not path.start_with?('/')
path = request_path.sub(%r(/[^/]*$), '/') + path
end
path = current_host + path
end
reset_cache!
send(method, path, attributes, env.merge(options[:headers] || {}))
end
Time to break out the debugger and see what method is here. Sticking a binding.pry before the final line in that method, and a require 'pry' in the test. It turns out method is :post and, for interest's sake, new_uri is a URI object with our remote form's URL.
Where's this post method coming from? method(:post).source_location tells me:
["/Users/ryan/.rbenv/versions/1.9.3-p374/lib/ruby/1.9.1/forwardable.rb", 199]
That doesn't seem right... Does Capybara have a def post somewhere?
capybara (master)★ack "def post"
lib/capybara/rack_test/driver.rb
76: def post(*args, &block); browser.post(*args, &block); end
Cool. We know that browser is aCapybara::RackTest::Browser` object. The class beginning gives the next hint:
class Capybara::RackTest::Browser
include ::Rack::Test::Methods
I know that Rack::Test::Methods comes with a post method. Time to dive into that gem.
lib/rack/test.rb
def post(uri, params = {}, env = {}, &block)
env = env_for(uri, env.merge(:method => "POST", :params => params))
process_request(uri, env, &block)
end
Ignoring env_for for the time being, what does process_request do?
lib/rack/test.rb
def process_request(uri, env)
uri = URI.parse(uri)
uri.host ||= #default_host
#rack_mock_session.request(uri, env)
if retry_with_digest_auth?(env)
auth_env = env.merge({
"HTTP_AUTHORIZATION" => digest_auth_header,
"rack-test.digest_auth_retry" => true
})
auth_env.delete('rack.request')
process_request(uri.path, auth_env)
else
yield last_response if block_given?
last_response
end
end
Hey, #rack_mock_session looks interesting. Where's that defined?
rack-test (master)★ack "#rack_mock_session ="
lib/rack/test.rb
40: #rack_mock_session = mock_session
42: #rack_mock_session = MockSession.new(mock_session)
In two places, very close to each other. What's on and around these lines?
def initialize(mock_session)
#headers = {}
if mock_session.is_a?(MockSession)
#rack_mock_session = mock_session
else
#rack_mock_session = MockSession.new(mock_session)
end
#default_host = #rack_mock_session.default_host
end
Ok then, so it ensures it is a MockSession object. What's MockSession and how is its request method defined?
def request(uri, env)
env["HTTP_COOKIE"] ||= cookie_jar.for(uri)
#last_request = Rack::Request.new(env)
status, headers, body = #app.call(#last_request.env)
headers["Referer"] = env["HTTP_REFERER"] || ""
#last_response = MockResponse.new(status, headers, body, env["rack.errors"].flush)
body.close if body.respond_to?(:close)
cookie_jar.merge(last_response.headers["Set-Cookie"], uri)
#after_request.each { |hook| hook.call }
if #last_response.respond_to?(:finish)
#last_response.finish
else
#last_response
end
end
I'm going to go right ahead here and assume #app is the Rack application stack. By calling the call method, the request is routed directly to this stack, rather going out to the world.
I conclude that this behaviour looks like its intentional and that I can indeed rely on it being that way.

Extending hash constant in another file

I've got a gem where in one of classes is sth similiar:
class Test
TESTING = {
:sth1 => 'foo',
:sth2 => 'bar'
}
# p Test.new.show
# should print 'cat'
def show
p TESTING[:sth3]
end
end
I extended in other file
# in other file
class Test
TESTING = {
:sth3 => 'cat'
}
end
But i need to use :sth3 in first file, as the first part of code stands.
Thx in advance.
You didn't extend it, you replaced the hash with a new one. Here's how to fix it:
# in the other file
Test::TESTING[:sth3] = 'cat'
I recommend using methods with lazy initialization, so that you can arrange the assignments in any order:
class Test
def self.testing
#testing ||= {}
end
testing[:sth1] = 'foo'
testing[:sth2] = 'bar'
end
# in the other file
Test.testing[:sth3] = 'cat'

em-mongo examples?

Looking to use em-mongo for a text analyzer script which loads text from db, analyzes it, flags keywords and updates the db.
Would love to see some examples of em-mongo in action. Only one I could find was on github em-mongo repo.
require 'em-mongo'
EM.run do
db = EM::Mongo::Connection.new.db('db')
collection = db.collection('test')
EM.next_tick do
doc = {"hello" => "world"}
id = collection.insert(doc)
collection.find('_id' => id]) do |res|
puts res.inspect
EM.stop
end
collection.remove(doc)
end
end
You don't need the next_tick method, that is em-mongo doing for you. Define callbacks, that are executed if the db actions are done. Here is a skeleton:
class NonBlockingFetcher
include MongoConfig
def initialize
configure
#connection = EM::Mongo::Connection.new(#server, #port)
#collection = init_collection(#connection)
end
def fetch(value)
mongo_cursor = #collection.find({KEY => value.to_s})
response = mongo_cursor.defer_as_a
response.callback do |documents|
# foo
# get one document
doc = documents.first
end
response.errback do |err|
# foo
end
end
end

Resources