Rails scraping errors - ruby-on-rails

Firstly it is showing error I have use #Watir.default_timeout = 900 also I try change the time in config file inside webrick still it doesn't work
The errors are
Net::ReadTimeout
Secondly for the next error I even tried changing port number still it doesn't work
Errno::ECONNREFUSED (Connection refused - connect(2) for "127.0.0.1" port 70 55):
I want to list all the seller name and price but it only list 2 sellers name and price I want them all
require 'selenium-webdriver'
require 'phantomjs'
require 'watir'
browser = Watir::Browser.new: chrome
browser.window.maximize
browser.goto "url"
browser.div(: class => 'sellCont').uls.each do |list |
puts list.lis.first.text# For dealer name
puts list.li(: class => 'price')# For price
end
browser.close

Are you getting all required data when you load given url first time without click on any link or button on browser?

As you said your example one-to-one as your code. It is really strange because it does not look like a valid ruby code. This code works in my console (irb)
require 'selenium-webdriver'
require 'watir-webdriver'
browser = Watir::Browser.new :chrome
browser.window.maximize
browser.goto "https://paytm.com/shop/p/gionee-e7-mini-black-MOBGIONEE-E7-MIHAPP44414CBBDB36C?psearch=organic%7Cundefined%7Cgionee%20e7%7Cgrid"
browser.div(:class => 'sellCont').uls.each do |list |
puts list.li.first.text# For dealer name
puts list.li(:class => 'price')# For price
end
browser.close
Pay attention that you don't use phantomjs (maybe you need it in general but not in the given example) so it was removed. And I require watir-webdriver not watir (I just take it from one of my project)

Related

Getting " wrong argument type Fixnum (expected String) (TypeError)" while running the Cucumber tests whenever it encounters the Capybara function

I am trying to use capybara in my feature tests, but I keep getting the above error. However, my tests work when non Capybara functions are involved.
Here is the settings in my env.rb:
Capybara.server_host = 45454
#Capybara.server_host = host
Capybara.app_host = 'http://localhost:45454'
Capybara.default_driver = :poltergeist
PATH variable is also set for Phantomjs
Following is the steps definition file where I am facing the issue.
Given(/^I navigate to home page$/) do
visit '/'
end
And /^I take screenshot$/ do
page.save_screenshot
end
Following is the feature file
Scenario: To validate the page shows up
Given I navigate to home page
And I take screenshot
Here is the output:
Scenario: To validate the page shows up←[90m # features/home.feature:8←[0m
←[31mGiven I navigate to home page←[90m # features/step_definitions/
home_steps.rb:8←[0m←[0m
←[31m wrong argument type Fixnum (expected String) (TypeError)←[0m
←[31m ./features/step_definitions/home_steps.rb:9:in `/^I navigate to
home
page$/'←[0m
←[31m features/home.feature:9:in `Given I navigate to home page'←[0m
←[36mAnd I take screenshot←[90m # features/step_definitions/
home_steps.rb:12←[0m←[0m
←[31m wrong argument type Fixnum (expected String) (TypeError)←[0m
←[31mFailing Scenarios:←[0m
←[31mcucumber features/home.feature:8←[0m←[90m # Scenario: To validate the
page
shows up←[0m
1 scenario (←[31m1 failed←[0m)
2 steps (←[31m1 failed←[0m, ←[36m1 skipped←[0m)
0m0.649s
Capybara.server_host needs to be the hostname/ip of an interface Capybara can bind the AUT to, not a number.
You're probably trying to set the port, which would be
Capybara.server_port = 45454
and then judging by your setting of app_host (which probably isn't necessary) you also want to be setting
Capybara.server_host = 'localhost'

Ruby Watir Gem, Timing Out on Form Input

I'm practicing webscraping using Watir, Mechanize and Nokigiri gems.
I'm running into an issue with my Watir script. My plan is to get a list of prices from flights via http://tripadvisor.com/. When I run the script, the Chrome browser opens as it should, the script proceeds to fill out the first parts of the form, origin and destination and then it halts. Here is the error message I'm getting:
This code has slept for the duration of the default timeout waiting for an Element to be present. If the test is still passing, consider using Element#exists? instead of rescuing UnknownObjectException
/home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:515:in `rescue in wait_for_present': element located, but timed out after 30 seconds, waiting for true condition on #<Watir::Input: located: true; {:name=>"rt_leaveday", :tag_name=>"input"}> (Watir::Exception::UnknownObjectException)
from /home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:505:in `wait_for_present'
from /home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:522:in `wait_for_enabled'
from /home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:534:in `wait_for_writable'
from /home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:639:in `element_call'
from /home/jaffejoe/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/watir-6.2.0/lib/watir/elements/element.rb:303:in `send_keys'
from watir_test.rb:8:in `<main>' </
Here is my code:
require 'watir'
browser = Watir::Browser.new
browser.goto('https://tripadvisor.com/CheapFlightsHome')
browser.input(name: 'orig').send_keys('Boston, MA - Logan International Airport (BOS)')
browser.input(name: 'dest').send_keys('Milan, Italy - All Airports (MIL)')
browser.input(name: 'rt_leaveday').send_keys('1')
browser.input(name: 'rt_leavemonth').send_keys('06/2017')
browser.input(name: 'retday').send_keys('30')
browser.input(name: 'leavemonth').send_keys('06/2017')
browser.input(value: 'Search Flights').click
puts browser.url
browser.quit
It can't set value on the rt_leaveday or rt_leavemonth because they are hidden inputs. But you can execute a script to click on dateselector:
require 'watir'
browser = Watir::Browser.new
browser.goto('https://tripadvisor.com/CheapFlightsHome')
browser.text_field(name: 'orig').set('Boston, MA - Logan International Airport (BOS)')
browser.text_field(name: 'dest').set('Milan, Italy - All Airports (MIL)')
browser.execute_script('document.querySelector(".in_date").click()')
browser.execute_script('document.querySelector(".day_28").click()')
browser.execute_script('document.querySelector(".out_date").click()')
browser.execute_script('document.querySelector(".day_2").click()')
browser.span(id: "CHECK_FARES_BUTTON").fire_event :click
puts browser.url
browser.quit
=> https://www.tripadvisor.com/CheapFlightsSearchResults-g187849-a_airport0.BOS-a_airport1.MIL-a_cos.0-a_date0.20170328-a_date1.20170402-a_nearby0.no-a_nearby1.no-a_nonstop.no-a_pax0.a-a_travelers.1-Milan_Lombardy.html

Ruby Waitir Gem, getting unexpected results

2 weeks ago I put up post about my watir script timing out on me, I was able to get a solution but I realized to late the results I was getting was different than the person that helped me out. Here is the original post: Ruby Watir Gem, Timing Out on Form Input
require 'watir'
browser = Watir::Browser.new
browser.goto('https://tripadvisor.com/CheapFlightsHome')
browser.text_field(name: 'orig').set('Boston, MA - Logan International Airport (BOS)')
browser.text_field(name: 'dest').set('Milan, Italy - All Airports (MIL)')
browser.execute_script('document.querySelector(".in_date").click()')
browser.execute_script('document.querySelector(".day_28").click()')
browser.execute_script('document.querySelector(".out_date").click()')
browser.execute_script('document.querySelector(".day_2").click()')
browser.span(id: "CHECK_FARES_BUTTON").fire_event :click
puts browser.url
browser.quit
The person who wrote that code got this as a result:
https://www.tripadvisor.com/CheapFlightsSearchResults-g187849-a_airport0.BOS-a_airport1.MIL-a_cos.0-a_date0.20170328-a_date1.20170402-a_nearby0.no-a_nearby1.no-a_nonstop.no-a_pax0.a-a_travelers.1-Milan_Lombardy.html
I have the same code in my script and for some reason I'm only getting:
https://www.tripadvisor.com/CheapFlightsHome
It seems as though the button click isn't happening for me, not sure. I tried both chrome and firefox.
First of all your click actually opens another two window and also the time you are clicking it's not exactly receiving the click, Please use this code, it will work for you and you will be left with your expected window
require 'watir'
caps = Selenium::WebDriver::Remote::Capabilities.firefox(marionette: false)
driver=Selenium::WebDriver.for :firefox, desired_capabilities: caps, profile: "default"
b=Watir::Browser.new driver
b.goto('https://tripadvisor.com/CheapFlightsHome')
b.text_field(name: 'orig').set('Boston, MA - Logan International Airport (BOS)')
b.text_field(name: 'dest').set('Milan, Italy - All Airports (MIL)')
b.execute_script('document.querySelector(".in_date").click()')
b.execute_script('document.querySelector(".day_28").click()')
b.execute_script('document.querySelector(".out_date").click()')
b.execute_script('document.querySelector(".day_2").click()')
begin
b.element(xpath: ".//*[#id='CHECK_FARES_BUTTON']").click
end until b.windows.count>1
b.windows[0].close
b.windows[1].close
puts b.url
b.quit
Firstly, I changed .fire_event :click to .click.
Then there was an error that appeared.
Please enter a valid airport code or city.
Secondly, I tried this:
browser.span(id: "CHECK_FARES_BUTTON").click
browser.span(id: "CHECK_FARES_BUTTON").click
puts browser.url
And it redirected me to:
https://www.tripadvisor.com/CheapFlightsSearchResults-g187849-a_airport0.BOS-a_airport1.MIL-a_cos.0-a_date0.20170401-a_date1.20170402-a_nearby0.no-a_nearby1.no-a_nonstop.no-a_pax0.a-a_travelers.1-Milan_Lombardy.html
To be honest I have no idea why it does not register the input during the first click...
I am using chrome.

Geocoding API not responding fast enough

I am using the Geocoder gem but lately it does not seem to work.
I get this error:
Geocoding API not responding fast enough (use Geocoder.configure(:timeout => ...) to set limit).
My application_controller.rb is:
before_filter :lookup_ip_location
private
def lookup_ip_location
if Rails.env.development?
prr = Geocoder.search(request.remote_ip).first
p "this is #{prr.inspect}"
else
request.location
end
end
This is development.rb:
# test geocoder gem locally
class ActionDispatch::Request
def remote_ip
"71.212.123.5" # ipd home (Denver,CO or Renton,WA)
# "208.87.35.103" # websiteuk.com -- Nassau, Bahamas
# "50.78.167.161" # HOL Seattle, WA
end
end
I am loading an IP addresses from development.rb to check if geocoder works locally, but it does not. I am getting the above error.
Also, when printing prr I get nil.
I also added a geocoder.rb initializer to raise the timeout to 15 seconds but even after 15 seconds of the browser loading the page I'm still getting the same message.
Is it broken? Should I use another gem? If so, do you have any suggestions?
Interesting. I tried your exact methods, and was running into the same problems. I also tried bumping the timeout up to 60 seconds, and same error.
Then I noticed Geocoder uses freegeoip. So I went to see what that was all about. Lo and behold, freegeoip.net is down. Suspicious.
So I checked the Geocoder documentation for any different ip address lookup services they offer. Sure enough, under "Ip Address Services", there are multiple offers. I tried the first one that does not require an API key, which was :ipinfo_io.
[18] pry(main)> Geocoder.configure(ip_lookup: :ipinfo_io)
=> {:timeout=>30,
:lookup=>:google,
:ip_lookup=>:ipinfo_io,
:language=>:en,
:http_headers=>{},
:use_https=>false,
:http_proxy=>nil,
:https_proxy=>nil,
:api_key=>nil,
:cache=>nil,
:cache_prefix=>"geocoder:",
:basic_auth=>{},
:logger=>:kernel,
:kernel_logger_level=>2,
:always_raise=>[],
:units=>:mi,
:distances=>:linear}
[19] pry(main)> Geocoder.search("144.138.175.101")
=> [#<Geocoder::Result::IpinfoIo:0x007fce5da5fe28 #cache_hit=nil, #data={"ip"=>"144.138.175.101", "city"=>"", "region"=>"", "country"=>"AU", "loc"=>"-27.0000,133.0000"}>]
And it works! But the response doesn't have much info. I would recommend looking at the other ip lookup services that Geocoder uses. Find one that is reliable and has enough response info for your needs. Seems that freegeoip is free, but can also be unreliable. Cheers.
EDIT: Found some related information about freegeoip.net here. If you really wish to use freegeoip, looks like you can run your own instance. Hope this helps!

Cucumber tests suddenly stops

I have feature like this:
Feature: Searching chats
In order to find chats
As an user
I want to find different chats by username or ad name
Background:
Given System prepares for chats
And There is a few machines with names:
| machine_1 |
| machine_2 |
| machine_3 |
And There is a few services with names:
| service_1 |
| service_2 |
| service_3 |
And I have chats with ads owners
Scenario: Searching chats
When I am logged in as a "user"
And I go to chats page # <- stops here
Then I should see search results when I fill form with:
| input | results |
| ma | machine_1, machine_2, machine_3 |
| se | service_1, service_2, service_3 |
When I start cucumber feature (or scenario), it suddenly stops at step "And I go to chats page" without any error message. Result looks like:
[alex#MacBookPro ~/my_project | master]$ cucumber features/chat/search.feature
Using the default profile...
#javascript
Feature: Searching chats
In order to find chats
As an user
I want to find different chats by username or ad name
Background: # features/chat/search.feature:8
Given System prepares for chats # features/step_definitions/chats.steps.rb:11
And There is a few machines with names: # features/step_definitions/machine.steps.rb:10
| machine_1 |
| machine_2 |
| machine_3 |
And There is a few services with names: # features/step_definitions/service.steps.rb:144
| service_1 |
| service_2 |
| service_3 |
And I have chats with ads owners # features/step_definitions/chats.steps.rb:5
Scenario: Searching chats # features/chat/search.feature:20
When I am logged in as a "user" # features/step_definitions/user.steps.rb:68
And I go to chats page # features/step_definitions/chats.steps.rb:17
[alex#MacBookPro~/my_project | master]$
That's my "falling" steps:
When /^I go to chats page$/ do
visit root_path
within('.global-menu') do
click_on username(#current_user)
click_on I18n.t('views.menu.profile.links.dashboard')
end
click_on I18n.t('views.menu.profile.links.chats')
end
Then(/^I should see search results when I fill form with:$/) do |table|
table.hashes.each do |search_data|
### searching ###
#page.query.set search_data['input']
# for AJAX search
sleep 1
### show results ###
search_data['results'].split(', ').each do |res|
page.should have_content res.mb_chars.upcase
end
within('#chats') do
page.all('.chat').length.should == search_data['results'].split(', ').size
end
end
end
I'm using capybara-webkit with cucumber. That's my env.rb:
require 'rubygems'
require 'spork'
require 'capybara'
require 'capybara/rspec'
require 'selenium-webdriver'
require 'site_prism'
require 'capybara-screenshot/cucumber'
# require 'cucumber/rails'
# 1) Tag your scenario (or feature) with #allow-rescue
#
# 2) Set the value below to true. Beware that doing this globally is not
# recommended as it will mask a lot of errors for you!
#
# ActionController::Base.allow_rescue = false
#############################################################################
ENV['SKIP_RAILS_ADMIN_INITIALIZER'] = 'false' # This fixes weird errors with cucumber + rails_admin (http://makandracards.com/makandra/9597-rake-spec-+-rails_admin-weirdly-failing-specs).
#############################################################################
Before do
DatabaseCleaner.strategy = :truncation
DatabaseCleaner.clean
FactoryGirl.create(:setting)
ContactType.generate_contact_types
ContactType.generate_ims
end
Spork.prefork do
require 'cucumber/rails'
require 'email_spec' # add this line if you use spork
require 'email_spec/cucumber'
Capybara.default_selector = :css
end
Spork.each_run do
ActionController::Base.allow_rescue = false
begin
DatabaseCleaner.strategy = :truncation
rescue NameError
raise "You need to add database_cleaner to your Gemfile (in the :test group) if you wish to use it."
end
Capybara.register_driver :webkit do |app|
Capybara::Webkit::Driver.new(app, :stderr => nil)
end
Capybara.javascript_driver = :webkit
Cucumber::Rails::Database.javascript_strategy = :truncation
end
Problem appeard when I have updaded my project to rails4. Any ideas?
A lot of the developers I managed find Capybara-webkit to be really problematic and inconsistent.
poltergeist/PhantomJS has a lot of advantages over it. Generally it:
is more deterministic, in that problem scenarios are more likely to fail consistently than to be flaky
is less machine-dependent; our suite now behaves pretty much the same in all of our test environments
gives better error messages
fails when there are Javascript errors, even if the test would otherwise pass
doesn’t hang, and
is easier to install.
Here's a good post from Dave Schweisguth about his presentation at the the February Automated Testing SF meetup, where he discussed his company (Fandor)'s testing setup/environment, issues, and troubleshooting and a quick comparison. It might help you track down your problem.
Ok, I don't have an answer, but I have more evidence that leads to a workaround.
This applies to rspec, but I assume it should be the same for Cucumber as well:
rspec spec/ --formatter progress --out rspec.output.txt
It looks like the pointer to STDOUT is getting mashed somehow. By specifying an output file and tailing it, you should see the full output.
I tried all the different formatters and no matter what, if they output to STDOUT, the output gets lost somewhere along the way.
I've had a similar issue, for unknown reasons when using Selenium web driver. But when I've switched to Poltergeist (PhantomJS) it started to work.
Also, I noticed, that you are requiring selenium driver, but then, you are using the Webkit.
And after changing the driver, try running everything without a Spork running.
Use thin web server instead of webkit and put the following code in features/support/env.rb:
Capybara.server do |app, port|
require 'rack/handler/thin'
Rack::Handler::Thin.run(app, :Port => port)
end
Read more about this solution using thin server from the following link:
Solution of same problem using thin web server, and read this solution same solution by another one.

Resources