Chrome headless download pdf using capybara and selenium - ruby-on-rails

I'm using chrome headless with Selenium (3.14.0) and Capybara (3.8.0) in my Ruby on Rails (5.2.1) project and I have a test which works in Non-headless chrome but not in headless chrome. I'm using the '--headless' flag on google chrome stable version 69.
I've setup my headless chrome with the following and this works for all tests which don't download files.
download_path="#{Rails.root}/tmp/downloads"
Capybara.register_driver(:headless_chrome) do |app|
caps = Selenium::WebDriver::Remote::Capabilities.chrome(
chromeOptions: {
prefs: {
'download.default_directory' => download_path,
"download.extensions_to_open" => "applications/pdf",
'download.directory_upgrade' => true,
'download.prompt_for_download' => false,
'plugins.plugins_disabled' => ["Chrome PDF Viewer"]
},
binary: "/opt/google/chrome/google-chrome",
args: %w[headless disable-gpu window-size=1920,1080]
}
)
Capybara::Selenium::Driver.new(
app,
browser: :chrome,
desired_capabilities: caps
)
end
I've read that I should be sending a command to selenium chrome driver to allow downloads but I cannot figure out how to do that with my setup. Here is what I'm trying to get working, but with my setup; (not from my code base);
#driver = Selenium::WebDriver.for :chrome, options: options
bridge = #driver.send(:bridge)
path = '/session/:session_id/chromium/send_command'
path[':session_id'] = bridge.session_id
bridge.http.call(:post, path, cmd: 'Page.setDownloadBehavior',
params: {
behavior: 'allow',
downloadPath: download_path
})
How do I access the selenium bridge in my setup so that I can send this http call?

You don't need to send that manually anymore it was added to selenium as Selenium::WebDriver::Chrome::Server#download_path=. You can set it in your driver registration via the Capybara::Selenium::Driver instance
...
Capybara::Selenium::Driver.new(
app,
browser: :chrome,
desired_capabilities: caps
).tap { |d| d.browser.download_path = <your download path> }

Related

Webdrivers::NetworkError - Mac64 M1 - ChromeDriver

My Capybara Selenium Webdriver set up is failing when trying to make a connection to ChromeDriver - It appears they released a version without an M1 version to find at the Chromedriver API https://chromedriver.storage.googleapis.com/index.html?path=106.0.5249.61/
Error:
Webdrivers::NetworkError:
Net::HTTPServerException: 404 "Not Found" with https://chromedriver.storage.googleapis.com/106.0.5249.61/chromedriver_mac64_m1.zip
CODE:
Capybara.register_driver :headless_chrome do |app|
options.add_argument("--disable-gpu")
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--window-size=1920,1080")
driver = Capybara::Selenium::Driver.new(app, browser: :chrome, options: options)
### Allow file downloads in Google Chrome when headless
### https://bugs.chromium.org/p/chromium/issues/detail?id=696481#c89
bridge = driver.browser.send(:bridge)
path = "/session/:session_id/chromium/send_command"
path[":session_id"] = bridge.session_id
bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
params: {
behavior: "allow",
downloadPath: "/tmp/downloads",
})
###
driver
end
When the application calls driver.browser I get the error above and that is because the file it's looking for does not exist.
Can I set a specific version of chrome driver or what system to look for when initializing the driver?
Fix is posted here: https://github.com/titusfortner/webdrivers/pull/239 - This is a known issue in "webdrivers"

Selenium WebDriver Chrome timeout and invalid URL

I am using Selenium Webdriver Chrome initialized in my Rails app as follows
host=XXX
port=XXX
Capybara.register_driver :selenium_chrome do |app|
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument("--proxy-server=#{host}:#{port}")
options.add_argument("--headless") # Remove this option to run on Chrome browser
Capybara::Selenium::Driver.new( app,
browser: :chrome,
options: options
)
end
However, it is timing out always when I run a spec giving this error
Net::ReadTimeout upon running the command visit url
URI.parse(current_url) returns #<URI::Generic data:,> which looks incorrect and probably the reason why it is timing out. I looked into the Selenium Webdriver gem and added debugging to see how the request is fetched for this command but for some reason it is not stopping at the pry when the command is get_current_url
Why does current_url look incorrect and why would it not stop for the command get_current_url ?
EDIT: the url is obtained from here and returns the following locally
[6] pry(Selenium::WebDriver::Remote::Bridge)> Remote::OSS::Bridge.new(capabilities, bridge.session_id, **opts).url
=> "data:,"
Adding a pry to the URL method doesn't stop, so wondering how it is obtaining the value.
Ruby: ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin20]
Selenium-Webdriver: 3.142.7

How to run Selenium Webdriver correctly on Heroku with a Rails app

I’m implementing a very basic scraper on my app with the watir gem. It runs perfectly fine locally but when I run it on heroku, it triggers this error : Webdrivers::BrowserNotFound: Failed to find Chrome binary.
I added google-chrome and chromedriver buildpacks to my app to tell Selenium where to find Chrome on Heroku but it still doest not work. Moreover, when I print the options, the binary seems to be correctly set:
#<Selenium::WebDriver::Chrome::Options:0x0000558bdf7ecc30 #args=#<Set: {"--user-data-dir=/app/tmp/chrome", "--no-sandbox", "--window-size=1200x600", "--headless", "--disable-gpu"}>, #binary="/app/.apt/usr/bin/google-chrome-stable", #prefs={}, #extensions=[], #options={}, #emulation={}, #encoded_extensions=[]>
This is my app Buildpack URLs :
1. heroku/ruby
2. heroku/google-chrome
3. heroku/chromedriver
This is my code :
def new_browser(downloads: false)
options = Selenium::WebDriver::Chrome::Options.new
chrome_dir = File.join Dir.pwd, %w(tmp chrome)
FileUtils.mkdir_p chrome_dir
user_data_dir = "--user-data-dir=#{chrome_dir}"
options.add_argument user_data_dir
if chrome_bin = ENV["GOOGLE_CHROME_SHIM"]
options.add_argument "--no-sandbox"
options.binary = chrome_bin
end
options.add_argument "--window-size=1200x600"
options.add_argument "--headless"
options.add_argument "--disable-gpu"
browser = Watir::Browser.new :chrome, options: options
if downloads
downloads_dir = File.join Dir.pwd, %w(tmp downloads)
FileUtils.mkdir_p downloads_dir
bridge = browser.driver.send :bridge
path = "/session/#{bridge.session_id}/chromium/send_command"
params = { behavior: "allow", downloadPath: downloads_dir }
bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
params: params)
end
browser
end
Any idea how to fix this ? I checked many similar issues on different websites but I did not find anything.
i also work on same thing last two days, and as you said I try a lot of different things. I finally made it.
The problem is that heroku use different path where is chromedriver downloaded. In source code of webdriver gem I found that webdriver looking on default system path for (linux, mac os, windows) and this is reason why works locally or path defined in WD_CHROME_PATH environment variable. To set path on heroku we must set this env variable
"WD_CHROME_PATH": "/app/.apt/usr/bin/google-chrome"
must be google-chrome not google-chrome-stable like we can find on examples.
That is, just run this from terminal:
heroku config:set WD_CHROME_PATH=/app/.apt/usr/bin/google-chrome
No solutions worked for me (Heroku-18 stack, with 'https://github.com/heroku/heroku-buildpack-google-chrome.git' and 'https://github.com/heroku/heroku-buildpack-chromedriver' buildpacks).
I tried all kinds of solutions but finally found a fail proof way to debug it yourself.
It involves a couple of resources:
https://www.simon-neutert.de/2018/watir-chrome-heroku/
and
https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor in particular.
Check where your actual binary and drivers are on Heroku:
$ heroku run bash
~ $ which chromedriver
/app/.chromedriver/bin/chromedriver
~ $ which google-chrome
/app/.apt/usr/bin/google-chrome
The shims that the buildpacks set up for me didn't work. In fact, even if you set the values above on Heroku to something different, the buildpacks reset them, so you lose the new shim (see here: https://github.com/heroku/heroku-buildpack-google-chrome/blob/master/bin/compile ) so I made new shims:
$ heroku config:set GOOGLE_CHROME_REAL=/app/.apt/usr/bin/google-chrome
$ heroku config:set CHROME_DRIVER_REAL=/app/.chromedriver/bin/chromedriver
Then, I modified the browser initializer (from: https://github.com/jormon/minimal-chrome-on-heroku/blob/master/runner.thor ):
def new_browser(downloads: false)
require 'watir'
require 'webdrivers'
options = Selenium::WebDriver::Chrome::Options.new
# make a directory for chrome if it doesn't already exist
chrome_dir = File.join Dir.pwd, %w(tmp chrome)
FileUtils.mkdir_p chrome_dir
user_data_dir = "--user-data-dir=#{chrome_dir}"
# add the option for user-data-dir
options.add_argument user_data_dir
# let Selenium know where to look for chrome if we have a hint from
# heroku. chromedriver-helper & chrome seem to work out of the box on osx,
# but not on heroku.
if chrome_bin = ENV["GOOGLE_CHROME_REAL"]
Selenium::WebDriver::Chrome.path = chrome_bin
end
if chrome_driver = ENV["CHROME_DRIVER_REAL"]
Selenium::WebDriver::Chrome.driver_path = chrome_driver
end
# headless!
options.add_argument "--window-size=1200x600"
options.add_argument "--headless"
options.add_argument "--disable-gpu"
# make the browser
browser = Watir::Browser.new :chrome, options: options
# setup downloading options
if downloads
# make download storage directory
downloads_dir = File.join Dir.pwd, %w(tmp downloads)
FileUtils.mkdir_p downloads_dir
# tell the bridge to use downloads
bridge = browser.driver.send :bridge
path = "/session/#{bridge.session_id}/chromium/send_command"
params = { behavior: "allow", downloadPath: downloads_dir }
bridge.http.call(:post, path, cmd: "Page.setDownloadBehavior",
params: params)
end
browser
end
Hope this helps others.
I have tried to solve this for a while with different approaches but none of them worked. Then I checked the webdrivers source code and found that you need to set the "WD_CHROME_PATH" env variable for it to work. Just attaching my full setup here. This cost me a few hours to debug and fix.
spec_helper.rb
require 'webdrivers'
require 'capybara/rspec'
# Heroku build packs need to put the chromedriver binary in a non-standard location specified by GOOGLE_CHROME_SHIM
chrome_bin = ENV.fetch('GOOGLE_CHROME_SHIM', nil)
options = {}
options[:args] = ['headless', 'disable-gpu', 'window-size=1280,1024']
options[:binary] = chrome_bin if chrome_bin
Capybara.register_driver :headless_chrome do |app|
Capybara::Selenium::Driver.new(app,
browser: :chrome,
options: Selenium::WebDriver::Chrome::Options.new(options)
)
end
Capybara.javascript_driver = :headless_chrome
Gemfile
group :test do
gem 'capybara'
gem 'timecop'
gem 'selenium-webdriver'
gem 'webdrivers'
end
app.json
{
"name": "evocal",
"repository": "https://github.com/zeitdev/evocal",
"environments": {
"test": {
"addons":[
"heroku-postgresql:in-dyno"
],
"scripts": {
"test-setup": "bundle exec rake db:seed",
"test": "bundle exec rspec"
},
"buildpacks": [
{ "url": "heroku/ruby" },
{ "url": "https://github.com/heroku/heroku-buildpack-google-chrome" },
{ "url": "https://github.com/heroku/heroku-buildpack-chromedriver" },
{ "url": "heroku/nodejs" }
],
"env": {
"WD_CHROME_PATH": "/app/.apt/opt/google/chrome/chrome"
}
}
}
}
I don't fully yet understand how selenium, webdriver and the gem interact with each other. Some also wrote that you can leave away another buildpack. But this works at least for now :-D.

Specify which Chrome to use for ChromeDriver in Rails Test

In my Rails app I have the following which makes my system tests use ChromeDriver to launch Chrome and perform my tests:
class ApplicationSystemTestCase < ActionDispatch::SystemTestCase
driven_by :selenium, using: :chrome, screen_size: [800, 600]
end
However, the Chrome installation I have at ~/Applications/Google Chrome.app is old and can't be upgraded due to IT restrictions.
Instead what we do is create a folder at:
~/Users/cameron/Applications (local)/Google Chrome.app as then we can update the app as we please as we don't have restrictions in place here.
However the ChromeDriver tries to use the version of Chrome in the main ~/Applications folder instead of my own. How can I tell the Driver to use the one in my local applications folder so that I have the correct version of Chrome running?
As this is causing the error: Selenium::WebDriver::Error::SessionNotCreatedError: session not created exception: Chrome version must be >= 60.0.3112.0
Try with the binary option (untested):
driven_by :selenium, using: :chrome, screen_size: [800, 600], options: {
:binary => 'Path to the Chrome executable'
}
I had success with this:
driven_by :selenium, using: :chrome, screen_size: [1400, 1400], options: { driver_path: 'path/to/chrome' }
I thought I did, but turns out that didn't work either.

"Refused to connect" using ChromeDriver, Capybara & Docker Compose

I'm trying to make the move from PhantomJS to Headless Chrome and have run into a bit of a snag. For local testing, I'm using Docker Compose to get all dependent services up and running. To provision Google Chrome, I'm using an image that bundles both it and ChromeDriver together while serving it on port 4444. I then link it to the my app container as follows in this simplified docker-compose.yml file:
web:
image: web/chrome-headless
command: [js-specs]
stdin_open: true
tty: true
environment:
- RACK_ENV=test
- RAILS_ENV=test
links:
- "chromedriver:chromedriver"
chromedriver:
image: robcherry/docker-chromedriver:latest
ports:
- "4444"
cap_add:
- SYS_ADMIN
environment:
CHROMEDRIVER_WHITELISTED_IPS: ""
Then, I have a spec/spec_helper.rb file that bootstraps the testing environment and associated tooling. I define the :headless_chrome driver and point it to ChromeDriver's local binding; http://chromedriver:4444. I'm pretty sure the following is correct:
Capybara.javascript_driver = :headless_chrome
Capybara.register_driver :chrome do |app|
Capybara::Selenium::Driver.new(app, browser: :chrome)
end
Capybara.register_driver :headless_chrome do |app|
capabilities = Selenium::WebDriver::Remote::Capabilities.chrome(
chromeOptions: { args: %w[headless disable-gpu window-size=1440,900] },
)
Capybara::Selenium::Driver.new app,
browser: :chrome,
url: "http://chromedriver:4444/",
desired_capabilities: capabilities
end
We also use VCR, but I've configured it to ignore any connections to the port used by ChromeDriver:
VCR.configure do |c|
c.cassette_library_dir = 'spec/vcr_cassettes'
c.default_cassette_options = { record: :new_episodes }
c.ignore_localhost = true
c.allow_http_connections_when_no_cassette = false
c.configure_rspec_metadata!
c.ignore_hosts 'codeclimate.com'
c.hook_into :webmock, :excon
c.ignore_request do |request|
URI(request.uri).port == 4444
end
end
I start the services with Docker Compose, which triggers the test runner. The command is pretty much this:
$ bundle exec rspec --format progress --profile --tag 'broken' --tag 'js' --tag '~quarantined'
After a bit of waiting, I encounter the first failed test:
1) Beta parents code redemption: redeeming a code on the dashboard when the parent has reached the code redemption limit does not display an error message for cart codes
Failure/Error: fill_in "code", with: "BOOK-CODE"
Capybara::ElementNotFound:
Unable to find field "code"
# ./spec/features/beta_parents_code_redemption_spec.rb:104:in `block (4 levels) in <top (required)>'
All specs have the same error. So, I shell into the container to run the tests myself manually and capture the HTML it's testing against. I save it locally and open it up in my browser to be welcomed by the following Chrome error page. It would seem ChromeDriver isn't evaluating the spec's HTML because it can't reach it, so it attempts to run the tests against this error page.
Given the above information, what am I doing wrong here? I appreciate any and all help as moving away from PhantomJS would solve so many headaches for us.
Thank you so much in advance. Please, let me know if you need extra information.
The issue you're having is that Capybara, by default, starts the AUT bound to 127.0.0.1 and then tells the driver to have the browser request from the same. In your case however 127.0.0.1 isn't where the app is running (From the browsers perspective), since it's on a different container than the browser. To fix that, you need to set Capybara.server_host to whatever the external interface of the "web" container is (which is reachable from the "chromedriver" container). That will cause Capybara to bind the AUT to that interface and tell the driver to have the browser make requests to it.
In your case that probably means you can specify 'web'
Capybara.server_host = 'web'

Resources