Locally-run tests pass, but Jenkins tests fail; why, and how can I fix this? - jenkins

I'm running a fairly large suite of python-based tests with a much larger number of steps on an Ubuntu Linux VM. When I run them via any number of methods manually (via the console) they all run and pass just fine.
After I ported them to a Jenkins server, four out of the thirty fail. I tried the usually recommended fix - increasing the wait time for keywords to work to 1s before every single click - so I'm fairly certain it isn't a timing issue. The site loads a lot faster than that on Windows, which I know is slower than Jenkins on Linux.
After Googling around a little for an answer, I found that apparently no one has come up with an accepted answer, either on this site or other Q/A sites.
Here's the error messages I'm getting from Jenkins.
ElementNotVisibleException: Message: element not visible
(Session info: chrome=61.0.3163.79
(Driver info: chromedriver=2.26.436382 (70eb799289ce4c2208441fc057053a5b07ceabac),platform=Linux 4.10.0-33-generic x86_64)
WebDriverException: Message: unknown error: Cannot read property 'innerHTML' of undefined
(Session info: chrome=61.0.3163.79
(Driver info: chromedriver=2.26.436382 (70eb799289ce4c2208441fc057053a5b07ceabac),platform=Linux 4.10.0-33-generic x86_64)
The other two are both element not visible exceptions identical to the first, both of which happen on a Click Button keyword that is not the first Click Button keyword in the test suite. The first one happens on a Click Element keyword that has worked perfectly since I wrote it, and the last one happens on tried-and-true JavaScript call to get the text of an element.
Why would something work locally on two different operating systems and then fail on Jenkins?

Why would something work locally on two different operating systems and then fail on Jenkins?
The most common might be that the jenkins system is running slower, and your tests aren't being hyper-vigilant about waiting for pages to finish loading before trying to interact with it. Jenkins boxes often can be under a heavy load, and if both the client and server are running on the same box, either or both could be contributing to the problem.
Another reason could be that you're running different versions of the browsers and/or selenium drivers on the jenkins box.
Another reason could be that the resolution of the (virtual?) displays is different, causing elements to be shifted to a different position.
The browsers on the jenkins box could have different profiles, resulting in a different set of plugins or antivirus software running. These can contribute to the speed at which a page renders, or could cause unwanted popups that cover portions of the screen.

Related

Why is my Dockerized Browserless-Chrome hanging when a Mediawiki Selenium test causes a new page to be loaded?

Background
I have a makefile which fetches and spins up a Dockerized Mediawiki instance on my local machine, from scratch, with a single make command:
https://gitlab.wikimedia.org/mhurd/mediawiki-docker-make
This works fairly well. Please try it!
Goal
I'd also like to be able to run Mediawiki's Selenium tests (as-is) in Docker containers and watch them execute "live", with the makefile, again, making it easy to kick this process off.
Approach
This work-in-progress branch adds a container for running Mediawiki Selenium tests, and a browserless-chrome container so you can watch and debug the tests "live", as they run:
https://gitlab.wikimedia.org/mhurd/mediawiki-docker-make/-/tree/selenium
As you can see from the top of this branch's readme, after running make to spin everything up, running make runseleniumtests kicks the tests off. When doing so a browser window is automatically opened which lets you see the Selenium tests which are running in the browserless-chrome container. This works... to a point.
Problem
Unfortunately, after a few tests are seen to be running, I'm seeing this error:
Protocol error (Input.dispatchMouseEvent): Target closed.

I suspect the way the browserless-chrome container manages its Chrome sessions may be to blame, as it appears the error is happening when a test causes a new page to be loaded, but I'm not sure.
Any ideas appreciated, and please try it yourself - the whole point of the makefile is to make spinning this up from scratch super simple. It only takes a couple commands and you should be able to see the tests running and the resulting error. Thanks!
Misc
Running curl -s http://127.0.0.1:3000/sessions | python3 -m json.tool on the host machine after attempting to run the tests shows there are multiple browserless-chrome sessions, but I'm unsure how to make it behave more like a non-dockerized setup - which has no problem with tests which cause page loads.
In email communication with Browserless support, they mention seeing similar issues but haven't been able to track down why:
It seems to be that the root cause may be due to "click" events aren't
working properly when your viewing the live debugger. Can you confirm
that if you don't open the live viewer, it allows the test to progress
and eventually throws new errors regarding selectors not being found? (Edit: I did)
I've also run into this live debugger behavior before and in my
experience I stopped watching the live debugger, and since I couldn't
see which selector it wasn't finding I exported a screenshot and in
that particular case, it ended up being an issue with my viewport,
since the viewport in the remote session was smaller, the styles
rendered differently then locally on my machine, when I set the
viewport size, the selectors were found - but then again, that was my
particular issue, might not be yours.
But yes, for some odd reason when you view live sessions, you can run
into this odd issue which we haven't been able to understand.

Request for working configurations of openresty + mobdebug + EmmyLua remote debugging in PHPStorm

I'm out of my wits. I'm trying to set up remote debugging of lua code in dockerized openresty. I use PHPStorm with EmmyLua extension, and the mobdebug library on the Lua end. I have been reading and hearing reports of this working for people, but for me stopping on a breakpoint (or immediately after mobdebug.start()) works about 15% of the time (evidence that I am not completely misconfiguring the thing), including exactly 0% of those places in my code that I actually want to debug.
I will not be debugging this issue. I intend to work around it by using an exact setup that is known to work, so I need someone for whom it does work to tell me what their setup is:
OS version
openresty version
mobdebug version
any custom patches or hacks you might have applied to get the debugging working
luasocket version (probably relevant)
PHPStorm version
EmmyLua version
docker and docker-compose version, if applicable
whatever you may suspect to be relevant
I am willing to completely raze my development environment and rebuild it exactly to the working spec, just to have working Lua debugging.
EDIT: for those interested, here are my detailed symptoms:
I can't stop at actual breakpoints, ever (i.e. after I initially stop after mobdebug.start() and then "Resume program" and a line with a breakpoint is hit, but it doesn't stop there)
I can stop after mobdebug.start() in code executed from init_by_lua_block, i.e. once per server start / config reload
I can't stop after mobdebug.start() in any code executed during handling a request, i.e. ssl_certificate_by_lua_block, rewrite_by_lua_block etc. This is probably understandable because coroutines are involved
All my attempts at enabling coroutine debugging in request handling code either error out or have no effect:
mobdebug.coro() in init_worker_by_lua_block() errors out with "API disabled in current context" somewhere in mobdebug.lua
mobdebug.on() in the function I want to debug either has no effect, or errors out with "attempt to yield across C-call boundary"; I haven't discerned the pattern yet.
stopping on a breakpoint (or immediately after mobdebug.start()) works about 15% of the time
Stopping after mobdebug.start() should work under all circumstances, except when there is a connection already established to the same debugger controller, so the fact that it doesn't usually points to the system that tries to establish multiple debugging sessions to the same controller/IDE (or no connection can be established at all).
Similarly, there are several reasons why breakpoints may not be working, but if they work in a file as part of a specific setup, then I'd expect them to always work in that case. Some of the reasons why breakpoints may not be working are listed in the documentation: https://studio.zerobrane.com/doc-faq#why-breakpoints-are-not-triggered.
mobdebug provides a command line-based controller, so for troubleshooting purposes it may be easier to use that instead of a more complex setup.

Can I see what happens in PhantomJS?

Sometimes, when running automated tests in PhantomJS using Cucumber for Rails 4, it would be really, really useful to actually sit in front of my screen, look in a window, and see exactly what the browser is doing.
There are times when your code is right and your test is right, but testing fails intermittently nonetheless. It's often because of a script, or an animation, or some CSS that gets in the way. But seeing a screenshot, drilling into a DOM inspector, or using the debugger is not enough to catch those edge cases.
Is there any way to have a window looking at what PhantomJS is doing in the background? It could be something in X Window, or running in a VNC Server, etc. Anything visual would greatly help with debugging, especially in with those finicky details.
I found a little program called PhantomVNC, but it's not telling me much on how it works. It looks like something just feeding a series of screenshots through VNC.
I tried PhantomJS and Capybara-WebKit, but neither of those headless browsers other a "head" option. Selenium-WebDriver seems complicated to set up and only seems to work with a full browser like Firefox, and that may cause more problems than it solves.
If you have any ideas, please let me know. Thank you in advance.
Not sure specifically about Cuc/Rails - but I specifically have a second chrome/webdriver Virtual Machine for this. I use PhantomJS specifically for the headless automatic acceptance testing, but when there is an issue with the tests and I'm trying to diagnose it can be helpful to view the browser live.
In other words, spin up a Ubuntu desktop VM, install chrome, install selenium server and the chrome webdriver. Log in to the VM, open chrome, make sure selenium is running, then configure your test suite to connect to the selenium webdriver service (usually port :4444) of this instance. Run your tests and you should be able to watch the chrome instance on the VM.

What could cause my MVC apps to be incredibly slow to start?

Extremely often, in all kinds of various MVC3/4 apps I debug in VS2012 on my home machine, after pressing F5 to start debugging and open the configured start page in Chrome, it can take several - up to ten - minutes before becoming active.
I have no long startup procedures that load caches or generate code etc. and the same app will start instantly on my office machine. Quite often it will do so on my home machine as well, but this slow starting seems to come about after some hours of debugging, and possibly certain operations. Restarting VS doesn't seem to help, neither does killing IIS Express.
We were faced with an identical scenario recently where attaching the application to the debugger resulted in each page load taking about 10 minutes each, but running without debugging or in the QA environment worked fine.
The problem turned out to be that log4net was configured to use a network path for storing log files, a path that was unavailable from our local setup. This resulted in multiple attempts at accessing a remote path (once for each class being set up with Spring .Net) that didn't exist (and hence log4net threw an exception in each case).
But this should impact you out of the box, and shouldn't increase with time..

Is there a reason that Selenium tests would act differently on Ubuntu vs Mac OSX?

I am working on a project with a partner and we both are using different Operating Systems. All of our files, including the Gemfileare under version control. Previously, I had written tests in Rspec with Capybara and selenium-webdriver. These tests opened Firefox and performed simple actions, passed, and closed the browser.
A few unrelated changes to the layout concerning flash[:success] and now, with both of us updated to the HEAD, the tests pass on my machine, but fail on his.
His browser is not loading the modal box from the before block click_link "Edit" therefore when the test tries to fill in a form, the fields are still hidden and not accessible.
I attempted to wait for the box, thinking it may be filling in the fields too soon :
wait = Selenium::WebDriver::Wait.new(:timeout => 10)
wait.until { page.driver.browser.find_element(:css, "div#edit_news_container").displayed? == true }
Instead I get a timeout, and the tests still fail.
However, they work beautifully on my machine... what's going on here?
In light of the layout changes, it's likely you're having a browser issue rather than an operating system issue. Are the two of you using the same version of Firefox? If not, changing his version to yours may allow him to load the modal again.

Resources