Capybara: locating iframes - capybara

I'm trying to program a bot to scrape and apply for jobs on indeed.com. My question is, how can i locate the id of an iframe so i can run commands within it.
unless page.has_css?('p.expired')
click_link('Apply Now')
page.driver.within_frame(1) do
page.driver.within_frame(0) do
complete_step_one
complete_additional_steps
end
The frame that pops up is when you click on Apply Now, it asks for name, number, email, cover letter.
Sample Link: https://www.indeed.com/cmp/SMCI/jobs/Project-Engineer-3ebc5b1b2bf00349?sjdu=QwrRXKrqZ3CNX5W-O9jEvWIuBcfYv3mrYLqkE6HctuqxVGTbpbhVXnHKq24JPICkWaBy43U6kq7z7H-uMado3w&tk=1citcpf1rbi5485q&vjs=3
Any help is appreciated.

within_frame takes the frame element, the name or id of the frame, the index of the frame on the page, or in the latest version of Capybara you can pass nothing if there's only one frame on the page.
From the link you pasted it appears the main document conatins one iframe that also contains one iframe, so that could be
page.within_frame(0) do
page.within_frame(0) do
... # do whatever in the inner frame
end
end
or in the latest version of Capybara
page.within_frame do
page.within_frame do
... # do whatever in the inner frame
end
end
Another way to do it would be something like
page.within_frame('indeedapply-modal-preload-iframe') do # select outer iframe by id
inner_frame = page.find('iframe[src^="https://apply.indeed.com/indeedapply/resumeapply?"]') # use CSS attribute starts with selector to find inner iframe
within_frame(inner_frame) do # pass the found iframe element
... # do whatever you want in the inner frame
end
end

Related

Watir - How do I collect all links where the span contains aria_label "Multimedia"

I have written a ruby code where the browser object finds all links and then I store them one by one in an array if they match a specific regex.
#browser.links.collect(&:href).each do |link|
matches = regex.match(link)
array_of_multimedia << matches[:multimedia_id] if matches
end
I am trying to create a filter where I only iterate over those links where the span inside the second child div contains the aria-label as Multimedia.
Attached is the screenshot of the HTML structure.HTML structure
I tried a few approaches like finding all spans and then going bottom up to the parent's parent of the span but its not giving me the href.
#browser.spans(aria_label: "Multimedia").each do |span|
span.parent.parent.a.hreflang #Didn't work
span.parent.parent.a.link.href #Didn't work
span.parent.parent.href.text #Didn't work
element.tag_name #This shows "a" which is correct though
end
I also tried a top down approach by doing
#browser.links.collect(&:href).each do |link|
link_element = #browser.link(href: link)
link_element.children.following_sibling(aria_label: "Multimedia").present? #Didn't work
end
So far, no luck in getting the actual hrefs. Will appreciate any help!
Because the span is inside the link tag, it's going to be easier to go bottom up
Do as much as you can with the Watir locators rather than multiple loops.
The parent method takes arguments:
#browser.spans(aria_label: 'Multimedia').map {|span| span.parent(tag_name: 'a').href }
As for what you tried:
# parent.parent is the link, so calling `#a` is looking for a link nested inside the link
span.parent.parent.a.hreflang
span.parent.parent.a.link.href
# href should give you a String, you shouldn't need to call #text method on it
span.parent.parent.href.text
# element isn't defined here, but try just element.href
element.tag_name
Also note that Element#href method is essentially a wrapper for Element#attribute_value('href').

Why is capybara saying node is obsolete and how to solve?

I have an spec/capybara test which searches for an element and then attempts to run a JS script to scroll the element into view. However, Capybara claims the node is obsolete by the time it attempts to run the JS.
The lines at issue are consecutive. Here they are:
element = page.find(selector, visible: false)
Capybara.current_session.driver.browser.execute_script(script, element.native)
I have done a fair bit of debugging already. When placing a debugger between the find and execute_script lines, calling element indeed returns an obsolete node Obsolete #<Capybara::Node::Element>.
Running page.find(selector, visible: false) within the debugger does not return an obsolete node but rather the normal active node you would expect #<Capybara::Node::Element tag="div" path="/HTML/BODY[1]/DIV[6]/DIV[2]/DIV[1]/DIV[1]/DIV[54]">
Furthermore, removing the two lines and running them manually in the debugger sees capybara correctly find the DOM element, run the JS correctly, and the spec passes
The relevant code:
def scroll_to(selector, align = true)
if align
script = <<-JS
arguments[0].scrollIntoView(true);
JS
else
script = <<-JS
arguments[0].scrollIntoView(false);
JS
end
element = page.find(selector, visible: false)
Capybara.current_session.driver.browser.execute_script(script, element.native)
end
scroll_to(".xdsoft_time[data-hour='13'][data-minute='15']")
Without knowing what's happening on your page it's impossible to say exactly why you're getting the 'obsolete node' error, but that error indicates the node that was originally found is no longer in the page. This can happen if you visit a new page, if the part of the page containing that node is replaced by JS, etc.
Passing visible: false and then trying to scroll that element into the page doesn't make sense though since if the element isn't visible then you'll never be able to scroll it into view (visible means drawn on the page, it does not mean 'in the viewport').
Other issues with your code are
you should not be calling the driver specific execute_script, but rather just use the Capybara session execute_script (generally if you're using Capybara.current_session.driver.browser you're doing something wrong).
page.execute_script(script, element)
Capybara already provides a scroll_to so use it instead of writing your own
page.scroll_to(page.find(selector)) # Defaults to scrolling to the top
If you need control over the alignment of the element just pass the :align option
page.scroll_to(page.find(selector), align: :center) # :top, :bottom, :center

React Component not rendered properly with Turbolinks in Rails 5.1

I have a very simple Rails app with a react component that just displays "Hello" in an existing div element in a particular page (let's say the show page).
When I load the related page using its URL, it works. I see Hello on the page.
However, when I'm previously on another page (let's say the index page and then I go to the show page using Turbolinks, well, the component is not rendered, unless I go back and forth again. (going back to the index Page and coming back to the show page)
From here every time I go back and forth, I can say that the view is rendered twice more time.Not only twice but twice more time! (i.e. 2 times then 4, then 6 etc..)
I know that since in the same time I set the content of the div I output a message to the console.
In fact I guess that going back to the index page should still run the component code without the display since the div element is not on the index page. But why in a cumulative manner?
The problems I want to solve are:
To get the code run on the first request of the show page
To block the code from running in other pages (including the index page)
To get the code run once on subsequent requests of the show page
Here the exact steps and code I used (I'll try to be as concise as possible.)
I have a Rails 5.1 app with react installed with:
rails new myapp --webpack=react
I then create a simple Item scaffold to get some pages to play with:
rails generate scaffold Item name
I just add the following div element in the Show page (app/views/items/show.html.erb):
<div id=hello></div>
Webpacker already generated a Hello component (hello_react.jsx) that I modified as following in ordered to use the above div element. I changed the original 'DOMContentLoaded' event:
document.addEventListener('turbolinks:load', () => {
console.log("DOM loaded..");
var element = document.getElementById("hello");
if(element) {
ReactDOM.render(<Hello name="React" />, element)
}
})
I then added the following webpack script tag at the bottom of the previous view (app/views/items/show.html.erb):
<%= javascript_pack_tag("hello_react") %>
I then run the rails server and the webpack-dev-server using foreman start (installed by adding gem 'foreman' in the Gemfile) . Here is the content of the Procfile I used:
web: bin/rails server -b 0.0.0.0 -p 3000
webpack: bin/webpack-dev-server --port 8080 --hot
And here are the steps to follow to reproduce the described behavior:
Load the index page using the URL http://localhost:3000/items
Click New Item to add a new item. Rails redirects to the item's show page at the URL localhost:3000/items/1. Here we can see the Hello React! message. It works well!
Reload the index page using the URL http://localhost:3000/items. The item is displayed as expected.
Reload the show page using the URL http://localhost:3000/items/1. The Hello message is displayed as expected with one console message.
Reload the index page using the URL http://localhost:3000/items
Click to the Show link (should be performed via turbolink). The message is not shown neither the console message.
Click the Back link (should be performed via turbolink) to go to the index page.
Click again to the Show link (should be performed via turbolink). This time the message is well displayed. The console message for its part is shown twice.
From there each time I go back to the index and come back again to the show page displays two more messages at the console each time.
Note: Instead of using (and replacing) a particular div element, if I let the original hello_react file that append a div element, this behavior is even more noticeable.
Edit: Also, if I change the link_to links by including data: {turbolinks: false}. It works well. Just as we loaded the pages using the URLs in the browser address bar.
I don't know what I'm doing wrong..
Any ideas?
Edit: I put the code in the following repo if interested to try this:
https://github.com/sanjibukai/react-turbolinks-test
This is quite a complex issue, and I am afraid I don't think it has a straightforward answer. I will explain as best I can!
To get the code run on the first request of the show page
Your turbolinks:load event handler is not running because your code is run after the turbolinks:load event is triggered. Here is the flow:
User navigates to show page
turbolinks:load triggered
Script in body evaluated
So the turbolinks:load event handler won't be called (and therefore your React component won't be rendered) until the next page load.
To (partly) solve this you could remove the turbolinks:load event listener, and call render directly:
ReactDOM.render(
<Hello name="React" />,
document.body.appendChild(document.createElement('div'))
)
Alternatively you could use <%= content_for … %>/<%= yield %> to insert the script tag in the head. e.g. in your application.html.erb layout
…
<head>
…
<%= yield :javascript_pack %>
…
</head>
…
then in your show.html.erb:
<%= content_for :javascript_pack, javascript_pack_tag('hello_react') %>
In both cases, it is worth nothing that for any HTML you add to the page with JavaScript in a turbolinks:load block, you should remove it on turbolinks:before-cache to prevent duplication issues when revisiting pages. In your case, you might do something like:
var div = document.createElement('div')
ReactDOM.render(
<Hello name="React" />,
document.body.appendChild(div)
)
document.addEventListener('turbolinks:before-cache', function () {
ReactDOM.unmountComponentAtNode(div)
})
Even with all this, you may still encounter duplication issues when revisiting pages. I believe this is to do with the way in which previews are rendered, but I have not been able to fix it without disabling previews.
To get the code run once on subsequent requests of the show page
To block the code from running in other pages (including the index page)
As I have mentioned above, including page-specific scripts dynamically can create difficulties when using Turbolinks. Event listeners in a Turbolinks app behave very differently to that without Turbolinks, where each page gets a new document and therefore the event listeners are removed automatically. Unless you manually remove the event listener (e.g. on turbolinks:before-cache), every visit to that page will add yet another listener. What's more, if Turbolinks has cached that page, a turbolinks:load event will fire twice: once for the cached version, and another for the fresh copy. This is probably why you were seeing it rendered 2, 4, 6 times.
With this in mind, my best advice is to avoid adding page-specific scripts to run page-specific code. Instead, include all your scripts in your application.js manifest file, and use the elements on your page to determine whether a component gets mounted. Your example does something like this in the comments:
document.addEventListener('turbolinks:load', () => {
var element = document.getElementById("hello");
if(element) {
ReactDOM.render(<Hello name="React" />, element)
}
})
If this is included in your application.js, then any page with a #hello element will get the component.
Hope that helps!
I was struggling with similar problem (link_to helper method was changing URL but react content was not loaded; had to refresh page manually to load it properly). After some googling I've found simple workaround on this page.
<%= link_to "Foo", new_rabbit_path(#rabbit), data: { turbolinks: false } %>
Since this causes a full page refresh when the link is clicked, now my react pages are loaded properly. Maybe you will find it useful in your project as well :)
Upon what you said I tested some code.
First, I simply pull out the ReactDOM.render method from the listener as you suggested in your first snippet.
This provide a big step forward since the message is no longer displayed elsewhere (like in the index page) but only in the show page as wanted.
But something interesting happen in the show page. There is no more accumulation of the message as appended div element, which is good. In fact it's even displayed once as wanted. But.. The console message is displayed twice!?
I guess that something related to the caching mechanism is going on here, but since the message is supposed to be appended why it isn't displayed twice as the console message?
Putting aside this issue, this seems to work and I wonder why it's necessary in the first place to put the React rendering after the page is loaded (without Turbolinks there was the DOMContentLoaded event listener)?
I guess that this has do with unexpected rendering by javascript code executed when some DOM elements are yet to be loaded.
Then, I tried your alternative way using <%= content_for … %>/<%= yield %>.
And as you expected this give mitigate results ans some weird behavior.
When I load via the URL the index page and then go to the show page using the Turbolink, it works!
The div message as well as the console message are shown once.
Then if I go back (using Turbolink), the div message is gone and I got the ".. unmounted.." console message as wanted.
But from then on, whenever I go back to the show page, the div and the console message are both never displayed at all.
The only message that's displayed is the ".. unmounted.." console message whenever I go back to the index page.
Worse, if I load the show page using the URL, the div message is not displayed anymore!? The console message is displayed but I got an error regarding the div element (Cannot read property 'appenChild' of null).
I will not deny that I completely ignore what's happening here..
Lastly, I tried your last best advice and simply put the last code snippet in the HTML head.
Since this is jsx code, I don't know how to handle it within the Rails asset pipeline / file structure, so I put my javascript_pack_tag in the html head.
And indeed, this works well.
This time the code is executed everywhere so it makes sense to use page-specific element (as previously intended in the commented code).
The downside, is that this time the code could be messy unless I put all page-specific code inside if statements that test for the presence of the page-specific element.
However since Rails/Webpack has a good code structure, it should be easily manageable to put page-specific code into page-specific jsxfiles.
Nevertheless the benefit is that this time all the page-specific parts are rendered at the same time as the whole page, thus avoiding a display glitch that occurs otherwise.
I didn't address this issue at the first place, but indeed, I would like to know how to get page specific contents rendered at the same time as the whole page.
I don't know if this is possible when combining Turbolink with React (or any other framework).
But in conclusion I leave this question for later on.
Thank you for your contribution Dom..

How to test jQuery TokenInput field using Selenium

I'm unable to test a Tokeninput field in a form using selenium. The situation is when we type something, it gives a list to options to select but those options aren't part of the DOM. The text fills the field but doesn't select the item.
The code which I have written is:
Given admin user is on schedule interview page
And he select "obie[1]" and "DHH[1]" from the candidate name(s) auto sugget field
**step defination**
Given /^he select "([^"]*)" and "([^"]*)" from the candidate name\(s\) auto sugget field$/ do |arg1, arg2|
within(:css, "#interview_template_candidate_names_input") do
fill_in('tmp',:with => arg1) --tmp is name of the token input field
find("li:contains('obie[1])'").click
save_and_open_page
end
end
I finally succeeded in making this work. Here's the gist: https://gist.github.com/1229684
The list is part of the dom (div.token-input-dropdown), it's added as the last child of the body element, which is probably why you didn't see it.
If you understand what the tokeninput plugin is doing, you can get a better idea of what you need to do. For each tokeninput you create, the plugin:
creates a ul.token-input-list (immediately before input#your_input_id)
creates a ul.token-input-list input#token-input-your_input_id
hides the input#your_input_id
creates a div.token-input-dropdown
So the most challenging part is finding the correct ul.token-input-list, because you have to find it based on its position relative to the original input, and the selenium webdriver doesn't let you navigate the dom.
After that, you just fill in the input#token-input-your_input_id and "click" on the div.token-input-dropdown li option that matches what you're looking for.

RAILS: Tracking content with /#foo in the address bar

Just like in Gmail, I want to create a div which when loaded with ajax would output a #foo in the address bar to track what content would be loaded.
If you go to https://mail.google.com/mail/?shva=1#sent gmail, if signed in, will take you straight to your sent box.
I want to do the same. For example. I have a div that loads a list of recipes. Once a recipe on the list has been clicked content gets loaded from db in the same div and the address bar would say http://site.com/#recipe-permalink. If this link gets passed to a friend and the friend goes to http://site.com/#recipe-permalink the div would load appropriate content with that recipe.
Also is there a way to control more than one div? For example if url is http://site.com/#recipe-permalink#blue app would load recipe in one div and appropriate content in another div for #blue (what ever it may be).
Is there a way to make cells or apotomo have this functionality?
Are there any SEO concerns with doing this as well? Would the crawlers be able to pick up content through #foo links?
Probably not a full answer to your question but I believe this episode of railscasts would be interesting to you.
http://railscasts.com/episodes/246-ajax-history-state

Resources