playwright - some urls make page.goto hangup and set timeouts didn't resolved - timeout

The url in the code below (and some others that i'm finding) exists but the https server don't answer, so try to verify response.code >= 400 don't work. Set page timeouts also don't work. Problem is, i have to run a list of urls and when the script find one it hangup.
url = 'https://mediafina.xyz'
page.set_default_timeout = 120000
page.set_default_navigation_timeout = 120000
page.goto(url)
If i run showing the browser page on the screen
browser = pw.chromium.launch({
...
headless=False,
...
}
The browser show in a minute or so:
Unable to access this site
mediafina.xyz has unexpectedly terminated the connection.
But i don't know how to get this event, if this is a playwright event. Google wasn't helpful.
What i'm missing or doing wrong ?

I did the following:
def response_handler(response):
if response.ok == False:
page.close()
page.on("response", response_handler)
BEFORE page.goto(url) and now playwright don't hangup anymore.

Related

How to catch the redirect with a webapp using playwright

When you go to this link, the page will run some javascript and then automatically redirect to a pdf. I have a hard time getting that final url from Playwright.
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto("https://scnv.io/760y", wait_until="networkidle")
print(page.url)
page.close()
Is there a way to get that final url?
There are multiple ways to do it. One way is using page.expect_response:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
# Catch any responses with '.pdf' at the end of the url
with page.expect_response('**/*.pdf') as response:
page.goto("https://scnv.io/760y")
print(response.value.url)
page.close()
Output
https://qcg-media.s3.amazonaws.com/media/uploads/72778/2022/06/20220622_663043_221.pdf
Check out this section of the documentation that details handling network traffic in playwright.
Also note that I did not include wait_until='networkidle' because that was not appropriate for this use case. For that event to trigger, the network must remain idle for at least 500 ms, which does not happen in the case of this website when it's making the request to the pdf. Therefore, if you were to include that, then the code will be inconsistent at best in catching the request we wanted the url of.

(iOS Cordova) Accessing local files from remote - WKWebview

Having the following variables:
remote web presented in cordova
local files that the remote web asks for with the following code:
var url = "https://cdvfile/localhost/" + localFolder + "/www/cordova.js";
var element = document.createElement('script');
element.id = "cordova";
element.type = "text/javascript";
element.onerror = function () {
//error
}
element.onload = function () {
//success - code to be executed upon success
}
element.src = url;
document.body.appendChild(el);
This fails in WKWebView with the obvious error
[Error] Failed to load resource: A server with the specified hostname
could not be found. (cordova.js, line 0)
As you know, WKUrlSchemeHandler doesn't intercept http/https requests. An alternative is to use the dangerous [NSURLProtocol wk_registerScheme:#"https"]; private API trick (and it works but then it somehow screws up the request to load the page that includes the code above (doesn't add some cookies and some weird behavior).
I do have another alternative to inject via [userContentController addUserScript:script] but this requires modifying the remote web part in order to execute the code that follows the success of the script injection request.
I know it was previously possible to do all this with cdvfile:// in UIWebView but I am looking for a way to do all this WITHOUT modifying the remote (meaning, the url has to stay as you see it above. I've racked my brains for a few months now with this but can't come up with a solution. Please don't ask why I'm doing this or say that this is stupid etc, I have no choice, it's what I gotta do and it doesn't depend on me.
Please send help, thoughts, prayers etc
Thanks

IE 11 + SignalR not working

Strange behavior is happening when using signalR with IE 11. Scenario:
We have some dispatcher type functionality where the dispatcher does some actions, and the other user can see updates live (querying). The parameters that are sent come through fine and cause updates on the IE client side without having to open the developer console.
BUT the one method that does not work (performUpdate - to get the query results - this is a server > client call, not client > server > client) - never gets called. IT ONLY GETS CALLED WHEN THE DEVELOPER CONSOLE IS OPEN.
Here's what I've tried:
Why JavaScript only works after opening developer tools in IE once?
SignalR : Under IE9, messages can't be received by client until I hit F12 !!!!
SignalR client doesn't work inside AngularJs controller
Some code snippets
Dispatcher side
On dropdown change, we get the currently selected values and send updates across the wire. (This works fine).
$('#Selector').on('change', function(){
var variable = $('#SomeField').val();
...
liveBatchHub.server.updateParameters(variable, ....);
});
Server Side
When the dispatcher searches, we have some server side code that sends out notifications that a search has been ran, and to tell the client to pull results.
public void Update(string userId, Guid bId)
{
var context = GlobalHost.ConnectionManager.GetHubContext<LiveBatchViewHub>();
context.Clients.User(userId).performUpdate(bId);
}
Client side (viewer of live updates)
This never gets called unless developer tools is open
liveBatchHub.client.performUpdate = function (id) {
//perform update here
update(id);
};
Edit
A little more information which might be useful (I am not sure why it makes a difference) but this ONLY seems to happen when I am doing server > client calls. When the dispatcher is changing the search parameters, the update is client > server > client or dispatcher-client > server > viewer-client, which seems to work. After they click search, a service in the search pipeline calls the performUpdate server side (server > viewer-client). Not sure if this matters?
Edit 2 & Final Solution
Eyes bloodshot, I realize I left out one key part to this question: we are using angular as well on this page. Guess I've been staring at it too long and left this out - sorry. I awarded JDupont the answer because he was on the right track: caching. But not jQuery's ajax caching, angulars $http.
Just so no one else has to spend days / nights banging their heads against the desk, the final solution was to disable caching on ajax calls using angulars $http.
Taken from here:
myModule.config(['$httpProvider', function($httpProvider) {
//initialize get if not there
if (!$httpProvider.defaults.headers.get) {
$httpProvider.defaults.headers.get = {};
}
// Answer edited to include suggestions from comments
// because previous version of code introduced browser-related errors
//disable IE ajax request caching
$httpProvider.defaults.headers.get['If-Modified-Since'] = 'Mon, 26 Jul 1997 05:00:00 GMT';
// extra
$httpProvider.defaults.headers.get['Cache-Control'] = 'no-cache';
$httpProvider.defaults.headers.get['Pragma'] = 'no-cache';
}]);
I have experienced similar behavior in IE in the past. I may know of a solution to your problem.
IE caches some ajax requests by default. You may want to try turning this off globally. Check this out: How to prevent IE from caching Ajax with jQuery
Basically you would globally switch this off like this:
$.ajaxSetup({ cache: false });
or for a specific ajax request like this:
$.ajax({
cache: false,
//other options...
});
I had a similar issue with my GET requests caching. My update function would only fire off once unless dev tools was open. When it was open, no caching would occur.
If your code works properly with other browsers, So the problem can be from the SignalR's used transport method. They can be WebSocket, Server Sent Events, Forever Frame and Long Polling based on browser support.
The Forever Frame is for Internet Explorer only. You can see the Introduction to SignalR to know which transport method will be used in various cases (Note that you can't use any of them on each browser, for example, IE doesn't support Server Sent Events).
You can understand the transport method being used Inside a Hub just by looking at the request's QueryString which can be useful for logging:
Context.QueryString["transport"];
I think the issue comes from using Forever Frame by IE likely, since sometimes it causes SignalR to crash on Ajax calls. You can try to remove Forever Frame support in SignalR and force to use the remaining supported methods by the browser with the following code in client side:
$.connection.hub.start({ transport: ['webSockets', 'serverSentEvents', 'longPolling'] });
I showed some realities about SignalR and gave you some logging/trace tools to solve your problem. For more help, put additional details :)
Update:
Since your problem seems to be very strange and I've not enough vision around your code, So I propose you some instructions based on my experience wish to be useful:
Setup Browser Link in IDE suitable
checkout the Network tab request/response data during its process
Make sure you haven't used reserved names in your server/client side
(perhaps by renaming methods and variables)
Also I think that you need to use liveBatchHub.server.update(variable, ....); instead of liveBatchHub.server.updateParameters(variable, ....); in Dispatcher side to make server call since you should use server method name after server.

rails responding to cross domain request developing locally, spotify app development

So Im messing around with developing a spotify app, trying to get it to talk to my local rails application API. I cant get anything other than a req.status 0 when I try it.
I think its either a problem with the spotify manifest.json file, not allowing the port:3000 to go on the url you set in required permissions, but it also says the following in their documentation.
https://developer.spotify.com/technologies/apps/tutorial/
If you need to talk to an outside web API you're welcome to, as long as you abide by the rules set in the Integration Guidelines. Please note that when talking with a web API, the requests will come from the origin sp://$APPNAME (so sp://tutorial for our example) - make sure the service you are talking to accepts requests from such an origin.
So, Im not sure if rails is set to not allow this sort of thing, or if its an issue with the putting the port into the required permissions, but my request
var req = new XMLHttpRequest();
req.open("GET", "http://127.0.0.1:3000/api/spotify/track/1.json", true);
console.log(req);
req.onreadystatechange = function() {
console.log(req.status);
console.log(req.readyState);
if (req.readyState == 4) {
if (req.status == 200) {
console.log("Search complete!");
console.log(req.responseText);
}
}
};
req.send();
Always returns status 0 where as their example:
var req = new XMLHttpRequest();
req.open("GET", "http://ws.audioscrobbler.com/2.0/?method=geo.getevents&location=" + city + "&api_key=YOUR_KEY_HERE", true);
req.onreadystatechange = function() {
console.log(req.status);
if (req.readyState == 4) {
console.log(req);
if (req.status == 200) {
console.log("Search complete!");
console.log(req.responseText);
}
}
};
req.send();
Will return a 403 response at least. its like the request is not being made or something?
Anyone have any idea what might be going on?
Much appreciated!
When talking to external services from a Spotify App, even if they're running on your local machine, you need to make sure that two things are in place correctly:
The URL (or at least the host) is in the RequiredPermissions section of your manifest. Port doesn't matter. http://127.0.0.1 should be fine for your case.
The server is allowing the sp://your-app-id origin for requests, as noted in the documentation you pasted in your question. This is done by setting the Access-Control-Allow-Origin header in your service's HTTP response. People often set it to Access-Control-Allow-Origin: * to allow anything to make requests to their service.
Thanks for help, I got it figured out, I think it was multiple things, with one main Im an idiot moment for not trying that earlier
First off, I had to run rails on port 80, as obviously if Im accessing my site from 127.0.0.1:3000, thats not going to work if spotify app is requesting 127.0.0.1 unless I can load that directly in the browser, which you cannot unless you run on 80. That is done via
rvmsudo rails server -p 80
Need to use rvmsudo because changing port requires permissions.
Next I had to set access controll allow origin as noted above, that can be done in rails 3 by adding before filter to your app controller as follows.
class ApplicationController < ActionController::Base
logger.info "I SEE REQUEST"
before_filter :cor
def cor
headers["Access-Control-Allow-Origin"] = "*"
headers["Access-Control-Allow-Methods"] = %w{GET POST PUT DELETE}.join(",")
headers["Access-Control-Allow-Headers"] = %w{Origin Accept Content-Type X-Requested-With X-CSRF-Token}.join(",")
head(:ok) if request.request_method == "OPTIONS"
end
end
Finally, and most importantly (sigh), you cant just righclick and reload your spotify app when you make changes to your manifest file, exit spotify completely and restart it!

CI 2.0.3 session heisenbug: session is lost after some time 20 minutes, only on server redirect, nothing suspicious in the logs

I can't seem to make any progress with this one. My CI session settings are these:
$config['sess_cookie_name'] = 'ci_session';
$config['sess_expiration'] = 0;
$config['sess_expire_on_close'] = FALSE;
$config['sess_encrypt_cookie'] = FALSE;
$config['sess_use_database'] = TRUE;
$config['sess_table_name'] = 'ci_sessions';
$config['sess_match_ip'] = FALSE;
$config['sess_match_useragent'] = FALSE;
$config['sess_time_to_update'] = 7200;
$config['cookie_prefix'] = "";
$config['cookie_domain'] = "";
$config['cookie_path'] = "/";
$config['cookie_secure'] = FALSE;
The session library is loaded on autoload. I've commented the sess_update function to prevent an AJAX bug that I've found about reading the CI forum.
The ci_sessions table in the database has collation utf8_general_ci (there was a bug that lost the session after every redirect() call and it was linked to the fact that the collation was latin1_swedish_ci by default).
It always breaks after a user of my admin section tries to add a long article and clicks the save button. The save action looks like this:
function save($id = 0){
if($this->my_model->save_article($id)){
$this->session->set_flashdata('message', 'success!');
redirect('admin/article_listing');
}else{
$this->session->set_flashdata('message', 'errors encountered');
redirect('admin/article_add');
}
}
If you spend more than 20minutes and click save, the article will be added but on redirect the user will be logged out.
I've also enabled logging and sometimes when the error occurs i get the message The session cookie data did not match what was expected. This could be a possible hacking attempt. but only half of the time. The other half I get nothing: a message that I've placed at the end of the Session constructor is displayed and nothing else. In all the cases if I look at the cookie stored in my browser, after the error the cookie's first part doesn't match the hash.
Also, although I know Codeigniter doesn't use native sessions, I've set session.gc_maxlifetime to 86400.
Another thing to mention is that I'm unable to reproduce the error on my computer but on all the other computers I've tested this bug appears by the same pattern as mentioned above.
If you have any ideas on what to do next, I'd greatly appreciate them. Changing to a new version or using a native session class (the old one was for CI 1.7, will it still work?) are also options I'm willing to consider.
Edit : I've run a diff between the Session class in CI 2.0.3 and the latest CI Session class and they're the same.
Here's how I solved it: the standards say that a browser shouldn't allow redirects after a POST request. CI's redirect() method is sending a 302 redirect by default. The logical way would be to send a 307 redirect, which solved my problem but has the caveat of showing a confirm dialog about the redirect. Other options are a 301 (meaning moved permanently) redirect or, the solution I've chosen, a javascript redirect.

Resources