How google url with # works - url

How the URL like https://www.google.co.in/#q=harry+potter works?
As per my understanding anything after the # is not sent to the server.
Now if we paste the above URL in browser then it get the search page for Harry Potter.
As per my understanding when one request the above URL a request will be sent to server and since the search term "Happy Potter" is after the '#' it won't be sent to the server. So server wont have anyway to determine what to search? So then how it works. Does browser does anything special ?

Your understanding is correct, the server does not see your search term.
It's a client side JavaScript that is executed upon page load and inspects the url. It then executes an XHR request with the search term appended in a way that is visible to the server (https://www.google.co.in/search?q=harry+potter&...).
Reload the page with JavaScript disabled and you will see that you are getting the regular page without pre-filled search box and results.

Related

POST Request is Displaying as GET Request During Replay In Jmeter

I have a Jmeter script where during replay, Post request is displaying as Get request and the parameters in the request are not sent to the server. Due to this, correlations are failing at this request.
One of the parameters in the request is ViewState with so many characters. Is this large parameter value causing the above issue? How to proceed now?
Most probably you're sending a malformed request therefore instead of properly responding to a POST request you're being redirected somewhere (most probably to Login page)
Use View Results Tree listener in HTML or Browser mode to see what page you're hitting in the reality
With regards to the ViewState, "so many characters" is not a problem, the problem is that these are not random characters. ViewState is being used for client-side state management and if you fail to provide the proper value you won't be able to move further so you need to design your test as follows:
Open first page
Extract ViewState using a suitable Post-Processor
Open second page
here you need to pass viewstate from the step 1 along with other parameters
More information: ASP.NET Login Testing with JMeter
Also don't forget to add HTTP Cookie Manager to your Test Plan
What I'm able to understand is the request may be getting redirected. This happens usually when the server expects a unique request. If you recorded the request, you may be possibly using older headers that carry old cookie information. Check your headers and then reconstruct the request.
Make sure you are not using old cookies anywhere. remove that cookie part from HTTP Header Manager everywhere.

check if url can be loaded in an iframe

Snip.ly nicely checks if the entered web address can be used in an iframe.
I'd like to replicate it in ruby. Looking through their code they send an ajax request to their server and thats where they do the validation.
Even after extensive googling couldn't find anything that could help me accomplish that.
My use case is that we let users add news listings to their page, which are shown in iframes, and would like to show it if the entered url can be used in an iframe.
You can figure out some cases by checking the X-Frame-Options header. But as you mentioned in the comments, it does not work all the time.
In my experience, it's best to side-step the problem altogether.
If you reverse-proxy your request through your rails server, then you can display pretty much anything all the time in your iframe.
Following is an example of the process. I'm assuming that your server is your-server.com and the user wants to list a page on user.com/list. The way it works would be:
Set an iframe's src to https://your-server.com/proxy?url=https://user.com/list`
Intercept the request, extract the url: https://user.com/list
Perform an HTTP request on https://user.com/list to fetch the content
Return it to the browser as if it come from your own server
This approach works pretty much all the time, but it then has other limitations:
- you should reverse proxy any asset on that page that has a relative url; otherwise the css/images may be broken
- you must handle ajax requests on that page
You can fix these as well, by transforming the html before step 4.
You could use https://github.com/waterlink/rack-reverse-proxy for step 2 and 3, instead of re-implementing your own reverse proxy.
You could set it up using the following code in config/application.rb:
config.middleware.insert(0, Rack::ReverseProxy) do
reverse_proxy_options timeout: 10 # avoids waiting for pages that take forever to load
reverse_proxy(/proxy\?url=(.*)/, '$1') # reverse proxy on the url parameter
end

Detecting main URL with IdHTTPProxyServer

I want to make an application to redirect websites.
It has a table with "domains" and "redirect domains".
Once it matched domain, it redirect to redirect domain.
If didn't matched, it redirect to default page.
So I created a Delphi application with IdHTTPProxyServer.
I have configured it to even work with https using "ssleay32.dll" and "libeay32.dll".
Everything works great.
It use "IdHTTPProxyServerHTTPBeforeCommand" event to redirect like this:
with AContext.Connection.IOHandler do
begin
WriteLn('HTTP/1.0 302 Moved Temporarily');
WriteLn('Location: ' + RedirectURL);
WriteLn('Connection: close');
WriteLn;
end;
But how do I distinguish the event call by main URL (user typed in the address bar) and other URLs?
"IdHTTPProxyServerHTTPBeforeCommand" event called lots of times when a page is loading for stat counters, facebook like buttons, etc. I don't want to redirect all of them to default page.
If this is not possible with IdHTTPProxyServer, is there any other options in Delphi or any other language (which can generate native executable. C++ preferred)?
Thank You
From the perspective of a proxy (or the target HTTP server, for that matter), there is no difference whatsoever between a user-typed URL and other URLs. Every HTTP request is self-contained and independent of every other HTTP request. They have to processed as-is on a per-request basis.
If you want to ignore dependent URLs (images, scripts, etc), you will have to know ahead of time what the initial URL is, parse the data that is retrieved from that URL, keep track of any URLs the data refers to, and then ignore those URLs if you see them being requested later. However, there is nothing in the HTTP protocol to tell you what the initial URL is. There is a Referer request header that may help at times, as it is filled in when a browser is requesting dependent resource files, but it is also filled in when the user navigates around from one page to another, so you can't rely on the Referer by itself. You will have to implement your own discovery logic to figure out the initial URL based on more analysis of the URLs being requested by a given client over time.
Only the client really knows what it is requesting and why, a proxy is just a gateway to reach it. So there is only so much smart filtering you can do in a proxy without knowing what the client is actually doing.

Query after '#' in https://www.google.co.in/#q=better+flight+search

The URL follows the following scheme
scheme://domain:port/path?query_string#fragment_id
but a search for string
better flight search
result in the following url
https://www.google.co.in/#q=better+flight+search
according to the url scheme # is followed by fragment. Correct me if I am wrong but fragments are not send to the server then how does google show search results.
As you realized, the fragment portion of the URL is not sent to the server in an HTTP request. Instead, it is used locally by the browser to mark places in the document. Some client side frameworks take advantage of this fact and use the fragment as a secondary query string.
So, for instance, in your example with Google, doing a search on a Google page causes the page to navigate to a fragment like #q=better+flight+search. The browser sees the change and notifies the page's javascript that the URL was changed. Since the URL minus the fragment is the same, the browser doesn't perform a request to the server. In this case, the page's javascript sees the fragment change and uses that to perform an Ajax query to get search results. Doing this allows Google to give you search results without loading the page, which is a huge win for both server and client (server, because it doesn't have to deal with the overhead of serving another page; client because load times are decreased dramatically).
For the related #! sees this question.

HttpWebRequest simulating the request from firebug always failed

I got an eccentric problem. I am trying to automate visiting a web site by using WebRequest and WebClient. I have observed all the post request header key-value pairs and posted data string in Firebug(request Header and Post tab). Then I simulated such request with WebRequest and put all the header parameter and posted data there. However when I do GetResponse() from this request instance, I always got an error page back that says some sessionID is short of.
Actually I have taken care to put previously(first step to open the Logon page) responded session cookie in the Header's cookie field for the request. And I can get the correct response back by simulating requesting the logon page(the 1st page), but cannot get through this authentication page. My post data is like userid=John&password=123456789&domain=highmark.And the authentication page request that carried out by browser succeeds every time.
Am I missing something in the request that may not be shown by firebug.If yes, can you give me some recommendation for the tools that may examine the entire request sent by browser.
I have solved this issue. The problem is I set the httpWebRequest instance's AllowAutoRedirect=true. Thus the effect is when I got the first response from the server, the httpWebRequest would continually to make another request asking for a different url that is replied in the response header's Location field.
The defect of HttpWebRequest class is when it is getting redirected, it does not include the Set-Cookies(Response's Header Field)'s cookies in the next request header, thus the server would deny such page request and may redirect again to another different page.
And the httpWebRequest.GetResponse() method only return the last responsed page back under the setting AllowAutoRedirect=true. And I got the totally different response than I expected.
Also in this solving process, I need to thank to a distinguish Http Traffic examining tool:IEInspector Http Analyzer(http://www.ieinspector.com/httpanalyzer/). The great feature of this tool is it can examine not only the http traffic from browser but also what your process's httpWebRequest made. And also it can display in text format the raw stream of those request and response. Although it is a commercial software, you can try it for 15 days. I am quite happy with what it tells me(in well-formed details) and I like to buy it as well.

Resources