Given 2 URLs, is it possible to know if the resources are on the same web server? - url

I am accessing 2 URLs. The domain name/server part is the same. The resource part is different.
The URLs are like the following:
https://aa.bb.com/dir1/dir2
https://aa.bb.com/dir3
When I access the first URL, I get redirected to the second URL. Is it possible that the second URL be hosted on a different web server than the first or both resources would be on the same web server?

If by web server you mean physical computer, absolutely they could be on different servers. Google and Akamai, among others, have large collections of machines serving the same domain names. It helps with speed, since you are likely to receive pages from a server near you.
In general, it does not appear to be possible to reliably tell whether you are talking to the exact same server before and after a redirect. First, it is difficult to test for IP addresses from a Web page (see, e.g., this question and this one). Second, even if the IP addresses are the same before and after the redirect, they may be on different machines. For example, TCP anycast can change which server you are talking to without changing the IP address. Also, network address translation and load-balancing may change which server you are talking to behind a firewall, which you would probably have no way of finding out unless the server provided some ID of its own.

Related

Was this site hacked? URL redirects when "www" removed.

I'm trying to figure out whether a website I use was hacked.
When I access the site via www.site-name.com, I'm taken to the website.
However, when I access the site without the "www," i.e. site-name.com, I'm taken to a different website.
Why is this happening? I did a little research and my only guess is that someone changed the site's .htaccess file, but that seems unlikely, as the different website has no relation to the official site.
Can someone help me understand what's going on here?
One IP address can host multiple websites with different hostnames using Virtual Name Hosting.
The HTTP server will look at the Host header in the request to determine what site to use for a given request.
This lets you have one IP address serving example.com and example.net.
Typically, the first Virtual Name Host will be the default, so if you were to ask for example.org the server would not recognise it and give you example.com instead.
In this case, it appears that the server has a Virtual Name Host configured for www.site-name.com but not for site-name.com so requests for site-name.com get the default site for the server.

How does the URL I type in lead to the eventual content I see in my browser?

I'm trying to figure out how these all work together, and there are bits and pieces of information all over the internet.
Here's what I (think) I know:
1) When you enter a url into your browser that gets looked up in a domain name server (DNS), and you are sent an IP address.
2) Your computer then follows this IP address to a server somewhere.
3) On the server there are nameservers, which direct you to the specific content you want within the server. -> This step is unclear to me.
4) With this information, your request is received and the server relays site content back to you.
Is this correct? What do I have wrong? I've done a lot of searching over the past week, and the thing I think I'm missing is the big picture explanation of how all these details tie together.
Smaller questions:
a) How does the nameserver know which site I want directions to?
b) How can a site like GoDaddy own urls? Why do I have to pay them yearly fees, and why can't I buy a url outright?
I'm looking for a cohesive explanation of how all this stuff works together. Thanks!
How contents get loaded when I put a URL in a browser ?
Well there some very well docs available on this topic each step has its own logic and algorithms attached with it, here I am giving you a walk through.
Step 1: DNS Lookup : Domain name get converted into IP address, in this process domain name from the URL is used to find IP address of the associated server machine by looking up records on multiple servers called name servers.
Step 2: Service Request : Once the IP address is known, as service request depending on protocol is created in form of packets and sent to the server machine using IP address. In case of a browser normally it will be a HTTP request; in other cases it can be something else.
Step 3: Request handling: Depending on the service request and underlying protocol, request is handled by a software program which lives normally on the server machine whose address was discovered in previous step. As per the logic programmed on the server program it will return a appropriate response in case of HTTP its called HTTP Response.
Step 4: Response handling: In this step the requesting program in your case a browser receives the response as mentioned in the previous step and renders it and display it as per defined in the protocol, in case of HTTP a HTTP body is extracted and rendered, which is written in HTML.
How does the nameserver know which site I want directions to
URL has a very well defined format, using which a browser find out a hostname/domain name which is used in turn to find out the associated IP address; there are different algorithms that name-servers runs to find out the correct server machine IP.
Find more about DNS resolution here.
How can a site like GoDaddy own urls? Why do I have to pay them yearly fees, and why can't I buy a url outright?
Domain name are resources which needed management and regulation which is done ICANN they have something called registries from which registrar(like GoDaddy) get domains and book them for you; the cost you pay is split up between ICANN and registrar.
Registrar does a lot of work for you, eg setup name server provide hosting etc.
Technically you can create you own domain name but it won't be free off course because you will need to create a name server, need to replicate it other servers and that way you can have whatever name you want (has too be unique); a simple way to do that is by editing your local hosts files in linux it is located at /etc/hosts and in windows it is located at C:\Windows\System32\drivers\etc\hosts but its no good on internet, since it won't be accepted by other servers.
(Precise and detailed description of this process would probably take too much space and time to write, I am sure you can google it somewhere). So, although very simplified, you have pretty good picture of what is going on, but some clarifications are needed (again, I will be somewhat imprecise) :
Step 2: Your computer does follow the IP address received in step 1, but the request set to that IP address usually contains one important piece of information called 'Host header', that is the actual name as you typed in your browser.
Step 3: There is no nameserver involved here, the software(/hardware) is usually called 'webserver' (for example Apache, IIS, nginx etc...). One webserver can serve one or many different sites. In case there are more than one, webserver will use the 'Host header' to direct you to the specific content you want.
ICAAN 'owns' the domain names, and the registration process involves technical and administrative effort, so you pay registrars to handle that.

SEO Destroyed By URL Forwarding - Can't figure out another way

We design and host websites for our clients/sales force. We have our own domain: http://www.firstheartland.com
Our agents fill out a series of forms on our website that are loaded into a database. The database then renders the website as a database driven website.
/repwebsites/repSite.cfm?link=&rep=rick.higgins
/repwebsites/repSite.cfm?link=&rep=troy.thompson
/repwebsites/repSite.cfm?link=&rep=david.kover
The database application reads which "rep" the site is for and the appropriate page to display from the query string. The page then outputs the content and the appropriate CSS to style the page and give it its own individual branding.
We have told the user to use Domain Name Forwarding to get the users to their spot on our server. However, everyone seems to be getting indexed under our domain instead of their own. We could in theory assign an new IP to them, the cost is not the issue.
The issue is how we would possibly accomplish this.
With all of that said, them being indexed under our domain would still be OK as long as they would actually show up high in the ranking for their search term.
For instance, an agent owns TroyLThompson.com. If I search Troy L Thompson, It does not show up in my search. Only, "troy thompson first heartland" works (they show up third)
Apart from scrapping the whole system, I don't know what to do. I'm very open to ideas.
I'm sure you can get this to work as most hosting companies will host hundreds of websites on a single server (i.e. multiple domains on one IP).
I think you need your clients to update the nameservers for their domains (i.e. DNS) to return the IP address of your hosting server. Then you need to configure your server to return the right website based on the domain that was originally requested.
That requires your "database driven website" to look in the HTTP request and check which domain was originally requested, then it can handle the request accordingly.
- If you are using Apache, see how to configure Apache to host multiple domains on one IP address.
- If you are using Microsoft IIS, maybe Host-Header Routing is what you need.
You will likely need code changes on your "database driven website" to cope with these changes.
I'm not sure that having a dedicated IP address per domain will help much, as then you have to find a way to host all those IP addresses from a single web server. However, if your web server architecture already supports a shared database and multiple servers, then that approach might work well for you, especially if you expect the load from some domains to be so heavy that you need a dedicated web server for them.
Google does not include URL in its index which return a 301 status code. The reason is pretty obvious on second thought, because the redirect tells Google "Whatever was here before has moved there, please update your references". One solution I can see is setting up Apache virtual hosts on your server for each external domain, and have each rep configure their domain's DNS A record to point to the IP address of your server.

Rails/Passenger/Apache: Simple one-off URL redirect to catch stale DNS after server move

One of my rails apps (using passenger and apache) is changing server hosts. I've got the app running on both servers (the new one in testing) and the DNS TTL to 5 minutes. I've been told (and experienced something like this myself) by a colleague that sometimes DNS resolvers slightly ignore the TTL and may have the old IP cached for some time after I update DNS to the new server.
So, after I've thrown the switch on DNS, what I'd like to do is hack the old server to issue a forced redirect to the IP address of the new server for all visitors. Obviously I can do a number of redirects (301, 302) in either Apache or the app itself. I'd like to avoid the app method since I don't want to do a checkin and deploy of code just for this one instance so I was thinking a basic http url redirect would work. Buuttt, there are SEO implications should google visit the old site etc. etc.
How best to achieve the re-direct whilst maintaining search engine niceness?
I guess the question is - where would you redirect to? If you are redirecting to the domain name, the browser (or bot) would just get the same old IP address and end up in a redirect loop.
If you redirect to an IP address.. well, that's not going to look very user friendly in someone's browser.
Personally, I wouldn't do anything. There may be some short period where bots get errors trying to access your site, but it should all work itself out in a couple days without any "SEO damage"
One solution might be to use Mod_Proxy instead of a rewrite to proxy traffic to the new host. This way you shouldn't see any "SEO damage".
I used rinetd to redirect the IP traffic from the old server to the new one on IP level. No web server or virtual hosts config needed. Runs very smoothly and absolutely transparent to any client.

Best way to redirect image requests to a different webserver?

I am trying to reduce the load on my webservers by adding an "Image server" (a dedicated server for handling image requests), and redirecting all requests for .gif,.jpg,.png etc., to it.
My question is, what is the best way to handle the redirection?
At the firewall level? (can I do this using iptables?)
At the load balancer level? (can ldirectord handle this?)
At the apache level - using rewrite rules?
Thanks for any suggestions on the best way to do this.
--Update--
One thing I would add is that these are domains that are hosted for 3rd parties, so I can't expect all the developers to modify their code and point their images to another server.
The further up the chain you can do it, the better.
Ideally, do it at the DNS level by using a different domain for your images (eg imgs.example.com)
If you can afford it, get someone else to do it by using a CDN (Content delivery network).
-Update-
There are also 2 featuers of apache's mod_rewrite that you might want to look at. They are all described well at http://httpd.apache.org/docs/1.3/misc/rewriteguide.html.
The first is under the heading "Dynamic Miror" in the above document, that uses the mod_rewrite Proxy flag [p]. This lets your server silently fetch files from another domain and return them.
The second is to just redirect the request to the new domain. This second option puts less strain on your server, but requests still need to come in and it slows down the final rendering of the page, as each request needs to make an essentially redundant request to your server first.
i agree with rikh. If you want images to be served from a different webserver, then serve them on a different web-server. For example:
<IMG src="images/Brett.jpg">
becomes
<IMG src="http://brettnesbitt.akamia-technologies.com/images/Brett.jpg">
Any kind of load balancer will still feed the image from the web-server's pipe, which is what you're trying to avoid.
i, of course, know what you really want. What you really want is for any request like:
GET images/Brett.jpg HTTP/1.1
to automatically get converted into:
HTTP/1.1 307 Temporary Redirect
Location: http://brettnesbitt.akamia-technologies.com/images/Brett.jpg
this way you don't have to do any work, except copy the images to the other web-server.
That i really don't know how to do.
By using the phrase "NAT", it implies that the firewall/router receives HTTP requests, and you want to forward the request to a different internal server if the HTTP request was for image files.
This then begs the question about what you're actually trying to save. No matter which internal web-server services the HTTP request, the data is still going to have to flow through the firewall/router's pipe.
The reason i bring it up is because the common scenario when someone wants to serve images from a different server is because they want to split up high-bandwidth, mostly static, low-CPU cost content from their actual logic.
Only using NAT to re-write the packet and send it to a different server will not work towards that common issue.
The other reason might be because images are not static content on your system, and a request to
GET images/Brett.jpg HTTP/1.1
actually builds an image on the fly, with a high-CPU cost, or only using with data available (i.e. SQL Server database) to ServerB.
If this is the case then i would still use a different server name on the image request:
GET http://www.brettsoft.com/default.aspx HTTP/1.1
GET http://imageserver.brettsoft.com/images/Brett.jpg HTTP/1.1
i understand what you're hoping for, with network packet inspection to override the NAT rule and send it to another server - i've never seen any such thing that can do that.
It sounds more "proxy-ish", where the web-proxy does this. (i.e. pfSense and m0n0wall can't do it)
Which then leads to a kind of solution we used once: a custom web-server that analyzes the request, makes the appropriate request off some internal server, and binary writes the response to the client.
That pain in the ass solution was insisted upon by a "security consultant", who apparently believes in security through obscurity.
i know IIS cannot do such things for you itself - i don't know about other web-server products.
i just asked around, and apparently if you wanted to write a custom kernel module for you linux based router, you could have it inspect packets and take appropriate action. Such a module might exist. There are, apparently, plenty of other open-sourced modules to use as a starting point.
But i'd rather shoot myself in the head.

Resources