SEO Destroyed By URL Forwarding - Can't figure out another way - url

We design and host websites for our clients/sales force. We have our own domain: http://www.firstheartland.com
Our agents fill out a series of forms on our website that are loaded into a database. The database then renders the website as a database driven website.
/repwebsites/repSite.cfm?link=&rep=rick.higgins
/repwebsites/repSite.cfm?link=&rep=troy.thompson
/repwebsites/repSite.cfm?link=&rep=david.kover
The database application reads which "rep" the site is for and the appropriate page to display from the query string. The page then outputs the content and the appropriate CSS to style the page and give it its own individual branding.
We have told the user to use Domain Name Forwarding to get the users to their spot on our server. However, everyone seems to be getting indexed under our domain instead of their own. We could in theory assign an new IP to them, the cost is not the issue.
The issue is how we would possibly accomplish this.
With all of that said, them being indexed under our domain would still be OK as long as they would actually show up high in the ranking for their search term.
For instance, an agent owns TroyLThompson.com. If I search Troy L Thompson, It does not show up in my search. Only, "troy thompson first heartland" works (they show up third)
Apart from scrapping the whole system, I don't know what to do. I'm very open to ideas.

I'm sure you can get this to work as most hosting companies will host hundreds of websites on a single server (i.e. multiple domains on one IP).
I think you need your clients to update the nameservers for their domains (i.e. DNS) to return the IP address of your hosting server. Then you need to configure your server to return the right website based on the domain that was originally requested.
That requires your "database driven website" to look in the HTTP request and check which domain was originally requested, then it can handle the request accordingly.
- If you are using Apache, see how to configure Apache to host multiple domains on one IP address.
- If you are using Microsoft IIS, maybe Host-Header Routing is what you need.
You will likely need code changes on your "database driven website" to cope with these changes.
I'm not sure that having a dedicated IP address per domain will help much, as then you have to find a way to host all those IP addresses from a single web server. However, if your web server architecture already supports a shared database and multiple servers, then that approach might work well for you, especially if you expect the load from some domains to be so heavy that you need a dedicated web server for them.

Google does not include URL in its index which return a 301 status code. The reason is pretty obvious on second thought, because the redirect tells Google "Whatever was here before has moved there, please update your references". One solution I can see is setting up Apache virtual hosts on your server for each external domain, and have each rep configure their domain's DNS A record to point to the IP address of your server.

Related

Was this site hacked? URL redirects when "www" removed.

I'm trying to figure out whether a website I use was hacked.
When I access the site via www.site-name.com, I'm taken to the website.
However, when I access the site without the "www," i.e. site-name.com, I'm taken to a different website.
Why is this happening? I did a little research and my only guess is that someone changed the site's .htaccess file, but that seems unlikely, as the different website has no relation to the official site.
Can someone help me understand what's going on here?
One IP address can host multiple websites with different hostnames using Virtual Name Hosting.
The HTTP server will look at the Host header in the request to determine what site to use for a given request.
This lets you have one IP address serving example.com and example.net.
Typically, the first Virtual Name Host will be the default, so if you were to ask for example.org the server would not recognise it and give you example.com instead.
In this case, it appears that the server has a Virtual Name Host configured for www.site-name.com but not for site-name.com so requests for site-name.com get the default site for the server.

How does the URL I type in lead to the eventual content I see in my browser?

I'm trying to figure out how these all work together, and there are bits and pieces of information all over the internet.
Here's what I (think) I know:
1) When you enter a url into your browser that gets looked up in a domain name server (DNS), and you are sent an IP address.
2) Your computer then follows this IP address to a server somewhere.
3) On the server there are nameservers, which direct you to the specific content you want within the server. -> This step is unclear to me.
4) With this information, your request is received and the server relays site content back to you.
Is this correct? What do I have wrong? I've done a lot of searching over the past week, and the thing I think I'm missing is the big picture explanation of how all these details tie together.
Smaller questions:
a) How does the nameserver know which site I want directions to?
b) How can a site like GoDaddy own urls? Why do I have to pay them yearly fees, and why can't I buy a url outright?
I'm looking for a cohesive explanation of how all this stuff works together. Thanks!
How contents get loaded when I put a URL in a browser ?
Well there some very well docs available on this topic each step has its own logic and algorithms attached with it, here I am giving you a walk through.
Step 1: DNS Lookup : Domain name get converted into IP address, in this process domain name from the URL is used to find IP address of the associated server machine by looking up records on multiple servers called name servers.
Step 2: Service Request : Once the IP address is known, as service request depending on protocol is created in form of packets and sent to the server machine using IP address. In case of a browser normally it will be a HTTP request; in other cases it can be something else.
Step 3: Request handling: Depending on the service request and underlying protocol, request is handled by a software program which lives normally on the server machine whose address was discovered in previous step. As per the logic programmed on the server program it will return a appropriate response in case of HTTP its called HTTP Response.
Step 4: Response handling: In this step the requesting program in your case a browser receives the response as mentioned in the previous step and renders it and display it as per defined in the protocol, in case of HTTP a HTTP body is extracted and rendered, which is written in HTML.
How does the nameserver know which site I want directions to
URL has a very well defined format, using which a browser find out a hostname/domain name which is used in turn to find out the associated IP address; there are different algorithms that name-servers runs to find out the correct server machine IP.
Find more about DNS resolution here.
How can a site like GoDaddy own urls? Why do I have to pay them yearly fees, and why can't I buy a url outright?
Domain name are resources which needed management and regulation which is done ICANN they have something called registries from which registrar(like GoDaddy) get domains and book them for you; the cost you pay is split up between ICANN and registrar.
Registrar does a lot of work for you, eg setup name server provide hosting etc.
Technically you can create you own domain name but it won't be free off course because you will need to create a name server, need to replicate it other servers and that way you can have whatever name you want (has too be unique); a simple way to do that is by editing your local hosts files in linux it is located at /etc/hosts and in windows it is located at C:\Windows\System32\drivers\etc\hosts but its no good on internet, since it won't be accepted by other servers.
(Precise and detailed description of this process would probably take too much space and time to write, I am sure you can google it somewhere). So, although very simplified, you have pretty good picture of what is going on, but some clarifications are needed (again, I will be somewhat imprecise) :
Step 2: Your computer does follow the IP address received in step 1, but the request set to that IP address usually contains one important piece of information called 'Host header', that is the actual name as you typed in your browser.
Step 3: There is no nameserver involved here, the software(/hardware) is usually called 'webserver' (for example Apache, IIS, nginx etc...). One webserver can serve one or many different sites. In case there are more than one, webserver will use the 'Host header' to direct you to the specific content you want.
ICAAN 'owns' the domain names, and the registration process involves technical and administrative effort, so you pay registrars to handle that.

Given 2 URLs, is it possible to know if the resources are on the same web server?

I am accessing 2 URLs. The domain name/server part is the same. The resource part is different.
The URLs are like the following:
https://aa.bb.com/dir1/dir2
https://aa.bb.com/dir3
When I access the first URL, I get redirected to the second URL. Is it possible that the second URL be hosted on a different web server than the first or both resources would be on the same web server?
If by web server you mean physical computer, absolutely they could be on different servers. Google and Akamai, among others, have large collections of machines serving the same domain names. It helps with speed, since you are likely to receive pages from a server near you.
In general, it does not appear to be possible to reliably tell whether you are talking to the exact same server before and after a redirect. First, it is difficult to test for IP addresses from a Web page (see, e.g., this question and this one). Second, even if the IP addresses are the same before and after the redirect, they may be on different machines. For example, TCP anycast can change which server you are talking to without changing the IP address. Also, network address translation and load-balancing may change which server you are talking to behind a firewall, which you would probably have no way of finding out unless the server provided some ID of its own.

Can a dedicated IP address with a website on it be found and crawled by search engines?

I have a VPS. I have placed a Drupal installation on that IP address. There is no URL registered for my website. The site on the IP address is for personal reference.
Can my IP address get indexed and found on search engines if there is no traditional URL for it? Will it get crawled?
I have no A-records pointing to it from other domain names I have on another VPS platform either. As far as I know, I am the only one that knows this IP address by heart or even goes there to add or refer to content.
There are three ways I know for a search engine to learn about the existence a website.
You submit the domain to them directly.
Someone else links to the domain.
The search engine watches all domain registrations (Google can do this easily because they run a DNS themselves), and tries the standard prefixes (e.g. www).
There does not seem to be an automatic approach for discovering IP addresses with content unless someone links to it.
If it's purely for personal reference and you want to be sure no one else can access it, then you should implement security anyway. Don't just rely on no one knowing the IP.
Can my IP address get indexed and found on search engines if there is no traditional URL for it?
Yes, if you can reach it externally, then so can the search engines. If you don't want it to be indexed, add a "robots.txt" that requests for the site not to be indexed. Bear in mind that crawlers do not have to respect this, but the major ones do.
As for how the search engines discover IP addresses that are not indexed elsewhere, that is probably part of their "secret sauce" that we will never know about. Perhaps your IP has been used before, and it has previously been indexed in that context; if so, a search engine that has a poke around may be expecting that old site but will happily index your new one.
Or, maybe other IP addresses in the same netblock are in active use, and the search engines give yours "a quick try" to see if it responds on ports 80 (http) or 443 (https). If they do, it gets added to their indexes (or do-not-crawl lists, if your robots.txt requests it).
If you specifically do not want search engines to see your content, you could make the default home page blank, and put your Drupal installation in a sub-directory. The search engines will then have nothing to index apart from a blank home page.

Account based lookup in ASP.NET

I'm looking at using ASP.NET for a new SaaS service, but for the love of me I can't seem to figure out how to do account lookups based on subdomains like most SaaS applications (e.g. 37Signals) do.
For example, if I offer yourname.mysite.com, then how would I use ASP.NET (MVC specifically) to extract the subdomain so I can load the right template (displaying your company's name and the like)? Can it be done with regular routing?
This seems to be a common thing in SaaS so there has to be an easy way to do it in ASP.NET; I know there are plugins that do it for other frameworks like Ruby on Rails.
This works for me:
//--------------------------------------------------------------------------------------------------------------------------
public string GetSubDomain()
{
string SubDomain = "";
if (Request.Url.HostNameType == UriHostNameType.Dns)
SubDomain = Regex.Replace(Request.Url.Host, "((.*)(\\..*){2})|(.*)", "$2");
if (SubDomain.Length == 0)
SubDomain = "www";
return SubDomain;
}
I'm assuming that you would like to handle multiple accounts within the same web application rather than building separate sites using the tools in IIS. In our work, we started out creating a new web site for each subdomain but have found that this approach doesn't scale well - especially when you release an update and then have to modify dozens of sites! Thus, I do recommend this approach rather than the server-oriented techniques suggested above based on several years worth of experience doing exactly what you propose.
The code above just makes sure that this is a fully formed URL (rather, say, than an IP address) and returns the subdomain. It has worked well for us in a fairly high-volume environment.
You should be able to pick this up from the ServerVariables collection, but first you need to configure IIS and DNS to work correctly. So you know 37Signals probably use Apache or another open source, unix web server. On Apache this is referred to as VirtualHosting.
To do this with IIS you would need to create a new DNS entry (create a CNAME yourname.mysite.com to application.mysite.com) for each domain that points to your application in IIS (application.mysite.com).
You then create a host header entry in the IIS application (application.mysite.com) that will accept the header yourname.mysite.com. Users will actually hit application.mysite,com but the address is the custom subdomain. You then access the ServerVariables collection to get the value to decide on how to customize the site.
Note: there are several alternative implementations you could follow depending on requirements.
Handle the host header processing at a hardware load balancer (more likely 37Signals do this, than rely on the web server), and create a custom HTTP header to pass to the web application.
Create a new web application and host header for each individual application. This is probably an inefficient implementation for a large number of users, but could offer better isolation and security for some people.
You need to configure your DNS to support wildcard subdomains. It can be done by adding an A record pointing to your IP address, like this:
* A 1.2.3.4
Once its done, whatever you type before your domain will be sent to your root domain, where you can get by splitting the HTTP_HOST server variable, like the user buggs said above:
string user = HttpContext.Request.ServerVariables["HTTP_HOST"].Split(".")
//use the user variable to query the database for specific data
PS. If you are using a shared hosting you're probably going to have to by a Unique IP addon from them, since it's mandatory for the wildcard domains to work. If you're using a dedicated hosting you already have your own IP.
The way I have done it is with HttpContext.Request.ServerVariables["HTTP_HOST"].Split(".").
Let me know if you need more help.

Resources