Today I heard from my colleague that search bot can index pages with sequential ids.
Is it really happens ?
As an example checkout two urls:
http://sample.com/myProduct?id=765
and
http://sample.com/myProduct?id=35d6eb6c-97f6-4cde-997c-ade657c285d3
So, if search bots can figure out that my product id in url is sequential it can possibly index other products up and down the sequence ...
Have you ever heard anything like that ?
Whomever told you that is mistaken. Search engines will only index pages they know exist. So they won't keep changing the ID in those URLs just see if they find anything. So if you want those other pages to be indexed you should use a HTML sitemap or XML sitemap to tell the search engines where those pages are. Linking to them from other product pages is also a good idea.
Related
I have this website of my client made by someone in prestashop which has search input, and after searching for an item it will display a list of matching products, each linking to its page with a url looking like this:
www.website.com/category/full-product-name.html?search_query=search_phrase&results=2
Where a regular url of the product page looks like this:
www.website.com/category/full-product-name.html
The problem is now the google indexes the duplicated urls as separate pages.
I've never worked with prestashop before but I've looked into the template files and found something what I'd assume is file responsible for generating the content with line responsible for the link looking like this:
<a class="product_img_link" href="{$product.link|escape:'html':'UTF-8'}" title="{$product.name|escape:'html':'UTF-8'}" itemprop="url">
Now as I don't know much about prestashop I don't want to blindly change stuff. How could I change it to have the links from the search results have the same structure as the normal product page urls?
Well I don't know what's the point of allowing search engines indexing search pages but the problem is here. For whatever reason the developers decided to include query string into search result links.
You can create an override of search controller (or custom search module would be even better) and throw that line out and you should have normal product links.
I'm working on a new advert website and want to implement some good SEO URLs.
I got category URLs like:
/category
/category/sub-category
This seems ok. What about detail pages?
Option 1:
/announcements-and-notices/announcements-various/15880/suscipit-dis-molestie-malesuada-vestibulum-ut.html
Option 2:
/adverts/15880/suscipit-dis-molestie-malesuada-vestibulum-ut.html
In reality my website has a pretty long URLs due to multiple areas you can shop. So it would become:
/en/area-name/announcements-and-notices/announcements-various/15880/suscipit-dis-molestie-malesuada-vestibulum-ut.html
/en/area-name/adverts/15880/suscipit-dis-molestie-malesuada-vestibulum-ut.html
Which detail page would be a better URL? The first option seems to be better if the product has no long/good title. The second seems better as its the most relevant one and shortest especially with long category names.
I would like to hear your thoughts!
EDIT:
I found this two google docs:
http://www.google.nl/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CDYQFjAA&url=http%3A%2F%2Fwww.google.com%2Fwebmasters%2Fdocs%2Fsearch-engine-optimization-starter-guide.pdf&ei=lXyaT6T_L8zR4QSM4c2qDw&usg=AFQjCNEMj8KHxhxQz9cMLoMxMDiLdrAbJw
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=76329
I think I will be going for /adverts. Anyone disagree?
i have seen many of SEO analysts miss something about optimizing their webpage and that is your page will be optimized for only some keywords not all keywords. it is not important how length is your URL. you should first analyze whether the contents in your webpage is rich enough to have such URL with these keywords or not. if the answer for every keyword is yes then the more length will give you the more rank.
I think you can even set your pages up in a way to use only the slug and skip the id, such as:
/adverts/suscipit-dis-molestie-malesuada-vestibulum-ut
or even just:
/suscipit-dis-molestie-malesuada-vestibulum-ut
like this and refer straight to the adverts controller and the advert itself, which has this slug assigned to it (the one with id 15880).
This way you'll have nice and clean URLs. Just assign and keep an unique slug for each advert and handle it using .htaccess, or dynamically inside the code of your site, if the system allows it.
Cheers.
I am using ColdFusion 9.0.1.
I have a new web site that uses Bikes.cfm and Makers.cfm as template pages. I need to be able to pass BikeID and MakerID to both of the these pages, along with other variables. I don't want to use the Actual page name in the URL, such as this:
MyDomain.com/Bikes.cfm?BikeID=1234&MakerID=1234
I want my URL to look more like this:
MyDomain.com/?BikeID=1234&MakerID=1234
I need to NOT specify the page name in the URL.
I want these two URLs to access different data:
MyDomain.com/?BikeID=1234&MakerID=1234 // goes to bike page
MyDomain.com/?MakerID=1234&BikeID=1234 // goes to maker page
So, if BikeID appears in the URL before MakerID, go to the Bikes.cfm page. If MakerID appears before BikeID, go the Makers.cfm page.
Is there an easy and existing method to arrange the URL keys in such a way to have them point to the appropriate page?
Should I just parse the the URL as a list and determine the first ID and go to the appropriate page? Is there a better way?
Any thoughts or hints or ideas would be appreciated.
UPDATE -- It certainly appears that using the order of parameters in a URL is a bad idea for the following reasons:
1) many programs append variables to the URL
2) some programs may reorder the variables
3) GoogleBot may not consider order relevant and will most likely not index the site correctly.
Thanks to everyone who provided advice in a positive manner that my approach was probably a bad idea and would not produce the results I wanted. Thanks to everyone who suggested alternate means to produce the results I wanted.
If anyone of you positive people would like to put your positive comment/advice as an answer, I'd be happy to accept it as the answer.
Despite my grave misgivings about the whole idea, here's how I would do it if I were forced to do so:
index.cfm:
<cfswitch expression="#ListFirst(cgi.query_string, '=')#">
<cfcase value="BikeID">
<cfinclude template="Bikes.cfm">
</cfcase>
<cfcase value="MakerID">
<cfinclude template="Makers.cfm">
</cfcase>
<cfdefaultcase>
<cfinclude template="Welcome.cfm">
</cfdefaultcase>
</cfswitch>
I'm building a site that has items, with each item having a page, for example:
website.com/book/123
website.com/film/456
website.com/game/789
Each item can have multiple sub (and sub-sub, sub-sub-sub) pages, for example a book could have a blurb, a film could have a gallery and a game could also have a gallery.
My question is, does any sort of standard or best practice exist around structuring the URLs for pages associated with an item? For example:
website.com/film/456/gallery
Where the sub page comes after the item, or:
website.com/film/gallery/456/
where the item is the very last part of the URL.
Does anyone have any information on why which approach is best or if any web standard exists? It seems an obvious thing but I'm struggling to decide, I can think of pros and cons for each approach -- although I'm leaning towards the former option because it means the following user path would match the URL:
load website.com -> click "films" (website.com/films)-> click "a film" (website.com/film/123) -> click gallery (website.com/film/123/gallery)
but something about it seems... off, inconsistent maybe.
You are correct that the former URL is "better" and is more widely deployed. I don't think you would find this documented in any standard; it is rather more of a convention. Most articles and books covering REST do it that way.
The reason for this is, as you say, that the path components in the URL match the structure of resources and sub-resources. In particular, all of the following should be valid URLs:
website.com/
website.com/books
website.com/books/123
In particular, note that it is books/123, not book/123 like you have. I have seen the singular but IMHO the plural is better.
For the URL /books
a GET gets all books, but you can restrict the books with query parameters, e.g. /books?author=alice
a POST adds a new book (with a server-generated id).
For the URL /books/123
a GET gets that particular book
a PUT replaces the book with that id (or adds a book with that client-generated id)
Now if a book has blurbs and the blurbs are unique only to a particular book then you will add the following URLs:
website.com/books/123/blurbs
website.com/books/123/blurbs/72
You can do the same for films and galleries, provided each gallery belonged to a single film. But if galleries existed for multiple films, then you would make /galleries a top-level URL. Navigating from a film to a gallery would still be fine. You wouldn't have a structured URL. You would instead get all galleries containing pictures from film 456 via a GET to
website.com/galleries?film=456
The general rule is that if you have an ownership relation for the subresources you can use structured urls, but if there is a looser relationship among top-level items, query parameters are fine. Don't fall into the common misconception that RESTful URLs don't have query parameters; they do. :)
Now finally, to directly answer your question: website.com/films/galleries/456 is not a good URL IMHO because `website.com/films/galleries/ is not very useful. In fact I think it is rather ugly. What would it mean? All galleries? If so, it should be website.com/galleries.
Again I don't think this is standardized anywhere, but it feels very common and conventional.
I've seen some websites highlight the search engine keywords you used, to reach the page. (such as the keywords you typed in the Google search listing)
How does it know what keywords you typed in the search engine? Does it examine the referrer HTTP header or something? Any available scripts that can do this? It might be server-side or JavaScript, I'm not sure.
This can be done either server-side or client-side. The search keywords are determined by looking at the HTTP Referer (sic) header. In JavaScript you can look at document.referrer.
Once you have the referrer, you check to see if it's a search engine results page you know about, and then parse out the search terms.
For example, Google's search results have URLs that look like this:
http://www.google.com/search?hl=en&q=programming+questions
The q query parameter is the search query, so you'd want to pull that out and un-URL-escape it, resulting in:
programming questions
Then you can search for the terms on your page and highlight them as necessary. If you're doing this server side-you'd modify the HTML before sending it to the client. If you're doing it client-side you'd manipulate the DOM.
There are existing libraries that can do this for you, like this one.
Realizing this is probably too late to make any difference...
Please, I beg you -- find out how to accomplish this and then never do it. As a web user, I find it intensely annoying (and distracting) when I come across a site that does this automatically. Most of the time it just ends up highlighting every other word on the page. If I need assistance finding a certain word within a page, my browser has a much more appropriate "find" function built right in, which I can use or not use at will, rather than having to reload the whole page to get it to go away when I don't want it (which is the vast majority of the time).
Basically, you...
Examine document.referrer.
Have a list of domains to GET param that contains the search terms.
var searchEnginesToGetParam = {
'google.com' : 'q',
'bing.com' : 'q'
}
Extract the appropriate GET param, and decodeURIComponent() it.
Parse the text nodes where you want to highlight the terms (see Replacing text with JavaScript).
You're done!