How to redirect old URL that hasn't any slug to new one which has a slug - url

Friends, this is a complex problem for me. I have researched on this many times and at last have came to you (with hope that I will get the solution). We had products URLs like:
/product_info.php/products_id/75
As per SEO, I wanted keyword rich URL so, we added a slug in products.php file and modified the URL as:
/product_info.php/products_id/75/product-title
But its also not an ideal URL. I wanted this as:
domainname.com/products/product-title/75
Changes I made in .htaccess file is as follows-
RewriteRule ^products/([A-Za-z0-9-]+)/([0-9]{2})/?$ product_info.php?products_id=$2=$1 [L]
RedirectMatch 301 ^/product_info.php/products_id/([0-9]{2})/([A-Za-z0-9-]+)$ http://www.livevaastu.com/products/$2/$1
Now problem is- our old URLs (which has no slugs) are indexed by Google. And I am not getting any idea how to redirect those old ones to new ones. Also there are many products pages so I couldn't redirect them one-by-one. You guys are genius. Can you help me any how. (without laughing on me). M not a developer.

You can't produce product_info.php?products_id=$2=$1 from your old URLs of /product_info.php/products_id/75 because they don't have the product title ("slug").
For one thing product_info.php?products_id=$2=$1 doesn't make any sense. Is that a typo? What are the key/value pairs in that query string?? This should look something like products_id=$1&product_title=$2 where each derived "value" from the mod_rewrite match gets assigned to a known "key", something you can use in $_GET or $_REQUEST to find the value.
Edit to help with what I think you are trying to achieve here, based on discussion:
If you want your old URLs to lead to the new "pretty" URLs, you will need to use PHP to do this. As mentioned, there simply is not adequate information in the URL to invent the product names. But you could pretty easily have something at the top of each page (i.e. in a header file) which looked to see if the "title" $_GET parameter is present or not (once you clean up the double-equal sign and replace it with proper key/value pairs). This might look something like:
<?php
if( !isset( $_GET['product_title'] ) ) {
// Code here to look up $product_title from the $product_id, presumably in a DB
header("HTTP/1.1 301 Moved Permanently");
header("Location: /products/$product_title/$product_id");
exit();
}

Related

301 redirect rule for urls with dynamic paths

I'm on the learning curve for 301 redirects and have done lots of research, including looking at answers on this forum. I haven't found the answer to my specific query, which requires removing elements from the middle of the url request.
Namely, I am building a new site with dynamic links (WordPress, but the question applies to any CMS).
I need to redirect from links (also dynamic) structured as:
sitename.com/issue/february-2016/post/dynamic-post-name
(february-2016 is an example - could be 'march-2014' or any of a range of terms)
to:
sitename.com/post/dynamic-post-name
Another way to say this: Any request url with /article/ needs to grab that last string (which I think would be the wildcard?) and redirect it as: sitename.com/post/$
Is this possible?
Update: With more research, I found a possible answer that worked in a testing tool, although I've not tested it live on my site.
Does this look correct?
RewriteRule ^([^/]+)/([^/]+)/article/([^.]+)$ article/$3 [QSA,L]
RewriteRule ^article/.*/(.*)$ post/$1 [QSA,L,R=301]
Something like this should work.
The characters captured within the brackets (.*) will be the $1.
Feel free to change article and post to fit your need.
In this case, it will redirect
http://example.com/article/february-2016/post/dynamic-post-name
to
http://example.com/post/dynamic-post-name

How to prepare for future URL rewriting to a directory type structure?

So I wish to implement URL rewrite once my site is done but I wish to have it in this format.
site.com/city/example-deal
Currently once a city is chosen it links to a page in the following format:
site.com/city.php?city=atlanta
Then once on that page, a deal is selected from there and it links to the next page:
site.com/deal.php?deal=123
With that in mind, could I rewrite it as such with my current linking structure:
site.com/atlanta/example-deal or do I have to link the page as such:
site.com/city.php?city=atlanta/deal.php?deal=123 in order to get the final URL rewrite structure I'm looking for.
Hopefully I explained this right and thanks for the help!
What you need to do is so that deal.php reads in the city in the query string.
You should also slugify your deal as well so that you can derive the deal id from the deal slug.
Here's an example of a slugify function in php:
http://sourcecookbook.com/en/recipes/8/function-to-slugify-strings-in-php
RewriteRule ^([^\.^/]+)/deals/(.*)$ deal.php?city=$1&deal_slug=<deal_slug> [QSA]
Also your deal table in MySQL should be modified to store the slug. With that your deal.php can be modified so that:
// get deal slug from query string
//select from deal table where deal slug = submitted deal slug
// continue with normal code.

dynamic seo title for news articles

I have a news section where the pages resolve to urls like
newsArticle.php?id=210
What I would like to do is use the title from the database to create seo friendly titles like
newsArticle/joe-goes-to-town
Any ideas how I can achieve this?
Thanks,
R.
I suggest you actually include the ID in the URL, before the title part, and ignore the title itself when routing. So your URL might become
/news/210/joe-goes-to-town
That's exactly what Stack Overflow does, and it works well. It means that the title can change without links breaking.
Obviously the exact details will depend on what platform you're using - you haven't specified - but the basic steps will be:
When generating a link, take the article title and convert it into something URL-friendly; you probably want to remove all punctuation, and you should consider accented characters etc. Bear in mind that the title won't need to be unique, because you've got the ID as well
When handling a request to anything starting with /news, take the next part of the path, parse it as an integer and load the appropriate article.
Assuming you are using PHP and can alter your source code (this is quite mandatory to get the article's title), I'd do the following:
First, you'll need to have a function (or maybe a method in an object-oriented architecture) to generate the URLs for you in your code. You'd supply the function with the article object or the article ID and it returns the friendly URL with the ID and the friendly title.
Basically function url(Article $article) => URL.
You will also need some URL rewriting rules to remove the PHP script from the URL. For Apache, refer to the mod_rewrite documentation for details (RewriteEngine, RewriteRule, RewriteCond).

Redirect 301 with hash part (anchor) #

One of our website has URL like this : example.oursite.com. We decided to move our site with an URL like this www.oursite.com/example. To do this, we wrote a rewrite rule in our Apache server that redirect to our new URL with a code 301.
Many websites link to us with URLs of the form example.oursite.com/#id=23. The problem is that the redirection erase the hash part of the URL with IE. As far as I know, the hash part is never sent to the server.
I wanted to implement the redirection with javascript to keep the hash part, but the Search Engine will not be aware that our URL changed. (no code 301 returned)
I want the Search Engine to be notified of our new URL(301) because we need to transfer the page rank to our new URL.
Is there a way to redirect with a 301 code and keep the hash part(#id=23) of in the URL ?
Search engines do in fact care about hash tags, they frequently use them to highlight specific content on a page.
To the question, however, anchor locations are unfortunately not sent to the server as part of the HTTP request. If you want to redirect a user, you will need to do this in Javascript on the client side.
Good article: http://web.archive.org/web/20090508005814/http://www.mikeduncan.com/named-anchors-are-not-sent/
Seeing as the server will never see the # (ruling out 301 Redirects) and Google has deprecated their AJAX Crawling scheme, it seems that a front-end solution is the only way!
How I did it:
(function() {
var redirects = [
['#!/about', '/about'],
['#!/contact', '/contact'],
['#!/page-x', '/pageX']
]
for (var i=0; i<redirects.length; i++) {
if (window.location.hash == redirects[i][0]) {
window.location.replace(redirects[i][1]);
}
}
})();
I'm assuming that because Google crawlers do indeed execute Javascript, the new pages will be indexed properly.
I've put it in a <script> tag directly underneath the <title> tag, so that it get executed before any other JS/CSS. Note that this script should only be required for your index file.
I am fairly certain that the hash/page anchor/bookmark part of a URL is not indexed by search engines, and therefore has no effect on your page ranking. Doing a google search for "inurl:#" returns zero documents, so that backs up my assumption. Links from external sites will be indexed without the hash.
You are right in that the hash part isn't sent to the server, so as far as I am aware, there isn't a good way to be able to create a redirection url with the hash in it.
Because of this, it's up to the browser to correctly manage the hash during a redirect. Firefox 3.5 appears to do this successfully. If you append a hash to a URL that has a known redirect, you will see the URL change in the address bar to the new location, but the hash stays on there successfully.
Edit: In response to the comment below, if there isn't a hash sign in the external URL for the part you need, then it is entirely possible to rewrite the URL. An Apache rewrite rule would take care of it:
RewriteCond %{HTTP_HOST} !^exemple\.oursite\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^/(.*) http://www.oursite.com/exemple/$1 [L,R]
If you're not using Apache, then you'll have to look into the server docs for something similar.
Google has a special syntax for AJAX applications that is based on hash URLs: http://code.google.com/web/ajaxcrawling/docs/getting-started.html
You could create a page on the old address that catches all requests and redirects to the new site with the correct address and code.
I did something like that, but it was in asp.net, which I guess it's not the language you use. Anyway there should be a way to do this in any language.
When returning status 301, your server is supposed to return a 'Location:' header which points to the new location. In practice, the way this is implemented varies; some servers provide the full URL (netloc and path), some just provide the new path and expect the browser to look for that path on the original netloc. It sounds like your rewrite rule is stripping the path.
An easy way to see what the returned Location header is, in the python shell:
>>> import httplib
>>> conn = httplib.HTTPConnection('exemple.oursite.com')
>>> conn.request('HEAD', '/')
>>> res = conn.getresponse()
>>> print res.getheader('location')
I'm afraid I don't know enough about mod_rewrite to tell you how to do the rewrite rule correctly, but this should give you an idea of what your server is actually telling clients to do.
The search bots don't care about hash tags. And if you are using them for some kind of flash or AJAX calls, you have more serious problems than your 301 redirects don't work. Because unless you have the content in an alternate form, the search engines are not indexing your site and you are definitely suffering as far as SEO goes.
I registered my account so I can't edit.
zombat : I'm sorry I made a mistake in my comment. The link to our video is exemple.oursite.com/#video_id=233. In this case, my rewrite rule in Apache doesn't work.
Nick Berardi: We changed the way our links work. We don't use # anymore, only for backward compatibility

How to avoid conflict when not using ID in URLs

I see often (rewritten) URLs without ID in it, like on some wordpress installations. What is the best way of achieve this?
Example: site.com/product/some-product-name/
Maybe to keep an array of page names and IDs in cache, to avoid DB query on every page request?
How to avoid conflicts, and what are other issues on using urls without IDs?
Using an ID presents the same conundrum, really--you're just checking for a different value in your database. The "some-product-name" part of your URL above is also something unique. Some people call them slugs (Wordpress, also permalinks). So instead of querying the database for a row that has the particular ID, you're querying the database for a row that has a particular slug. You don't need to know the ID to retrieve the record.
As long as product names are unique it shouldn't be an issue. It won't take any longer (at least not significant) to look up a product by unique name than numeric ID as long as the column is indexed.
Wordpress has a field in the wp_posts table for the slug. When you create the post, it creates a slug from the post title (if that's how you have it configured), replacing spaces with dashes (or I think you can set it to underscores). It also takes out the apostrophes, commas, or whatnot. I believe it also limits the overall length of the slug, too.
So, in short, it isn't dynamically decoding the URL into the post's title--there's a field in the table that matches the URL version of the post name directly.
As you may or may not know, the URLs are being re-written with Apache's mod_rewrite module. As mentioned here, Wordpress is, in the background, assigning a slug after sanitizing the title or post name.
But, to answer your question, what you're describing is Wordpress' "Pretty Permalinks" feature and you can learn more about it in the Wordpress codex. Newer versions of Wordpress do the re-writing internally (no .htaccess editin, wp_rewrite instead). Which is why you'll see the same ruleset for any permalink structure.
Though, if you do some digging you can find the old rewrite rules. For example:
RewriteRule ^([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/?$ /index.php?year=$1&monthnum=$2&day=$3 [QSA,L]
Will take a URL like /2008/01/01/ and direct it to /index.php?year=2008&monthnum=01&day=01 (and load a date category).
But, as mentioned, a page like product-name exists only because Wordpress already sanitized the post title and stored it as a field in the database.

Resources