url masking mod_rewrite - url

I am slowly but surely learning php, and all is going well up until now.
I am looking to do a url rewrite, my DB is relatively indepth and a typical url would look like:
players.php?position=1&teamid=4&playerid=129
basically i want to return /Defender/Arsenal/Thomas-Vermaelen/ which are basically the names associated with the ID's in the database. This one page generates lots of different pages and I wanted to workout how to use the name in the URL instead of the ID number.
Im 99% sure this can be done as I have been looking in detail at the Joomla CMS system, and wondered if anyone could help shed some light on this please?
Thanks in advance
Richard :)

I think the easiest would be to simply map the requested URI /Defender/Arsenal/Thomas-Vermaelen/ onto /players.php?position=Defender&teamid=Arsenal&playerid=Thomas-Vermaelen:
RewriteRule ^/([A-Za-z]+)/([\w-]+)/([\w-]+)/$ /players.php?position=$1&teamid=$2&playerid=$3
Then in you PHP script you can check whether the parameter value was numeric or alphabetic and fetch the the numeric ID in case of the latter.

Related

Advanced URLs and URL rewriting

I was visiting the site asos.com the other day. If you search 'tshirt' on their site the resulting URL is 'http://www.asos.com/search/tshirt?q=tshirt'. Does anyone know which technique they use to make it seem that the live generate a page called 'tshirt' which basically takes any extension?
Also if you select a product the URL becomes something like: 'http://www.asos.com/ralph_lauren/polo/product.aspx' I know they don't have a file and folder for every brand and item, so how is it possible for the browser to follow this url?
I'm not looking for any code, just a hint on what to google for more information.
Hope this doesn't sound too ignorant!
Many Ragards,
Andreas
In most cases, this sort of functionality (often called clean URL's, user-friendly URL's, or spider-friendly URL's), is achieved through server-side rewrites. To point all requests of a specific known structure to a single backend script for processing.
Now these specific URL's you mention are not, in my opinion, the best examples of clean URL's. I will give you an example however of how such a clean URL might be achieved using Apache mod_rewrite (since Apache is so popular).
Take for example a URL like http://somedomain.com/product/ralph_lauren/polo
You might be able to do something like this in mod_rewrite
RewriteEngine On
RewriteRule /?product/(.*)/(.*) /product.php?cat=$1&subcat=$2 [L]
This would silently (to the end user) redirect the incoming request for any URL's of the structure /product/*/* to a script called /product.php, passing the second and third parts of the URL as cat and subcat parameters to be evaluated by the script.
I'm not sure I understand what you are asking, but in the example you cited it's using a query string which is everything after the '?' in the URL.
On the backend server it uses the variables passed in the query string to determine what to return back to you.

Friendly URL Structure with folders

I am currently building the URL Structure for an insurance website. I was hoping someone would give me some advice? I am trying to make it as SEO friendly as possible!
As an example, for 'travel insurance' the URL link would be;
http://www.example.com/travel_insurance/travel.html
and then for the quote form;
http://www.example.com/travel_insurance/quote/get_a_quote.html
would that be an ideal structure?
Any suggestions/advice is appreciated!
Being an seo specialist, it is best practice to keep your URL's as short and descriptive as possible.
If you want the actual page to rank online then add the keywords you wish to rank for in the URL.
If you do not want the page to rank it won't matter much.
Too many characters in the URL will not show up fully in search engine results snippets and can cuase a loss of potential clicks thru to your web page.
It is also best procatices to use a - Dash rather than _ underscore in your URL's.
I would go with:
http://www.website.com/travel-insurance.html
http://www.website.com/travel-insurance/get-a-quote.html
Less folders, quicker path, easier to understand.

Any short URL service that you can POST variables on?

I work for a small SMS marketing company, where we're sending out text message that each contain a unique code for the user (as a variable). My url is rather long, and I want to attach a unique variable for each one.
For example, the full URL might be:
http://www.mybigwebsiteurlishuge.com/more/more/?code={variable}
but I want it to be something like:
http://bit.ly/2398h?code={variable}
Anybody know any services that can do this? Otherwise I need to purchase small domain name just for this.
Thanks so much!
Most shortening services have APIs that you can use to shorten your URLs. Including bit.ly. Yu will have to use their API to the shortened URL.
I kept on looking, and still couldn't find anything suitable, so I got a new 3-character domain name, and also make a redirecting script that changed miniaturized variable names t the full ones. This works just as good really.

Best way to format pretty URLs for numeric IDs

Alright, so let's say I'm writing a forum application, and I want pretty URLs. However, all my tables use numeric IDs, so I'm not sure the best way to format the URLs for those resources. Let's pretend I'm trying to get a topic with ID 123456 and title This is a forum post. I've seen it done a couple ways:
www.example.com/topic/123456
www.example.com/topic/this-is-a-forum-post
www.example.com/topic/123456/this-is-a-forum-post
Which one would you say is, taking all things into consideration (including SEO), the optimal URL?
Sorry if this question is too vague, but it seems programming-related and it's not incredibly open-ended, as I just want to hear the pros and cons of each method.
I would go with option 3, and make the slug (the last bit) optional
Because?
The ID will always be unique... 2 people may make a thread with the name 'good news' for example
The search bots can access the slug for some SEO goodness
The slug should be optional ... Using just the ID should still give you access to the site. Perhaps if the slug isn't there you could forward to the slug'd version, if you're concerned about duplicate content. You could always use the canonical meta tag to tell Google to index the slugged version.
Another benefit of the optional slug is if someone copies and pastes the URL into a document, there is a chance it could have characters at the end chopped off (because URLs generally don't have spaces, so they don't break to new lines). Having the slug optional means there is more of a chance people will find your page.
I believe this is what Stack Overflow does.. and also notice they are doing rather well in the Search Engines.
Update
From the comments, be sure to 301 redirect any missing slug version to the correct slug.
URL 1 is definitely suboptimal. URL 2 is attractive but you run the risk of confusion if tags collide, especially if they differ only in punctuation. So I'd say URL 3 is the clear winner.
Also note that just because you display URL 3 is no reason not to accept all 3, with the other two redirecting. If URL 2 is ambiguous, it should redirect to a disambiguation page.
I would think that the 2nd URL would be the best for SEO since it is meaningful and has less depth. It's nicer for people as well since you can look at the URL and know what the content is about.
Doesn't include the title, so you'll lose the additional SEO value of having those keywords in the URL.
Won't work well, because it doesn't have a unique numerical ID, so what are you going to do if someone else tries to post a topic titled "This is a forum post"? Then you start getting into the weird thing digg does, where it has to give the second one the url "http://www.example.com/topic/this-is-a-forum-post_2", and so on. It makes it harder to take the URL they tried to load, and figure out exactly which topic they were trying to get to.
Has the best of both worlds, this would be my style of choice.
Stackoverflow seems to using pattern 3, with the title being ignored completely (just the id is used).
That makes for nice semantic URL, and is also easy to implement, and still works if the title changes later.
Of course, the title could be completely fake:
Best way to format pretty URLs for numeric IDs
I'll go for the first one. You know it really doesn't matter now. Since there are Long URLs converter and it will just proliferate and will become the norm in the future. Remember the longer your URL the less SEO points you'll get.
And you can't control the way people name their forum topics. So really, I'll just choose the first one for simplicity and the norm.
For SEO/traffic, definitely no.2 without a doubt. Get those meaningless numbers out of the URL every single time.
www.example.com/topic/this-is-a-forum-post
pickup the "this-is-a-forum-post" from your database and map it back to the ID number within your database via a query. Then do an internal URL re-write to the real page, something like /topic.php?ID=324342
I would go with option 2, as SEO can better understand.
Stack Overflow uses the third way, probably, that is the reason, Stack Overflow urls were not optimized for SEO. I am not sure in the above answer.
But In my experience with Google, Quite Often, I could see a solution from other forums, whereas stackoverflow solutions were almost invisible.
Best way to format pretty URLs for numeric IDs
Best way to format pretty URLs for numeric IDs
if the both urls were one and the same, the SEO simply goes with option 2, which is less optimized.
I'm not convinced longer URL's are SEO trouble. The depth seems to be a bigger issue, and not by counting slashes, but by steps it takes to get from an indexed page with rank to the content page. I recently created a dummy test page titled /content/roofing/how-much-does-a-shingle-roof-cost.html and threw it on the server just to test pathways and make sure my directories were working correctly. I'm not even sure how google discovered the page but it did and it started getting traffic, so I had to give it content and make it part of the family. The dummy content was a copy of our about page so it wasn't empty, but I was surprised an unpromoted page would get traction, and think the URL had something to do with that.
Which brings up a slight alternative to the above 3 choices for a URL. What if you went with number 3 but added .html to the end? I generally do this with dynamic URL's but I have no concrete evidence that it's helpful. According to Google they brag that they can index dynamic URL's just fine and so there's no need to do URL rewrites at all. Google doesn't mind a bit if the other engines aren't as good at that. Several sites I trust add the html at the end (blogger for example) and it can't hurt, so I still do it.
i would suggest the first one, since the topic title can be changed for clarity, by the admins and then the url will be inconsistent.
www.example.com/topic/123456
also allows one to just edit the last bit of the url (the numbers and jump to another topic), not likely to happen but still a usable feature.

Can an "SEO Friendly" url contain a unique ID?

I'd like to start using "SEO Friendly Urls" but the notion of generating and looking up large, unique text "ids" seems to be a significant performance challenge relative to simply looking up by an integer. Now, I know this isn't as "human friendly", but if I switched from
http://mysite.com/products/details?id=1000
to
http://mysite.com/products/spacelysprokets/sproket/id
I could still use the ID alone to quickly lookup the details, but the URL itself contains keywords that will display in that detail. Is that friendly enough for Google? I hope so as it seems a much easier process than generating something at the end that is both unique and meaningful.
Thanks!
James
Be careful with allowing a page to render using the same method as Stack overflow.
http://stackoverflow.com/questions/820493/random-text-can-cause-problems
Black hats can this to cause duplicate content penalty for long tail competitors (trust me).
Here are two things you can do to protect yourself from this.
HTTP 301 redirect any inbound display url that matches your ID but doesn't match the text to the correct text.
Example:
http://stackoverflow.com/questions/820493/random-text-can-cause-problems
301 ->
http://stackoverflow.com/questions/820493/can-an-seo-friendly-url-contain-a-unique-id
Use canonical URLs.
<link rel="canonical"
href="http://stackoverflow.com/questions/820493/can-an-seo-friendly-url-contain-a-unique-id"
/>
https://stackoverflow.com/questions/820493/can-an-seo-friendly-url-contain-a-unique-id
I'd say you're fine.
Have a look at the URLs that StackOverflow uses. They have a unique id, then they have the SEO-friendly stuff. You can omit the SEO-friendly stuff and the URL still works.
You are making a devils bargan here, you are trading away business goals for technology goals.
If you were to ask "From a purely business and SEO prospective, is it better to include unique IDs in the URL or not?"; the answer would clearly be to not use them.
The question then becomes, if you do use them, how much does it hurt you in the search engines? The answer is that it definately has some negative impact. How much is yet to be determined.
In terms of "user friendly", no, they are definitely not user friendly.
In terms of Google, they state "Whenever possible, shorten URLs by trimming unnecessary parameters." See their URL structure document.
I'm not aware of any problems caused by adding an ID to a URL. In fact it can be extremely useful, as it allows the human/search engine friendly part of the URL to be changed without causing a broken link to a page that a search engine has already indexed. Using SO as an example, here's a link to your question:
https://stackoverflow.com/questions/820493/you-can-put-any-text-you-want-here
Nothing wrong with that. An increasing number of services have started to use a hybrid solution as Paul Tomblin already pointed out. In addition to SO, Tumblr uses this pattern too (maybe it was the first).
Furthermore, in certain services—like Google News—the URL must contain a unique numeric ID.
Getting rid of the parameterized URL will definitely help. From my experience, including the ID does not hurt or help, as long as there are no '?key=value' pairs in the url.
I have two seemingly contradictory points to make here:-
Nobody looks at URLs! Experience has "trained" browser users to render the "Address" box contents as invisable, they know the contents will be any two of 'ureadable', 'meaningless' and 'confusing', hence they just ignore it completely.
Using a String which can be easily converted to an integer may offer a slight performance advantage over using a longer string which is slightly harder (hash() vs. to_int() ) to convert into an integer. However in the context of the average web application any performance difference would would be negligable.
My advice would be to stick with what your comfortable with.
Use something like modrewrite to parse URLs before they reach your server. So you could convert a slug like http://oorl.com/99942/My-Friendly-Text-For-Search-Engines/ into http://oorl.com/lookup.php?id=99942. This will also let you change slug and keywords used to optimize certain links without damaging functionality.
Duplicate refer cause more negative impact compare to friendly URL, be careful about using fake text with id, your competitors could miss use this.
Yes, and in fact it's more SEO friendly to include a number in your url as it implies to google that you are consistently updating your content.
I am fairly sure that it makes it much more difficult to get indexed in Google News if you don't have an incrementing number attached in some way to your URLs.

Resources