How to generate complex url like stackoverflow? - url

I'm using playframework, and I hope to generate complex urls like stackoverflow. For example, I want to generate a question's url:
http://aaa.com/questions/123456/How-to-generator-a-complex-url
Note the last part, it's the title of the question.
But I don't know how to do it.
UPDATED
In the playframework, we can define routes in conf/routes file, and what I do is:
GET /questions/{<\d+>id} Questions.show
In this way, when we call #{Questions.show(id)} in views, it will generate:
http://aaa.com/questions/123456
But how to let the generated has a title part, is difficult.

With playframework it's easy to generate such url. In your routes file you add this :
GET /questions/{id}/{title} YourController.yourMethod
See the doc in playframework site about routing for more info
In your html page :
<a href="#{YourController.yourMethod(id,title.slugify())}">
slugify method from JavaExtensions, clean your title from reserved characters (see doc)

It a server-side url rewriter does. In case of SO it doesn't matter you type {...}/questions/4698625/how-to-generate-complex-url-like-stackoverflow or {...}/questions/4698625 - they both redirects to the same content. So this postfix is used just to increase readability of a url.
To see more details about url rewriting, see this post.
UPD:
to generate such a postfix,
take a title of the content,
shrink multiple whitespaces into single
replace all whitespaces with dash (-)
remove all non-letter symbols from a title
Better to perform this operations with Regular Expressions

Related

Regex to normalize topic links in Discourse forum

I am using Discourse forum software. As in its current state, Discourse presents links to topic in two ways, with and without a post number at the end.
Example:
forum.domain.com/t/some-topic/23
forum.domain.com/t/some-topic/23/5
The first one is what I want and the second one I want to not be displayed in the forum at all.
I've written a post about it on Discourse forum but didn't receive an answer what Regex to put in the permalink normalization input field in the admin section.
I was told that there is an option to do it using permalink normalization like so (It's an example shown in the admin under the Regex input text, I didn't write it):
permalink normalizations
Apply the following regex before matching permalinks,
for example: /(topic.)\?./\1 will strip query strings from topic routes.
Format is regex+string use \1 etc. to access captures
I don't know what Regex I should use in order to remove the numerical value of the post number from links. I need it only for topic links.
This is the routes.rb routing library and this is the permalink.rb library (I think that the permalink library should help get a better clue how to achieve this). I have no idea how to approach this, because it seems that I need some knowledge of the Discourse routing to make it work. For example, I don't understand why (topic.) is part of the regex, what does it mean, so their example doesn't help me to find a solution.
In the admin I have an input field in which I nee to put the normalization regex code.
I need help with the Regex. I need the regex to work with all topics.
Things I've tried that didn't work out:
/(\/\d+)\/\d+$/\1
/(t/[^/]+/\d+).*/\1
/(\/\d+)\/[0-9]+$/\1
/(\/\d+)\/[0-9]+/\1
/(\/\d+)\/\d+$/\1/
/(forum.domain.com(\/\w+)*\/\d+)\/\d+(?=\s|$)/\1
Note: The Permalink Normalization input field treats the character | as a separator to separate between several Regex expressions.
I think this may be the expression you are looking for to put inside de settings field:
/(t\/.*\/\d+)(\/\d+)/\1
You can see it working on Rubular.
However, the code that generates the url is not using the normalization code, so the expression is being ignored.
You could try normalizing the permalink there:
def last_post_url
url = "#{Discourse.base_uri}/t/#{slug}/#{id}/#{posts_count}"
url = Permalink.normalize_url url
url
end
I didn't truly understand your question, but if I got it right, you are saying that you want links with /some-number at the end but don't what links with /some-number/some-number at the end. If that is the case, the regex is:
forum\.domain\.com\/t\/[^0-9\/]+\/\d{1,9}$
You can replace 'forum' with your forum name and 'domain' with your domain name.
This will remove trailing "/<digits>" after another "/<digits>":
/(forum.domain.com(\/\w+)*\/\d+)\/\d+(?=\s|$)/\1

Can friendly-id gem work with capital letters in url e.g. /users/joe-blogs and /users/Joe-Blogs both work

I like the friendly id gem but one problem i'm seeing is when I type in a url with a capitol letter in it such as /users/Joe-Blogs it cant find the page. Its a little trivial but most sites can handle something like this and will generate the page whether it has a capitol letter or not. Does anyone know a fix for this?
Edit: to clarify this is for when users enter a url manually and put capitals in it just because its a name like author/Joe-Blogs. I've seen other sites handle this but rails seems to just give a 404.
friendly_id uses parameterize to create the slugs.
I think the best way to solve your problem is to parameterize the params before using it to find.
# controller
User.find(params[:id].parameterize)
Or parameterize the url where the link originated from.
As an addition to Vic's answer, you'll want to look at url normalization:
The following normalizations are described in RFC 3986 to result in equivalent URLs:
Converting the scheme and host to lower case.
The scheme and host components of the URL are case-insensitive. Most normalizers will convert them to lowercase.
Example: HTTP://www.Example.com/ → http://www.example.com/
In short - it's against convention to use capitalization in your urls.
You may also wish to look at URI normalize; more importantly, you should work to remove the capitalization from your URLs:
URI.parse(params[:id]).normalize

Attaching parameters to the URL of a Rails route

This is a silly question but weird enough I Googled it, I am sure i had seen it before in Rails guides but now couldn't find it.
I want to attach parameters to my URL.
My initial url is this: "http://localhost:3000/pharmacy/patients"
Now I attach one URL with string concatination in JavaScript and it will be this:
"http://localhost:3000/pharmacy/patients?provider=234"
And still good.
Now I want to attach a second parameter named thera_class and its values are strings with spaces between them like "Nasal Congestion"
If I want to also concatenate that second parameter to it, How would the URL look like?
The way it would look is:
http://localhost:3000/pharmacy/patients?provider=234&thera_class=Nasal Congestion
To be extra strict, spaces are replaced by %20 in the URL:
http://localhost:3000/pharmacy/patients?provider=234&thera_class=Nasal%20Congestion

ASP.NET MVC Colon in URL

I've seen that IIS has a problem with letting colons into URLs. I also saw the suggestions others offered here.
With the site I'm working on, I want to be able to pass titles of movies, books, etc., into my URL, colon included, like this:
mysite.com/Movie/Bob:The Return
This would be consumed by my MovieController, for example, as a string and used further down the line.
I realize that a colon is not ideal. Does anyone have any other suggestions? As poor as it currently is, I'm doing a find-and-replace from all colons (:) to another character, then a backwards replace when I want to consume it on the Controller end.
I resolved this issue by adding this to my web.config:
<httpRuntime requestPathInvalidCharacters=""/>
This must be within the system.web section.
The default is:
<httpRuntime requestPathInvalidCharacters="<,>,*,%,&,:,\,?"/>
So to only make an exception for the colon it would become
<httpRuntime requestPathInvalidCharacters="<,>,*,%,&,\,?"/>
Read more at: http://msdn.microsoft.com/en-us/library/system.web.configuration.httpruntimesection.requestpathinvalidcharacters.aspx
For what I understand the colon character is acceptable as an unencoded character in an URL. I don't know why they added it to the default of the requestPathInvalidCharacters.
Consider URL encoding and decoding your movie titles.
You'd end up with foo.com/bar/Bob%58The%20Return
As an alternative, consider leveraging an HTML helper to remove URL unfriendly characters in URLs (method is URLFriendly()). The SEO benefits between a colon and a placeholder (e.g. a dash) would likely be negligable.
One of the biggest worries with your approach is that the movie name isn't always going to be unique (e.g. "The Italian Job"). Also what about other ilegal characters (e.g. brackets etc).
It might be a good idea to use an id number in the url to locate the movie in your database. You could still include a url friendly copy of movie name in your url, but you wouldn't need to worry about getting back to the original title with all the illegal characters in it.
A good example is the url to this page. You can see that removing the title of the page still works:
ASP.NET MVC Colon in URL
ASP.NET MVC Colon in URL
Colon is a reserved and invalid character in an URI according to the RFC 3986. So don't do something that violates the specification. You need to either URL encode it or use another character. And here's a nice blog post you might take a look at.
The simplest way is to use System.Web.HttpUtility.UrlEncode() when building the url
and System.Web.HttpUtility.UrlDecode when interpreting the results coming back. You would also have problems with the space character if you don't encode the value first.

dynamic seo title for news articles

I have a news section where the pages resolve to urls like
newsArticle.php?id=210
What I would like to do is use the title from the database to create seo friendly titles like
newsArticle/joe-goes-to-town
Any ideas how I can achieve this?
Thanks,
R.
I suggest you actually include the ID in the URL, before the title part, and ignore the title itself when routing. So your URL might become
/news/210/joe-goes-to-town
That's exactly what Stack Overflow does, and it works well. It means that the title can change without links breaking.
Obviously the exact details will depend on what platform you're using - you haven't specified - but the basic steps will be:
When generating a link, take the article title and convert it into something URL-friendly; you probably want to remove all punctuation, and you should consider accented characters etc. Bear in mind that the title won't need to be unique, because you've got the ID as well
When handling a request to anything starting with /news, take the next part of the path, parse it as an integer and load the appropriate article.
Assuming you are using PHP and can alter your source code (this is quite mandatory to get the article's title), I'd do the following:
First, you'll need to have a function (or maybe a method in an object-oriented architecture) to generate the URLs for you in your code. You'd supply the function with the article object or the article ID and it returns the friendly URL with the ID and the friendly title.
Basically function url(Article $article) => URL.
You will also need some URL rewriting rules to remove the PHP script from the URL. For Apache, refer to the mod_rewrite documentation for details (RewriteEngine, RewriteRule, RewriteCond).

Resources