Any problems with using a period in URLs to delimiter data? - url

I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.

To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.

Related

What is si=a or si=fc, etc in my soundcloud link? [duplicate]

https://www.airbnb.com/help?audience=host?audience=guest?audience=host?audience=host?audience=host
The URL above was created occasionally by me.
A normal URL to me has one question mark while all parameters are distinct. So in my opinion, this URL is abnormal.
What seems weird to me is that it still works and my browser has no complaint about it.
Would anyone explain it to me?
The first ? indicates the query component. The query component is terminated by the first following #, or the end of the URL.
So, this is the query component of your URL:
audience=host?audience=guest?audience=host?audience=host?audience=host
Within the query component, it’s perfectly fine to use ? characters, they don’t have any special meaning there (list of all allowed characters in the query).
While parameters in the query typically are in the name=value format, separated by &, this is just a convention (it’s what the encoding type application/x-www-form-urlencoded in HTML forms produces). Site authors can use whatever format they want.

#! as opposed to just # in a permalink

I'm designing a permalink system and I just noticed that Twitter and Hipmunk both prefix their permalinks with #!. I was wondering why this is, and if the exclamation point in particular is there for a reason. Wouldn't #/ work just as well, since they're no doubt using a framework that lets them redirect queries to certain templates with a regex URL parser?
http://www.hipmunk.com/#!BOS.SEA,Dec15.Jan02
http://twitter.com/#!/dozba
My only guess is it's because browsers use # to link to an anchor element. Is this why the exclamation point is appended?
This is done to make an "AJAX" page crawlable [by google] for indexing -- It does not affect the other well-defined semantics of the fragment identifier at all!
See Making AJAX Applications Crawlable: Getting Started
Briefly, the solution works as follows: the crawler finds a pretty AJAX URL (that is, a URL containing a #! hash fragment). It then requests the content for this URL from your server in a slightly modified form. Your web server returns the content in the form of an HTML snapshot, which is then processed by the crawler. The search results will show the original URL.
I am sure other search-engines are also following this lead/protocol.
Happy coding.
Also, It is actually perfectly valid, at least per HTML5, to have an element with an ID of "!foo" so the
reasoning in the post is invalid. See the article "The id attribute just got more classy":
HTML5 gets rid of the additional restrictions on the id attribute. The only requirements left — apart from being unique in the document — are that the value must contain at least one character (can’t be empty), and that it can’t contain any space characters.
My guess is that both pages use this in their JavaScript to differ between # (a link to an anchor) and their custom #! which loads some additional content using Ajax.
In that case pretty much everything else would work after the # sign.

ColdFusion - What's the best URL naming convention to use?

I am using ColdFusion 9.
I am creating a brand new site that uses three templates. The first template is the home page, where users are prompted to select a brand or a specific model. The second template is where the user can view all of the models of the selected brand. The third template shows all of the specific information on a specific model.
A long time ago... I would make the URLs like this:
.com/Index.cfm // home page
.com/Brands.cfm?BrandID=123 // specific brand page
.com/Models.cfm?ModelID=123 // specific model page
Now, for SEO purposes and for easy reading, I might want my URLs to look like this:
.com/? // home page
.com/?Brand=Worthington
.com/?Model=Worthington&Model=TX193A
Or, I might want my URLs to look like this:
.com/? // home
.com/?Worthington // specific brand
.com/?Worthington/TX193A // specific model
My question is, are there really any SEO benefits or easy reading or security benefits to either naming convention?
Is there a best URL naming convention to use?
Is there a real benefit to having a URL like this?
http://stackoverflow.com/questions/7113295/sql-should-i-use-a-junction-table-or-not
Use URLs that make sense for your users. If you use sensible URLs which humans understand, it'll work with search engines too.
i.e. Don't do SEO, do HO. Human Optimisation. Optimise your pages for the users of your page and in doing so you'll make Google (and others) happy.
Do NOT stuff keywords into URLs unless it helps the people your site is for.
To decide what your URL should look like, you need to understand what the parts of a URL are for.
So, given this URL: http://domain.com/whatever/you/like/here?q=search_terms#page-frament.
It breaks down like this:
http
what protocol is used to deliver the page
:
divides protocol from rest of url
//domain.com
indicates what server to load
/whatever/you/like/here
Between the domain and the ? should indicate which page to load.
?
divides query string from rest of url
q=search_terms
Between the ? and the # can be used for a dynamic search query or setting.
#
divides page fragment from rest of the url
page-frament
Between the # and the end of line indicates which part of the page to focus on.
If your system setup lets you, a system like this is probably the most human friendly:
domain.com
domain.com/Worthington
domain.com/Worthington/TX193A
However, sometimes a unique ID is needed to ensure there is no ambiguity (with SO, there might be multiple questions with the same title, thus why ID is included, whilst the question is included because it's easier for humans that way).
Since all models must belong to a brand, you don't need both ID numbers though, so you can use something like this:
domain.com
domain.com/123/Worthington
domain.com/456/Worthington/TX193A
(where 123 is the brand number, and 456 is the model number)
You only need extra things (like /questions/ or /index.cfm or /brand.cfm or whatever) if you are unable to disambiguate different pages without them.
Remember: this part of the URL identifies the page - it needs to be possible to identify a single page with a single URL - to put it another way, every page should have a unique URL, and every unique URL should be a different page. (Excluding the query string and page fragment parts.)
Again, using the SO example - there are more than just questions here, there are users and tags and so on too. so they couldn't just do stackoverflow.com/7275745/question-title because it's not clearly distinct from stackoverflow.com/651924/evik-james - which they solve by inserting /questions and /users into each of those to make it obvious what each one is.
Ultimately, the best URL system to use depends on what pages your site has and who the people using your site are - you need to consider these and come up with a suitable solution. Simpler URLs are better, but too much simplicity may cause confusion.
Hopefully this all makes sense?
Here is an answer based on what I know about SEO and what we have implemented:
The first thing that get searched and considered is your domain name, and thus picking something related to your domain name is very important
URL with query string has lower priority than the one that doesn't. The reason is that query string is associated with dynamic content that could change over time. The search engine might also deprioritize those with query string fearing that it might be used for SPAM and diluting the result of SEO itself
As for using the URL such as
http://stackoverflow.com/questions/7113295/sql-should-i-use-a-junction-table-or-not
As the search engine looks at both the domain and the path, having the question in the path will help the Search Engine and elevate the question as a more relevant page when someone typing part of the question in the search engine.
I am not an SEO expert, but the company I work for has a dedicated dept to managing the SEO of our site. They much prefer the params to be in the URI, rather than in the query string, and I'm sure they prefer this for a reason (not simply to make the web team's job slightly trickier... all though there could be an element of that ;-)
That said, the bulk of what they concern themselves with is the content within and composition of the page. The domain name and URL are insignificant compared to having good, relevant content in a well defined structure.

How do you structure a restful route with several GET constraints?

Suppose you are working on an API, and you want nice URLs. For example, you want to provide the ability to query articles based on author, perhaps with sorting.
Standard:
GET http://example.com/articles.php?author=5&sort=desc
I imagine a RESTful way of doing this might be:
GET http://example.com/articles/all/author/5/sort/desc
Am I correct? Or have I got this REST thing all wrong?
I'm afraid your question really misses the point of REST. From a purely theoretical perspective there is absolutely no advantage or disadvantage to either of those urls from a REST perspective. In practice, those urls may behave differently with different caches, and certainly server frameworks are going to parse them differently. Despite what you hear from the framework developers, there is no such thing as a RESTful URL.
From the perspective of REST those two URLs are simply identifiers that can be dereferenced. If you want to start building REST apis that will benefit from the characteristics described in the dissertation, you need to start thinking in terms of content that is returned when you dereference the URL and how that content is linked together using URLs embedded in the content.
I realize this does not help you much in trying to resolve what you consider to be your problem. What I can tell you is that one of the major intents of REST is to allow your URLs to be completely under the control of the server and can change without impacting your client applications. Therefore, my recommendation is to pick whatever url structure works most easily with the framework you are using to serve the resource representations. Certainly do not look to the REST dissertation to tell you what is the right and wrong way of formatting your URLs and anyone who tells you that your URLs are not RESTful is confused. Probably what they are telling you is the server framework, they are used to using for creating RESTful interfaces, requires URLs to be structured this way.
It's not what your URI looks like that matters, it is what you do with it that matters.
Using a query string is not more or less RESTful than using path components. The URI Generic Syntax (RFC 3986, January 2005) defines that they're just as important in identifying the resource. So yes, as others point out, it's not important to REST. (Note that in the obsoleted-by-RFC-3986 RFC 2396, the query string was not defined to be identifying the resource, but rather a string of information to be interpreted by the resource.)
However, URI design is important, because as an owner of a URI namespace (i.e. the holder of the domain name where the URIs will live) you want the URIs to be long lived. As wise men have stated earlier: Cool URIs don't change!
The choice of using query strings vs path components depends on how your resources are identified, and how they will be identified in years to come. If there's a hierarchy that stands out, then it might be that this should be reflected in the URI, at least if that hierarchy is relatively permanent, and that things don't move around all the time.
It's also important to note that the actual URIs are only meaningful to two parties:
Servers, who need to forge and parse URIs
Human beings who might see a URI in passing might learn things from the URI.
By contrast, client applications are usually not allowed to do URI introspection. So your choice of query strings vs path components boils down to what you think you can live with ten (or 100) years from now.
You are mostly right. The thing with REST api's is to focus on the nouns.
What does the noun all do in this case? Wouldn't you expect your API to always return all articles, unless you filter it?
I would make sort a query string parameters, further, I would make any and all filtering query string parameters. If you look at how Stack is implemented when you click on the "Newest" questions link, you get a query string to filter the questions.
So perhaps something like:
GET http://example.com/aritcles/authors/5?sort=desc
But also think about what happens with each URL:
GET http://example.com/aritcles/ might return all current articles
GET http://example.com/aritcles/authors/ What does this url do? does it return all authors of all articles, or does it return all the articles for all authors (which is essentially the same functionality of the URL above.)
GET http://example.com/aritcles/authors/5/ might return all articles by author 5, or does it return author 5's information?
I would maybe change it to:
http://example.com/aritcles returns all articles
http://example.com/aritcles/5 returns all articles from author 5
http://example.com/authors returns all authors
http://example.com/authors/5 returns information for author 5
Alan is mostly right but his URLs are misleading. I believe the correct routes / urls should reflect the following behavior:
[GET] http://domain.com/articles #=> returns all articles (index action)
[GET] http://domain.com/articles/5 #=> returns article ID 5 (show action)
[GET] http://domain.com/authors/#=> returns all authors (index action)
[GET] http://domain.com/authors/5 #=> returns author ID 5 (show action)
[GET] http://domain.com/authors/5/articles OR http://domain.com/articles/authors/5 #=> depending on the hierarchy of your routes (both belong to the index action)
Best regards,
DBA

What to use for space in REST URI?

What should I use:
/findby/name/{first}_{last}
/findby/name/{first}-{last}
/findby/name/{first};{last}
/findby/name/first/{first}/last/{last}
etc.
The URI represents a Person resource with 1 name, but I need to logically separate the first from the last to identify each. I kind of like the last example because I can do:
/findby/name/first/{first}
/findby/name/last/{last}
/findby/name/first/{first}/last/{last}
You could always just accept spaces :-) (querystring escaped as %20)
But my preference is to just use dashes (-) ... looks nicer in the URL. unless you have a need to be able to essentially query in which case the last example is better as you noted
Why not use + for space?
I am at a loss: dashes, minuses, underscores, %20... why not just use +? This is how spaces are normally encoded in query parameters. Yes, you can use %20 too but why, looks ugly.
I'd do
/personNamed/Joe+Blow
I like using "_" because it is the most similar character to space that keeps the URL readable.
However, the URLs you provided don't seem really RESTful. A URL should represent a resource, but in your case it represents a search query. So I would do something like this:
/people/{first}_{last}
/people/{first}_{last}_(2) - in case there are duplicate names
It this case you have to store the slug ({first}_{last}, {first}_{last}_(2)) for each user record. Another option to prepend the ID, so you don't have to bother with slugs:
/people/{id}-{first}_{last}
And for search you can use non-RESTful URLs:
/people/search?last={last}&first={first}
These would display a list of search results while the URLs above the page for a particular person.
I don't think there is any use of making the search URLs RESTful, users will most likely want to share links to a certain person's page and not search result pages. As for the search engines, avoid having the same content for multiple URLs, and you should even deny indexing of your search result pages in robots.txt
For searching:
/people/search?first={first}&last={last}
/people/search?first=george&last=washington
For resource paths:
/people/{id}-{first}-{last}
/people/35-george-washington
If you are using Ruby on Rails v3 in standard configuration, here's how you can do it.
# set up the /people/{param} piece
# config/routes.rb
My::Application.routes.draw do
resources :people
end
# set up that {param} should be {id}-{first}-{last}
# app/models/person.rb
class Person < ActiveRecord::Base
def to_param
"#{id}-#{to_slug(first_name)}-#{to_slug(last_name)}"
end
end
Note that your suggestion, /findby/name/first/{first}/last/{last}, is not restful. It does not name resources and it does not name them succinctly.
The most sophisticated choice should always and first of all consider two constraints:
As you'll never know how skilled the developer or the device being implemented on is regarding handling of urlencoding, i will always try to limit myself to the table of safe characters, as found in the excellent rant (Please) Stop Using Unsafe Characters in URLs
Also - we want to consider the client consuming the API. Can we have the whole structure easily represented and accessible in the client side programming language? What special characters would this requirement leave us with? I.e. a $ will be fine in javascript variable names and thus directly accessible in the parsed result, but a PHP client will still have to use a more complex (and potentially more confusing) notation $userResult->{'$mostVisited'}->someProperty... that a shot in your own foot! So for those two (and a couple of other programming environments) underscore seems the only valid option.
Otherwise i mostly agree with #yfeldblum`s response - i'd distinct between a search endpoint vs. the actual unique resource lookup. Feels more REST to me, but more importantly, the two have a significant cost difference on your api server - this way you can easier distinct and i.e. charge a higher costs or rate limit the search endpoint - should you ever need it.
To be Pragmatic, as opposed to a "RESTafarian" the mentioned approach /people/35-george-washington could (and should imho) basically respond to just the id, so if you want a named, urlsafe-for-dummies-link, list the reference as /people/35_george_washington. Other ideas could be /people/35/#GeorgeWashington (so breaking tons of RFCs) or /people/35_GeorgeWashington - the API wouldn't care.

Resources