What is a composite URI? - url

I have read the term "composite URI", but I'm not familiar with what it means. Is there such a thing?
I have a CompositeUri class in an application I'm building and I would like to know if I should choose a different name.

Composite generally means something made up on a few things. For example, composite keys in databases are made up from multiple fields, which together guarentee uniqueness.
A composite URL could be anything that fits that description, but would maybe be like the SO urls which include what your viewing (question), it's ID and some friendlier string.

In ShrinkWrap, a composite URI is one that refers to an aggregate of multiple CSS or javascript files e.g.
http://example.com/css/hello.css+world.css
That composite URI denotes the contents of http://example.com/css/world.css appended to the contents of http://example.com/css/hello.css.

Related

Any problems with using a period in URLs to delimiter data?

I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.
To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.

Resolve prefix programmatically, Jena

I have to parse through xml which contain URI links to dbpedia.org. I have to extract rdf triples from those URI based on a given Ontology using Jena library. How do I resolve the Prefix programmatically in Java based on the ontology given.
The given ontology says that triples can be extracted by querying dbpedia.org. For all such triples the corresponding dbpedia resource is available to start writing the query. But the problem is how do I write the query with only its resource available. I have the properties to query. But I don't have the PREFIX for those properties
Although this may not answer the question directly, I had a whole load of URIs, some prefixed some not and I wanted them all unprefixed (i.e. written out in full with their prefixes resolved). Searching Google the most useful thing I came across was this question (first) and the JavaDoc so I thought I'd add my experience to this question to help anyone else who might be searching for the same thing.
Jena's PrefixMap interface (which Model implements) has expandPrefix and qnameFor methods. The expandPrefix method is the one which helped me (qnameFor does the reverse i.e. it applies a prefix from a PrefixMap to a string and returns null if no such mapping can be found).
Hence for any resource, to ensure that you have a fully expanded URI you can do
myRes.getModel().expandPrefix(myRes.getURI());
Hope this helps someone.
Your question is not very clear, so if this answer doesn't address your issue please edit your question to say more clearly what your problem is. However, based on what you've asked, once you've read an RDF file into a Jena model, in XML or any other encoding, the prefixes used are available through the methods in the interface com.hp.hpl.jena.shared.PrefixMapping (see javadoc). A Model object implements that interface. To automatically expand prefix "foo", use the method getNsPrefixURI().
Edit
OK, given your revised question, there's a number of things you can do to turn a simple property name into a property URI that you can use in a SPARQL query:
use the prefix.cc service to look at possible expansions of prefixes and prefix names (e.g. if you are given dbpedia:elevation, you can look it up on prefix.cc (i.e: http://prefix.cc/dbpedia:elevation) to see that one of the possible expansions is http://dbpedia.org/ontology/elevation
issue a SPARQL describe query on the resource URI to see which properties are returned in the RDF description, then match those to the un-prefixed property names you've been given
ask your data provider to give you full property names, or otherwise provide the prefix expansions, in order to save you from having to reverse engineer which properties he or she meant in the first place.
Personally I'd advocate the third option if that's at all possible.

What to use for space in REST URI?

What should I use:
/findby/name/{first}_{last}
/findby/name/{first}-{last}
/findby/name/{first};{last}
/findby/name/first/{first}/last/{last}
etc.
The URI represents a Person resource with 1 name, but I need to logically separate the first from the last to identify each. I kind of like the last example because I can do:
/findby/name/first/{first}
/findby/name/last/{last}
/findby/name/first/{first}/last/{last}
You could always just accept spaces :-) (querystring escaped as %20)
But my preference is to just use dashes (-) ... looks nicer in the URL. unless you have a need to be able to essentially query in which case the last example is better as you noted
Why not use + for space?
I am at a loss: dashes, minuses, underscores, %20... why not just use +? This is how spaces are normally encoded in query parameters. Yes, you can use %20 too but why, looks ugly.
I'd do
/personNamed/Joe+Blow
I like using "_" because it is the most similar character to space that keeps the URL readable.
However, the URLs you provided don't seem really RESTful. A URL should represent a resource, but in your case it represents a search query. So I would do something like this:
/people/{first}_{last}
/people/{first}_{last}_(2) - in case there are duplicate names
It this case you have to store the slug ({first}_{last}, {first}_{last}_(2)) for each user record. Another option to prepend the ID, so you don't have to bother with slugs:
/people/{id}-{first}_{last}
And for search you can use non-RESTful URLs:
/people/search?last={last}&first={first}
These would display a list of search results while the URLs above the page for a particular person.
I don't think there is any use of making the search URLs RESTful, users will most likely want to share links to a certain person's page and not search result pages. As for the search engines, avoid having the same content for multiple URLs, and you should even deny indexing of your search result pages in robots.txt
For searching:
/people/search?first={first}&last={last}
/people/search?first=george&last=washington
For resource paths:
/people/{id}-{first}-{last}
/people/35-george-washington
If you are using Ruby on Rails v3 in standard configuration, here's how you can do it.
# set up the /people/{param} piece
# config/routes.rb
My::Application.routes.draw do
resources :people
end
# set up that {param} should be {id}-{first}-{last}
# app/models/person.rb
class Person < ActiveRecord::Base
def to_param
"#{id}-#{to_slug(first_name)}-#{to_slug(last_name)}"
end
end
Note that your suggestion, /findby/name/first/{first}/last/{last}, is not restful. It does not name resources and it does not name them succinctly.
The most sophisticated choice should always and first of all consider two constraints:
As you'll never know how skilled the developer or the device being implemented on is regarding handling of urlencoding, i will always try to limit myself to the table of safe characters, as found in the excellent rant (Please) Stop Using Unsafe Characters in URLs
Also - we want to consider the client consuming the API. Can we have the whole structure easily represented and accessible in the client side programming language? What special characters would this requirement leave us with? I.e. a $ will be fine in javascript variable names and thus directly accessible in the parsed result, but a PHP client will still have to use a more complex (and potentially more confusing) notation $userResult->{'$mostVisited'}->someProperty... that a shot in your own foot! So for those two (and a couple of other programming environments) underscore seems the only valid option.
Otherwise i mostly agree with #yfeldblum`s response - i'd distinct between a search endpoint vs. the actual unique resource lookup. Feels more REST to me, but more importantly, the two have a significant cost difference on your api server - this way you can easier distinct and i.e. charge a higher costs or rate limit the search endpoint - should you ever need it.
To be Pragmatic, as opposed to a "RESTafarian" the mentioned approach /people/35-george-washington could (and should imho) basically respond to just the id, so if you want a named, urlsafe-for-dummies-link, list the reference as /people/35_george_washington. Other ideas could be /people/35/#GeorgeWashington (so breaking tons of RFCs) or /people/35_GeorgeWashington - the API wouldn't care.

Is this RESTful?

I have a Rails app that needs to expose values from a database as a web service - since I'm using Rails 2.x, I'm going with REST (or at least try). Assuming my resource is Bananas, for which I want to expose several sub-characteristics, consider this:
- /banana -> give a summary of the first 10 bananas, in full (all characteristics)
- /banana/?name=<name> -> give all characteristics for banana named <name>
- /banana/?number=<number> -> give all characteristics for banana number <number>
- /banana/?name=<name>/peel -> give peel data for banana named <name>
- /banana/?number=<number>/length -> give length data for banana number <number>
I don't want to search for ID, only name or number. And I have about 7 sub-characteristics to expose. Is this RESTful?
Thanks for any feedback!
What Wahnfrieden is talking about is something called Hypermedia as the Engine of Application State (HATEOAS) - a central constraint of REST as defined by Fielding.
In a nutshell, REST application clients never construct URIs themselves. Instead, they follow URIs provided by the application. So, URI templates such as the ones you're asking about are irrelevent at best. You can make them conform to a system if you'd like, but REST says nothing about how your URIs need to look. You could, if you wanted to, arrange it so that every resource in your system was available from http://example.com/{hash}.
Publishing URI templates, such as the ones you're talking about in your question, introduces tight coupling between your application and clients - something REST is trying to prevent.
The problem with understanding hypermedia-driven applications is that almost nobody implements or documents their "RESTful" systems this way.
It might help to think about the interaction between a human and server via a browser. The human only knows about content and links that the server provides through the browser. This is how a RESTful system should be built. If your resources aren't exposing links, they're probably not RESTful.
The advantage is that if you want to change your URI system, for example, to expose the Banana "Peel" attribute through a query parameter instead of a nested URL, you can do it anytime you'd like and no client code needs to be changed because they're not constructing links for themselves.
For an example of a system that embraces the hypertext-driven constraint in REST, check out the Sun Cloud API.
I would use these:
/banana
/banana/blah
/banana/123
/banana/blah/peel (and /banana/123/peel)
/banana/blah/length (and /banana/123/length)
First, common practice for ReSTful URIs is /object_name/id/verb, with some of those absent (but in that order). Of course, this is neither required nor expected.
If all your names aren't made of digits, you don't have to explicitly have name in /banana/name/blah. In fact, if anything, it would be better to have id as identifier: /banana/id/123/peel. Hope this helps.
Parameters should only be used for form submission.
Also, URI naming schemas is totally unrelated to REST. The point of REST is to make related resources discoverable via hypertext, not out-of-band conventions, and only from a limit number of entry points. So your /bananas/ entry point might provide the summary info for 10 bananas, but it must also provide the URI for each of those bananas' details resources, as well as the URI to get the summary for the next 10 bananas. Anything else is just RPC.
It is good practice in REST to not use query parameters because query parameters donĀ“t belong to a URL and in REST all resources should be addressable through a URL.
In your example /banana/?name=name should be /banana/name because you are referring a concrete resource.
Even I think /banana/?number=number/length is not good REST style, because you are selecting an attribute through a URL when you should retrieve the whole state with /banana/name . A difference could be /customers/1024/address to get the Customer 1024 address record.
HTH.
A more opt form for the route in url having query string is the plural form, as it is possible that multiple items are returned in the result. In this case, bananas, like bananas?color=yellow, sounds more appropriate.
On the other hand, the singular form banana, like banana/123, is good when fetching a specific resource's representation when its identifier is known and query string is not required.

How to avoid conflict when not using ID in URLs

I see often (rewritten) URLs without ID in it, like on some wordpress installations. What is the best way of achieve this?
Example: site.com/product/some-product-name/
Maybe to keep an array of page names and IDs in cache, to avoid DB query on every page request?
How to avoid conflicts, and what are other issues on using urls without IDs?
Using an ID presents the same conundrum, really--you're just checking for a different value in your database. The "some-product-name" part of your URL above is also something unique. Some people call them slugs (Wordpress, also permalinks). So instead of querying the database for a row that has the particular ID, you're querying the database for a row that has a particular slug. You don't need to know the ID to retrieve the record.
As long as product names are unique it shouldn't be an issue. It won't take any longer (at least not significant) to look up a product by unique name than numeric ID as long as the column is indexed.
Wordpress has a field in the wp_posts table for the slug. When you create the post, it creates a slug from the post title (if that's how you have it configured), replacing spaces with dashes (or I think you can set it to underscores). It also takes out the apostrophes, commas, or whatnot. I believe it also limits the overall length of the slug, too.
So, in short, it isn't dynamically decoding the URL into the post's title--there's a field in the table that matches the URL version of the post name directly.
As you may or may not know, the URLs are being re-written with Apache's mod_rewrite module. As mentioned here, Wordpress is, in the background, assigning a slug after sanitizing the title or post name.
But, to answer your question, what you're describing is Wordpress' "Pretty Permalinks" feature and you can learn more about it in the Wordpress codex. Newer versions of Wordpress do the re-writing internally (no .htaccess editin, wp_rewrite instead). Which is why you'll see the same ruleset for any permalink structure.
Though, if you do some digging you can find the old rewrite rules. For example:
RewriteRule ^([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/?$ /index.php?year=$1&monthnum=$2&day=$3 [QSA,L]
Will take a URL like /2008/01/01/ and direct it to /index.php?year=2008&monthnum=01&day=01 (and load a date category).
But, as mentioned, a page like product-name exists only because Wordpress already sanitized the post title and stored it as a field in the database.

Resources