Understanding the abstract jargons in REST - ruby-on-rails

Not sure whether it is here to ask these basic questions about REST...
The more I try to understand REST and want to implement it, the more I get confused. I think probably because of those abstract words I see when people come to teach you what a REST is. For instance,
REST from wikipedia,
Representational state transfer (REST) is an abstraction of the
architecture of the World Wide Web; more precisely, REST is an
architectural style consisting of a coordinated set of architectural
constraints applied to components, connectors, and data elements,
within a distributed hypermedia system.
And from this link
Representations of resources should be uniform
Use hypermedia—not URL schemes or resource name mapping—to represent
relationships
Use a single entry to an API, hyperlink from there (hyperlink? why not hypermedia??)
Include a “self” link in your representations
REST doesn’t just mean spitting out ActiveRecord models as JSON (so what should you spit out??)
So, what does it mean by these abstract jargons below?
Representations,
resources,
URL schemes,
hypermedia (what the difference of it from hypertext or hyperlink???),
resource name mapping—to,
ActiveRecord models
Can you explain them in a concrete examples?

Representation is the name given for the complete set of bytes that are sent over the wire when you make a request to a resource using a link.
Resource is a any concept that is important to your application and you wish to expose it to a client application. A resource is identified by a URL.
Technically an URL Scheme is the first few bytes of a URL that appear before the colon. e.g. http, urn, ftp, file, etc.
However, I suspect in the context you are seeing it, it means a convention for organizing the paths segments and query string parameters of and URL and assigning a significance to those pieces of the URL.
From the perspective of REST, a server can create URL conventions, but clients should know nothing about those conventions.
Hypermedia is a classification of media types that support hyperlinks. Hypertext is an older term that is almost synonymous with hypermedia. In theory hypermedia does not need to be a text based format. So far I'm not aware of any hypermedia formats that are not hypertext.
Hyperlink, link and web link are synonymous.
Unfortunately these terms are abstract by definition. There are many different ways to implement these concepts.
ActiveRecord is an implementation concept that is often used to contain the data that will be used to build a representation of a resource. However, if you limit yourself to only being able to create resources from ActiveRecord instances, you will likely struggle to implement an effective REST API.
Representations of resources should be uniform - This makes no sense to me. There is a REST constraint called "Uniform Interface". However, this constraint refers to using a consistent interface to enable building layered applications. i.e. When a client wants to talk to server, you can slip a proxy, load balancer, cache in between and the client and server don't know the difference because the interface is consistent.
Use hypermedia - Hypermedia helps you to decouple your client and server to allow either to evolve independently. If this is not important to you (i.e. you can always deploy your client javascript at the same time as the website) then you are not going to get a whole lot of value out of it.
Use a single entry to an API - See previous point. If you need independent evolvability then explore some more.
Include a “self link in your representations And again...
REST doesn’t just mean spitting out ActiveRecord models as JSON REST is an architectural style of the boundary of your application. Exposing ActiveRecords as Resources ties your client to implementation details of your server. Also, HTTP is limited to just a few methods (GET, PUT, POST, DELETE, OPTIONS, HEAD). In order to compensate for this limited vocabulary you will need to create Resources that are nothing to do with records in a database.

Many are quick to point out that "REST," as typically used, isn't really REST, as originally defined.
There's endless (and pointless) discussion and debate around this topic, but from a practical standpoint I prefer to think about "resources," and how best to expose them.
The concept of REST may be instructional in this regard, even if common implementations diverge from the source.
In the end, we're programmers and have to build things that meet today's objectives, knowing they won't be tomorrow's.

Related

Zanzibar doubts about Tuple + Check Api. (authzed/spicedb)

We currently have a home grown authz system in production that uses opa/rego policy engine as core for decision making(close to what netflix done). We been looking at Zanzibar rebac model to replace our opa/policy based decision engine, and AuthZed got our attention. Further looking at AuthZed, we like the idea of defining a schema of "resource + subject" types and their relations (like OOP model). We like the simplicity of using a social-graph between resource & subject to answer questions. But the more we dig-in and think about real usage patterns, we get more questions and missing clarity in some aspects. I put down those thoughts below, hope it's not confusing...
[Doubts/Questions]
[tuple-data] resource data/metadata must be continuously added into the authz-system in the form of tuple data.
e.g. doc{org,owner} must be added as tuple to populate the relation in the decision-graph. assume, i'm a CMS system, am i expected to insert(or update) the authz-engine(tuple) for for every single doc created in my cms system for lifetime?.
resource-owning applications are kept in hook(responsible) for continuous keep-it-current updates.
how about old/stale relation-data(tuples) - authz-engine don't know they are stale or not...app's burnded to tidy it?.
[check-api] - autzh check is answered by graph walking mechanism - [resource--to-->subject] traverse path.
these is no dynamic mixture/nature in decision making - like rego-rule-script to decide based on json payload.
how to do dynamic decision based on json payload?
You're correct about the application being responsible for the authorization data it "owns". If you intend to have a unique role/relationship for each document in your system, then you do need to write/delete those relationships as the referenced resources (or the roles on them, more likely) change, but if you are using an RBAC-like design for your schema, you'd have to apply these role changes anyway; you'd just apply them to SpiceDB, instead of to your database. Likewise, if you have a relationship between say, a document and its parent organization, you do have to write/delete those as well, but that should only occur when the document is created or deleted.
In practice, unless you intend to keep the relationships in both your database and in SpiceDB (which some users do), you'll generally only have to write them to one or the other. If you do intend to apply them to both, you can either just perform the updates to both at the same time, or use an outbox-like pattern to synchronize behind the scenes.
Having to be proactive in your applications about storing data in a centralized system is necessary for data consistency. The alternative is federated systems that reach into other services. Federated systems come with the trade-offs of being eventually consistent and can also suffer from priority inversion. I presented on the centralized vs federate trade-offs in a bit of depth and other design aspects of authorization systems in my presentation on the cloud native authorization landscape.
Caveats are a new feature in SpiceDB that enable dynamic policy to be enforced on the relationship graph. Caveats are defined using Google's Common Expression Language, which a language used for policy in other cloud-native projects like Kubernetes. You can also use caveats to make relationships that eventually expire, if you want to take some of book-keeping out of your app code.

Is there a listing of known whois query output formats?

TL;DR: I need a source for as many different output formats from a whois query as possible.
Background:
I am looking for a single reference that can provide as many (if not all) unique whois query output formats as possible.
I don't believe this exists but hope to be proven wrong.
This appears to be an age old problem
This stackoverflow post from 2015 references the challenge of handling the "~40 formats" that the author was aware of.
The author never detailed any of these formats.
The RFC for whois is... depressing
The IETF ran an analysis in 2015 that examined the components of whois per each RIR at the time
In my own research I see that registrars like JPNIC do not appear to comply with the APNIC standards
I am aware of existing tools that do a bang-up job parsing whois (python-whois for example) however I'd like to hedge my bets against outliers with odd formats. I'm also open to possible approaches to gather this information, however that would likely be too broad to fit this question.
Hoping there is a simple "go here and download this" answer. Hoping...
"TL;DR: I need a source for as many different output formats from a whois query as possible."
There isn't, except if you use any kind of provider that does this for you, with whatever caveats.
Or more precisely there isn't something public, maintained and exhaustive. You can find various libraries that try to do this, in various languages, but none is complete, as this is basically an impossible task, especially if you want to include any TLDs, like ccTLDs (you are not framing your constraints space in a very detailed way, nor in fact really saying you are asking about domain name data in whois or IP addresses/ASN data?).
Some providers of course try to do that and offering you an abstract uniform API. But why would anyone share their internal secret sauce, that is list of parsers and so on? It makes no business incentive to do that.
As for opensource library authors (I was one at some point), it is just tedious and absolutely not rewarding at all to just update it forever with all new formats and tweaks per registry (battle scar example: one registrar in the past changed its output format at each query! one query gave you somefield: somevalue while next time it was somefield:somevalue or somefield somevalue, etc. of course that is only a simple example).
RFC 3912 specified just the transport part, not the content, hence a lot of cases appeared. Specifically in the ccTLD world, each registry is king in its kingdom and it is free to implement whatever it wants the way it wants. Also the protocol had some serious limitations (ex: internationalization, what is the "charset" used for the underlying data) that were circumvented in different ways (like passing "options" in your query... of course none of them are standardized in any way)
At the very least, gTLDs whois format is specified there:
https://www.icann.org/resources/pages/approved-with-specs-2013-09-17-en#whois
Note however that due to GDPR there were changes (see https://www.icann.org/resources/pages/gtld-registration-data-specs-en/#temp-spec) and will be other changes in the future.
However, you should be highly pressed to look at RDAP instead of whois.
RDAP is now a requirement in all gTLDs registries and registries. As it is JSON, it solves immediately the problem of format.
Its core specifications are:
RFC 7480 HTTP Usage in the Registration Data Access Protocol (RDAP)
RFC 7481 Security Services for the Registration Data Access Protocol (RDAP)
RFC 7482 Registration Data Access Protocol (RDAP) Query Format
RFC 7483 JSON Responses for the Registration Data Access Protocol (RDAP)
RFC 7484 Finding the Authoritative Registration Data (RDAP) Service
You can find various libraries doing RDAP for you (see below for links), but at its core it is JSON over HTTPS so you can emulate simple cases with any kind of HTTP client library.
Work is underway to fix some missing/not precise enough details on RFC 7482 and 7483.
You need also to take into account ICANN specifications (again, only for gTLDs of course):
https://www.icann.org/en/system/files/files/rdap-technical-implementation-guide-15feb19-en.pdf
https://www.icann.org/en/system/files/files/rdap-response-profile-15feb19-en.pdf
Note that, right now, even if it is an ICANN requirement, you will find a lot of missing or broken gTLD registries or registrar RDAP server. You will also find a lot of "deviations" in replies from what would be expected per the specification.
I gave full details in various other questions here, so maybe have a look:
https://stackoverflow.com/a/61877920/6368697
https://stackoverflow.com/a/48066735/6368697
https://webmasters.stackexchange.com/a/115605/75842
https://security.stackexchange.com/a/213854/137710
https://serverfault.com/a/999095/396475
PS: philosophical question on "Hoping there is a simple "go here and download this" answer. Hoping..." because a lot of people hoped for that in the past, and see initial remark at beginning. Let us imagine you go forward and build this magnificent resource with all exhaustive details. Would you be inclined to just share it with anyone, for free? The answer is probably no, for obvious reasons, so the same happened in the past for others that went on the same path as you, and hence the results of now various providers offering you more or less this service (you would need to find details on which formats are parsed, the rate limites, the prices, etc.), but nothing freely available to share.
Now you can just dream/hope that every registries and registrars switch to RDAP AND implement it properly. Then the problem of format is solved once for all. However, the above requirements ("every" + "properly") are not small, and may not happen "soon". Specifically in ccTLDs, where registries are in no way mandated by any external force (except market pressure?) to implement RDAP at all.

How to implement OData federation for Application integration

I have to integrate various legacy applications with some newly introduced parts that are silos of information and have been built at different times with varying architectures. At times these applications may need to get data from other system if it exists and display it to the user within their own screens based on the business needs.
I was looking to see if its possible to implement a generic federation engine that kind of abstracts the aggregation of the data from various other OData endpoints and have a single version of truth.
An simplistic example could be as below.
I am not really looking to do an ETL here as that may introduce some data related side effects in terms of staleness etc.
Can some one share some ideas as to how this can be achieved or point me to any article on the net that shows such a concept.
Regards
Kiran
Officially, the answer is to use either the reflection provider or a custom provider.
Support for multiple data sources (odata)
Allow me to expose entities from multiple sources
To decide between the two approaches, take a look at this article.
If you decide that you need to build a custom provider, the referenced article also contains links to a series of other articles that will help you through the learning process.
Your project seems non-trivial, so in addition I recommend looking at other resources like the WCF Data Services Toolkit to help you along.
By the way, from an architecture standpoint, I believe your idea is sound. Yes, you may have some domain logic behind OData endpoints, but I've always believed this logic should be thin as OData is primarily used as part of data access layers, much like SQL (as opposed to service layers which encapsulate more behavior in the traditional sense). Even if that thin logic requires your aggregator to get a little smart, it's likely that you'll always be able to get away with it using a custom provider.
That being said, if the aggregator itself encapsulates a lot of behavior (as opposed to simply aggregating and re-exposing raw data), you should consider using another protocol that is less data-oriented (but keep using the OData backends in that service). Since domain logic is normally heavily specific, there's very rarely a one-size-fits-all type of protocol, so you'd naturally have to design it yourself.
However, if the aggregated data is exposed mostly as-is or with essentially structural changes (little to no behavior besides assembling the raw data), I think using OData again for that central component is very appropriate.
Obviously, and as you can see in the comments to your question, not everybody would agree with all of this -- so as always, take it with a grain of salt.

Comparison of OData and Semantic Web/Linked Data

I'm trying to get my head around two very different approaches to data sharing: OData and Semantic Web/Linked Data. Is there a good comparison of the two?
As I understand it, OData combines syndication/CRUD (AtomPub), serialisation formats (XML, JSON), a data model, a query language, and some semantics/conventions governing use of those existing technologies. It's primarily intended for exposing data from one system so that others can consume it.
Linked Data is a data model, a rigorous commitment to URIs, an (optional?) serialisation format (RDF/XML), but (correct me if I'm wrong) doesn't say anything about transport, CRUD, etc. It seems intended to allow inferencing across lots of little chunks of data drawn from a wide variety of sources. (Not something of major importance to us right now - we would be synchronising large slabs of data between a small number of sources, and wanting to preserve provenance information).
I'm interested in technologies for sharing data between certain data management platforms, some of which I work on directly. OData seems more appealing as it's very straightforward to explain to developers: implement this API, follow that Atom standard, serialise the data like this. We're already doing something very similar for one platform: sharing XML-serialised data on an Atom feed, with URL parameters used to filter.
By contrast, my past experiences working with RDF have given me an impression of brittle, opaque (massive slabs of RDF/XML), inaccessible (using SPARQL vs SQL) technology - but perhaps I'm confusing the experience of working with a triplestore like Jena with simply exposing an existing database via a linked data API.
Any pointers, comments etc on the differences and similarities between these two approaches in terms of scope, technologies, ease, future potential etc would be great.
I think discussing this in depth is not really what Stackoverflow is meant for, but just to give you some pointers to interesting discussions about differences and overlap:
Oh - it is data on the Web
Microsoft, OData and RDF
One of the key differences seems to be that OData has no means to link data from different sources to each other. Essentially, you're still stuck in a silo.
It might also be interesting to check out various attempts to convert data between the two approaches. See a.o. http://answers.semanticweb.com/questions/1298/has-anyone-written-a-mapping-from-odata-to-rdf .
OData may be easier, but its not better, by any means. SPARQL and RDF (forget RDF/XML, better to look at Turtle) satisfies everything in OData along with providing many more cutting edge features such as:
Federation Extensions
Linked Data
Reasoning and Inference (for the more brave)
Equally, the software supporting the standards is actually quite sophisticated. Most people interested in OData generally come from a Microsoft background, so take a look at dotNetRdf
Here's a comparison matrix:
http://uoccou.wordpress.com/2011/02/17/linked-data-odata-gdata-datarss-comparison-matrix/
Unfortunately the table formatting is pretty horrible, but the content is useful.

What is a concise way of understanding RESTful and its implications?

**update: horray! so it is a journey of practice and understanding. ;) now i no longer feel so dumb.*
I have read up many articles on REST, and coded up several rails apps that makes use of RESTful resources. However, I never really felt like I fully understood what it is, and what is the difference between RESTful and not-restful. I also have a hard time explaining to people why/when they should use it.
If there is someone who have found a very clear explanation for REST and circumstances on when/why/where to use it, (and when not to) it would benefit the world if you could put it up, thanks! =)
REST is usually learned like this:
You hear about REST being using HTTP the way it was meant to be used, and from that you shun SOAP Web Services' envelopes, since most of what's needed by many SOAP standards are handled by HTTP in a simple, no-nonsense way. You also quickly learn that you need to use the right method for the right operation.
Later, perhaps years later, you hear that REST is more than that. REST is in fact also the concept of linking between resources. This often takes a while to grasp the full meaning of, but when you learn this, you start introducing hyperlinks into your responses so that clients can navigate your system without being coupled to how the server wants to name its resources (i.e. the URIs).
Even later, you learn that you still haven't understood REST! And this is because you find out that media types are important. You start making media types called application/vnd.example.foo+json and put hyperlinks in them, since that's already your understanding of REST.
Years pass, and you re-read Fielding's thesis for the umpteenth time, to see if there's anything you missed, and it suddenly dawns upon you what really the HATEOAS constraint is: It's about the client not having any notion of how the server's resources are structured, but that it discoveres these relationships at runtime. It also means that the screen in front of the user is driven completely by what is passed over the wire, so in fact, if a server passes an image/jpeg then that's what you're supposed to show to the user, not an error message saying "AtomProcessor can't handle image/jpeg".
I'm just coming to terms with #4 and I'm hoping the ladder isn't much longer! It's taken me seven years.
This article does a good job classifying the differences in several http application styles from WS-* to RESTian purity. What I like about this post is it reminds you that most of what we call REST really is something only partly in line with Roy Fielding's original definition.
InfoQ has a whole section addressing more of the "what is REST" angle as well.
In terms of REST vs. SOAP, this question seems to have a number of good responses, particularly the selected answer.
I would imagine YMMV, but I found it very easy to start understanding the details of REST after I realised how REST essentially was a continuation of the static WWW concepts into the web application design space. I had written (a rather longish) post on the same : Why REST?
Scalability is an obvious benefit of REST (stateless, caching).
But also - and this is probably the main benefit of hypertext - REST is ideal for when you have lots of clients to your service. Following REST and the hypertext constraint drastically reduces the coupling between all those clients and your server, which means you have more freedom when evolving/developing your service over time - you are not tied down by the risk of breaking would-be-coupled clients.
On a practical note, if you're working with rails - then restfulie is a great little framework for tackling hypertext on the client and server. Server side is a rails extension, and client is a DSL for handling state changes. Interesting stuff, check it out here: http://restfulie.caelum.com.br/ - I highly recommend the tutorial/demo vids they have up on vimeo :)
Content-Type: text/x-flamebait
I've been asking the same question lately, and my supposition is that
half the problem with explaining why full-on REST is a good thing when
defining an interface for machine-consumed data is that much of the
time it isn't. OK, you'd need a really good reason to ignore the
commonsense bits (URLs define resources, HTTP verbs define actions,
etc etc) - I'm in no way suggesting we go back to the abomination that
was SOAP. But doing HATEOAS in a way that is both Fielding-approved
(no non-standard media types) and machine-friendly seems to offer
diminishing returns: it's all very well using a standard media type to
describe the valid transitions (if such a media type exists) but where
the application is at all complicated your consumer's agent still
needs to know which are the right transitions to make to achieve the
desired goal (a ticket purchase, or whatever), and it can't do that
unless your consumer (a human) tells it. And if he's required to
build into his program the out-of-band knowledge that the path with
linkrels create_order => add_line => add_payment_info => confirm is
the correct one, and reset_order is not the right path, then I don't
see that it's so much more grievous a sin to make him teach his XML
parser what to do with application/x-vnd.yourname.order.
I mean, obviously yes it's less work all round if there's a suitable
standard format with libraries and whatnot that can be reused, but in
the (probably more common) case that there isn't, your options
according to Fielding-REST are (a) create a standard, or (b) to
augment the client by downloading code to it. If you're merely
looking to get the job done and not to change the world, option (c)
"just make something up" probably looks quite tempting and I for one wouldn't
blame you for taking it.

Resources