RESTful URL structure for displaying local data - url

I am developing a web app which displays sales from local stores around the United States. The sales and stores listed vary by location. Is there a RESTful URL scheme for describing this information while avoiding duplicate content?
Specifically, the app needs to list local stores, and list items sold at a particular store. Zip (postal) codes seem a convenient way to refer to location, so consider this scheme:
/stores/zip - list stores near zip, with links to particular stores
/store/name/lat+long - list items at a particular store
There is a problem. The page at /store/name/lat+long needs to link back to the list of stores, but which zip code should it choose? Say it chooses the zip code closest to the lat+long coordinate. A user might arrive at a particular store page from a link on /stores/zipA yet the store page could refer them back to a slightly different list, /stores/zipB.
We could solve that problem by carrying the zip code information forward. So the list at /stores/zip, could link to /store/name/lat+long/zip. This is not logical, however, because all information needed to identify a store is provided by the lat+long coordinate; the zip code is redundant. In fact the same page content would have duplicate URLs.
Another solution would be to save the last-viewed zip code as a cookie, but that's not RESTful. Is there a solution?

Add that information as an optional query parameter.
/stores/name/lat+long?search=zip
The /stores/name/lat+long represents the resource uniquely, while the optional query parameter provides the extra information you need for your breadcrumb back to their original search.
If you have links that come from somewhere other than a search for that zip code, then you could just leave the query parameter off. When the query parameter is missing, default to linking back to the closest zip code, or leaving the breadcrumb link off entirely.
Another option would be to just let the browser history do this for you, by using JavaScript to navigate the user to the previous page in their history:
Back to search

Related

Requesting input on conceptual ideas for disguising browser history

I am working with a Domestic Violence support organisation to build a website and have been asked to provide a "Quick Exit" function.
The purpose is to enable the user to exit the site quickly without closing the browser. I have seen such buttons on similar sites and the normal scenario is that they simply cause a Google search page to be shown. (easy but doesn't hide history)
I am looking for ideas to improve on this function to hide/disguise the history stored in the browser as this is currently a fairly significant flaw with the Quick Exit buttons I've seen to date.
I had a concept but I am looking for input on either fleshing out my concept, or other alternative directions to consider.
My concept was to have two domains: let's call them dv-site.com and decoy-site.com. The former being the source of domestic violence support information and the latter being some random content, could be anything, lets just say weather information for the sake of the conversation.
If a user navigates directly to dv-site.com the server redirects to decoy-site.com but also attaches some session specific, or perhaps single use query string or similar.
decoy-site.com validates the query string and, if valid, loads dv-site.com within an iframe or something like that so from the users perspective they are just looking at dv-site.com, though the domain recorded in history is decoy-site.com.
Links within the iframe loaded site would similarly be redirected with the same or a new query string.
If a user was to click on the browser history and go directly to decoy-site.com it would not be able to validate the query string and would just load the decoy site like a normal site. i.e. just showing weather information that exist on that site.
Domestic violence is a serious systemic issue and I would love some input from anyone who has more technical knowledge than I do on fleshing out this concept.
Other aspects I am unsure of how to tackle;
ensuring that dv-site.com can get crawled and ranked by search engines, even though users are all redirected, as it is imperative that it appears in search results so it can be found
technical aspects of a redirect that does not appear in history.
I'm unsure if it's possible to do this without all content and engagement being attributed to the decoy-site..
For the redirect, I believe that HTTP redirects do not get stored in history. You can use a 302 redirect for that. HTTP has a set-cookie header that lets you record a cookie - coupled with the headers here, you can give the decoy site access without recording it in history. Then, delete the cookie.
As far as pagerank goes, you could add a line to robots.txt as described here (the last point) to force the bot to scrape using a query parameter. Then in the backend, return the dv site only if that parameter is passed, otherwise redirect. If the googlebot removes query params when publishing, it will work out. Otherwise, it might fail.
Best of luck.

Using DELETE or PUT to delete a foreign key in a database?

We are currently creating a web app that mimics how flickr works as a project.
In this app, we have a galleries and photos.
In each gallery, we store the photo ids in an array.
gallery.photos = [ photoId1, photoId2, photoId3 ]
If we want to delete a photo from the gallery, however the photo stays on the database and can be accessed if you go to the user's profile but not from the gallery.
so if we do DELETE url/gallery/photos/photoId3 then GET url/gallery/photos/photoId3, it would return an Error 404.
There's an argument currently happening on whether we should use DELETE or PUT.
some say DELETE as we are deleting things and photo isn't accessible from that url.
others say PUT as we are just editing the list of photo ids.
So, my question is, is there a common convention when it comes to this problem ?
is there a common convention when it comes to this problem ?
The important thing to recognize is that DELETE and PUT are of the transfer of documents over a network domain.
We use the standard methods in standard ways so that general purpose components can understand what the messages mean and do intelligent things.
If the semantics of the request are "the target uri should answer GET requests with a 404", then DELETE is appropriate. The underlying details of how your implementation achieves this are irrelevant.
Note that the definition of DELETE is pretty explicit about the limits of the semantics
If the target resource has one or more current representations, they might or might not be destroyed by the origin server, and the associated storage might or might not be reclaimed, depending entirely on the nature of the resource and its implementation by the origin server (which are beyond the scope of this specification).
Where things get really messy: HTTP gives us agreement on the meaning of the semantics of messages, but does not restrict implementations. In particular, it is perfectly valid that a DELETE of one resource would also change the representation of some other resource, or vice versa.
Imagine, if you will, that the gallery is a web page (/gallery/1/photos/webpage.html), with links to images, including a link to /gallery/1/photos/3.
DELETE /gallery/1/photos/3
That removes the association between the URI and the image data, but it doesn't (necessarily) change the web page, so you get a broken link.
PUT /gallery/1/photos/webpage.html
That takes the link out of the web page, but of course the image can still be accessed directly via its URI.
(Note: if your profile was using the same URI for the image as your gallery, then this is more likely to be the model you want to use. We take the link out of this web page, but not out of profile.html. DELETE would produce 404's for ALL web pages that link to the picture).
If you want both - when the DELETE happens, the web page should be updated automatically, or vice versa - you can do that within your implementation (side effects are allowed). BUT... general purpose components will not necessarily know that both things have happened. For example, a general purpose cache isn't going to know that the webpage changed when you DELETE the image. And it won't know that you removed the image when you edit the web page.
Which is to say, we don't have a standard way to include in the HTTP response metadata that describes the other resources that have been changed by a request.
You can, of course, include that information in the response so that a bespoke component can do intelligent things.
DELETE /gallery/1/photos/photoId3
200 OK
Content-Type: text/plain
Deleted: /gallery/1/photos/photoId3
Changed: /gallery/1/photos/webpage.html
GET /gallery/galleryId/photos/photoId3 would return an Error 404 as photo isn't in that gallery, however GET /photos/photoId3 would still return the photo assuming you have the correct permissions
The good news: general purpose components don't know that there is any relationship between /gallery/galleryId/photos/photoId3 and /photos/photoId3. Again, the fact that they share information under the covers is an implementation detail hidden behind the HTTP facade.

Firestore billing for reading a document with subcollections

I'm making an app where it stores how many minutes a user has studied with my app. My Firestore database starts with a "users" collection, and each user has their own document that is named by their userID generated in Auth.
My question is if I read their userID document, which has many documents in its sub collections, does that count as one read or does it also count the number of documents in the sub collections as well?
Thank You in advance.
The answer here from Torewin is mostly correct, but it missing one important detail. It says:
if you retrieve a document; anywhere, it counts as a read
This is not entirely true. Cached document reads are not billed as reads. This is one important feature of the Firestore client SDKs that helps lower billing costs. If you get a single document using the source option cache (options are "cache" or "server" or "default"), then the cache will be consulted first, and you get the document without billing. The cache is also used for query results when the app is offline.
The same is true for query results. If a document comes from cache for some reason, there is no billing for that read.
I am uncertain what Torewin means by this in comments: "They recommend you make multiple reads instead of 1 big one because you will save money that way". All reads are the same "size" in terms of billing, considering only the cost of the read itself. The size of the document matters only for the cost of internet egress usage, for which there is documentation on pricing.
It's worth noting that documents can't "contain" other documents. Documents are contained in collections or subcollections. These collection just have a "path" that describes where they live. A subcollection can exist without a "parent" document. When a document doesn't exist, but a collection is organized under it, the document ID is shown in italics in the console. When you delete a document using the client API, none of its subcollections are deleted. Deletes are said to be "shallow" in this respect.
If you are referring to is it 1 read to access a Document (in this case your generatedUserID) from FireStore?
I would imagine the answer would be yes.
Any query or read from Firestore only pulls the reference that you are mapping to. For example, if you grab the 3rd document in your User -> userID -> 3rd document, only the 3rd document will be returned. None of the other documents in that collection or any of the collections besides the userID.
Does that answer your question or are you asking something completely different?
For reference: https://firebase.google.com/docs/firestore/pricing#operations
Edit: Each individual Document that is pulled from the query will be charged. For example, if you pull the parent collection (with 6 documents in it), you will be charged for all 6 documents. The idea is to only grab the documents you need or use a cursor which let's you resume a long-running query. For example, if you only want the document pertaining to use data on a specific date (if your data is set up like that), you'd only retrieve that specific document and not retrieve all of the documents in the collection for the other days.
A simple way of thinking about it is: if you retrieve a document; anywhere, it counts as a read.

How to get the specific <Ad ID, Campaign ID> that was clicked on from the landing page?

I've been searching for a solution to this, which I thought would be trivial, and seems pretty much impossible.
Here's the situation: I set up an AdWords campaign, ad groups and ads. I point them to www.mysite.com
Once visitors arrive to my site through one of my ads, I want to know which exact ad they clicked on (and campaign, as apparently the ad id isn't globally unqiue). Is this possible?
I first tried by enabling Destination URL auto-tagging, but seems like the gclid parameter is pretty much useless.
Then I looked at the UTMZ cookie, but it seems like at most (correct me if this isn't the case), you get the campaign number (is this even the ID in AdWords?) and the keywords searched or the ad's keywords, one of those. Not anything I can uniquely identify the ad by, right?
Finally, I looked at ValueTrack, although again correct me if I'm wrong, but this would mean manually changing the destination URL of each of my ads in AdWords, right? Even doing this, I'm not sure I can get something that lets me uniquely identify the clicked ad. Is {creative} what I want? It's described in the docs as the "unique ID of the creative", does that mean this includes the Campaign.Id and the AdGroupAd.Id?
Thanks!
There is a way to do what you want using tracking templates.
Navigating to auto-tracking and tracking template settings:
Log in to Adwords, and click "Campaigns".
Click "Shared Library" in the bottom left corner.
Under "Shared Library", click "URL options".
You'll now get these options:
These options are set for the entire account. I think it is possible to override the tracking template for individual campaigns, ad groups and ads. Here is what they mean:
Auto-tagging
Auto-tagging means that when a user clicks on an ad, they will go a URL with the gclid parameter appended, for example http://yourwebsite.com/?gclid=example. This value is useful for some things, such as for offline conversions, so your website should save it.
Tracking template
Tracking template means that when a user clicks on an ad, they will be directed to this URL. Interestingly, it does not have to be your website, as long as the URL redirects to your website. For instance, you could set it up to look like this:
http://trackingcompany.com/?url={lpurl}&campaignid={campaignid}
{lpurl} and {campaignid} are placeholders which AdWords recognises and knows how to handle. So, for example, if a user clicks on an ad, they could go to:
http://trackingcompany.com/?url=http%3A%2F%2Fyourwebsite.com&campaignid=543987
trackingcompany.com must redirect the user now to http://yourwebsite.com, otherwise, it is in violation of AdWords policy and your ads could be rejected.
Now, here's the clever bit that I didn't realise because all of this is badly documented: you don't have to use a third-party tracking company to get access to things like campaign id. You can just reuse your own website! Just set your tracking URL to something like this:
{lpurl}?campaignid={campaignid}
You see that? {lpurl} will get replaced with the landing page, which is your website! So the user in our example would go to this URL upon clicking an ad:
http://yourwebsite.com?campaignid=543987
It's not clear to me whether example.com must now redirect to the landing page URL without those parameters, or not.
I can't find documentation on these placeholders anywhere, but these are the ones that I've found work:
{lpurl} landing page URL
{campaignid} campaign ID
{adgroupid} ad group ID
{creative} creative or ad ID
{keyword} keyword
Auto-tagging and tracking template together
If you enable both auto-tagging and a tracking template, then AdWords would behave as it normally does with a tracking template, appending a gclid query parameter.
Addendum: ignoring these new query parameters in Google Analytics:
If you use Google Analytics, you probably want to ignore these query parameters, merging hits with these parameters with hits that don't have them. You can do that by setting the "Exclude URL Query Parameters" option to aw_campaignid,aw_adgroupid,aw_creative,aw_keyword. You can't apply this retroactively, so do this before making any AdWords changes.
As far as I know there is no value track for campaign or ad group ID. You could just append something to the end of each ad's destination URL based on the campaign & ad group, but that is a bit of a chore.
If you link your Google Analytics & AdWords accounts and use auto-tagging in AdWords you can get the information you want in GA through the AdWords report (shows campaign, ad group, keyword etc). GA is able to use the gclid to retrieve data from AdWords, and I think you can then use the GA API to get the campaign data back out if you want it.
You could:
turn off auto-tagging
pull the entire account into an excel file
insert a new column for each desired output variable (Campaign, ad id [like Headline?])
trim, lower, and find/remove spaces from the target columns (so something like: campaignname, compressedheadline)
then concatenate that column with your destination URLs and a UTM string like this:
?utm_source=google&utm_medium=ppc&utm_content=compressedheadline&utm_campaign=campaignname
use this function and replace with the appropriate columns
=concatenate([dest url column],"?utm_source=google&utm_medium=ppc&utm_content=",[compressedheadline column],"&utm_campaign=",[campaignname column])
if the functions for the parts between the quotes break the formula, paste them into their own cells and then reference the cells in the concatenate function.
Drag this formula down the entire account,
Copy / Paste Special / Paste Values of the new Destination URLs over the old Destination URLs.
Remove unnecessary columns that have been created between Campaign, Ad Group, Headline, Description Line 1, Description Line 2, Display URL and your new Destination URL.
Then highlight just the Campaign, Ad Group, Headline, Description Line 1, Description Line 2, Display URL and your new Destination URL and you can paste this into the AdWords Editor under "add/update multiple ads.
You can get this data from the CLICK_PERFORMANCE_REPORT - The only downside to this, is that this report can only be run for 1 day. so if you needed a month worth of data - you would have to run about 30 reports -
The ad Id is the "CreativeId" - you can get the campaignId and Adgroup ID as well from this report - there is 1 row for each click - (GCLID) these are unique.
see this link for more info on what fields are available
https://developers.google.com/adwords/api/docs/appendix/reports#click

What is the right way to handle permalinks when the page depends on the session?

Here's the situation: I've got an application where you begin at a screen showing a list of countries. You choose a country, and this becomes the ambient country that the application uses until you change it. This ambient country is stored in the Session so the application doesn't have to pass around a CountryId in every single url. But I also want to support permalinks to country specific content, so I guess there needs to be a "Get Permalink" button, which creates a permalink that does contain the CountryId, because it obviously has to work independent of the current session.
Here's the question: My understanding is that because selecting a country changes the session state, one should only do it via POST. But then if the user comes in via GET with a permalink containing, e.g. CountryId=123, what should happen? Should the page update the Session with country 123? In this case, it would be breaking the rule that you can change the session ONLY via POST. But if it doesn't change the session, then all the code that relies on the session won't work, and I'd have to have code redundant ways to generate the page.
OR, should the page have some sort of mechanism for saying "use the session value, but override with any query string value if there is one (and don't modify the session at all)?
OR, am I misunderstanding the POST rule entirely?
The real issue here is the fact that you are using a Session. You cannot provide permalinks because the data that you have stored in the session might have expired when the user follows this links later. So you must somehow persist this data into a more durable datastore when someone requests you to generate a permalink. So when a user asks for a permalink you will go ahead and persist all those search criteria that were used to perform the search into your data store and obtain an unique id that will allow you to fetch them later. Then give the user the following permalink: /controller/search/id where the id represents the unique identifier tat will allow you to fetch the criteria from your data store, perform the search and reconstruct the page as it was.

Resources