I'm building a Python Flask web application that performs an analysis based on the inputs a user enters. I want to allow the user to send this analysis to their Facebook friends as a link; therefore, I need the analysis page to have a unique URL for every instance.
My approach up to this point has been to build the analysis page URL like so:
website.com/results/<Facebook ID>/<time in seconds since the epoch>
I'm using time.time() for the last parameter and the analysis data (paired with the FB ID and time) is stored in a database. An example URL might look something like this:
website.com/results/1619598720181063/1508036889
Does this approach seem feasible, are there best practices for generating a persistent unique URL, or is there a much better approach I'm overlooking?
Note I'm using Facebook's Send Dialog to share the link.
You might consider letting the database handle it for you by using an auto-generated unique key for the data. Key are guaranteed unique and your URI could be even simpler: results/123
Related
When looking at how websites such as Facebook stores profile images, the URLs seem to use randomly generated value. For example, Google's Facebook page's profile picture page has the following URL:
https://scontent-lhr3-1.xx.fbcdn.net/hprofile-xft1/v/t1.0-1/p160x160/11990418_442606765926870_215300303224956260_n.png?oh=28cb5dd4717b7174eed44ca5279a2e37&oe=579938A8
However why not just organise it like so:
https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png
Clearly this would be much easier in terms of storage and simplicity. Am I missing something? Thanks.
Companies like Facebook have fairly intense CDNs. They may look like randomly generated urls but they aren't, each individual route is on purpose and programed to be handled in that manner.
They aren't after simplicity of storage like you would be if you were just using a FTP to connect to a basic marketing website server. While you may put all your images in a /images folder, Facebook is much too complex for this. Dozens of different types of applications accessing hundreds if not thousands of CDNs and servers world wide.
If you ever build a web app, such as a Ruby on Rails app, and you work with a services such as AWS (Amazon Web Services) you'll also encounter what seems like nonsensical urls. But it's all part of the fast delivery network provided within the architecture. Every time you "push" your app up to the server new urls are generated for each unique resource automatically, css files, JavaScript files, image files, etc all dynamically created. You don't have to type in each of these unique urls individually each time you publish the app, the code simply knows where to look for those as a part of the publishing process.
Example: you tell the web app to look for
//= require jquery
and it returns you http://example.com/assets/jquery-eb3e278249152b5b5d5170b73d9dbf52.js?body=1 in your header.
It doesn't matter that the url is more complex than it should be, the application recognizes it, and that's all that matters.
Simply put, I think it can boil down to two main reasons: Security and Cache:
Security - Adding these long unpredictable hashes prevent others from guessing photo URLs and makes it pretty hard to download photos you aren't supposed to.
Consider what would happen if I could easily guess your profile photo URL and download it, even when you explicitly chose to share it only with friends.
Cache - by adding "random" query params to each photo, you make sure each photo instance gets its own URL. Thus you can store the photo in browser's cache for a long time, knowing that whenever you replace it with a new one, the new photo will have a fresh URL and the browser won't keep showing you the old photo.
If you were to keep the same URL for each user's profile photo (e.g. https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png), and then upload a new photo, either one of these can happen:
If you stored the photo in browser's cache for a long time, the browser will keep showing you the cached version (as long as URL is the same, and cache hasn't expired, there's no need to re-download the image).
If, instead, you only keep the image in cache for short period of time, you end up hitting your server much more then actually needed, increasing the load and hurting performance.
I hope this clarifies it.
With your route scheme, how would you avoid strangers to access the pictures of a private account? The hash also prevent bots to downloads all the pictures.
I get your pain :-) I might not stay with describing how this problem could appear more, but rather let me speak of a solution. Well it is normal that in general code while dealing with hashed value or even base64ed value it seems likes mess to deal with, but with an identifier to explain along, it does not remain much!
I use to work in a company where we use to collate Facebook post, using Graph API get its Insights Object and extract information from it for easy passing around within UI and sending back to our Redis cache store; and once we defined a data-structure in TaffyDB how an object organization is going to look like, everything just made sense with its ability to query the useful finite from long junk looking stream of minified Javascript stream
Refer: http://www.taffydb.com/
The extra values in the URL are useful to:
Track access. This is like when a newspaper appends "&homepage" vs. "&email" to an article URL, so their system knows how a reader found the page.
Avoid abuse and control access. Imagine that a user loaded a small, popular pornographic image into a profile image. They could then hijack the CDN to be a free web host for their porn site. But that code is used internally by the CDN to limit the number of views.
I am working w/ the Event Brite API and I have a need that I am trying to figure out the best approach for. Right now, I have an event that people will be registering for. At the final step of the registration process, I need to ask them some questions that are specific to my event. Sadly, these questions are data-driven from my website, so I am unable to use the packaged surveys w/ Event Bright.
In a perfect world, I would use the basic flow detailed in the Website Workflow of the EB documentation, ending upon the "3rd Party Next Steps" step (redirect method).
http://developer.eventbrite.com/doc/workflows/
Upon landing on that page, I would like to be able to access the order data that we just created in order to update my database and to send emails to each person who purchased a seat. This email would contain the information needed to kick off the survey portion of my registration process.
Is this possible in the current API? Does the redirect post any data back to the 3rd party site? I saw a few SO posts that gave a few keywords that could be included in the redirect URL (is there a comprehensive list?). If so, is there a way to use that data to look up order information for that order only?
Right now, my only other alternative is to set up a polling service that would pull EB API data, check for new values, and then kick off the process on intervals. This would be pretty noisy for all parties involved, create delay for my attendees, and I would like to avoid it if possible. Thoughts?
Thanks!
Here are the full set of parameters which we support after an attendee places an order:
http://yoursite.com/?eid=$event_id&attid=$attendee_id&oid=$order_id
It's possible that order_id and attendee_id would not be a numeric value, in which case it would return a value of "unknown." You'll always have the event_id though.
If you want to get order-specific data after redirecting an attendee to your site, you can using the event_list_attendees method, along with the modified_after parameter. You'll still have to look through the result set for the new order_id, but the result set will be much smaller and easier to navigate. You can get more information here: http://developer.eventbrite.com/doc/events/event_list_attendees/
You can pass the order_id in your redirect URL in order to solve this.
When you define a redirect URL, Evenbrite will automatically swap in the order_id value in place of the string "$order_id".
http://your3rdpartywebsite.com/welcome_back/?order_id=$order_id
or:
http://your3rdpartywebsite.com/welcome_back/$order_id/
When the user completes their transaction, they will be redirected to your external site, as shown here: /http://developer.eventbrite.com/doc/workflows/
When your post-transaction landing page is loaded, grab the order_id from the request URL, and call the event_list_attendees API method to find the order information in the response.
Here's the situation: I've got an application where you begin at a screen showing a list of countries. You choose a country, and this becomes the ambient country that the application uses until you change it. This ambient country is stored in the Session so the application doesn't have to pass around a CountryId in every single url. But I also want to support permalinks to country specific content, so I guess there needs to be a "Get Permalink" button, which creates a permalink that does contain the CountryId, because it obviously has to work independent of the current session.
Here's the question: My understanding is that because selecting a country changes the session state, one should only do it via POST. But then if the user comes in via GET with a permalink containing, e.g. CountryId=123, what should happen? Should the page update the Session with country 123? In this case, it would be breaking the rule that you can change the session ONLY via POST. But if it doesn't change the session, then all the code that relies on the session won't work, and I'd have to have code redundant ways to generate the page.
OR, should the page have some sort of mechanism for saying "use the session value, but override with any query string value if there is one (and don't modify the session at all)?
OR, am I misunderstanding the POST rule entirely?
The real issue here is the fact that you are using a Session. You cannot provide permalinks because the data that you have stored in the session might have expired when the user follows this links later. So you must somehow persist this data into a more durable datastore when someone requests you to generate a permalink. So when a user asks for a permalink you will go ahead and persist all those search criteria that were used to perform the search into your data store and obtain an unique id that will allow you to fetch them later. Then give the user the following permalink: /controller/search/id where the id represents the unique identifier tat will allow you to fetch the criteria from your data store, perform the search and reconstruct the page as it was.
What I would like to do is have my admin user be able to see - in real time (via some AJAX/jQuery niceness) - what my user's are doing.
How do I go about doing that ?
I assume it has something to do with session activity - and I have started saving the session to the db, rather than the cookie.
But generally speaking, how do I take that info and parse it in real time ?
I looked at my session table and aside from the ids (id and session_id), I see a 'data' field. That data field stores a hash - which I can't make any sense of (looks like an md5 hash).
How would I use that to see that User A just clicked on Link B, and right after that User B clicked on link A, etc. ?
Is there a gem - aside from rackamole - that might be able to help me?
You might want to check out Mixpanel. They are easy to setup and have some of what you are asking for.
The session data only contains the values stored in the session[]-hash from the user. It doesn't store which action/controller was called, so you don't know which "link was clicked".
Get the activity of your users:
Besides rackamole you have two options IMHO.
Use a before_filter in your ApplicationController to store the relevant info you are interested in. (Name of controller, action or URI, additional parameters and id of the logged in user for example).
Use an AJAX-call at the bottom of each page which posts back the info you are interested in (URI, id of logged in user, etc.) to your server. This allows faster response times from the server, as the info is stored after the page has already been delivered. Plus, you don't have to use a Rails-request to store it. The AJAX-request could also be calling a simple PHP-script writing the data to disk. This is much faster.
Storing this activity:
Store this data/info either in the database or in a logfile. The database will give your more flexibility like showing all actions from one user, or all visitors for one page, etc. The logfile solution will give you better performance.
Realtime vs. Oldschool:
As for pulling out your collected data in realtime, you have to build your own solution. To do this elegantly (without querying your server once a second to look if new data has arrived) you'll need another server process. Search for AJAX Push for more info.
Depending on your application I'd ask myself if realtime notifications for this are really necessary (because of all the hassles of setting this up).
To monitor the activity on your site, it should be enough to have a page listing the latest actions and manually refresh it (or refresh it automatically every ten seconds).
Maybe you can test https://github.com/raid5/acts_as_scribe#readme
It works with Rails 3 too.
I'm designing (and developing) web software that will allow the general public to sign up for a service, become a customer, and exchange fairly sensitive data.
I'm working through the documentation and the tutorials, and of course the RESTful pattern adopted by the default routing in ASP.NET MVC is to do URL's like this: /customer/edit/3487.
I guess I am a little squeamish about displaying such technical details as customer ID in the URL bar.
What do the smart kids do these days? Does RESTful have to mean "put your record ID's on display"?
Edit: In an ASP.NET WebForm I would have stored this in the session, I think. But I'm finding that this is discouraged in ASP.NET MVC.
Edit:
I do not intend to rely on security through obscurity.
That still doesn't mean its a good idea to give the users any ideas, or any information about the underlying data. Let's say I have an app that's publishing information about the different business in a Chamber of Commerce, to be arbitrary. Once you are logged in, you have an administrative right to click on every business in the directory and see them all - but the application is supposed to spoon feed them to you as search results or the like. Just because the user technically is allowed to access all records, this doesn't mean it should be trivial for you to write a screen scraper that downloads all of my content in a few minutes. As well, the user can just look at customer ID's and make a guess about how many customers I might have. There's lots of good reasons not to display this.
As long is there is proper authentication and authorization being done on server side then displaying ids is not an issue.
Otherwise just try to encrypt the particular id or username in the URL, this way it will be difficult for the attacks.
You don't have to put the Id in the Url, you just need to use a unique value or unique combination of values to find the data you want to display.
I'd think that the actual bussinesses name would be good and also look good in the Url. So you would have something like this:
/Business/View/theouteredge/
Or if the business name is not unique you could use a combination of business name and zip/postal code.
/Business/View/theouteredge/78665/
You would have to write a new route to handle this.
routes.MapRoute(
"Bussiness",
"Business/{Action}/{name}/{zip}/",
new { controller = "Business", action = "Index", Name = "", PostalCode = "" }
);
All this action would need to be secured with the [authorize] attribute, or the controller its self.
If you also decorate your actions with [authorise] then if another user does use the id from another user, they will automatically be challenged for a login.
It's 6 of one and 1/2 dozen of the other as to whether you use an ID or a Name. Eventually they both resolve to a record.
The important thing is to only allow authorised persons to view the data by allowing them to log in.
I've got a site which has sensitive data but only if you are the holder of that info can you see it and I do that by decorating my actions and checking rights etc.
I think that putting an ID in a url is fine -- as long as it is a Surrogate Key. The key has no value, except to identify a record. Just make sure that the requester is authorized before you send sensitive data back to the client.
Update:
I can see how having a number as part of your URL is undesirable. After all, a URL for a web app is part of the user interface, and exposing such internal details can take away from the UI's elegance. However, you are faced with limited options.
Somehow, you have to identify the resource that you want to get. The crux of REST (IMO) is that a request to a server for a particular resource must be described entirely by the request. The key for the item you want has to be encoded into the HTTP GET somehow. Your options are: put it into the URL somehow, or add it to a cookie. However, adding a key to a cookie is frowned upon.
If you look at this site you will see the question id in the url. If you view your profile you will see your username. So you would probably want to use usernames intead of an id.
If you're really concerned about it you can use a Guid, which isn't very user friendly but would be very hard to guess. :)
If you use some other way than customer id simply because you're concerned about security, then that means you're using security through obscurity, which is a bad idea. Proper authorization would require something like you either 1) have to be logged in with that customer id, or 2) be logged in as an admin, to have that request succeed.