When looking at how websites such as Facebook stores profile images, the URLs seem to use randomly generated value. For example, Google's Facebook page's profile picture page has the following URL:
https://scontent-lhr3-1.xx.fbcdn.net/hprofile-xft1/v/t1.0-1/p160x160/11990418_442606765926870_215300303224956260_n.png?oh=28cb5dd4717b7174eed44ca5279a2e37&oe=579938A8
However why not just organise it like so:
https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png
Clearly this would be much easier in terms of storage and simplicity. Am I missing something? Thanks.
Companies like Facebook have fairly intense CDNs. They may look like randomly generated urls but they aren't, each individual route is on purpose and programed to be handled in that manner.
They aren't after simplicity of storage like you would be if you were just using a FTP to connect to a basic marketing website server. While you may put all your images in a /images folder, Facebook is much too complex for this. Dozens of different types of applications accessing hundreds if not thousands of CDNs and servers world wide.
If you ever build a web app, such as a Ruby on Rails app, and you work with a services such as AWS (Amazon Web Services) you'll also encounter what seems like nonsensical urls. But it's all part of the fast delivery network provided within the architecture. Every time you "push" your app up to the server new urls are generated for each unique resource automatically, css files, JavaScript files, image files, etc all dynamically created. You don't have to type in each of these unique urls individually each time you publish the app, the code simply knows where to look for those as a part of the publishing process.
Example: you tell the web app to look for
//= require jquery
and it returns you http://example.com/assets/jquery-eb3e278249152b5b5d5170b73d9dbf52.js?body=1 in your header.
It doesn't matter that the url is more complex than it should be, the application recognizes it, and that's all that matters.
Simply put, I think it can boil down to two main reasons: Security and Cache:
Security - Adding these long unpredictable hashes prevent others from guessing photo URLs and makes it pretty hard to download photos you aren't supposed to.
Consider what would happen if I could easily guess your profile photo URL and download it, even when you explicitly chose to share it only with friends.
Cache - by adding "random" query params to each photo, you make sure each photo instance gets its own URL. Thus you can store the photo in browser's cache for a long time, knowing that whenever you replace it with a new one, the new photo will have a fresh URL and the browser won't keep showing you the old photo.
If you were to keep the same URL for each user's profile photo (e.g. https://scontent-lhr3-1.xx.fbcdn.net/{{ profile_id }}/50x50.png), and then upload a new photo, either one of these can happen:
If you stored the photo in browser's cache for a long time, the browser will keep showing you the cached version (as long as URL is the same, and cache hasn't expired, there's no need to re-download the image).
If, instead, you only keep the image in cache for short period of time, you end up hitting your server much more then actually needed, increasing the load and hurting performance.
I hope this clarifies it.
With your route scheme, how would you avoid strangers to access the pictures of a private account? The hash also prevent bots to downloads all the pictures.
I get your pain :-) I might not stay with describing how this problem could appear more, but rather let me speak of a solution. Well it is normal that in general code while dealing with hashed value or even base64ed value it seems likes mess to deal with, but with an identifier to explain along, it does not remain much!
I use to work in a company where we use to collate Facebook post, using Graph API get its Insights Object and extract information from it for easy passing around within UI and sending back to our Redis cache store; and once we defined a data-structure in TaffyDB how an object organization is going to look like, everything just made sense with its ability to query the useful finite from long junk looking stream of minified Javascript stream
Refer: http://www.taffydb.com/
The extra values in the URL are useful to:
Track access. This is like when a newspaper appends "&homepage" vs. "&email" to an article URL, so their system knows how a reader found the page.
Avoid abuse and control access. Imagine that a user loaded a small, popular pornographic image into a profile image. They could then hijack the CDN to be a free web host for their porn site. But that code is used internally by the CDN to limit the number of views.
In my application, I have a textarea input where users can type a note.
When they click Save, there is an AJAX call to Web Api that saves the note to the database.
I would like for users to be able to attach multiple files to this note (Gmail style) before saving the Note. It would be nice if the upload could start as soon as attached, before saving the note.
What is the best strategy for this?
P.S. I can't use jQuery fineuploader plugin or anything like that because I need to give the files unique names on the server before uploading them to Azure.
Is what I'm trying to do possible, or do I have to make the whole 'Note' a normal form post instead of an API call?
Thanks!
This approach is file-based, but you can apply the same logic to Azure Blob Storage containers if you wish.
What I normally do is give the user a unique GUID when they GET the AddNote page. I create a folder called:
C:\TemporaryUploads\UNIQUE-USER-GUID\
Then any files the user uploads at this stage get assigned to this folder:
C:\TemporaryUploads\UNIQUE-USER-GUID\file1.txt
C:\TemporaryUploads\UNIQUE-USER-GUID\file2.txt
C:\TemporaryUploads\UNIQUE-USER-GUID\file3.txt
When the user does a POST and I have confirmed that all validation has passed, I simply copy the files to the completed folder, with the newly generated note ID:
C:\NodeUploads\Note-100001\file1.txt
Then delete the C:\TemporaryUploads\UNIQUE-USER-GUID folder
Cleaning Up
Now. That's all well and good for users who actually go ahead and save a note, but what about the ones who uploaded a file and closed the browser? There are two options at this stage:
Have a background service clean up these files on a scheduled basis. Daily, weekly, etc. This should be a job for Azure's Web Jobs
Clean up the old files via the web app each time a new note is saved. Not a great approach as you're doing File IO when there are potentially no files to delete
Building on RGraham's answer, here's another approach you could take:
Create a blob container for storing note attachments. Let's call it note-attachments.
When the user comes to the screen of creating a note, assign a GUID to the note.
When user uploads the file, you just prefix the file name with this note id. So if a user uploads a file say file1.txt, it gets saved into blob storage as note-attachments/{note id}/file1.txt.
Depending on your requirement, once you save the note, you may move this blob to another blob container or keep it here only. Since the blob has note id in its name, searching for attachments for a note is easy.
For uploading files, I would recommend doing it directly from the browser to blob storage making use of AJAX, CORS and Shared Access Signature. This way you will avoid data going through your servers. You may find these blog posts useful:
Revisiting Windows Azure Shared Access Signature
Windows Azure Storage and Cross-Origin Resource Sharing (CORS) – Lets Have Some Fun
I am working on an ASP.NET MVC application where a user can manage its own profile. He can change for example his own photo.
Since a photo is considered as static content, IIS will lock this file, as I understand, and cache it to optimise performance.
The problem happens when the user try to change the image. What I am doing is :
Record the new image.
Start serving the new one. The old file will never be served.
Now I need to remove the old image. But I having access denied exception.
How to tell IIS to unlock this old photo so that I can delete it.
One can imagine setting up a loop that tries to delete the photo and if cannot it will wait and retry... but I have no idea how much time that can take.
Do you have any better solution to tell IIS to unlock file that will never be used ?
I'm currently building into my app a method to allow users to provide images for their profile.
As part of the upload process I'll be creating a couple of different versions of the file for use in various different places in the system.
I am not going to store the images in a database as that just doesn't make sense to me. They will be served up by IIS and I also want to structure it in such a way that it will make migrating it to a CDN in the future a lot easier.
My current plan is to assign each user a GUID or similar unique value that will be stored in the database as part of their profile information, so something like:
ProfileId 1 - UserCode 628B3AF30B0F48EA8D61778084FC73C3
When a user uploads their profile image (or other data) I'll use the code to create a directory on the web server, for instance:
"~/userimages/628B3AF30B0F48EA8D61778084FC73C3/"
Then I'll store a few images in here like:
"~/userimages/628B3AF30B0F48EA8D61778084FC73C3/profilephoto160x160.jpg"
"~/userimages/628B3AF30B0F48EA8D61778084FC73C3/profilephoto25x25.jpg"
This should then make the process of recreating valid urls to the correct images both not hard coded anywhere nasty (like a profile image url in the database) and should make it a predictable process, something like:
var profileImgurl = BaseImageFolder + profile.UserCode + "profilephoto160x160.jpg"
I have no experience of CDN and was wondering if I'm creating any problems I'll struggle to solve later down the line when/if it gets migrated?
Does anyone see any nasty problems with this approach that I haven't thought of?
As far as I can tell people like google have a similar approach to this.
I have a gallery in my rails app that needs to only allow certain images to be shown to specific, logged in users. I am using Paperclip for image processing now, but it saves all images in a public folder available to anyone.
Note that I don't have to use Paperclip if there is a better way, and I already have the login system in place. I just need a way to place the images in a non-public location, but still be able to serve them as needed.
Is it possible to only allow these images to be served to authenticated users?
Here you can find how to change the path of the uploaded pictures. If you have done this. You need to create a controller which serves these static files.
For Example: Paperclip sample app part 2: downloading files through a controller
Is it possible to only allow these
images to be served to authenticated
users?
Yes, you just need to check if the user is logged in in the controller action responsible for the image.