Rails 7: Fuzzy FIle Matching for public files/images? - ruby-on-rails

Scenario: I'm uploading images to Rails ad hoc. I have the files linked from the database using a external ID # field (the images need to be attributed to the original artist, and this is the easiest way to do it). The files will be uploaded to public.
Problem: Some of the images are .jpg, others .png (might be others later). I can't put the files under assets/images since this requires the asset pipeline to be refreshed each time. ActionController::Base.helpers.resolve_asset_path(#external.id) and image_path(#external.id) work great when it's in the pipeline since I don't have to specify the exact file extension.
The only way I was able to do it was by concatenating/mapping, such as:
image_path("/#{#external.id}.jpg") || image_path("/#{#external.id}.png")
Not ideal, and not sure if there's a better way? Later, I'll probably look into another storage solution like a BLOB field or S3 bucket, but I wanted to know if there was a simple way to do it this until then.

Related

iOS file upload - original filename

I have a simple HTML file upload snippet that works under iOS as well. However my problem is that the filename of the uploaded file will always be 'image.jpeg'. Is there a way to get the original filename - i.e. 'IMG_0001.jpg' instead? The major issue is that if I have 2 files selected they both have the name of 'image.jpeg' as opposed to their unique names.
Safari on iOS will always make the name of the uploaded file image.jpeg, presumably for security/privacy purposes. You need to generate your own name for the files, which is a good idea in general for uploaded files: you never want to trust the client too much.
If you are targeting more than just Safari on iOS, you will still need to handle this case because it is reasonable that people might upload multiple files with the same name, but originally located in different directories.

Rails File Upload - Scan files; and separate folders for each user

Does Paperclip scans the files for errors, malicious software, viruses before uploading to database? If no, what are the viable solutions.
And, is it better to first create a separate folder for each user before they upload files and store in their respective folders? What are the merits and demerits of it? Is it possible to specify this with Paperclip?
Thanks
Re viruses etc, this might be useful - Rails / Heroku - How to anti-virus scan uploaded file?
Re storing each user's files in a seperate folder: the conventional way would be to store every FILE in a separate folder, and then link the files to the user via the database (eg a user_id field on the file records). As far as merits and demerits go, besides it not being conventional, one thing to bear in mind would be that if a user's files are stored in a single folder, then if they upload a two files with the same name then the second would overwrite the first (unless of course you put them in separate folders within the user's folder). This could be a good thing or bad thing depending on your requirements.
BTW - a slightly pedantic note: files aren't uploaded to the database (at least not normally) - they are uploaded to a filesystem, and a corresponding record is created in the database. The files don't go into the database (as i say, usually: it is possible to store files as blobs in the DB but it's not good practise and not usual).

Dynamically generate file package from S3 assets

I have a service set up where when the user registers, they are able to download a file to their device. The file is dynamically generated from some local information from our database such custom field information (username, email, web url, etc) and then account specific assets stored on S3 (avatar, icons, background art).
I'm not sure of the best way to handle these S3 files as part of the generation process.
Using a Ruby Tempfile class generates a file that has a unique filename that doesn't match what we are expecting. Using Ruby's File class generates the files we want, but it also litters the filesystem with a bunch of files and I worry won't handle concurrent requests for the same assets properly. We're also using Heroku, and they tend to frown on that from what I read.
What's a best practice/recommended way to handle dynamically generating files based on a mix of local and remote assets and then presenting it to the user?

where is the best place to save images from users upload

I have a website that shows galleries. Users can upload their own content from the web (by entering a URL) or by uploading a picture from their computer.
I am storing the URL in the database which works fine for the first use case but I need to figure out where to store the actual images if a user does a upload from their computer.
Is there any recommendation here or best practice on where I should store these?
Should I save them in the appdata or content folders? Should they not be stored with the website at all because it's user content?
You should NOT store the user uploads anywhere they can be directly accessed by a known URL within your site structure. This is a security risk as users could upload .htm file and .js files. Even a file with the correct extension can contain malicious code that can be executed in the context of your site by an authenticated user allowing server-side or client-side attacks.
See for example http://www.acunetix.com/websitesecurity/upload-forms-threat.htm and What security issues appear when users can upload their own files? which mention some of the issues you need to be aware of before you allow users to upload files and then present them for download within your site.
Don't put the files within your normal web site directory structure
Don't use the original file name the user gave you. You can add a content disposition header with the original file name so they can download it again as the same file name but the path and file name on the server shouldn't be something the user can influence.
Don't trust image files - resize them and offer only the resized version for subsequent download
Don't trust mime types or file extensions, open the file and manipulate it to make sure it's what it claims to be.
Limit the upload size and time.
Depending on the resources you have to implement something like this, it is extremely beneficial to store all this stuff in Amazon S3.
Once you get the upload you simply push it over to Amazon and pop the URL in your database as you're doing with the other images. As mentioned above it would probably be wise to open up the image and resize it before sending it over. This both checks it is actually an image and makes sure you don't accidentally present a full camera resolution image to an end user.
Doing this now will make it much, much easier if you ever have to migrate/failover your site and don't want to sync gigabytes of image assets.
One way is to store the image in a database table with a varbinary field.
Another way would be to store the image in the App_Data folder, and create a subfolder for each user (~/App_Data/[userid]/myImage.png).
For both approaches you'd need to create a separate action method that makes it possible to access the images.
While uploading images you need to verify the content of the file before uploading it. The file extension method is not trustable.
Use magic number method to verify the file content which will be an easy way.
See the stackoverflow post and see the list of magic numbers
One way of saving the file is converting it to binary format and save in our database and next method is using App_Data folder.
The storage option is based on your requirement. See this post also
Set upload limit by setting maxRequestLength property to Web.Config like this, where the size of file is specified in KB
<httpRuntime maxRequestLength="51200" executionTimeout="3600" />
You can save your trusted data just in parallel of htdocs/www folder so that any user can not access that folder. Also you can add .htaccess authentication on your trusted data (for .htaccess you should kept your .htpasswd file in parallel of htdocs/www folder) if you are using apache.

Hiding files in public folder with random names paperclip

I have a problem, I would like to store files outside the public folder using paperclip (to make them private), it would be very simple, just configure the :path option, but to retrieve those files (many of them images) I would need a controller method (ie. get_file), making very slow when you display a list of files with the "thumb" images. I was thinking to use a random name to store the files in the public, some cryptic name with SHA1 or something, how hard would it be to access a file.?
As long as you make sure that the directory is never listed and the name is really random (does not depend on the real name) and long enough (16 alphanumeric characters should be ok) this is a feasible and common method to do it.

Resources