Remove unused files from ActiveStorage+DirectUpload - ruby-on-rails

Consider the following example:
I have a form that includes a multiple files input;
The input file uses ActiveStorage and DirectUpload to upload files automatically as soon as they are included;
After adding some files they are uploaded automatically;
I never click the submit button so those files are never used nor accessible anywhere;
Does Rails support some built-in mechanism for removing these files or is something we have to implement ourselves?
Seems rather trivial to perform a DoS by continuously uploading files until something breaks.
Update 1
Forgot to mention that the example I'm following uses a 3rd party library (Dropzone in this case) and follow the example from the official documentation.
According to the documentation after a file upload we inject a hidden input field with the id of the uploaded blob.

I think the answer of Chiperific is good, since DirectUpload is executed in the submit there is little time for the requests to fail.
I mention requests because as far as i understand it, the process is like this:
The user selects a file from his computer and fills the rest form.
DirectUpload uploads the file to the storage.
The backend receives the body and updates the attachment and either creates or updates a model.
So, what happens if the file upload is successful but model validation is not? you would end up with a file in the storage without his corresponding model or with dirty one.
More information here: https://github.com/rails/rails/issues/31985
The answer then is no, rails does not have a mechanism of removing this files automatically. I guess you could check if the model creation/update was successful and remove the file manually if not.

I think your premise is incorrect.
The input file uses ActiveStorage and DirectUpload to upload files automatically as soon as they are included;
According to the docs:
Active Storage, with its included JavaScript library, supports uploading directly from the client to the cloud.
and
That's it! Uploads begin upon form submission.
So the point of Direct Storage seems to be to bypass some Rails ActiveStorage things and go straight to the service. BUT, it still doesn't happen until the form is submitted.
The example on the non-edge docs shows the user clicking "Submit" before the files are actually uploaded.

Related

Rails upload excel file, validate and save

I am working on a rails engine that uploads a excel file, validates it and if there is no error than it will save it to database.
Now when ever a user mounts the engine and than go to the route provided by engine. He will have a form to upload the excel file. There are two buttons on page, i.e, upload and validate.
Once a user choose the file and when he click on upload i want that file only gets uploaded and don't get saved in db. Once i get the message the file is uploaded successfully, than i will validate the file. If it is a valid excel file with valid data than it will be saved into db. Now i am not getting how to go about it. I have seen this Railscasts video on uploading csv and excel file but here he is performing validation and save operation with import action but i want validation and save operation when user clicks on validate action. This Questions seems similar to my problem but i am not getting how do i access that uploaded file. I don't want that file to be saved in database. I mean when a user click on upload button that file gets only uploaded not saved. Than i will validate that file and save it's content to db.
This may seem very easy and simple questions for some experts but i am very new to rails and i am not sure how to go about it.
Someone please help me with a sample code, so that i can understand the workflow. Also note that both upload and validate actions are on same page. So when a file gets uploaded it needs to be stored somewhere temporarily, this is the first problem i am facing. I can do all the task if someone can tell me workflow with a sample code about uploading excel file. I am only having problem here that as both upload and validate action are on same page, so after upload request it needs to be on that page so that i can validate that file.
Any help would be appreciated, I am very beginner at rails and really confused here.
Two options:
Write code to upload the file and save to DB with a validated column set to false. Then the 'validate' button will locate the unvalidated file, validate it and set validated to true. You could have a periodic job deleting unvalidated files of a certain age. If you do this, use a helper gem like Paperclip.
Forego file upload frameworks and just manually save uploaded files to Tempfile.new 'spreadsheet'. This guide takes you through how to do that. Save that filename to session and use it to validate at a later point. When you're finally ready to persist to DB, again, consider using a helper gem.

Rails s3_direct_upload without file field

My website generates a file in javascript (audio recording) and I then want it to be uploaded to Amazon S3.
I first managed to get the uploading part working by sending the generated file to my server, where it is uploaded. However I would like now to upload the file directly to S3, without going through my server.
So I started to use the s3_direct_upload gem, which works great when using a file_field. However my file is generated by the javascript and :
- The value of a file field has to be set by the user for security reasons
- I do not want the user to have to interact with the upload
I tried to play with the S3Uploader class and to directly add data, without any success for now, it seems that I do not use the correct method.
Does anyone has any idea on how to achieve S3 direct upload without a file field ?
Thanks
Never mind, I found out that the S3Uploader class used by the s3_direct_upload gem has the same methods as the jQuery-File-Upload from which it is derived.
So one can call $("#s3_uploader").fileupload('send', {files: [f]});
And the f File will be uploaded to S3 directly

where is the best place to save images from users upload

I have a website that shows galleries. Users can upload their own content from the web (by entering a URL) or by uploading a picture from their computer.
I am storing the URL in the database which works fine for the first use case but I need to figure out where to store the actual images if a user does a upload from their computer.
Is there any recommendation here or best practice on where I should store these?
Should I save them in the appdata or content folders? Should they not be stored with the website at all because it's user content?
You should NOT store the user uploads anywhere they can be directly accessed by a known URL within your site structure. This is a security risk as users could upload .htm file and .js files. Even a file with the correct extension can contain malicious code that can be executed in the context of your site by an authenticated user allowing server-side or client-side attacks.
See for example http://www.acunetix.com/websitesecurity/upload-forms-threat.htm and What security issues appear when users can upload their own files? which mention some of the issues you need to be aware of before you allow users to upload files and then present them for download within your site.
Don't put the files within your normal web site directory structure
Don't use the original file name the user gave you. You can add a content disposition header with the original file name so they can download it again as the same file name but the path and file name on the server shouldn't be something the user can influence.
Don't trust image files - resize them and offer only the resized version for subsequent download
Don't trust mime types or file extensions, open the file and manipulate it to make sure it's what it claims to be.
Limit the upload size and time.
Depending on the resources you have to implement something like this, it is extremely beneficial to store all this stuff in Amazon S3.
Once you get the upload you simply push it over to Amazon and pop the URL in your database as you're doing with the other images. As mentioned above it would probably be wise to open up the image and resize it before sending it over. This both checks it is actually an image and makes sure you don't accidentally present a full camera resolution image to an end user.
Doing this now will make it much, much easier if you ever have to migrate/failover your site and don't want to sync gigabytes of image assets.
One way is to store the image in a database table with a varbinary field.
Another way would be to store the image in the App_Data folder, and create a subfolder for each user (~/App_Data/[userid]/myImage.png).
For both approaches you'd need to create a separate action method that makes it possible to access the images.
While uploading images you need to verify the content of the file before uploading it. The file extension method is not trustable.
Use magic number method to verify the file content which will be an easy way.
See the stackoverflow post and see the list of magic numbers
One way of saving the file is converting it to binary format and save in our database and next method is using App_Data folder.
The storage option is based on your requirement. See this post also
Set upload limit by setting maxRequestLength property to Web.Config like this, where the size of file is specified in KB
<httpRuntime maxRequestLength="51200" executionTimeout="3600" />
You can save your trusted data just in parallel of htdocs/www folder so that any user can not access that folder. Also you can add .htaccess authentication on your trusted data (for .htaccess you should kept your .htpasswd file in parallel of htdocs/www folder) if you are using apache.

Alternative to X-sendfile in Apache for sending file given a URL?

I'm writing a Rails application that serves files stored on a remote server to the end user.
In my case the files are stored on S3 but the user requests the file via the Rails-application (hiding the actual URL). If the file was on my servers local file-system, I could use the Apache header X-Sendfile to free up the Ruby process for other requests while Apache took over the task of sending the file to the client. But in my case - where the file is not on the local file-system, but on S3 - it seems that I'm forced to download it temporarily inside Rails before sending it to the client.
Isn't there a way for Apache to serve a "remote" file to the client that is not actually on the server it self. I don't mind if Apache has to download the file for this to work, as long as I don't have to tie up the Ruby process while it's going on.
Any suggestions?
Thomas, I have similar requirements/issues and I think I can answer your problem. First (and I'm not 100% sure you care for this part), hiding the S3 url is quite easy as Amazon allows you to point CNAMES to your bucket and use a custom URL instead of the amazon URL. To do that, you need to point your DNS to the correct amazon URL. When I set mine up it was similar to this: files.domain.com points to files.domain.com.s3.amazonaws.com. Then you need to create the bucket with the name of your custom URL (files.domain.com in this example). How to call that URL will be different depending on which gem you use, but a word of warning was that the attachment_fu plugin I was using was incorrectly sending me to files.domain.com/files.domain.com/name_of_file.... I couldn't find the setting to fix it, so a simple .sub method for the S3 portion of the plugin fixed it.
On to your other questions, to execute some rails code (like recording the hit in the db) before downloading you can simply do this:
def download
file = File.find(...
# code to record 'hit' to database
redirect_to 3Object.url_for(file.filename,
bucket,
:expires_in => 3.hours)
end
That code will still cause the file to be served by S3, but and still give you the ability to run some ruby. (Of course the above code won't work as is, you will need to point it to the correct file and bucket and my amazon keys are saved in a config file. The above is also using the syntax for the AWS::S3 gem - http://amazon.rubyforge.org/).
Second, the Content-Disposition: attachment issue is a bit more tricky. Hopefully, your situation is a bit more simple than mine and the following solution can work. Assuming the object 'file' (in this example) is the correct S3 object, you can set the disposition to attachment by
file.content_disposition = "attachment"
file.save
The above code can be executed after the file exists on the S3 server (unlike some other headers and permissions), which is nice and it can also be added when you upload the file (syntax depends on your plugin). I'm still trying to find a way to tell S3 to send it as an attachment and only when requested (not every time), and if you find that, please let me know your solution. I need to be able to sometimes download it and other times save embed images (for example) into HTML. I'm not using the above mentioned redirect but fortunately it seems that if you embed (such as a HTML image tag) a file with the content-disposition/attachment header, and the browser still displays the image normally (but I haven't throughly tested that across enough browsers to send it in the wild).
Hope that helps! Good luck.

Posting files to Rails XML API

I have a Rails application setup to receive file attachments using Paperclip.
Now I need to allow a .net/C# cell phone application to post files along with the XML in the same way (or some other way if necessary: they could encode the image as base64 and send - they tried that initially - including the binary data in the tag that would normally be a file field in the web application, but it did not work.
I have found nothing in the way of documentation and wondering if anybody has experience or advice.
Surprising that there is apparently no documentation for doing this anywhere to be found. I ended up stumbling across a document on the Basecamp website describing how their file attachment process works for API users and used it as a guideline.
http://developer.37signals.com/basecamp/
with help from this article about posting files:
http://www.codevil.com/index.php/2009/05/23/posting-and-getting-files-in-rubyrails/
I modified my initial setup so that, rather than passing the tag in the XML, they first post a file and receive an file ID in response.
Then they post the XML with that reference and their .
Then I use before_validation and after_save callbacks to set the file with Paperclip, and remove the tmp file after the save.

Resources