Carrierwave uploader consuming memory - ruby-on-rails

I am using Carrierwave uploader V0.10.0 for my rails project (on RHEL) to upload 310 MB of zip file to server. While the file is being transferred, I could see the server's available memory is decreasing. After the file gets downloaded and when the call returns to the controller to save the uploaded file, I could see the deduction of 3 times (of zip file, say 930 MB) size of memory from available file system space. I am not able to figure out why it is happening.

Related

How to upload file size > 2GB using fsharp.data

I followed this document for handling multipart form data
I can upload ok with file size below 2GB. With the size is greater than 2GB, the application sends nothing.
Does anyone have experience in uploading large files using fsharp.data?

Upload files of more than2 GB from EC2 to S3

I'm working on a rails project and am using aws-sdk 1.66.0. I want to upload a folder to s3 but as i cant actually do that, i'm grabbing the zip file and unzipping it on my EC2 server. Once the files are unzipped, i'm recursively uploading them to the S3. This article depicts something similar and i have followed it.
All is working great for files with 250 MB and 350 MB. However when trying to upload a much larger file of around 2 GB, i am facing error. I see the file being uploaded to the EC2 successfully but the same is not getting uploaded to S3. Sometimes, i see the logs appear indicating the file is being uploaded to S3 but the file doesn't actually get uploaded.
At other times, there are no logs.
I'm not sure what i'm missing or what extra things needs to be done for large file uploads.
I'll be grateful for any help. Please let me know any details are required. Thanks in Advance.

How to resume large file upload (which utilizes chunking)

I am looking at using this file upload process in my asp.net MVC website. http://www.codeproject.com/Articles/830704/Gigabit-File-uploads-Over-HTTP?fid=1870992
It's working great so far. I can see the file being uploaded in chunks (on the server). Essentially they are placed in my upload directory in a directory that has the file name. In that directory, I see the chunks, such as "LargeFile.00000001.csv.tmp", "LargeFile.000000002.csv.tmp", etc....
If the upload is canceled I can still see the chunks on the server. Question: How can I "resume" the upload later on?
I could look at the folder name and file chunk names and decipher where I left off.

Flow for downloading and unzipping files for Heroku and S3?

I'm working with Apple's iTunes EPF data files. I'll daily need to download, unzip and then process 1-3GB of data in .tbz files every day.
I've got a Rails app, hosted on Heroku, with most asset storage being taken care of on S3.
But what I'm having trouble with is the flow for getting the EPF files from Apple.
There are 3 files I'll be downloading. Each are .tbz files varying in size from 1GB to down to ~20MB.
Heroku doesn't have a way to reliable store files, so I assume I need to download the files directly to S3? Then would I somehow unzip them there?
That's where I'm hitting a snag. I know how to actually get the files from Apple and on to S3, but decompressing them is where I'm not following.
And since the data files can be pretty large, minimizing the transfer over S3 is critical to keeping costs down.
Is there a service that can let me download the Apple files to their servers, decompress, and then upload to S3 the necessary files?
Heroku's file system is ephemeral, but you can still write out to /tmp as a temporary scratch space to download, unzip, do whatever processing you need, re-package (if needed), and then uploaded to S3. Because of automatic dyno restarts (or manual restarts), just make sure your service knows how to gracefully resume if interrupted.

How we can upload Large file in chunks in rails?

I am trying to upload a zip file of 350mb - 500mb to server. It gives "ENOSPC" error.
Is it possible to upload file in chunks and receive it on server as one file ?
or
Use custom location for tmpfs, so that it will be independent of system tmp, because in my case tmp is of 128MB only.
Why not use the Web-server uploading feature like nginx-upload and apache-upload
Not sure what it is called in apache but I guess apache too has it
if you are using Nginx
there is also a nginx-upload-progress which can be helpful if you want to track the progress of the upload
Hope this help

Resources