How to store and retrieve data (files, images etc.) in docker? - docker

I am new to docker. Recently I hosted an docker image(Asp.net core published contents with asp.net core runtime) on heroku. It is working fine. I am using LiteDB, serverless database, for my application.
Every time when I deploy new changes on heroku(new docker image with changes), the old LiteDB data file gets removed.
What I want to do is only to deploy the new docker image that will use the old LiteDB data file that was already on the container(Heroku container).
Is there any way to store data(files, images etc.) on docker and retrieve data anytime when i required? eg. in above case, copy my LiteDB data file to local computer.
IF
I am doing the above work wrong please provide me the correct way to do that.
Thanks.

This is not something you can do on Heroku (VOLUME is unsupported).
Your only solution is to store the data file somewhere else, such as Amazon S3. Or to use a server-side database, such as PostgreSQL.

Related

Apache Marmotta Importer from Docker

I installed Apache Marmotta with Docker using docker pull apache/marmotta on an AWS server. I am able to see Core Services (http://34.229.180.217:8080/marmotta/core/admin/import) via the Import interface in my browser. However, I am not able to import RDF files through the interface.
The files (RDF and TTL) are on both my local machine and on the server. The files are very large (over 2 GB each) and so I'd like to use KiWi Loader to bring them into Marmotta so I can run SPARQL queries against them.
Is there a parameter I can adjust in Marmotta to allow for larger file imports? Otherwise, is it possible to use the KiWi Loader through the Docker installation? Any suggestions would be great.
You can import using the local directory. Just copy your RDF/TTL files to $MARMOTTA_HOME/import. You can define you context base in file-like structure. For example, if you want to store your data in http://34.229.180.217:8080/marmotta/foo, just store your file in $MARMOTTA_HOME/import/foo, here you are using the default context. However, if you want to store in other context create a folder with a URL encoded. For more details of the options that Apache Marmotta provide to import files check the documentation.
IMHO, I have had a lot of problems uploading big files. I think is mostly because Apache Marmotta commit the data after is everything in memory, it is an implementation of KiWi. I don't if you can upload by chunks, and using importer.batchsize property hasn't work much for me.

Move Images From Parse To S3 AWS

I need help moving the images I have from Parse to S3 on AWS. I have viewed numerous supposed guides and GitHub projects, but everything stops short at giving you all the information. One even says, you need GCS bucket set up, but gives no details on how to set up one. Just someone please help me with this. I have the S3 File Adapter in my index.js all set up for the app, but none of the images are there, they are still hosted in parse.
If you are referring to old images that where hosted with parse.com that you want to move across to your own environment then it can be done with the utility tool.
Get all files across all classess in a Parse database. Print file URLs
to console OR transfer to S3, GCS, or filesystem. Rename files so that
Parse Server no longer detects that they are hosted by Parse. Update
MongoDB with new file names.
https://github.com/parse-server-modules/parse-files-utils
Moving forward if you have setup your S3 bucket correctly all new images from your app will be stored there.
https://github.com/ParsePlatform/parse-server/wiki/Configuring-File-Adapters

Creating a dashboard using csv files

I am trying to create a dashboard using CSV files, Highcharts.js, and HTML5. In a local development environment I can render the charts using CSVs both on my file system and hosted on the web. The current goal is to deploy the dashboard live on Heroku.
The CSVs will be updated manually - for now - once per day in a consistent format as required by Highcharts. The web application should be able to render the charts with these new, "standardized" CSVs whenever the dashboard page is requested. My question is: where do I host these CSVs? Do I use S3? Do I keep them on my local file system and manually push the updates to heroku daily? If the CSVs are hosted on another machine, is there a way for my application (and only my application) to access them securely?
Thanks!
Use the gem carrierwave direct to upload the file directly from the client to an Amazon S3 bucket.
https://github.com/dwilkie/carrierwave_direct
You basically give the trusted logged in client a temporary key to upload the file, and nothing else, and then the client returns information about the uploaded file to your web app. Make sure you have set the upload to be private to prevent any third parties from trying to brut force find the CSV. You will then need to create a background worker to do the actually work on the CVS file. The gem has some good docs on how to do this.
https://github.com/dwilkie/carrierwave_direct#processing-and-referencing-files-in-a-background-process
In short in the background process you will download the file temporarily to heroku, parse it out, get the data you need and then discard the copy on heroku, and if you want the copy on S3. This way you get around the heroku issue of permanent file storage, and the issue of tied up dynos with direct uploads, because there is nothing like NGINX for file uploads on heroku.
Also make sure that the file size does not exceed the available memory of your worker dyno, otherwise you will crash. Sense you don't seem to need to worry about concurrency I would suggest https://github.com/resque/resque.

Uploaded files disappear during new push?

I have paperclip working just fine where I can upload files to my site, but whenever I make updates and push a new version of the site all of the files I uploaded via paperclip seem to disappear (All the information that was entered into the database remains though).
I assume the problem is that I haven't pulled the files from the live version of the site, but whenever I do a git pull it tells me everything is up to date. Is there anyway for me to download the files I've uploaded. (I would prefer to not use amazon S3 to store the files currently)
The files you have uploaded are stored at public folder. And public folder is not deployed with code, so your files are assuming to be disappeared.
If, you use amazon S3, then images will be stored at s3 and it will provide a dynamic url to access images. Then, you will be able to access images properly.
You can also save images at dropbox. In this application images are stored at dropbox and running on heroku. You may take the referance:
https://github.com/aman199002/Album-App # open source app to store albums(at dropbox).
http://album-app.herokuapp.com #Url of the application running on heroku.
When you deploy your application to Heroku, your pushed code is compiled into what is called a slug - this is essentially an archive of everything needed to run your application. When you restart or scale your application your original slug is then copied to be run. However, it's a readonly slug so when you upload files they exist on the dyno that received them so if you had multiple dynos your files wouldn't exist across them and they then do not persist when your application is restarted or you push new code nor is there any way to retrieve them.
Your only way to perist files on Heroku is to use an external cloud storage solution like Amazon S3, Rackspace files etc - fortunately it's very simple to have Paperclip use S3 for it's storage mechanism - there's a tutorial at https://devcenter.heroku.com/articles/paperclip-s3

Paperclip + Rails with load balanced machines

How do I get Paperclip image uploads to work on a Rails app running on 8 machines (load-balanced)?
A user can upload an image on the app. The image is stored on one of the machines. The user later requests the image, but it's not found, because it's being requested from another machine.
What's the workaround for this type of problem? I can't use AWS or any cloud service; images have to be stored in-house.
Thanks.
One solution is to use NFS to mount a shared folder that will be the root of your public/system or whatever you called your folder containing paperclip images.
There's a few things to consider to make everything work though :
Use a dedicated server that will only contain assets, this way your hard drive(s) are dedicated to serve your paperclip images
NFS can be expensive. Use it to write files from your App servers to your Asset server only. You'll have to configure your load balancer or reverse proxy or web server to retrieve all images from the asset server directly, without asking an application server to do it over NFS.
a RAID system is recommended on your asset server of course
a second asset server is recommended, with the same specs. You can make it act as a backup server and regularly rsync your paperclip images to it. If the master asset server ever goes down, you'll be able to switch to this one.
When mounting the shared NFS folder, use the soft option, and mount via a high-speed local network connection, for example : mount -o soft 10.0.0.1:/export/shared_image_folder . If you're not specifying the soft option, and the asset server goes down, your Ruby instances will keep waiting for the server to go up. Everything will be stuck, and the website will look down. Learned this one the hard way ...
THese are general guidelines to use NFS. I'm using it on a quite big production website with hundreds of thousands of images and it works fine for me.
If you don't want to use a file share like NFS, you could store the images in your database. Here is a gem that provides a :database storage type for Paperclip:
https://github.com/softace/paperclip_database

Resources