In-memory etcd storage - storage

I'd like to run etcd on k8s with in-memory storage. We are using etcd to store just temporary data, and there's no problem losing them in case pod(s) restart.
We are using for etcd deployment Helm chart bitnami/etcd. Is it possible to configure it, so we would be able to deploy /var/lib/etcd as tmpfs and not k8s PVC?
Our VMware storage is incredibly slow, and it's causing problem with fdatasync within etcd cluster. So, at least until we will have fast SSD storage, in-memory storage would be solution.

I think you can turn off persistence and use an EmptyDir:
https://github.com/bitnami/charts/blob/master/bitnami/etcd/README.md
set persistence.enabled: false

Related

Share volume in docker swarm for many nodes

I'm facing a big challenge. Trying run my app on 2 VPS in docker swarm. Containers that use volumes should use shared volume between nodes.
My solution is:
Use plugin glusterFS and mount volume on every node using nfs. NFS generate single point of failure so when something go wrong my data are gone. (it's not look good maybe im wrong)
Use Azure Storage - store data as blob ( Azure Data Lake Storage Gen2 ). But my main is problem how can i connect to azure storage using docker-compose.yaml? I should declarate volume in every service that use volume and declare volume in volume section. I don't have idea how to do that.
Docker documentation about it is gone. Should be here https://docs.docker.com/docker-for-azure/persistent-data-volumes/.
Another option is use https://hub.docker.com/r/docker4x/cloudstor/tags?page=1&ordering=last_updated but last update was 2 years ago so its probably not supported anymore.
Do i have any other options and which share volume between nodes is best solution?
There are a number of ways of dealing with creating persistent volumes in docker swarm, none of them particularly satisfactory:
First up, a simple way is to use nfs, glusterfs, iscsi, or vmware to multimount the same SAN storage volume onto each docker swarm node. Services just mount volumes as /mnt/volumes/my-sql-workload
On the one hand its really simple, on the other hand there is literally no access control and you can easilly accidentally load services pointing at each others data.
Next, commercial docker volume plugins for SANs. If you are lucky and possess a Pure Storage, NetApp or other such SAN array, some of them still offer docker volume plugins. Trident for example if you have a NetApp.
Third. if you are in the cloud, the legacy swarm offerings on Azure and Aws included a built in "cloudstor" volume driver but you need to dig really deep to find it in their legacy offering.
Four, there are a number of opensource or free volume plugins that will mount volumes from nfs, glusterfs or other sources. But most are abandoned or very quiet. The most active I know of is marcelo-ochoa/docker-volume-plugins
I wasn't particularly happy with how those plugins mounted pre-existing volumes, but made operations like docker volume create hard, so I made my own, but really
Swarm Cluster Volume Support with CSI Plugins is hopefully going to drop in 2021¹. Which hopefully is a solid rebuttal to all the problems above.
¹Its now 2022 and the next version of Docker has not yet gone live with CSI support. Still we wait.
In my opinion, a good solution could be to create a GlusterFS cluster, configure a single volume and mount it in every Docker Swarm node (i.e. in /mnt/swarm-storage).
Then, for every Container that needs persistent storage, bind-mount a subdirectory of the GlusterFS volume inside the container.
Example:
services:
my-container:
...
volumes:
- type: bind
source: /mnt/swarm-storage/my-container
target: /a/path/inside/the/container
This way, every node shares the same storage, so that a given container could be instantiated indifferently on every cluster node.
You don't need any Docker plugin for a particular storage driver, because the distributed storage is transparent to the Swarm cluster.
Lastly, GlusterFS is a distributed filesystem, designed to not have a single point of failure and you can cluster it on as many node you like (contrary to NFS).

Persisting data in a docker swarm with glusterfs

I have a docker swarm with a lot of containers, but in particolar:
mysql
mongodb
fluentd
elasticsearch
My problem is that when a node fails, the manager discards the current container and creates a new one in another node. So everytime i lost the persisting data stored in that particular container even using docker volumes.
So i would create four distributed glusterfs volumes over my cluster, and mount them as docker volumes into my containers.
Is this a correct way to resolve my problem?
If it is, what type of filesystem should i use for my glusterfs volumes?
Are there perfomance problems with this approch?
GlusterFS would not be the correct way to resolve this for all of your containers since Gluster does not support "structured data", as stated in the GlusterFS Install Guide:
Gluster does not support so called “structured data”, meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster is traditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).
One solution to this would be master slave replication for the data in your databases.
MySQL and mongoDB both support this (as described here and here), as do most common DBMSs.
Master slave replication is basically where for 2 or more copies of your database, one will be the master and the rest will be slaves. All write operations happen on the master, and all read operations happen on the slaves. Any data written to the master will be replicated across the slaves, by the master.
Some DBMSs also provide a way to check if the master goes down and elect a new master if this happens, but I don't think all DBMSs do this.
You could alternatively set up a Galera Cluster, but as far as I'm aware this only supports MySQL.
I would have thought you could use GlusterFS for Fluentd and Elasticsearch, but I'm not familiar with either of those so I couldn't say for certain. I imagine it would depend on how they store any data they collect (if they collect any at all).
You might want to take a look at flocker (a volume data manager) which has integration for several container cluster managers, including Docker Swarm.
You will have to create a volume using flocker driver for each application as pointed by the tutorial:
...
volumes:
mysql:
driver: "flocker"
driver_opts:
size: "10GiB"
profile: "bronze"
...

Persistent storage solution for Docker on AWS EC2

I want to deploy a node-red server on my AWS EC2 cluster. I got the docker image up and running without problems. Node-red stores the user flows in a folder named /data. Now when the container is destroyed the data is lost. I have red about several solutions where you can mount a local folder into a volume. What is a good way to deal with persistent data in AWS EC2?
My initial thoughts are to use a S3 volume or mount a volume in the task definition.
It is possible to use a volume driver plugin with docker that supports mapping EBS volumes.
Flocker was one of the first volume managers, it supports EBS and has evolved to support a lot of different back ends.
Cloudstor is Dockers volume plugin (It comes with Docker for AWS/Azure).
Blocker is an EBS only volume driver.
S3 doesn't work well for all file system operations as you can't update a section of an object, so updating 1 byte of a file means you have to write the entire object again. It's also not immediately consistent so a write then read might give you odd/old results.
The EBS volume can only be attached to one instance which means that you can only run your docker containers in one EC2 instance. Assuming that you would like to scale your solution in future with many containers running in ECS cluster then you need to look into EFS. It’s a shared system from AWS. The only issue is performance degradation of EFS over EBS.
The easiest way (and the most common approach) is run your docker with -v /path/to/host_folder:/path/to/container_folder option, so the container will refer to host folder and information will stay after it will be restarted or recreated. Here the detailed information about docker volume system.
I would use AWS EFS. It is like a NAS in that you can have it mounted to multiple instances at the same time.
If you are using ECS for your docker host the following guide may be helpful http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_efs.html

What are proven options for using ZFS remotely as a volume backend for docker?

So with the introduction of volumes we do not longer use data-only containers! Nice. But right now I have this nice home-grown ZFS appliance and I want to use it as a backend for my docker volumes (of course Docker is running on other hosts).
I can export ZFS as NFS relatively easy, what are proven (i.e. battle-tested) options for using NFS as a volume backend for docker?
A google search shows me the following possibilities:
using Flocker, I could use the flocker-agent-thingie on the zfs appliance. However with Flocker being scrapped, I am concerned...
using the local volume backend and simply mount the nfs export on the docker host -> does not scale, but might do the job.
using a specialized volume plugin like https://github.com/ContainX/docker-volume-netshare to utilize nfs
something alike from Rancher: https://github.com/rancher/convoy
or going big and use Kubernetes and NFS as persistent storage https://kubernetes.io/docs/concepts/storage/volumes/#nfs
I have pretty extensive single-host Docker knowledge - which option is a stable one? Performance is not that important to me (alas the use case is a dockerized OwnCloud/NextCloud stack, throughput is limited by the internet connection)

Kubernetes NFS server is taking 100% cpu

When i create the RC as given in the nfs tutorial of kubernetes to create the nfs server,
it uses 100% cpu of a n1-standard-1 node from GCE:
Pod logs returns nothing wrong:
> kubectl logs nfs-server-*****
Serving /exports
NFS started
Is that normal that nfs consume so much cpu?
There is an issue in the NFS image you were using.

Resources