Longhorn Volume a lot bigger than mounted drive - influxdb

I have a small InfluxDB database running inside my K3S cluster.
As Storage Class I use Longhorn.
I know it's not optimal to run a database in Kubernetes, but this is only for some metric logging for Telegraf.
The problem is that in the pod the mounted volume is 200 MB big, but in Longhorn it's 2.5 GB big as actual size. The volume is only 1 day old. At this speed, my disk storage will be full soon.
Why is this? And is this something I can fix?

I suspect the reason for this is snapshots.
Longhorn volumes have different size "properties":
The volume's size - this is what you define in your manifest. The actual filesystem contents can't exceed that
The amount of storage currently used on the volume head - this is essentially how full the volume is. run df -h inside an attached pod or use a tool like df-pv to check usage (this is relevant when your volume is getting full)
snapshot size: how big a snapshot is, building incrementally on top of the last one. this can be viewed in the snapshots section of longhorn UI
actual size: how much space the volume is really using on your host machine. This can be larger than the volume's "defined" size due to a number of reasons - the most common of which being snapshots
Longhorn keeps a history of previous changes to a volume as snapshots. you can either create them manually from the UI or create a RecurringJob that does that for you automatically.
Having many snapshots is problematic when a lot of data is (re-)written to a volume. Imagine the following scenario:
Write a 1GB file to volume
take snapshot (this snapshot is now 1GB big)
delete the file (volume head only contains the "file deleted" info, previous snapshot size is unaffected)
write a new 1GB file. volume head is now 1GB (new file)+the info from 3. big, BUT your previous snapshot is another GB. That way, your actual size is already 2x as big as the space currently used inside the volume
There's also an ongoing discussion about reclaiming space automatically

Related

Increasing the storage space of a docker container on Windows to 2-3TB

Working on a windows computer with 5TB available space. Working on building an application to process large amounts of data that uses docker containers to create replicable environments. Most of the data processing is done in parallel using many smaller docker containers, but the final tool/container requires all the data to come together in one place. The output area is mounted to a volume, but most of the data is just copied into the container. This will be multiple TBs of storage space. RAM luckily isn't an issue in this case.
Willing to try any suggestions and make what changes I can.
Is this possible?
I've tried increasing disk space for docker using .wslconfig but this doesn't help.

Artifactory Docker Registry Free /var partition space

Due to irregularization of docker images being pushed to jfrog docker registry, my /var partition is currently FULL.
As i am able to ssh into the machine, i wanted to know can i directly go about deleting the images at the /var location as I am not able to start artifactory service due to insufficient space.
The Docker images are stored as checksum binary files in the filestore. You will have no way of knowing what checksum belongs to what image and since images often share the same layer, even deleting a single one can corrupt several images.
For the short term, I recommend moving (not deleting) a few binary files to allow you to start your registry back up. You can also delete the backup directory (since backup is on by default and you may not actually want/need it and it occupies a lot of space). Once that is done, start it up and delete enough images to clear enough space OR, preferably, expand the filestore size OR, better yet, move it to a different partition so you don't mix the app/OS with the application data. In any case, when you have more free space, move the binary files back to their original location.

How do I limit container disk usage without evicting?

I'm trying to use Kubernetes on GKE (or EKS) to create Docker containers dynamically for each user and give users shell access to these containers and I want to be able to set a maximum limit on disk space that a container can use (or at least on one of the folders within each container) but I want to implement this in such a way that the pod isn't evicted if the size is exceeded. Instead ideally a user would get an error when trying to write more data to disk than the specified limit (e.g., Disk quota exceeded, etc).
I'd rather not have to use a Kubernetes volume that's backed by a gcePersistentDisk or an EBS volume to minimize the costs. Is there a way to achieve this with Kubernetes?
Assuming you're using emptyDir volume on Kubernetes, which is a temporary disk attached to your pod, you can set a size for that.
See the answer at https://stackoverflow.com/a/45565438/54929, this question is likely a duplicate.

What is the difference between Volume and Partition?

What is the difference between partition and volume?
Kindly give an analogy if possible since I am unable to understand the difference between them.
Partitions -
Storage media (DVD's, USB sticks, HDD's, SSD's) can all be divided into partitions, these partitions are identified by a partition table.
The partition table is where the partition information is stored, the information stored within here is basically where the partition starts and where it finishes on the disc platter.
Volumes -
A Volume is a logical abstraction from physical storage.
Large disks can be partitioned into multiple logical volumes
Volumes are divided up into fixed size blocks or a cluster or blocks.
We don't see the partition as this is sorted by the file system controller but we see volumes as they are logical and are provided by a gui with a hierarchical structure and human interface. When we request to see a file it runs through a specific order to view that information from within the volume on the partition:
Application created the file I/O request
The file system creates a block I/O request
Block I/O drive accesses the disk
Hope this helps... If any part needs clearing up let me know, try my best to clear it up more

Docker images across multiple disks

I'm getting going with Docker, and I've found that I can put the main image repository on a different disk (symlink /var/lib/docker to some other location).
However, now I'd like to see if there is a way to split that across multiple disks.
Specifically, I have an old SSD that is blazingly fast to read from, but doesn't have too many writes left until it kicks the can. It would be awesome if I could store the immutable images on here, then have my writeable images on some other location that can handle the writes.
Is this something that is possible? How do you split up the repository?
Maybe you could do this using the AUFS driver and some trickery such as moving layers to the SSD after initially creating them and pointing symlinks at them - I'm not sure, I never had a proper look at how that storage driver worked.
With devicemapper thinp, btrfs and OverlayFS this isnt possible AFAICT:
The Docker dm-thinp and btrfs drivers both build layers one on top of the other using block device snapshot mechanisms. Your best bet here would be to include the SSD in the storage pool and rely on some ability to migrate the r/o snapshots to a specific block device that is part of the pool. Doubt this exists though.
The OverlayFS driver stacks layers by hard-linking files in independent directory structures. Hard-links only work within a filesystem.

Resources