Influxdb retention policy is set but not working - influxdb

I set a retention policy on a influx db, but it seems thats is not working. Old data is not actually deleted.
I expect to drop data older than 1 month.

I figure it out. The logic is that the old data is still in the old retention policy "autogen". I just set "autogen" to 30d, not "0s" which is infinite and it worked. Basically when you set an retention policy in existing database, the new data is stored to that retention policy from now on. The old data is still stored in the old retention policy.

Related

CDC log retention for Informix

Actual Situation:
We use IBM Data Replication (11.4) to replicate Data from an Informix Database to an SQL Server Database.
Now we have an instance with 45 different subscriptions. On the informix side, we have 30 different log files.
The Problem:
When we want to “refresh” all subscriptions at once, we get in trouble that some logs aren’t available anymore, because they were overwritten.
The problem is that these logs were not full to 100 percent, but instead only to approximately 0,5%.
I don’t know when exactly a new log will be created.
Is there any possibility to change the settings, at which time a new logfile will be created? or that a new logfile only will be created when it is full to 100% or something else? Or do you have another solution to that problem at all?
We have found the problem:
The parameter “log_api_switch_log_num_pages” has to be defined. It describes log switching after a refresh.
See details here:
http://www-01.ibm.com/support/docview.wss?uid=swg21997830

How to release the deleted shard storage in InfluxDB?

I changed the retention policy of a database to keep data only for one day and after that, I dropped all shards created from autogen RP, but InfluxDB still takes a huge part of the storage on /var/lib/influxdb/data/<DB NAME>/_series folder and it's increasing continuously.
How can I release the storage related to the deleted points?

DB folder utilising lot of space creating space issue

I have a grafana windows server.Where we have integrated HyperV snaphot related infor as well as CPU, Memory usage of HV's etc. I could see below folder in our grana windows server
C:\InfluxDB\data\telegraf\autogen
Under this autogen folder, I can see multiple subfolder with .tsm files. Each file create every 7 days and the folder size is around 4 to 5GB. There are many files in this autogen folder from 2nd Feb 2017 to 14 Mar 2018 which is utilizing around 225GB space.
What you see:
autogen is a default Retention Policy (RP) auto-created by InfluxDB and has an infinite data retention duration. All datapoints in Influx are logically stored in shards. Physically shards data is compressed and stored in .tsm files. Shards are unified into shards groups. Each shard group covers a specific time range defined by so-called shard duration and stores datapoints belonging to this time interval. By default for RP with retention duration > 6 month shard group duration is set to 7 days.
For more info see docs on storage engine.
Regarding your questions:
"Is there anyway we can shrink the size of autogen file?"
Probably no. The only thing you can do is to rely on InfluxDB internal compression. Here they say that it may be improved if you increase shard duration.
*Although, because InfluxDB drop the whole shard rather then separate datapoints, the increase of shard duration will make your data to be stored until the whole shard goes out of scope of current retention duration and only then it will be dropped. Though, if you have an infinite retention duration it doesn't matter. This leads us to the second question.
"Is it possible to delete the old file under autogen folder?"
If you can afford loosing old data or can't afford to much storage space InfluxDB lets to specify data Retention Policy (RP), already mentioned above. Basically, all your measurements are associated with a specific RP and the data will be deleted as soon as retention duration comes to the end. So if you specify a RP of 1 year, InfluxDB will automatically delete all datapoints older then now() - 1 year. RP is a standard (and pretty obvious) way of dealing with storage issues. A logical continuation of RP idea is to group and aggregate your data over time over longer discrete time intervals (downsampling). In Influx it can be achieved with Continuous Queries (CQ). You can read more of data retention and downsamping here.
In conclusion, storage limitation are inevitable and properly configured retention policies is the way to go.

Batch write with deferent retentation policies in influxdb

I try to write batch data to influxdb but with deferent retention policies, but I could not find a way to do it without grouping the batch data with same retention then send each of the data in different batch..
It is currently not possible to write data with different retention policies in the same batch.

Automatically clear old data

Is it possible automatically to clear old data in Influx DB? Let's say some configuration option to keep records for 1 month only? In my server I store quite much statistics, so preventing running the free storage out I would like to have such feature.
Yes, it's simple, just add shard with Retention on 7 days.

Resources