In fluentd, does drop_oldest_chunk reset retry_wait? - fluentd

In fluentd, regarding retry_limit, disable_retry_limit http://docs.fluentd.org/v0.12/articles/output-plugin-overview:
If the limit is reached, buffered data is discarded and the retry interval is reset to its initial value (retry_wait).
In my setup I have the following configuration for output:
buffer_queue_limit 200
buffer_chunk_limit 1m
flush_interval 3s
buffer_queue_full_action drop_oldest_chunk
max_retry_wait 1h
disable_retry_limit true
So we will keep retrying to output from buffer, with a max_retry_wait of 1 hour, untill the buffer queue is full, in which case it will drop the oldest chunk and move onto the next one.
With the disable_retry_limit set to true, this means we drop the oldest chunk only when the buffer queue is full, buffer_queue_full_action drop_oldest_chunk.
My question is, when this buffer queue drops the oldest chunk, is the retry_wait(default 1s, incrementing with each try) reset to it's initial value for the next chunk in the queue due to be outputted (giving same behavior as when retry_limit is reached)

Tested on local machine, fluent-d does not reset the retry_wait to its initial value when a chunk is dropped.

Related

Prometheus blackbox probe helpful metrics

I have around 1000 targets that are probed using HTTP.
job="http_2xx", env="prod", instance="x.x.x.x"
job="http_2xx", env="test", instance="y.y.y.y"
job="http_2xx", env="dev", instance="z.z.z.z"
I want to know for the targets:
Rate of failure by env in last 10 minutes.
Increase in rate of failure by env in last 10 minutes.
Curious what the following does:
sum(increase(probe_success{job="http_2xx"}[10m]))
rate(probe_success{job="http_2xx", env="prod"}[5m]) * 100
The closest I have reached is with following to find operational by env in 10 minutes:
avg(avg_over_time(probe_success{job="http_2xx", env="prod"}[10m]) * 100)
Rate of failure by env in last 10 minutes. The easiest way you can do it is:
sum(rate(probe_success{job="http_2xx"}[10m]) * 100) by (env)
This will return you the percentage off successful probes, which you can reverse adding *(-1) +100
Calculating rate over 10m and increase of rate over 10m seems redundant adding an increase function to the above query didn't work for me. you can replace the rate function with increase if want to.
The first query was pretty close it will calculate the increase of successful probes over 10m period. You can make it show increase of failed probes by adding == 0 and sum it by the "env" variable
sum(increase(probe_success{job="http_2xx"} == 0 [10m])) by (env)
Your second query will return percentage of successful request over 5m for prod environment

Why is "ie.waitforcomplete 10000" not working?

According to the syntax of ie.waitforcomplete command, it takes input in terms of milliseconds but while I was running this code, it does for 10 seconds as it should.
ie.open google.com nowait true
window ✱internet✱
ie.seturl g1ant.com
ie.waitforcomplete 1000
ie.seturl duckduckgo.com
As manual suggests, ie.waiforcomplete "suspends script execution until a webpage is loaded". The first argument is timeout which "specifies time in milliseconds for G1ANT.Robot to wait for the command to be executed". It simply means that when you have:
ie.waitforcomplete 1000
It will wait for maximum time of 1000 milliseconds (1 second) for a webpage to load, if it fails, an exception will occurr.
And the following code means that it will wait for maximum time of 10 seconds.
ie.waitforcomplete 10000

What is the reason of redis memory loss?

Nobody flush DB, only do hget after start and hset all the Data.
But after sometime(about 1 day) , the memory will loss and no data exists...
# Memory
used_memory:817064
used_memory_human:797.91K
used_memory_rss:52539392
used_memory_peak:33308069304
used_memory_peak_human:31.02G
used_memory_lua:36864
mem_fragmentation_ratio:64.30
mem_allocator:jemalloc-3.6.0
After I restart the server and hset all again , the used_memory recover back.
# Memory
used_memory:33291293520
used_memory_human:31.00G
used_memory_rss:33526530048
used_memory_peak:33291293520
used_memory_peak_human:31.00G
used_memory_lua:36864
mem_fragmentation_ratio:1.01
mem_allocator:jemalloc-3.6.0
But it can never last longer than 1 day...The hset process need at least 4h, and redis takes up over half of the memory so BGSAVE is useless...
What is the reason of memory loss? and how 2 backup data?

How to gracefully kill an unresponsive tcl script?

Lets say I have a tcl script which should normally execute in less than a minute - How could I make sure that the script NEVER takes more than 'x' minutes/seconds to execute, and if it does then the script should just be stopped.
For example, if the script has taken more than 100 seconds, then I should be able to automatically switch control to a clean up function which would gracefully end the script so that I have all the data from the script run so far but I also ensure that it doesn't take too long or get stuck infinitely.
I'm not sure if this can be done in tcl - any help or pointers would be welcome.
You could use interp limit when you use a child interpreter.
Note that this will throw an uncachable error, if you want to do some cleanup you to remove the limit in a parent interp.
set interp [interp create]
# initialize the interp
interp eval $interp {
source somestuff.tcl
}
# Add the limit. From now you have 60 seconds or an error will be thrown
interp limit $interp time -seconds [clock seconds] -milliseconds 60000
set errorcode [catch {interp eval {DoExpensiveStuff}} res opts]
# remove the limit so you can cleanup the mess if needed.
interp limit $interp time -seconds {}
if {$errorcode} {
# Do some cleanup here
}
# delete the interp, or reuse it?
interp delete $interp
# And what shall be done with the error? Throw it.
return -options $opt $res
Resource limits are the best bet with Tcl, but they are not bullet-proof. Tcl can not (and will not) abort C procedures, and there are some ways to let the Tcl core do some hard working.
There must be a loop that you're worried might take more than 100 seconds, yes? Save clock seconds (current time) before you enter the loop, and check the time again at the end of each iteration to see if more than 100 seconds have elapsed.
If for some reason that's not possible, you can try devising something using after—that is, kick off a timer to a callback that sets (or unsets) some global variable that your executing code is aware of—so that on detection, it can attempt to exit.

Set the memory high and then set_time_limit to 0

In PHP if I set the the memory 100M via ini_set and then I set set_time_limit(0); Does that mean that my PHP memory allocation is 100M forever(Until I restart my Apache)?
No its reset back to the original at the end of script execution.
From the manual:
string ini_set ( string $varname , string $newvalue )
Sets the value of the given configuration option. The configuration
option will keep this new value during the script's execution, and
will be restored at the script's ending.
and set_time_limit(0); is treated the same.
Example:
// 1. Script starts
echo ini_get('memory_limit');//128M
// 2. We set a new limit the script will now have 100M
ini_set('memory_limit','100M');
echo ini_get('memory_limit'); //100M
die;
// 3. Script ends now its set back to 128M
With set_time_limit(0); it just tells the script to not time out, tho say you were to use set_time_limit(0); within a loop then on each iteration its internal counter is set to 0 over and over.
So if you were to use set_time_limit(1); within a loop as long as each iteration of the loop did not last longer then 1 second then it would still not time out as set_time_limit(n); would reset the internal timeout counter to 0 on each iteration.
Example of it not timing out after 1 second:
for($i=0;$i<=10;$i++){
set_time_limit(1);
usleep(999998); //2micro seconds from a second
echo $i;
}

Resources