Couldn't fluentd send output in the order in which logs occur? - fluentd

fluentd-daemonset -> fluentd-forward -> mongodb
We are currently collecting logs using the above structure. I am collecting kubernetes container logs and storing them in mongodb.
I want to save to mongodb in order of kubernetes contaerin log time. However, there are cases where container logs that occurred a long time ago are accumulated later in mongodb.
Is this unavoidable?
Is there any way to solve this?
Is this problem caused by pos_file or buffer?

Related

OS error messages when running multiple q/kdb+ servers on the same database in separate Docker containers

I've been getting unpredictable error messages like
./2016.12.19/ccbar/padjm. OS reports: Operation not permitted
I'm using multiple Linux q sessions that are dockerized and running in separate containers opened on the same database directory containing partitioned tables, and have found that such errors will go away when restarting the other q sessions and their containers.
Could Docker be causing some sort of locking of files? And would there be any way to release the lock? Or perhaps there is some other cause to this behavior? Am I not supposed to run multiple q sessions on the same database directory with partitioned tables? And would there be any methods to get rid of this error without restarting the q sessions?

Couchdb 3.1.0 cluster - database failed to load after restarting one node

Here is the situation : on a couchdb cluster made of two nodes, each node is a couchdb docker instance on a server (ip1 and ip2). I had to reboot one server and restart docker, after that both my couchdb instances displays for each database: "This database failed to load."
I can connect with Futon and see the full list of databases, but that's all. On "Verify Couchdb Installation" with Futon I have several errors (only 'Create database' is a green check)
The docker logs for the container gives me this error :
"internal_server_error : No DB shards could be opened"
I tried to recover the database locally by copying the .couch and shards/ files to a local instance of couchdb but the same problem occurs.
How can I retrieve the data ?
PS: I checked the connectivity between my two nodes with erl, no problem there. Looks like docker messed up with some couchdb config file on restart.
metadata and cloning a node
The individual databases have metadata indicating on which nodes their shards are stored which is built on creation based on cluster options, so copying database files alone does not actually move or mirror the database on to the new node. (If one sets the metadata correctly the shards are copied by couch itself, so copying the files is only done to speed up the process.)
replica count
A 2 node cluster usually does not make sense. As with file system RAID, you can stripe for maximal performance and half the reliability or you can create a mirror, but unless individual node state has perfect consistency detection one can not automatically decide which of two nodes is incorrect, while deciding which of 3 nodes is incorrect is easy enough to perform automatically. Consequently, most clusters are 3 or more nodes and each shard has 3 replicas on any 3 nodes.
Alright, just in case someone do the same mistake :
When you have a 2 node cluster, couchdb#ip1 and couchdb#ip2, and created the cluster from couchdb#ip1 :
1) If the node couchdb#ip2 stops, then the cluster setup is messed up (couchdb#ip1 will no longer work), on restart it appears that the node will not connect correctly and the databases will appear but will not be available.
2) On the other hand, stoping and starting couchdb#ip1 do not cause any problem
The solution in case 1 is to recreate the cluster with 2 fresh couchdb instances (couchdb#ip1 and couchdb#ip2), then copy the databases on one couchdb instance and all the databases will be back !
If anyone can explain in detail why this happend ? It also means that this cluster configuration is absolutly not reliable (if couchdb#ip2 is down then nothing works), I guess it will not be the same with a 3 nodes cluster ?

org.apache.druid.java.util.common.ISE: No default server found

I'm setting up druid first time and ran into following issues while trying to start druid using docker-compose,
2020-04-10T14:40:01,837 ERROR [qtp1667348377-84] org.apache.druid.server.router.QueryHostFinder - Catastrophic failure! No servers found at all! Failing request!: {class=org.apache.druid.server.router.QueryHostFinder}
2020-04-10T14:40:01,837 WARN [qtp1667348377-84] org.eclipse.jetty.server.HttpChannel - /druid/v2/sql
org.apache.druid.java.util.common.ISE: No default server found!
at org.apache.druid.server.router.QueryHostFinder.pickDefaultServer(QueryHostFinder.java:119) ~[druid-server-0.17.1.jar:0.17.1]
here is the cmd i'm using
vm:~$ sudo docker-compose -f distribution/docker/docker-compose.yml up
i've cloned the repo from https://github.com/apache/druid.
However if download the druid distribution apache-druid-0.17.1 and using ./bin/start-micro-quickstart I'm able to see all services started successfully and running fine. able to access web-console and load the data into segments.
But when I try to start druid in cluster mode or using docker-compose getting 404 errors in console and in logs connection refused and Out Of Memory errors. I've increased -XX:MaxDirectMemorySize=6g in druid.sh but no luck.
Please help me in resolving these errors while starting druid using docker-compose.
Do a Docker PS to check if router/broker is running.
If not, increase the memory allocated to your docker engine.
Druid need at least 4GB memory.

My Logstash container can't handle the traffic, but the host CPU/MEM is idle

I'm using Logstash to parse and aggregate the nginx connection log. (around 3000 events/sec). I believe the aggregating is the bottleneck and what happened is my Logstash has a slower rate to process the log than the written rate of the log. So there is a growing gap between the timestamp each event is written into the file and the #timestamp which Logstash appended to the event. Eventually, Logstash is going to drop some events.
But I checked the usage of the host machine, the average CPU usage is only about 50%, and the memory is not an issue too. Is there any other stuff causing this bottleneck for Logstash?
Also, I didn't see anything wrong in the log of the Logstash. The reason I know it will drop the events is from Kibana.
Thank you in advance!

what's the BestPractice for Docker logging?

Im using docker with my Web service.
when I deploy using Docker, loosing some logging files (nginx accesslog, service log, system log.. etc)
Cause, docker deployment system using down and up container architecures.
So I thought about this problem.
LoggingServer and serviceServer(for api) must seperate!
using these, methods..
First, Using logstash(in elk)(attaching all my logFile) .
Second, Using batch system, this batch system will moves logfiles to otherServer on every midnight.
isn't it okay?
I expect a better answer.
thanks.
There are many ways for logging which most the admin uses for containers
1 ) mount log directory to host , so even if docker goes up/down logs will be persisted on host.
2) ELK server, using logstash/filebeat for pushing logs to elastic search server with tailing option of file, so if new log contents it pushes to server.
3) if there is application logs like maven based projects, then there are many plugins which pushes logs to server
4) batch system , which is not recommended because if containers dies before mid-night then logs will be lost.

Resources