My goal is to use Prometheus to monitor several microservices that can replicate or migrate.
I've seen a video of 2017 that says the only solution was through DNS A records, but it does not solve cross-Docker-network pulling, making the migration between different networks impossible right?
I would like you if you know if there is already a solution for this? (I didn't find). If not, is "file_sd_config" the best way to go? With this I'm able to change the file while having prometheus to read the updated file periodically.
Thanks!
Related
We’ve created some kind of python monitoring app that performs health-check of our system once in 10 minutes and sends text alarms to our engineers (via jabber/slack) if something went wrong.
Are there any best practices we can introduce to be sure monitoring works even if server it’s hosted on is down? Any good books/online materials covering stability topic? First idea was to use docker swarm and multiple servers (just because I know it exists and seems to solve the problem) but maybe there’re way better solutions I’m not aware of.
I would say the best practice would be to build your SRE stack out of off the shelf rather than home grown components.
prometheus, alertmanager and so on.
Then you want your actual alerting infrastructure to be cloud hosted - PagerDuty for example.
And use something like Pingdom as an external check that your crucial infrastructure is operating.
For our system we mark important messages with the delivery mode = 2, and are sending them on durable exchanges and queue's. The problem is that rabbitmq is being hosted on a docker container, and if that container goes down, the messages that have been persisted are lost upon container restart.
I want to know if there is a way to change the location of the persistence of messages to a mounted volume instead of the container-backed disk, and if so how. I also currently cant figure out where the messages are actually being persisted right now, and so finding to config for that is definitely a start, I'm just not sure where this is set as I cant find anything related to mnesia, and that seems to be a default for some people. This change to location could be at runtime or not, it is unimportant to me.
Also for help, try to keep in mind that all of this is very new to me so I'm not the most educated on how this system functions in all of its glory, so simple explanations will help a good deal more than those with unnecessarily complex solutions. Let me know if I can provide any other helpful info.
It's right here in the RabbitMQ documentation.
Create the /etc/rabbitmq/rabbitmq-env.conf file with the following contents to change the persistent data location:
MNESIA_DIR=/path/to/mounted/volume
Note that the RABBITMQ_ prefix is not necessary for variables defined in rabbitmq-env.conf
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
I have 3 docker MySQL container running and I am trying to replicate the data in one container to another.
if I create a database in one mysql-container the other mysql-container should get updated with the the changes I made.
I tried and I did not find a way to do that.
How to make this happen ?
You need to setup an standard replication between mysql containers.
You cannot use something like volume to share mysql data at file level because the mysql's will mess the things up and corrupt data for sure.
You can make a replication as Master-Slave scheme or Master-Master (according to your needs).
Refer to the docs to have detailed information.
I think #Robert is absolutely right in that you can probably start with a simple master-slave or master-master schema.
Since you mentioned Docker, I think the AutoPilot Pattern is a good repository to start. They have a solution for MySQL (replication, backups, failover, ...). Here is where you can find it
https://github.com/autopilotpattern/mysql
Can we share a common/single named volume across multiple hosts in docker engine swarm mode, what's the easiest way to do it ?
If you have an NFS server setup you can use use some nfs folder as a volume from docker compose like this:
volumes:
grafana:
driver: local
driver_opts:
type: nfs
o: addr=192.168.xxx.xx,rw
device: ":/PathOnServer"
In the grand scheme of things
The other answers are definitely correct. If you feel like you're still missing something or are coming to the conclusion that things might never really improve in this space, then you might want to reconsider the use of the typical POSIX-like hierarchical filesystem abstraction. Not all applications really need it (I might go as far as to say that few do). Maybe yours doesn't either.
In defense of filesystems
It is still very common in many circles, but usually these people know their remote/distributed filesystems very well and know how to set them up and leverage them properly (and they might be very good systems too, though often not with existing Docker volume drivers). Sometimes it's also in part because they're simply forced to (codebases that can't or shouldn't be rewritten to support other storage backends). Using, configuring or even writing arbitrary Docker volume drivers would be a secondary concern only.
Alternatives
If you have the option however, then evaluate other persistence solutions for your applications. Many implementations won't use POSIX filesystem interfaces but network interfaces instead, which pose no particular infrastructure-level difficulties in clusters such as Docker Swarm.
Solutions managed by third-parties (e.g. cloud providers)
Should you succeed in removing all dependencies to filesystems for persistent and shared data (it's still fine for transient local state), then you might claim to have fully "stateless" applications. Of course there is often always state persisted somewhere still, but the idea is that you don't handle it yourself. Many cloud providers (if that's where you're hosting things) will offer fully managed solutions for handling persistent state such that you don't have to care about it at all. If you're going this route, do consider managed services that use APIs compatible with implementations that you can use locally for testing (for example by running a Docker container based on an image for that implementation that is provided by a third-party or that you can maintain yourself).
DIY solutions
If you do want to manage persistent state yourself within a Docker Swarm cluster, then the filesystem abstraction is often inevitable (and you'd probably have more difficulties targeting block devices directly anyway). You'll want to play with node and service constraints to ensure the requirements of whatever you use to persist data are fulfilled. For certain things like a central DBMS server it could be easy ("always run the task on that specific node only"), for others it could be way more involved.
The task of setting up, scaling and monitoring such a setup is definitely not trivial, which is why many application developers are happy to let somebody else (e.g. cloud providers) do it. It's still a very cool space to explore however, though given you had to ask that question it's likely not something you should focus on if you're on a deadline.
Conclusion
As always, use the right abstraction for the job, and pause to think about what your strengths are and where to spend your resources.
From scratch, Docker does not support this by itself. You must use additional components either a docker plugin which would provide you with a new layer type for your volumes, or a sync tool directly on your FS which will sync the data for you.
From my point of view, the easiest solution is rsync or more accurately lsyncdn the daemon version of rsync. But I never tried it for docker volumes, so I can't tell if it handle it fine.
Other solutions are offered using Infinit.sh. It basically does the same thing as lsyncd does. It's a one way sync. So if your docker container are RW in their volumes it won't match your expectations. I tried this solution, and it works pretty well for RO operations. And not in production. It's still an alpha version. Infinit is also on the way to provide a docker driver. Not released yet. So I didn't even tried it. Too risky.
Other solutions I found but was unable to install (and so to try) are flocker and glusterFS. Both are designed to create FS Volume based on several HDD from several machines. But none of their repositories were working these past weeks.
Sorry for giving you only weak solutions, but I'm facing the same problem and haven't find yet a perfect solution.
Cheers,
Olivier
We are looking into using Docker plus either Mesos/Marathon or Kubernetes for hosting a cluster. However, the one issue that we haven't really seen any answers for is how to allow clustered services to connect to each other correctly. All of the ones that I have seen need to know about at least one other node before they can join the cluster. Some need to know about every node. However, in Kubernetes and Mesos, there's no way to know what those IP addresses are ahead of time.
So, are there any best practices for this? If it helps, some technologies we're looking into deploying as containers are ElasticSearch, ActiveMQ, and MongoDB. There may be others.
However, the one issue that we haven't really seen any answers for is how to allow clustered services to connect to each other correctly.
I think you're talking about HA/replicated/sharded apps here.
At the moment, in kubernetes, you can accomplish this by making an api call listing all the "endpoints" of the service; that will tell you where your peers are running.
We'd eventually like to support the use case you describe in a more first-class manner.
I filed https://github.com/GoogleCloudPlatform/kubernetes/issues/3419 to maybe get something more standardized started here.
I also wanted to setup an ElasticSearch cluster using Mesos/Marathon. As the existing "solutions" either were merely undocumented, or not working/outdated, I set up my own container.
If you like, have a look at https://github.com/tobilg/docker-elasticsearch-marathon
If you have a running Marathon installation (I use v0.8.1), then setting up an ElasticSearch cluster should be a matter of a few minutes.
UPDATE:
The container now uses Elasticsearch v1.5.2 and is able to run on the latest Marathon v0.8.2.
As for Kubernetes, it currently does require kube-controllers-manager to start with --machines argument given a list of minion IPs or hostnames.
I don't see any easy way how to handle this correctly in Kubernetes now. Yes, you could make a call to the API that returns list of endpoints but you must watch for changes and take an action when endpoints change...
I would prefer to use Mesos/Marathon that is well prepared for this scenario. You should implement custom Framework for Mesos. There is already Framework for ElasticSearch prepared: http://mesos.apache.org/documentation/latest/mesos-frameworks/