I have just started exploring Health Check feature in docker. All the tutorials online are showing same type of health check examples. Like this link1 link2. They are using this same command:
HEALTHCHECK CMD curl --fail http://localhost:3000/ || exit 1
I have a python code which I have converted into docker image and its container is running fine. I have service in container which runs fine but I want to put a health check on this service. It is started/stopped using :
service <myservice> start
service <myservice> stop
This service is responsible to send data to server. I need to put a health check on this but don't know how to do it. I have searched for this and didn't found any examples. Can anyone please point me to the right link or can explain it.?
Thanks
The health check command is not something magical, but rather something you can automate to get a better status on your service.
Some questions you should ask yourself before setting the healthcheck:
How would i normally verify that the service is running ok, assuming i'm running it normally instead of inside of a container and it's not an automated process, but rather i check the status doing something myself
If the service has no open ports it can be interrogated on, does it rather write it's success/failure status on disk inside a file?
If the service has open ports but it communicates on a custom protocol, do i have any tools that i use to interrogate the open ports
Let's take the curl command you listed: It implies that the healthcheck listed is monitoring a http service started on port 3000. The curl command will fail if the http status code returned is not 200. That's pretty straight forward to demonstrate the health check usage.
Assuming you write success or failure to a file every 30 seconds from your service then your healthcheck would be a script that exits abnormally when encountering the failure text
Assuming that your service has an open port but is communicating via some custom protocol like protocol buffers, then all you have to do is call it with a script that encodes a payload with proto buf then checks the output received
And so on...
Related
I have a java application as JAR. This JAR application runs fine from my machine, meaning it can send and receive HTTP Requests to and from an API Endpoint (let's call this endpoint example.com/api/).
And then i built a docker image of this JAR Application, and tried to run the image as container from my docker desktop. But then i got this error.
the error i got
it seems like my application cant reach the url from inside the docker container. I tried to set the proxy in Settings -> Resources -> Proxies -> Manual proxies configuraton, and put my company proxy since i'm inside my company network. But still it doesn't work.
I tried to google this problem but almost nothing shows up (anything that shows up have little correlation with my problem). Anyone knows what seems to be the problem? What should I do?
First check if your container is able to communicate with the endpoint. Ping or curl it from the container shell. If you use proxy, set environment variables in container:
export http_proxy=http://server-ip:port
export https_proxy=https://server-ip:port
I think to understood how fabric mainly works and how consens is reached. What I am still missing in the documentation is the part of what happens inside of a docker container of fabric to take part in communication process.
So, communication starting from a client (e.g. an app) takes place in using gRPC messages between peers and orderer.
But what happens inside of the containers?
I imagine it for myself as a process that is only receiving gRPC message and answering them in using functions in the background of a peer/orderer, to hands out its response for further processing in another unit like the client to collect the responses of multiple peers for a smart contract.
But what happens really inside a container? I mean, a container spawns, when the docker image file is loaded and launched by the yaml config file. But what is started there inside of it (is there only a single peer binary started, e.g. like the command "peer node start") - I mean the compiled go binary file "peer" only?? What is listening? What is responding there? I discovered only one port for every container that is exposed out. This seems for me to be the gate for gRPC (cause it is often used as Port ID: **51).
The same questions goes for the orderer, the chaincode and the cli. How are they talking to each other or is gRPC the only way of communication and processing (excluded of the discovery service and gossip, how is this started inside of the containers (in using the yaml files for lauchun only or is there further internal configuration or a startupscript in the image files (cause I cannot look inside the images, only login on running containers while runtime).
When your client sends request to one of the peers, peer instance checks if requested chaincode (CC) installed on it. If CC not installed: Obviously you'll get an error.
If CC is installed: Peer checks if a dedicated container is already started for the given CC and corresponding version. If container is started, peer sends transaction request to that CC instance and returns back the response to your client after signing the transaction. Signing guarantees that response is really sent by that peer.
If container not started:
It builds a docker image and starts that instance (docker container). New image would be based on one of the hyperledger images. i.e. if your CC is GO, then hyperledger/baseos, which is very basic linux os, will be used. This new image contains CC binary and META-DATA as well.
That peer instance is using underlying (your) machine's docker server to do all of those. That's the reason why we need to pass /var/run:/host/var/run into volume mapping and CORE_VM_ENDPOINT=unix:///host/var/run/docker.sock into environment variables.
Once the CC container starts, it connects to its parent peer node which is defined with
CORE_PEER_CHAINCODEADDRESS attribute. Peer dictates to child (probably during image creation) to use this address, so they obey. Peer node defines its own listen URL with CORE_PEER_CHAINCODELISTENADDRESS attribute.
About your last question; communication is with gRPC in between nodes also with clients. If TLS is enabled, then it's for sure secure communication. Entry point for orderers to know about peers and peers know about other organizations' peers is the definition of anchor peers defined during channel creation. Discovery service is running in peer nodes, so they can hold a close to real-time network layout. Discovery service also provides peers' identity, that's how clients can detect other organizations' peers when endorsement policy requires multiple organizations' endorsement policy (i.e. if policy look like AND(Org1MSP.member, Org2MSP.member)).
I run a very simple micro integrator service that only has 1 proxy service and a single sequence. In this sequence the incoming XML message is transferred to amazon SQS service.
If I run this in the Integration Studio on the instance that comes built in I have no problems. However, when I package the file into a CAR and feed it to the docker instance it will boot up and instantly gets bombarded with requests? That is to say, the following logs take over and the container can no longer be manually stopped:
[2020-04-15 12:45:44,585] INFO
{org.apache.synapse.transport.passthru.SourceHandler} - Writer null
when calling informWriterError ^[[?62;c^[[?62;c[2020-04-15
12:45:46,589] ERROR
{org.apache.synapse.transport.passthru.SourceHandler} - HttpException
occurred org.apache.http.ProtocolException: Invalid request line:
ÇÃ^ú§ß¡ðO©%åË*29xÙVÀ$À(=À&À*kjÀ at
org.apache.http.impl.nio.codecs.AbstractMessageParser.parse(AbstractMessageParser.java:208)
at
org.apache.synapse.transport.http.conn.LoggingNHttpServerConnection$LoggingNHttpMessageParser.parse(LoggingNHttpServerConnection.java:407)
at
org.apache.synapse.transport.http.conn.LoggingNHttpServerConnection$LoggingNHttpMessageParser.parse(LoggingNHttpServerConnection.java:381)
at
org.apache.http.impl.nio.DefaultNHttpServerConnection.consumeInput(DefaultNHttpServerConnection.java:265)
at
org.apache.synapse.transport.http.conn.LoggingNHttpServerConnection.consumeInput(LoggingNHttpServerConnection.java:114)
at
org.apache.synapse.transport.passthru.ServerIODispatch.onInputReady(ServerIODispatch.java:82)
at
org.apache.synapse.transport.passthru.ServerIODispatch.onInputReady(ServerIODispatch.java:39)
at
org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:113)
at
org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:159)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:338)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
at
org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
at
org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
at
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:586)
at java.lang.Thread.run(Thread.java:748) Caused by:
org.apache.http.ParseException: Invalid request line:
ÇÃ^þvHÅFmÉ
(#ë¸'º¯æ¦V
I made sure there were no outside connections possible and I also found the older threads of someone describing this problem, but their solution (changing something in the keystore) did not work.
Also, I made sure to include the SQS certificate in the container as well.
I have no connections setup to connect to the container so that will be out of the equation as well.
What am I missing here?
I have no idea why, but I have identified the culprit to be none other than Portainer. When I shutdown Portainer the stream of requests stops.
According to Wireshark, the requests are all made towards
GET
http://172.17.0.1:9000/api/endpoints/< containerID >/docker/< someId >/logs
It seems that because the WSO2 container I'm trying to run is an ESB that uses endpoints and returns 400 status codes on non-existing endpoints portainer will retry until it succeeds. This is just my observation, so I could be wrong.
I have confirmed my findings by uploading my container to AWS where the problem did not exist.
I'm trying to deploy GridGain Web Console 2020.03.01 on RHEL7 x86_64 with Docker following documentation here.
However, there is 404 Not Found error on accessing http://localhost:3000/swagger-ui.html page which is used as healthcheck. Backend logs show no errors. The last version I'm able to get containers running with is 2019.12.02 (which in fact refuses to show a connected cluster, but that's another issue). Starting with 2020.01.00, all backend healthchecks fail. That looks suspicious considering that 2020.01.00 releasenotes include updates of io.springfox and swagger-ui-dist.
Besides that, 2020.03.01 releasenotes say that Console's default port is changed to 8008, but the server still starts on 3000.
Anyone had any luck deploying dockerized Web Console?
The Web Console consists of backend and frontend. The backend is started on port 3000 which is printed in log, while the frontend is started indeed on port 8008 - and you most probably want to use this.
The docker-compose.yml given on Documentation site maps container's 8008 port to host's 80 port, feel free to replace with any wanted.
Regarding the heathcheck, /health endpoint is now changed to this
The Swagger was removed in 2020.01.00 due to security concerns (same GG-26726 issue mentioned in the release notes). You are right to be suspicious, I'll ask right people to update release notes and the docs, sorry about the confusion and thanks for pointing the issue out. Swagger was supposed to be an internal feature for Web Console (WC) developer team only.
As you pointed out, starting with 2020.01.00 the Swagger-based health check won't work. Internally, the WC team uses dockerize to wait for backend to start, here's an example from our E2E test suite compose:
entrypoint: dockerize -wait http://backend:3000/health -timeout 2m -wait-retry-interval 5s node ./index.js --target=${TARGET:-on-premise}
This might work for you too, with some adaptation. You will most likely have to remove "healthcheck" sections from docker-compose.yml too, or modify these, if the "http://backend:3000/health" URL can indeed serve as a direct replacement for the old "http://localhost:3000/swagger-ui.html" URL, which I am not sure about.
I have been using Kafka Connect in my work setup for a while now and it works perfectly fine.
Recently I thought of dabbling with few connectors of my own in my docker based kafka cluser with just one broker (ubuntu:18.04 with kafka installed) and a separate node acting as client for deploying connector apps.
Here is the problem:
Once my broker is up and running, I login to the client node (with no broker running,just the vanilla kafka installation), i setup the class path to point to my connector libraries. Also the KAFKA_LOG4J_OPTS environment variable to point to the location of log file to generate with debug mode enabled.
So every time i start the kafka worker using command:
nohup /opt//bin/connect-distributed /opt//config/connect-distributed.properties > /dev/null 2>&1 &
the connector starts running, but I don't see the log file getting generated.
I have tried several changes but nothing works out.
QUESTIONS:
Does this mean that connect-distributed.sh doesn't generate the log file after reading the variable
KAFKA_LOG4J_OPTS? and if it does, could someone explain how?
NOTE:
(I have already debugged the connect-distributed.sh script and tried the options where daemon mode is included and not included, by default if KAFKA_LOG4J_OPTS is not provided, it uses the connect-log4j.properties file in config directory, but even then no log file is getting generated).
OBSERVATION:
Only when I start zookeeper/broker on the client node, then provided KAFKA_LOG4J_OPTS value is picked and logs start getting generated but nothing related to the Kafka connector. I have already verified the connectivity b/w the client and the broker using kafkacat
The interesting part is:
The same process i follow in my workpalce and logs start getting generated every time the worker (connnect-distributed.sh) is started, but I haven't' been to replicate the behaviors in my own setup). And I have no clue what I am missing here.
Could someone provide some reasoning, this is really driving me mad.