NiFi Cluster Docker Load Balancing configuration

NiFi Cluster Docker Load Balancing configuration - docker

I would like to configure Load Balancing in docker-compose.yml file for NiFi cluster deployed via Docker containers.
Current docker-compose parameters for LB are as follows (for each of three NiFi nodes):
# load balancing
- NIFI_CLUSTER_LOAD_BALANCE_PORT=6342
- NIFI_CLUSTER_LOAD_BALANCE_HOST=node.name
- NIFI_CLUSTER_LOAD_BALANCE_CONNECTIONS_PER_NODE=4
- NIFI_CLUSTER_LOAD_BALANCE_MAX_THREADS=8
But, when I try to use load balancing in queues, I can choose all the parameters there, and do not have any error, but LB is not working, everything is done on the primary node (because I used GetSFTP on the primary node only, but want to then process data on all 3 nodes). Also, NiFi cluster is configured to work with SSL.
Thanks in advance!

I had opened load balance port on my docker file. Also I had to specify hostname for each node's compose file
here is my docker file for basic clustering
version: "3.3"
services:
nifi_service:
container_name: "nifi_service"
image: "apache/nifi:1.11.4"
hostname: "APPTHLP7"
environment:
- TZ=Europe/Istanbul
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=8088
- NIFI_ZK_CONNECT_STRING=172.16.2.238:2181,172.16.2.240:2181,172.16.2.241:2181
ports:
- "8080:8080"
- "8088:8088"
- "6342:6342"
volumes:
- /home/my/nifi-conf:/opt/nifi/nifi-current/conf
networks:
- my_network
restart: unless-stopped
networks:
my_network:
external: true
please not that you have to configure load balance strategy on the downstream connection in your flow.

Related

How to get redis address from docker compose?

I'm trying to pass redis url to docker container but so far i couldn't get it to work. I did a little research and none of the answers worked for me.
version: '3.2'
services:
redis:
image: 'bitnami/redis:latest'
container_name: redis
hostname: redis
expose:
- 6379
links:
- api
api:
image: tufanmeric/api:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock
networks:
- proxy
environment:
- REDIS_URL=redis
depends_on:
- redis
deploy:
mode: global
labels:
- 'traefik.port=3002'
- 'traefik.frontend.rule=PathPrefix:/'
- 'traefik.frontend.rule=Host:api.example.com'
- 'traefik.docker.network=proxy'
networks:
proxy:
Error: Redis connection to redis failed - connect ENOENT redis

You can only communicate between containers on the same Docker network. Docker Compose creates a default network for you, and absent any specific declaration your redis container is on that network. But you also declare a separate proxy network, and only attach the api container to that other network.
The single simplest solution to this is to delete all of the network: blocks everywhere and just use the default network Docker Compose creates for you. You may need to format the REDIS_URL variable as an actual URL, maybe like redis://redis:6379.
If you have a non-technical requirement to have separate networks, add - default to the networks listing for the api container.
You have a number of other settings in your docker-compose.yml that aren't especially useful. expose: does almost nothing at all, and is usually also provided in a Dockerfile. links: is an outdated way to make cross-container calls, and as you've declared it to make calls from Redis to your API server. hostname: has no effect outside the container itself and is usually totally unnecessary. container_name: does have some visible effects, but usually the container name Docker Compose picks is just fine.
This would leave you with:
version: '3.2'
services:
redis:
image: 'bitnami/redis:latest'
api:
image: tufanmeric/api:latest
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- REDIS_URL=redis://redis:6379
depends_on:
- redis
deploy:
mode: global
labels:
- 'traefik.port=3002'
- 'traefik.frontend.rule=PathPrefix:/'
- 'traefik.frontend.rule=Host:api.example.com'
- 'traefik.docker.network=default'

Kafka connect and HDFS in docker

I am using kafka connect HDFS sink and Hadoop (for HDFS) in a docker-compose.
Hadoop (namenode and datanode) seems working correctly.
But I have an error with kafka connect sink:
ERROR Recovery failed at state RECOVERY_PARTITION_PAUSED
(io.confluent.connect.hdfs.TopicPartitionWriter:277)
org.apache.kafka.connect.errors.DataException:
Error creating writer for log file hdfs://namenode:8020/logs/MyTopic/0/log
For information:
Hadoop services in my docker-compose.yml:
namenode:
image: uhopper/hadoop-namenode:2.8.1
hostname: namenode
container_name: namenode
ports:
- "50070:50070"
networks:
default:
fides-webapp:
aliases:
- "hadoop"
volumes:
- namenode:/hadoop/dfs/name
env_file:
- ./hadoop.env
environment:
- CLUSTER_NAME=hadoop-cluster
datanode1:
image: uhopper/hadoop-datanode:2.8.1
hostname: datanode1
container_name: datanode1
networks:
default:
fides-webapp:
aliases:
- "hadoop"
volumes:
- datanode1:/hadoop/dfs/data
env_file:
- ./hadoop.env
And my kafka-connect file:
name=hdfs-sink
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
tasks.max=1
topics=MyTopic
hdfs.url=hdfs://namenode:8020
flush.size=3
EDIT:
I add an env variable for kafka connect to be aware of the cluster name (env variable: CLUSTER_NAME to add in kafka connect service in docker compose file).
The error is not the same (and it seems to solve a problem):
INFO Starting commit and rotation for topic partition scoring-topic-0 with start offsets {partition=0=0} and end offsets {partition=0=2}
(io.confluent.connect.hdfs.TopicPartitionWriter:368)
ERROR Exception on topic partition MyTopic-0: (io.confluent.connect.hdfs.TopicPartitionWriter:403)
org.apache.kafka.connect.errors.DataException: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /topics/+tmp/MyTopic/partition=0/bc4cf075-ccfa-4338-9672-5462cc6c3404_tmp.avro
could only be replicated to 0 nodes instead of minReplication (=1).
There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
EDIT2:
The hadoop.env file is:
CORE_CONF_fs_defaultFS=hdfs://namenode:8020
# Configure default BlockSize and Replication for local
# data. Keep it small for experimentation.
HDFS_CONF_dfs_blocksize=1m
YARN_CONF_yarn_log___aggregation___enable=true
YARN_CONF_yarn_resourcemanager_recovery_enabled=true
YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate
YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs
YARN_CONF_yarn_log_server_url=http://historyserver:8188/applicationhistory/logs/
YARN_CONF_yarn_timeline___service_enabled=true
YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true
YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=true
YARN_CONF_yarn_resourcemanager_hostname=resourcemanager
YARN_CONF_yarn_timeline___service_hostname=historyserver

Finaly like noticed by #cricket_007 I need to configure hadoop.conf.dir.
The directory should contain hdfs-site.xml.
When each service is dockerized, I need to create a named volume in order to share configuration files between kafka-connect service and namenode service.
To do this I add in my docker-compose.yml:
volumes:
hadoopconf:
Then for namenode service I add:
volumes:
- hadoopconf:/etc/hadoop
And for kafka connect service:
volumes:
- hadoopconf:/usr/local/hadoop-conf
Finaly I set hadoop.conf.dir in my HDFS sink properties file to /usr/local/hadoop-conf.

Customized port is not functional in Docker-compose

I'm having a dumb question regarding to using docker-compose.
Current scenario is I'm trying use my reverse_proxy to talk to my frontend_server. Inside the reverse_proxy, It redirects to frontend_server just like the following:
Suppose I receive http://${REV_IP}:${REV_PORT} It should redirect me to http://${FE_IP}:${FE_PORT} but it redirect me to 15000
(PROXIED_FRONTEND is http://${FE_IP}:${FE_PORT}, this environment veriable is used for this redirection)
Here is the code snippet for my docker-compose.yml
version: '3'
services:
reverse_proxy:
image: "${ARTIFACTORY}/template-reverse-proxy:${BRANCH}-${REV_TAG}"
networks:
nucleus-network:
ipv4_address: ${REV_IP}
ports:
- "${REV_PORT}:15999"
environment:
- KEYFILE_REVPROXY=${REV_KEY}
- CERTFILE_REVPROXY=${REV_CERT}
- PUBLIC_URL=${PUBLIC_URL}
- PUBLIC_API_URL=${PUBLIC_API_URL}
- PROXIED_FRONTEND=${PROXIED_FRONTEND}
- PROXIED_PDF=${PROXIED_PDF}
depends_on:
- frontend_server
frontend_server:
image: "${ARTIFACTORY}/fe_server:${BRANCH}-${PDF_TAG}"
ports:
- "${FE_PORT}:15000"
networks:
nucleus-network:
ipv4_address: ${FE_IP}
environment:
- FILEPATH_FE_SERVER=${FILEPATH_FE_SERVER}
volumes:
- "/home/lluo/dist_share:/app/dist"
depends_on:
- frontend_static
networks:
nucleus-network:
driver: bridge
ipam:
driver: default
config:
- subnet: ${SUB_NET}

It's no need for you to publish frontend_server port to outside world (via ports:), unless you want to access it directly (i.e. for debugging)
Since you use docker-compose and depends_on, it will create you a inner docker network, in which containers will see each other.
Only thing for you to do is to setup your reverse proxy to have the proxied backend pointing to http://frontend_server:15000 and your good to go. inner docker DNS will resolve the frontend_sever service name to appropriate container IP address.
for reference and more info see this question and links provided there: https://serverfault.com/questions/800689/how-to-use-haproxy-in-load-balancing-and-as-a-reverse-proxy-with-docker

port forwarding in docker-compose

I'm trying to split legacy system combined from hbase and php module into two separated containers with the following docker-compose file:
version: '2'
services:
php:
image: my-legacy-php
volumes:
- ~/workspace/php:/workspace/php
ports:
- "80:80"
links:
- hbase
hbase:
image: dajobe/hbase
hostname: hbase-docker
ports:
- "43590-44000:43590-44000"
- "8085:8085"
- "2181:2181"
- "8080:8080"
- "16010:16010"
- "9095:9095"
- "9090:9091"
- "16020:16020"
- "16030:16030"
- "60000:60000"
volumes:
- ~/workspace/hbase-docker/data:/data
I'm using a public hbase-docker image which using port 9090 for thrift while my legacy php module expect to connect via port 9091. I've tried to 'map' or 'forward' within the docker-compose.yml file "9090:9091" without lack. I also tried the expose attribute of docker-compose but it doesn't takes two ports (only one which is exposed to the other containers). How do I make that append?
I want that the listening port 9090 of hbase container will appear as 9091 from the php container (inside)

One of the possible solutions is: Building your own image, with dajobe/hbase as the base image, but modifying the hbase configs and ports exposed using EXPOSE to match your requirements, And then use that image in your compose file.
But this would require you have build and managing the image by yourself.

The solution is to put both services on the same docker network.
Specifically, add this to your docker-compose.yml:
networks:
app_net:
driver: bridge
Then, in each service's config be sure to include:
networks:
- app_net
Finally (and you've already done this), be sure that the correct port mapping is included in the config for hbase:
ports:
- "9090:9091"

rationale behind docker compose "links" order

I have a Redis - Elasticsearch - Logstash - Kibana stack in docker which I am orchestrating using docker compose.
Redis will receive the logs from a remote location, will forward them to Logstash, and then the customary Elasticsearch, Kibana.
In the docker-compose.yml, I am confused about the order of "links"
Elasticsearch links to no one while logstash links to both redis and elasticsearch
elasticsearch:
redis:
logstash:
links:
- elasticsearch
- redis
kibana:
links:
- elasticsearch
Is this order correct? What is the rational behind choosing the "link" direction.
Why don't we say, elasticsearch is linked to logstash?

Instead of using the Legacy container linking method, you could instead use Docker user defined networks. Basically you can define a network for your services and then indicate in the docker-compose file that you want the container to run on that network. If your containers all run on the same network they can access each other via their container name (DNS records are added automatically).
1) : Create User Defined Network
docker network create pocnet
2) : Update docker-compose file
You want to add your containers to the network you just created. Your docker-compose file would look something along the lines of this :
version: '2'
services:
elasticsearch:
image: elasticsearch
container_name: elasticsearch
ports:
- "{your:ports}"
networks:
- pocnet
redis:
image: redis
container_name: redis
ports:
- "{your:ports}"
networks:
- pocnet
logstash:
image: logstash
container_name: logstash
ports:
- "{your:ports}"
networks:
- pocnet
kibana:
image: kibana
container_name: kibana
ports:
- "5601:5601"
networks:
- pocnet
networks:
pocnet:
external: true
3) : Start Services
docker-compose up
note : you might want to open a new shell window to run step 4.
4) : Test
Go into the Kibana container and see if you can ping the elasticsearch container.
your__Machine:/ docker exec -it kibana bash
kibana#123456:/# ping elasticsearch

First of all Links in docker are Unidirectional.
More info on links:
there are legacy links, and links in user-defined networks.
The legacy link provided 4 major functionalities to the default bridge network.
name resolution
name alias for the linked container using --link=CONTAINER-NAME:ALIAS
secured container connectivity (in isolation via --icc=false)
environment variable injection
Comparing the above 4 functionalities with the non-default user-defined networks , without any additional config, docker network provides
automatic name resolution using DNS
automatic secured isolated environment for the containers in a
network
ability to dynamically attach and detach to multiple networks
supports the --link option to provide name alias for the linked
container
In your case: Automatic dns will help you on user-defined network. first create a new network:
docker network create ELK -d bridge
With this approach you dont need to link containers on the same user-defined network. you just have to put your elk stack + redis containers in ELK network and remove link directives from composer file.
Your order looks fine to me. If you have any problem regarding the order, or waiting for services to get up in dependent containers, you can use something like the following:
version: "2"
services:
web:
build: .
ports:
- "80:8000"
depends_on:
- "db"
entrypoint: ./wait-for-it.sh db:5432
db:
image: postgres
This will make the web container wait until it can connect to the db.
You can get wait-for-it script from here.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart