--wal-method=stream is not working for Postgresql 10.3 - postgresql-10

I am just leaning streaming replication using postgres Db,
trying to achieve something as described here :- https://cloud.google.com/community/tutorials/setting-up-postgres-hot-standby
I have done all the configurations as mentioned for both master and standby server.But in my case I am using Postgres version 10.3.
So while running the pg_basebackup command with option --wal-method=stream I am getting below error:-
[vagrant#postgres1 bin]$ ./pg_basebackup -h /tmp/ -p 5432 -D /home/vagrant/postgres/data2 -P -v --wal-method=stream
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2D000028 on timeline 1
pg_basebackup: starting background WAL receiver
33824/33824 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2D0000F8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: could not send copy-end packet: no COPY in progress
pg_basebackup: child process exited with error 1
pg_basebackup: removing data directory "/home/vagrant/postgres/data2"
But while executing pg_basebackup with --wal-method=fetch option it completed successfully
[vagrant#postgres1 bin]$ ./pg_basebackup -h /tmp/ -p 5432 -D /home/vagrant/postgres/data2 -P -v --wal-method=fetch
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2F000028 on timeline 1
50209/50209 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2F0000F8
pg_basebackup: base backup completed
But what I want is the real time replication of the primary to the secondary, which is possible by --wal-method=stream option.
Please let me know if anyone has any inputs for the same.
Thanks in advance :)

"stream" is the default value for wal-method. Just skip this part.
Run the below command without using --wal-method=stream
./pg_basebackup -h /tmp/ -p 5432 -D /home/vagrant/postgres/data2 -P -v

Related

Cannot run nodetool commands and cqlsh to Scylla in Docker

I am new to Scylla and I am following the instructions to try it in a container as per this page: https://hub.docker.com/r/scylladb/scylla/.
The following command ran fine.
docker run --name some-scylla --hostname some-scylla -d scylladb/scylla
I see the container is running.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e6c4e19ff1bd scylladb/scylla "/docker-entrypoint.…" 14 seconds ago Up 13 seconds 22/tcp, 7000-7001/tcp, 9042/tcp, 9160/tcp, 9180/tcp, 10000/tcp some-scylla
However, I'm unable to use nodetool or cqlsh. I get the following output.
$ docker exec -it some-scylla nodetool status
Using /etc/scylla/scylla.yaml as the config file
nodetool: Unable to connect to Scylla API server: java.net.ConnectException: Connection refused (Connection refused)
See 'nodetool help' or 'nodetool help <command>'.
and
$ docker exec -it some-scylla cqlsh
Connection error: ('Unable to connect to any servers', {'172.17.0.2': error(111, "Tried connecting to [('172.17.0.2', 9042)]. Last error: Connection refused")})
Any ideas?
Update
Looking at docker logs some-scylla I see some errors in the logs, the last one is as follows.
2021-10-03 07:51:04,771 INFO spawned: 'scylla' with pid 167
Scylla version 4.4.4-0.20210801.69daa9fd0 with build-id eb11cddd30e88ef39c32c847e70181b5cf786355 starting ...
command used: "/usr/bin/scylla --log-to-syslog 0 --log-to-stdout 1 --default-log-level info --network-stack posix --developer-mode=1 --overprovisioned --listen-address 172.17.0.2 --rpc-address 172.17.0.2 --seed-provider-parameters seeds=172.17.0.2 --blocked-reactor-notify-ms 999999999"
parsed command line options: [log-to-syslog: 0, log-to-stdout: 1, default-log-level: info, network-stack: posix, developer-mode: 1, overprovisioned, listen-address: 172.17.0.2, rpc-address: 172.17.0.2, seed-provider-parameters: seeds=172.17.0.2, blocked-reactor-notify-ms: 999999999]
ERROR 2021-10-03 07:51:05,203 [shard 6] seastar - Could not setup Async I/O: Resource temporarily unavailable. The most common cause is not enough request capacity in /proc/sys/fs/aio-max-nr. Try increasing that number or reducing the amount of logical CPUs available for your application
2021-10-03 07:51:05,316 INFO exited: scylla (exit status 1; not expected)
2021-10-03 07:51:06,318 INFO gave up: scylla entered FATAL state, too many start retries too quickly
Update 2
The reason for the error was described on the docker hub page linked above. I had to start container specifying the number of CPUs with --smp 1 as follows.
docker run --name some-scylla --hostname some-scylla -d scylladb/scylla --smp 1
According to the above page:
This command will start a Scylla single-node cluster in developer mode
(see --developer-mode 1) limited by a single CPU core (see --smp).
Production grade configuration requires tuning a few kernel parameters
such that limiting number of available cores (with --smp 1) is the
simplest way to go.
Multiple cores requires setting a proper value to the
/proc/sys/fs/aio-max-nr. On many non production systems it will be
equal to 65K. ...
As you have found out, in order to be able to use additional CPU cores you'll need to increase fs.aio-max-nr kernel parameter.
You may run as root:
# sysctl -w fs.aio-max-nr=65535
Which should be enough for most systems. Should you still have any error preventing it to use all of your CPU cores, increase its value further.
Do notice that the above configuration is not persistent. Edit /etc/sysctl.conf in order to make it persistent across reboots.

Redis-server closed when using redis-benchmark

I am using redis-benchmark on Ubuntu by using docker container.
I am experimenting performance of memory and CPU by limiting resources by Docker.
For example:
docker run --cpus="2" -m=4M -it [imageID]
My image is based on ubntu:latest and installed redis-server.
So, when I get into a container I run redis-server on background and use redis-benchmark:
10000 changes in 60 seconds. Saving...
Background saving started by pid 17
DB saved on disk
RDB: 0MB of memory used by copy-on-write
Background saving terminated with success
Error: Server closed the connection
[1]+ Killed redis-server
An error shows up.
My redis-benchmark command to make key/value size 100000:
redis-benchmark -q -r 100000 -c 1
If this is because of limitation of memory size, I think this command:
redis-benchamrk -c 1 -r 100000 -n 400
should also shut down the redis-server.

Issue accessing vespa outside docker container

Installed Docker on Mac and trying to run Vespa on Docker following steps specified in following link
https://docs.vespa.ai/documentation/vespa-quick-start.html
I did n't had any issues till step 4. I see vespa container running after step 2 and step 3 returned 200 OK response.
But Step 5 failed to return 200 OK response. Below is the command I ran on my terminal
curl -s --head http://localhost:8080/ApplicationStatus
I keep getting
curl: (52) Empty reply from server whenever I run without -s option.
So I tried to see listening ports inside my vespa container and don't see anything for 8080 but can see for 19071(used in step 3)
➜ ~ docker exec vespa bash -c 'netstat -vatn| grep 8080'
➜ ~ docker exec vespa bash -c 'netstat -vatn| grep 19071'
tcp 0 0 0.0.0.0:19071 0.0.0.0:* LISTEN
Below doc has info related to vespa ports
https://docs.vespa.ai/documentation/reference/files-processes-and-ports.html
I'm assuming port 8080 should be active after docker run(step 2 of quick start link) and can be accessed outside container as port mapping is done.
But I don't see 8080 port active inside container in first place.
A'm I missing something. Do I need to perform any additional step than mentioned in quick start? FYI I installed Jenkins inside my docker and was able to access outside container via port mapping. But not sure why it's not working with vespa.I have been trying from quiet sometime but no progress. Please advice me if I'm missing something here.
You have too low memory for your docker container, "Minimum 6GB memory dedicated to Docker (the default is 2GB on Macs).". See https://docs.vespa.ai/documentation/vespa-quick-start.html
The deadlock detector warnings and failure to get configuration from configuration server (which is likely oom killed) indicates that you are too low on memory.
My guess is that your jdisc container had not finished initialize or did not initialize properly? Did you try to check the log?
docker exec vespa bash -c '/opt/vespa/bin/vespa-logfmt /opt/vespa/logs/vespa/vespa.log'
This should tell you if there was something wrong. When it is ready to receive requests you would see something like this:
[2018-12-10 06:30:37.854] INFO : container Container.org.eclipse.jetty.server.AbstractConnector Started SearchServer#79afa369{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
[2018-12-10 06:30:37.857] INFO : container Container.org.eclipse.jetty.server.Server Started #10280ms
[2018-12-10 06:30:37.857] INFO : container Container.com.yahoo.container.jdisc.ConfiguredApplication Switching to the latest deployed set of configurations and components. Application switch number: 0
[2018-12-10 06:30:37.859] INFO : container Container.com.yahoo.container.jdisc.ConfiguredApplication Initializing new set of configurations and components. Application switch number: 1

docker - driver "devicemapper" failed to remove root filesystem after process in container killed

I am using Docker version 17.06.0-ce on Redhat with devicemapper storage. I am launching a container running a long-running service. The master process inside the container sometimes dies for whatever reason. I get the following error message.
/bin/bash: line 1: 40 Killed python -u scripts/server.py start go
I would like the container to exit and to be restarted by docker. However docker never exits. If I do it manually I get the following error:
Error response from daemon: driver "devicemapper" failed to remove root filesystem.
After googling, I tried a bunch of things:
docker rm -f <container>
rm -f <pth to mount>
umount <pth to mount>
All result in device is busy. The only remedy right now is to reboot the host system which is obviously not a long-term solution.
Any ideas?
I had the same problem and the solution was a real surprise.
So here is the error om docker rm:
$ docker rm 08d51aad0e74
Error response from daemon: driver "devicemapper" failed to remove root filesystem for 08d51aad0e74060f54bba36268386fe991eff74570e7ee29b7c4d74047d809aa: remove /var/lib/docker/devicemapper/mnt/670cdbd30a3627ae4801044d32a423284b540c5057002dd010186c69b6cc7eea: device or resource busy
Then I did the following (basically go through all processes and look for docker in mountinfo):
$ grep docker /proc/*/mountinfo | grep 958722d105f8586978361409c9d70aff17c0af3a1970cb3c2fb7908fe5a310ac
/proc/20416/mountinfo:629 574 253:15 / /var/lib/docker/devicemapper/mnt/958722d105f8586978361409c9d70aff17c0af3a1970cb3c2fb7908fe5a310ac rw,relatime shared:288 - xfs /dev/mapper/docker-253:5-786536-958722d105f8586978361409c9d70aff17c0af3a1970cb3c2fb7908fe5a310ac rw,nouuid,attr2,inode64,logbsize=64k,sunit=128,swidth=128,noquota
This got be the PID of the offending process keeping it busy - 20416 (the item after /proc/)
So I did a ps -p and to my surprise find:
[devops#dp01app5030 SeGrid]$ ps -p 20416
PID TTY TIME CMD
20416 ? 00:00:19 ntpd
A true WTF moment. So I pair problem solved with Google and found this:
Then found this https://github.com/docker/for-linux/issues/124
Turns out I had to restart ntp daemon and that fixed the issue!!!

How to debug "WSREP: SST failed: 1 (Operation not permitted)" with a MariaDB Galera cluster in Docker?

Requirement: CentOS-based Docker container providing a MariaDB 10.x Galera cluster
Host Environment: OX X El Capitan 10.11.6, Docker 1.12.5 (14777)
Docker Container OS: CentOS Linux release 7.3.1611 (Core)
DB: 10.1.20-MariaDB
I found a promising Docker image, but the documentation seems to be obsolete, the commands to start the cluster do not work. At the time of writing the image uses wsrep_sst_method = rsync and so I figured that the following commands should work (replace /Users/Me/somedb with an empty directory on your host):
docker pull dayreiner/centos7-mariadb-10.1-galera
docker run -d --name db1 -h db1host -p 3306:3306 -e CLUSTER_NAME=joe -e CLUSTER=BOOTSTRAP -e MYSQL_ROOT_PASSWORD='pwd' -v /Users/Me/somedb:/var/lib/mysql dayreiner/centos7-mariadb-10.1-galera:latest
docker run -d --name db2 -h db2host -p 3307:3306 --link db1 -e CLUSTER_NAME=joe -e CLUSTER=db1host,db2host -e MYSQL_ROOT_PASSWORD='pwd' -v /Users/Me/somedb:/var/lib/mysql dayreiner/centos7-mariadb-10.1-galera:latest
The first container (db1) comes up and seems OK. But the last line that tries to add db2 as a second node to the Galera cluster results in the following error (docker logs db2):
2017-01-10 15:26:10 139742710823680 [Note] WSREP: New cluster view: global state: :-1, view# 0: Primary, number of nodes: 1, my index: 0, protocol version 3
2017-01-10 15:26:10 139742711142656 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2017-01-10 15:26:10 139742711142656 [ERROR] Aborting
I could not figure out what is wrong here and would appreciate ideas on how to analyze this further. Is this a problem of rsync, Galera or even Docker?
That's my image on dockerhub.
I had not tested the cluster (until now) on a single host, only running on multiple hosts. You're right though, running two on a single host seems to abort the second node on start.
This looks to be caused by the default bridge network not behaving nicely. Possibly some issue with handling the ports for state transfer. Not really sure why.
If you modify your commands to first create a custom network for your clustered containers to use on the backend, and then run the cluster members using that network, that seems to work when running two nodes on a single host:
# docker network create mariadb
# docker run -d --network=mariadb -p 3307:3306 --name db1 -e CLUSTER_NAME=test -e CLUSTER=BOOTSTRAP -e MYSQL_ROOT_PASSWORD=test -v /opt/test/db1:/var/lib/mysql dayreiner/centos7-mariadb-10.1-galera:latest
# docker run -d --network=mariadb -p 3308:3306 --name db2 -e CLUSTER_NAME=test -e CLUSTER=db1,db2 -e MYSQL_ROOT_PASSWORD=test -v /opt/test/db2:/var/lib/mysql dayreiner/centos7-mariadb-10.1-galera:latest
No errors this time on the second node:
# docker logs db2 -f
...snip
2017-01-12 20:33:08 139726185019648 [Note] WSREP: Signalling provider to continue.
2017-01-12 20:33:08 139726185019648 [Note] WSREP: SST received: 42eaa277-d906-11e6-b98a-3e6b9531c1b7:0
2017-01-12 20:33:08 139725604124416 [Note] WSREP: 1.0 (f170852fe1b6): State transfer from 0.0 (951fdda2454b) complete.
2017-01-12 20:33:08 139725604124416 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2017-01-12 20:33:08 139725604124416 [Note] WSREP: Member 1.0 (f170852fe1b6) synced with group.
2017-01-12 20:33:08 139725604124416 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2017-01-12 20:33:08 139726105180928 [Note] WSREP: Synchronized with group, ready for connections
2017-01-12 20:33:08 139726105180928 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-01-12 20:33:08 139726185019648 [Note] mysqld: ready for connections.
Version: '10.1.20-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
Try that, see how it goes. Also, if you run it using docker-compose it will also work without any problems. This is likely because compose creates a dedicated compose container network by default. You can see an example compose file in this gist.
Just make sure to use a different directory for each mariadb instance, and after you have your cluster started, stop db1 and relaunch it as a regular cluster member (otherwise the next time db1 is started it will keep bootstrapping a new cluster).
Works after upgrading the Docker image to MariaDB 10.2.3 (from 10.1.20).
I am not 100% sure whether I have a truly valid cluster now, but at least show status like "wsrep_cluster_size"; produces the following output and the DB is usable:
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
Note: I also omitted the -v option and placed the DB files inside the Docker container instead of on an external volume. I don't think that this makes a difference regarding the cluster, but I did not verify 10.2.3 with -v. However, I tried 10.1.20 with both variations (external volume with -v and container-internal files) and both did not work.

Resources