rabbitmq docker wss not working even though https works - docker

Update
I initially started the docker container using this command:
sudo docker run -d -it --hostname some-rabbitmq --name rabbitmq -p 5672:5672 -p 15672:15672 -p 15674:15674 --restart=unless-stopped -v /opt/rabbitmq-test:/etc/rabbitmq/ rabbitmq:3.9-management
However I needed to add tls without losing the info that was already saved on it, so I added more ports by stopping the docker container and service, updating the config.v2.json and hostconfig.json files to add the ports I wanted to expose, including 15673.
I'm using the official rabbitmq docker image as my rabbitmq server. I've configured tls but for some reason the wss:// connection fails completely. This is my rabbitmq.conf file
loopback_users.guest = false
management.cors.allow_origins.1 = *
management.tcp.port = 80
# ===========================================================================
# ====================== TLS for webstomp plugin ============================
# ===========================================================================
# source of tls settings for webstomp plugin: https://www.rabbitmq.com/web-stomp.html#tls-versions
web_stomp.ssl.port = 15673
web_stomp.ssl.backlog = 1024
web_stomp.ssl.cacertfile = /etc/letsencrypt/live/{example.com}/chain.pem
web_stomp.ssl.certfile = /etc/letsencrypt/live/{example.com}/cert.pem
web_stomp.ssl.keyfile = /etc/letsencrypt/live/{example.com}/privkey.pem
# web_stomp.ssl.password = changeme
web_stomp.ssl.honor_cipher_order = true
web_stomp.ssl.honor_ecc_order = true
web_stomp.ssl.client_renegotiation = false
web_stomp.ssl.secure_renegotiate = true
web_stomp.ssl.versions.1 = tlsv1.2
web_stomp.ssl.versions.2 = tlsv1.1
web_stomp.ssl.ciphers.1 = ECDHE-ECDSA-AES256-GCM-SHA384
web_stomp.ssl.ciphers.2 = ECDHE-RSA-AES256-GCM-SHA384
web_stomp.ssl.ciphers.3 = ECDHE-ECDSA-AES256-SHA384
web_stomp.ssl.ciphers.4 = ECDHE-RSA-AES256-SHA384
web_stomp.ssl.ciphers.5 = ECDH-ECDSA-AES256-GCM-SHA384
web_stomp.ssl.ciphers.6 = ECDH-RSA-AES256-GCM-SHA384
web_stomp.ssl.ciphers.7 = ECDH-ECDSA-AES256-SHA384
web_stomp.ssl.ciphers.8 = ECDH-RSA-AES256-SHA384
web_stomp.ssl.ciphers.9 = DHE-RSA-AES256-GCM-SHA384
# ==========================================================================
# ====================== TLS for manangement ui ============================
# ==========================================================================
# source of tls settings for management ui: https://www.rabbitmq.com/management.html#single-listener-https
management.ssl.port = 443
# management.ssl.cacertfile = /opt/rabbitmq-test/tls-ca/letsencrypt.pem
management.ssl.cacertfile = /etc/letsencrypt/live/{example.com}/chain.pem
management.ssl.certfile = /etc/letsencrypt/live/{example.com}/cert.pem
management.ssl.keyfile = /etc/letsencrypt/live/{example.com}/privkey.pem
## This key must only be used if private key is password protected
# management.ssl.password = bunnies
# For RabbitMQ 3.7.10 and later versions
management.ssl.honor_cipher_order = true
management.ssl.honor_ecc_order = true
management.ssl.client_renegotiation = false
management.ssl.secure_renegotiate = true
management.ssl.versions.1 = tlsv1.2
management.ssl.versions.2 = tlsv1.1
management.ssl.ciphers.1 = ECDHE-ECDSA-AES256-GCM-SHA384
management.ssl.ciphers.2 = ECDHE-RSA-AES256-GCM-SHA384
management.ssl.ciphers.3 = ECDHE-ECDSA-AES256-SHA384
management.ssl.ciphers.4 = ECDHE-RSA-AES256-SHA384
management.ssl.ciphers.5 = ECDH-ECDSA-AES256-GCM-SHA384
management.ssl.ciphers.6 = ECDH-RSA-AES256-GCM-SHA384
management.ssl.ciphers.7 = ECDH-ECDSA-AES256-SHA384
management.ssl.ciphers.8 = ECDH-RSA-AES256-SHA384
management.ssl.ciphers.9 = DHE-RSA-AES256-GCM-SHA384
## Usually RabbitMQ nodes do not perform peer verification of HTTP API clients
## but it can be enabled if needed. Clients then will have to be configured with
## a certificate and private key pair.
##
## See https://www.rabbitmq.com/ssl.html#peer-verification for details.
# management.ssl.verify = verify_peer
# management.ssl.fail_if_no_peer_cert = true
# ==========================================================================
# ========================= TLS for core server ============================
# ==========================================================================
# source of tls settings: https://www.rabbitmq.com/ssl.html#testssl-sh
listeners.ssl.default = 5671
# ssl_options.cacertfile = /opt/rabbitmq-test/tls-ca/letsencrypt.pem
ssl_options.cacertfile = /etc/letsencrypt/live/{example.com}/chain.pem
ssl_options.certfile = /etc/letsencrypt/live/{example.com}/cert.pem
ssl_options.keyfile = /etc/letsencrypt/live/{example.com}/privkey.pem
ssl_options.versions.1 = tlsv1.2
ssl_options.verify = verify_peer
ssl_options.fail_if_no_peer_cert = false
ssl_options.honor_cipher_order = true
ssl_options.honor_ecc_order = true
# These are highly recommended for TLSv1.2 but cannot be used
# with TLSv1.3. If TLSv1.3 is enabled, these lines MUST be removed.
ssl_options.client_renegotiation = false
ssl_options.secure_renegotiate = true
ssl_options.ciphers.1 = ECDHE-ECDSA-AES256-GCM-SHA384
ssl_options.ciphers.2 = ECDHE-RSA-AES256-GCM-SHA384
ssl_options.ciphers.3 = ECDH-ECDSA-AES256-GCM-SHA384
ssl_options.ciphers.4 = ECDH-RSA-AES256-GCM-SHA384
ssl_options.ciphers.5 = DHE-RSA-AES256-GCM-SHA384
ssl_options.ciphers.6 = DHE-DSS-AES256-GCM-SHA384
ssl_options.ciphers.7 = ECDHE-ECDSA-AES128-GCM-SHA256
ssl_options.ciphers.8 = ECDHE-RSA-AES128-GCM-SHA256
ssl_options.ciphers.9 = ECDH-ECDSA-AES128-GCM-SHA256
ssl_options.ciphers.10 = ECDH-RSA-AES128-GCM-SHA256
ssl_options.ciphers.11 = DHE-RSA-AES128-GCM-SHA256
ssl_options.ciphers.12 = DHE-DSS-AES128-GCM-SHA256
If you didn't read all that, the web_stomp plugin is configured to accept wss:// connections on port 15673, and regular ws:// on 15674. Note that I'm not using amqp and therefore amqps is not configured.
With this configuration I'm able to connect to ws://example.com:15674/ws using a webstomp javascript library using this code:
const stompClient = new StompJs.Client({
brokerURL: url,
connectHeaders: {
login: userId,
passcode: password,
},
reconnectDelay: 20000,
heartbeatIncoming: 10000,
heartbeatOutgoing: 10000,
});
// ... set my onConnect and onError listeners
stompClient.activate();
I'm also able to access both http://example.com and https://example.com from the browser. But strangely I
cannot access wss://example.com:15673/ws using the same javascript code above.
When I run sudo docker logs rabbitmq (rabbitmq is the name of the running container), I see entries like this:
2022-02-08 16:06:45.314017+00:00 [info] <0.11704.33> accepting AMQP connection <0.11704.33> (162.142.125.221:54938 -> 172.17.0.2:5672)
2022-02-08 16:06:45.314245+00:00 [erro] <0.11704.33> closing AMQP connection <0.11704.33> (162.142.125.221:54938 -> 172.17.0.2:5672):
2022-02-08 16:06:45.314245+00:00 [erro] <0.11704.33> amqp1_0_plugin_not_enabled
2022-02-08 16:40:01.274168+00:00 [noti] <0.12580.33> TLS server: In state hello at tls_handshake.erl:346 generated SERVER ALERT: Fatal - Insufficient Security
2022-02-08 16:40:01.274168+00:00 [noti] <0.12580.33> - no_suitable_ciphers
2022-02-08 16:40:02.790021+00:00 [noti] <0.12597.33> TLS server: In state hello at tls_handshake.erl:346 generated SERVER ALERT: Fatal - Insufficient Security
2022-02-08 16:40:02.790021+00:00 [noti] <0.12597.33> - no_suitable_ciphers
2022-02-08 16:40:03.423216+00:00 [noti] <0.12604.33> TLS server: In state hello at tls_handshake.erl:346 generated SERVER ALERT: Fatal - Insufficient Security
2022-02-08 16:40:03.423216+00:00 [noti] <0.12604.33> - no_suitable_ciphers
2022-02-08 16:40:06.467195+00:00 [noti] <0.12610.33> TLS server: In state hello at tls_handshake.erl:346 generated SERVER ALERT: Fatal - Insufficient Security
2022-02-08 16:40:06.467195+00:00 [noti] <0.12610.33> - no_suitable_ciphers
2022-02-08 18:12:36.615274+00:00 [noti] <0.16116.33> TLS server: In state hello at tls_record.erl:564 generated SERVER ALERT: Fatal - Unexpected Message
2022-02-08 18:12:36.615274+00:00 [noti] <0.16116.33> - {unsupported_record_type,65}
2022-02-08 18:28:00.774093+00:00 [noti] <0.16535.33> TLS server: In state hello at tls_record.erl:564 generated SERVER ALERT: Fatal - Unexpected Message
2022-02-08 18:28:00.774093+00:00 [noti] <0.16535.33> - {unsupported_record_type,71}
None of these entries correspond to my actual attempts to connect. I don't know which connections are triggering these logs because my actual connection attempts do not show up in the logs.
RabbitMQ's website has a page for troubleshooting ssl, but when I go into the docker container (using docker exec -it rabbitmq bash) and try to use the rabbit-diagnostics command, I get:
Error: unable to perform an operation on node 'rabbit#name-rabbitmq-test.localdomain'. Please see diagnostics information and suggestions below.
I've also tried using the openssl s_client command mentioned in the troubleshooting page, but what I get is Error: verify error:num=2:unable to get issuer certificate. I've tried using both chain.pem and fullchain.pem (the CA files provided by letsencrypt) without any luck, it's the same error.
I'm not a rabbitmq expert so I'm not really sure how to proceed. I tried to include additional info about what I've tried, but please remember that the main problem is that I cannot connect using wss:// even though I can connect with ws:// and https://. Any ideas as to why this happening or what can help me resolve the problem? Thanks.

Many thanks to #jhmckimm for the idea. The problem was that port 15673 from the docker container was not exposed by the host. To fix this I had to add that port to the ExposedPorts in the /var/lib/docker/containers/[hash_of_the_container]/config.v2.json file of the docker system. See here for more information.

Related

Docker Redis TLS authentication failure with .netcore app

I am trying to use redis with tls with a netcore application and I get an authentication error
The Setup:
Docker:
I created a redis docker container using redis:6.2.0
docker-compose.yaml:
.
.
redis:
image: redis:6.2.0
command: redis-server /usr/local/etc/redis/redis.conf --appendonly yes
container_name: "cxm-redis"
ports:
- "6379:6379"
volumes:
- cxm-redis-data:/data
- C:/SaaS/certs/redis.conf:/usr/local/etc/redis/redis.conf
- C:/SaaS/certs/tests/tls/redis.crt:/usr/local/etc/redis/redis.crt
- C:/SaaS/certs/tests/tls/redis.key:/usr/local/etc/redis/redis.key
- C:/SaaS/certs/tests/tls/ca.crt:/usr/local/etc/redis/ca.crt
up to here all looks good, (as far as I can tell) I managed to authenticate using the following command
redis-cli --tls --cert ../usr/local/etc/redis/redis.crt --key /usr/local/etc/redis/redis.key --cacert /usr/local/etc/redis/ca.crt and I can succesfully ping and request keys.
I created the certificates with openssl and for the redis.conf i am using the redis.conf example from redis
The important bits:
### TLS
tls-port 6379
tls-cert-file /usr/local/etc/redis/redis.crt
tls-key-file /usr/local/etc/redis/redis.key
tls-ca-cert-file /usr/local/etc/redis/ca.crt
netcore:
For my .netcore application I am using the StackExchange library and for the TLS connection I followed the instructions here, like so
var options = new ConfigurationOptions
{
EndPoints = { "redis-test:6379" },
Password = "not-the-actual-password",
Ssl = true
};
options.CertificateSelection += delegate {
return new X509Certificate2("./redis_certificate.p12");
};
_db = ConnectionMultiplexer.Connect(options).GetDatabase();
the redis_certificate.p12 was generated using openssl with this command line
openssl pkcs12 -export -out sample_certificate.p12 -inkey redis.key -in redis.crt
The Issue:
When I make a request to redis from my app I get the following error:
It was not possible to connect to the redis server(s). There was an authentication failure; check that passwords (or client certificates) are configured correctly. AuthenticationFailure on redis-test:6379/Interactive, Initializing/NotStarted
in my apps logs, and I get the following in my redis logs:
Error accepting a client connection: error:1408F10B:SSL routines:ssl3_get_record:wrong version number
Error accepting a client connection: error:14094418:SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca,
Error accepting a client connection: (null)
Are there any apparent mistakes in my setup I am failing to see? This is my first time trying this and maybe I am assuming too much or heading down the wrong way..
Trying to resolve this I found several questions with a similar issue but implementing their fixes did not resolve my issue..
a few of the things I tried
sending different ssl protocols from my .netcore app
sending the pfx/p12 certificate in different ways
several different redis configurations
Edit: I can provide as much code as needed!
For any one facing the same issue, it seems the server was using a non routed CA for the server certificates, the solution I found was to use the CertificateValidation callback of StackExchange.Redis library with the following code
private static bool CheckServerCertificate(object sender, X509Certificate certificate,
X509Chain chain, SslPolicyErrors sslPolicyErrors)
{
if ((sslPolicyErrors & SslPolicyErrors.RemoteCertificateChainErrors) == SslPolicyErrors.RemoteCertificateChainErrors)
{
// check that the untrusted ca is in the chain
var ca = new X509Certificate2(_redisSettings.CertificatePath);
var caFound = chain.ChainElements
.Cast<X509ChainElement>()
.Any(x => x.Certificate.Thumbprint == ca.Thumbprint);
return caFound;
}
return false;
}
also an important part of the code being the condition
if((sslPolicyErrors & SslPolicyErrors.RemoteCertificateChainErrors) == SslPolicyErrors.RemoteCertificateChainErrors)

msmtp TLS timeout

I've looked through the list of possible solutions, but I don't see this problem, here it is.
I had been using smtp for years for my crontab entry to provide status updates via email. Then it quit this week, and I was unable to fix it. Then I saw that it had become orphaned, and the suggestion was to move to msmtp. So I downloaded and installed it on my Ubuntu 18.10 system.
I'm trying to send email to my myaccount#gmail.com account.
It appears that I'm communicating properly with the gmail smtp server, as the debug below show. But it always gets a TLS Timeout.
I also don't understand why I have multiple EHLO entries. My system does not have a DNS domain name, so that I'm not sure what to put here; localhost seems to be working OK. Also, my Thunderbird emailer is working correctly with gmail.
Here's the debug output:
echo "Hello there" | msmtp --debug myaccount#gmail.com >/tmp/msmtpOut.txt
ignoring system configuration file /etc/msmtprc: No such file or directory
loaded user configuration file /home/myhome/.msmtprc
falling back to default account
using account default from /home/myhome/.msmtprc
host = smtp.gmail.com
port = 587
proxy host = (not set)
proxy port = 0
timeout = off
protocol = smtp
domain = localhost
auth = choose
user = myaccount
password = *
passwordeval = (not set)
ntlmdomain = (not set)
tls = on
tls_starttls = on
tls_trust_file = /etc/ssl/certs/ca-certificates.crt
tls_crl_file = (not set)
tls_fingerprint = (not set)
tls_key_file = (not set)
tls_cert_file = (not set)
tls_certcheck = on
tls_min_dh_prime_bits = (not set)
tls_priorities = (not set)
auto_from = off
maildomain = (not set)
from = myaccount#gmail.com
add_missing_from_header = on
dsn_notify = (not set)
dsn_return = (not set)
logfile = (not set)
syslog = (not set)
aliases = (not set)
reading recipients from the command line
<-- 220 smtp.gmail.com ESMTP 4sm116524ywc.22 - gsmtp
--> EHLO localhost
<-- 250-smtp.gmail.com at your service, [71.56.87.81]
<-- 250-SIZE 35882577
<-- 250-8BITMIME
<-- 250-STARTTLS
<-- 250-ENHANCEDSTATUSCODES
<-- 250-PIPELINING
<-- 250-CHUNKING
<-- 250 SMTPUTF8
--> STARTTLS
<-- 220 2.0.0 Ready to start TLS
TLS certificate information:
Owner:
Common Name: smtp.gmail.com
Organization: Google LLC
Locality: Mountain View
State or Province: California
Country: US
Issuer:
Common Name: Google Internet Authority G3
Organization: Google Trust Services
Country: US
Validity:
Activation time: Tue 21 May 2019 04:48:45 PM EDT
Expiration time: Tue 13 Aug 2019 04:32:00 PM EDT
Fingerprints:
SHA256: C7:78:B6:D6:4E:3E:2B:2F:08:6D:A4:84:E6:1D:87:8E:A1:DF:54:D2:AB:79:AC:A6:BB:50:E5:5D:EC:B4:20:4C
SHA1 (deprecated): 39:C5:E5:40:64:37:17:25:17:7F:E8:BA:20:F4:70:F4:FE:22:70:22
--> EHLO localhost
msmtp: cannot read from TLS connection: the operation timed out
msmtp: could not send mail (account default from /home/myhome/.msmtprc)
Build msmtp using --with-tls=openssl to solve the problem.
As regards as the EHLO command sent twice the RFC3207 states:
The server MUST discard any knowledge
obtained from the client, such as the argument to the EHLO command,
which was not obtained from the TLS negotiation itself. The client
MUST discard any knowledge obtained from the server, such as the list
of SMTP service extensions, which was not obtained from the TLS
negotiation itself. The client SHOULD send an EHLO command as the
first command after a successful TLS negotiation.
So that is the normal behaviour.

Ruby SSL handshake not receiving Server Hello back - using proxy Net::HTTP

I am connecting to an external API using Ruby SSL two way authentication.
My latest script is here:
namespace :rnif_message do
# With Proxy
task send_test_index: :environment do
our_cert = File.read(File.join(Rails.root, 'ssl', 'invoice', 'test', 'cert20190116_ourcert.der'))
their_test_cert = File.read(File.join(Rails.root, 'ssl', 'invoice', 'test', 'testcert2016_theircert.der'))
cert_store = OpenSSL::X509::Store.new
# Contains their intermediate CA files
cert_store.add_path File.join(Rails.root, 'ssl', 'invoice', 'test', 'ca')
cert_store.add_cert OpenSSL::X509::Certificate.new(their_test_cert)
uri = URI("https://xml.digital.com/wm.rn/receive")
proxy_host = "us-static-02.qg.com"
proxy_port = "port"
proxy_user = "user"
proxy_pass = "pass"
proxy_request = Net::HTTP.new(uri.hostname, '443', proxy_host, proxy_port, proxy_user, proxy_pass)
proxy_request.verify_mode = OpenSSL::SSL::VERIFY_PEER
proxy_request.use_ssl = true
proxy_request.ssl_version = :TLSv1_2
proxy_request.ciphers = ["AES256-SHA:AES128-SHA:DES-CBC3-SHA"]
proxy_request.cert = OpenSSL::X509::Certificate.new(our_cert)
proxy_request.cert_store = cert_store
post_request = Net::HTTP::Post.new(uri)
response = proxy_request.request(post_request)
end
Response back (since I updated the ciphers) is now
OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=unknown state
Instead of the older from my two previous questions
OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=SSLv3 read server hello A
# /Users/me/projects/proj/lib/tasks/rnif_message_builder.rake:217:in `block (2 levels) in <top (required)>'
Here is my latest wireshark
In the initial configuration of my certificate and IP on THEIR server configuration, I may have given them the wrong IP address, so I may be getting blocked by their firewall. Is there ways using openssl s_client I can test this?
So far i've been trying
openssl s_client -showcerts -connect xml.digitaloilfield.com:https
But I am not very familiar with using openssl s_client
Any help on troubleshooting this would be greatly appreciated!
Update
Thanks you very much for your help so far. I am experimenting with those commands you sent me and trying to see what info I can get from them to help me with this. Currently, after they changed my IP address and allowed me through the firewall, I am getting this
EOFError: end of file reached /Users/me/projects/xtiri/xtiri.com/lib/tasks/rnif_message_builder.rake:219:in `block (2 levels) in <top (required)>'
This will usually connect to nearly all servers. It uses TLS 1.2 and SNI. That should establish the TCP connection and start the TLS handshake. The handshake may fail later, but that's a different problem.
$ openssl s_client -connect xml.digitaloilfield.com:443 -tls1_2 \
-servername xml.digitaloilfield.com -debug
<hang>
connect: Connection timed out
connect:errno=110
However, while s_client is hanging, jump over to another terminal and issue:
$ sudo netstat -a | grep openssl
$
Netstat does not show you the SYN_SEND state, so use tcptrack:
$ sudo tcptrack -i eth0
# <next, use s_client>
172.16.2.4:43302 208.38.22.37:443 SYN_SENT 15s 0 B/s
You are in TCP's wait timer. The other side did not perform the three-way handshake with you. In fact, they did not acknowledge your SYN. There could be a few reasons for it, but ...
Given the target, it looks like you encountered a firewall. Rather than Reject'ing connections, it is Drop'ing connections. Its sometimes called "Stealth Mode"; it makes it appear there's no server running on the machine. That's consistent with OpenSSL's connect: Connection timed out message.
The problem could be with the proxy. You really want to run the tests from there, but you probably won't be able to. It could be you are using the ciphers, protocols and ports as specified by the remote site; but the proxy is doing its own thing. Also see Jarmock's SSL Interception Proxies and Transitive Trust.
Here are a couple of references:
How can I monitor network connections for an app
Is it better to set -j REJECT or -j DROP in iptables?
TCP 3-way handshake on the Wireshark Wiki

Where does stomp_interface come from?

In order to enable https communications between OpsCenter and DSE nodes, I have to set stomp_interface to opscenter.mydomain.com in /var/lib/datastax-agent/conf/address.yaml on each node. (After the fix, I no longer have to do this.)
Whenever I do a configure job from OpsCenter, it changes this stomp_interface value back to nn.nn.nn.nn. (After the fix, it still does this, but it doesn't break the agent HTTP communications anymore.)
Where does this parameter come from? Can I set it on the OpsCenter node in the /etc/opscenter/clusters/cluster_name.conf file?
Is it part of the [agents] section?
What is the parameter name and value that I should be adding?
opscenterd is now (the fix was to add the incoming_interface line):
# opscenterd.conf
[webserver]
port = 8888
interface = 0.0.0.0
ssl_keyfile = /var/lib/opscenter/ssl/opscenter.key
ssl_certfile = /var/lib/opscenter/ssl/opscenter.pem
ssl_port = 8443
[authentication]
enabled = True
[stat_reporter]
[agents]
use_ssl = true
incoming_interface = opscenter.mydomain.com
address.yaml before fix:
use_ssl: 1
stomp_interface: 1.2.3.4 (the opscenter external IP.
opscenter.mydomain.com also works)
stomp_port: 61620
local_interface: 2.3.4.5 (the external IP for this cluster node)
agent_rpc_interface: 0.0.0.0
agent_rpc_broadcast_address: 2.3.4.5
poll_period: 60
disk_usage_update_period: 60
rollup_rate: 200
rollup_rate_unit: second
jmx_host: 127.0.0.1
jmx_port: 7199
jmx_user: someuser
jmx_pass: somepassword
status_reporting_interval: 20
ec2_metadata_api_host: 169.254.169.254
metrics_enabled: true
jmx_metrics_threadpool_size: 5
hosts: ["2.3.4.5", "3.4.5.6", "4.5.6.7", "5.6.7.8"]
cassandra_port: 9042
thrift_port: 9160
cassandra_user: someuser
cassandra_pass: somepassword
runs_sudo: true
cassandra_install_location: /usr/share/dse
cassandra-conf: /etc/dse/cassandra/cassandra.yaml
cassandra_binary_location: /usr/bin
cassandra_conf_location: /etc/dse/cassandra
dse_env_location: /etc/dse
dse_binary_location: /usr/bin
dse_conf_location: /etc/dse
spark_conf_location: /etc/dse/spark
monitored_cassandra_user: someuser
monitored_cassandra_pass: somepassword
tcp_response_timeout: 120000
pong_timeout_ms: 120000
cluster_name.conf (I updated the seed_hosts to match those in the address.yaml hosts config in order to satisfy a Best Practices alert
that they should all be the same):
[destinations]
active =
[kerberos]
default_service =
opscenterd_client_principal =
opscenterd_keytab_location =
agent_keytab_location =
agent_client_principal =
[agents]
ssl_keystore_password =
ssl_keystore =
[jmx]
password = somepassword
port = 7199
username = someuser
[cassandra]
ssl_truststore_password =
cql_port = 9042
seed_hosts = 2.3.4.5, 3.4.5.6, 4.5.6.7, 5.6.7.8
username = someuser
password = somepassword
ssl_keystore_password =
ssl_keystore =
ssl_truststore =
Based on your comment for further information, I figured it out.
I added the incoming_interface = opscenter.mydomain.com to the [agents] section of the opscenterd.conf. (That wasn't present before markc's comment.)
I restarted service opscenterd.
Next, I was able to go back to OpsCenter LifeCycle Manager and do a fresh Install and Configure on the cluster, and all of the job steps completed successfully.
(Note: Don't change the rack names on nodes from what they were before, and select autoBootStrap = true on the Configure / Install requests.)
The datastax-agents are fully Up and Active. After the Configure and Install, the address.yaml files contained the public IP address of the OpsCenter node as the stomp_interface. (I changed one stomp_interface manually to be opscenter.mydomain.com, and that also works.)
I will also edit the question and post the requested information.
Thanks markc!

Oauth2 Cygnus and Cosmos Sink doesn't seems to work

Since the last update , i haven't been able to upload my data to Cosmos using Cygnus . I am aware that we now need to use Oauth2 token to do it . So i did the request for the token .
curl -k -X POST "https://cosmos.lab.fiware.org:13000/cosmos-auth/v1/token" -H "Content-Type: application/x-www-form-urlencoded" -d "grant_type=password&username=guillaume.jourdain#4planet.eu&password=XXXXX"
I get a token, but then i try to check the token :
curl -X GET "http://cosmos.lab.fiware.org:14000/webhdfs/v1/guillaume.jourdain/hostabee?op=liststatus&user.name=guillaume.jourdain#4planet.eu" -H "X-Auth-Token: TheToken"
and even this :
curl -X GET "http://cosmos.lab.fiware.org:14000/webhdfs/v1/guillaume.jourdain/hostabee?op=liststatus&user.name=guillaume.jourdain" -H "X-Auth-Token: TheToken"
And Everytime , for each of this command and for all the token I Tried i get this :
User token not authorized
Next i tried to put the oauth parameter in my cygnus conf file and this occured everytime :
2015-07-17 16:17:17,797 (lifecycleSupervisor-1-1) [INFO - es.tid.fiware.orionconnectors.cosmosinjector.hdfs.HttpFSBackend.createDir(HttpFSBackend.java:71)] HttpFS response: HTTP/1.1 401 Unauthorized
2015-07-17 16:17:17,798 (lifecycleSupervisor-1-1) [ERROR - es.tid.fiware.orionconnectors.cosmosinjector.OrionHDFSSink.start(OrionHDFSSink.java:108)] The directory could not be created in HDFS. HttpFS response: 401 Unauthorized
So yeah , for the moment i'm kinda stuck . Do you have any information for me to resolve this problem ?
EDIT :
Here's my Cygnus Configuration file , maybe the problem is located here
APACHE_FLUME_HOME/conf/cygnus.conf
orionagent.sources = http-source
orionagent.sinks = hdfs-sink
orionagent.channels = notifications
# Flume source, must not be changed
orionagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# channel name where to write the notification events
orionagent.sources.http-source.channels = notifications
# listening port the Flume source will use for receiving incoming notifications
orionagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
orionagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# regular expression for the orion version the notifications will have in their headers
orionagent.sources.http-source.handler.orion_version = 0\.23\.*
# URL target
orionagent.sources.http-source.handler.notification_target = /notify
# channel name from where to read notification events
orionagent.sinks.hdfs-sink.channel = notifications
# Flume sink that will process and persist in HDFS the notification events, must not be changed
orionagent.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.OrionHDFSSink
# IP address of the Cosmos deployment where the notification events will be persisted
orionagent.sinks.hdfs-sink.cosmos_host = 130.206.80.46
# port of the Cosmos service listening for persistence operations; 14000 for httpfs, 50070 for webhdfs and free choice for inifinty
orionagent.sinks.hdfs-sink.cosmos_port = 14000
# username allowed to write in HDFS (/user/myusername)
orionagent.sinks.hdfs-sink.cosmos_username = guillaume.jourdain
# dataset where to persist the data (/user/myusername/mydataset)
orionagent.sinks.hdfs-sink.cosmos_password = XXXXX
orionagent.sinks.hdfs-sink.cosmos_dataset = hostABee
orionagent.sinks.hdfs-sink.attr_persistence = column
orionagent.sinks.hdfs-sink.hive_host = 130.206.80.46
orionagent.sinks.hdfs-sink.hive_port = 10000
orionagent.sinks.hdfs-sink.oauth2_token = TheTOKEN
# HDFS backend type (webhdfs, httpfs or infinity)
orionagent.sinks.hdfs-sink.hdfs_api = webhdfs
# channel name
orionagent.channels.notifications.type = memory
# capacity of the channel
orionagent.channels.notifications.capacity = 1000
# amount of bytes that can be sent per transaction
orionagent.channels.notifications.transactionCapacity = 100
Now I get this error (and others). The sink and the handlers does'nt seems to be found
2015-07-27 14:27:10,562 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: hdfs-sink, type: com.telefonica.iot.cygnus.sinks.OrionHDFSSink
2015-07-27 14:27:10,562 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:142)] Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load sink type: com.telefonica.iot.cygnus.sinks.OrionHDFSSink, class: com.telefonica.iot.cygnus.sinks.OrionHDFSSink
at org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:69)
at org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:415)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:103)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
Caused by: java.lang.ClassNotFoundException: com.telefonica.iot.cygnus.sinks.OrionHDFSSink
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:67)
... 12 more
Thank you for reading .
Regarding the WebHDFS command for listing a HDFS folder:
curl -X GET "http://cosmos.lab.fiware.org:14000/webhdfs/v1/guillaume.jourdain/hostabee?op=liststatus&user.name=guillaume.jourdain#4planet.eu" -H "X-Auth-Token: TheToken"
The user.name should be user.name=guillaume.jourdain (without the #4planet.eu part).
Regarding Cygnus, have you upgraded to 0.8.2? It is the only Cygnus version supporting OAuth2. I guess you did not upgrade because of the es.tid.fiware.orionconnectors.cosmosinjector.OrionHDFSSink logs (those packages are previous to 0.8.0). You have all the details for upgrading here.

Resources