Connectivity problems migrating from 1.5.3 to 2.0 - orleans

I am in the process of updating an Orleans 1.5.3 application to 2.0 but the updated version fails to start properly with all silos in the cluster reporting that they cannot ping the other live nodes. I am able to deploy the 1.5.3 version to the same cloud service & virtual network with no problems so I do not believe this is a network issue.
07-02-2018 22:09:51 EventId="101021" Message="Exception getting a sending socket to endpoint S172.0.0.4:11111:268290539" Exception="System.TimeoutException: Connection to 172.0.0.4:11111 could not be established in 00:00:05
at Orleans.Runtime.SocketManager.Connect(Socket s, IPEndPoint endPoint, TimeSpan connectionTimeout) in D:\build\agent\_work\18\s\src\Orleans.Core\Messaging\SocketManager.cs:line 206
at Orleans.Runtime.SocketManager.SendingSocketCreator(IPEndPoint target) in D:\build\agent\_work\18\s\src\Orleans.Core\Messaging\SocketManager.cs:line 108
at Orleans.Runtime.LRU`2.Get(TKey key) in D:\build\agent\_work\18\s\src\Orleans.Core\Utils\LRU.cs:line 160
at Orleans.Runtime.Messaging.SiloMessageSender.GetSendingSocket(Message msg, Socket& socket, SiloAddress& targetSilo, String& error) in D:\build\agent\_work\18\s\src\Orleans.Runtime\Messaging\SiloMessageSender.cs:line 85"
07-02-2018 22:09:33 EventId="100661" Message="-Failed to get ping responses from all 2 silos that are currently listed as Active in the Membership table. Newly joining silos validate connectivity with all pre-existing silos that are listed as Active in the table and have written I Am Alive in the table in the last 00:10:00 period, before they are allowed to join the cluster. Active silos are: [[SiloAddress=S172.0.0.4:11111:268290539 SiloName=ModelEngine.Silo.GraphBuilder_IN_0 Status=Active HostName=RD0003FFA5B97C ProxyPort=30000 RoleName=Orleans.Runtime UpdateZone=0 FaultZone=0 StartTime = 2018-07-03 05:08:59.428 GMT IAmAliveTime = 2018-07-03 05:09:09.537 GMT ], [SiloAddress=S172.0.0.5:11111:268290516 SiloName=ModelEngine.Silo.GraphBuilder_IN_1 Status=Active HostName=RD0003FFA5ADDE ProxyPort=30000 RoleName=Orleans.Runtime UpdateZone=0 FaultZone=0 StartTime = 2018-07-03 05:08:36.901 GMT IAmAliveTime = 2018-07-03 05:08:47.168 GMT ]]" Exception=""
Looking at the membership table, everything looks consistent between the 1.5.3 and the 2.0 nodes with the exception of the RoleName column. For the 2.0 nodes, this column always contains "Orleans.Runtime" rather than the role name. Not sure if this is related but it is the only differnce I can find.

Related

Spring Cloud Stream Kafka Consumer application doesn't allow adding Supplier

I am working on a Spring Cloud Stream Kafka application. I have added only consumers to consume messages from topics and deliver them to a third party using FIX protocol.
It is working fine till this point, but now the third party sends back the response and I would like to produce them to a new topic. When I added a Supplier in my existing code, it starts behaving weirdly. bootstrap.servers config changes from remoteHost broker to localhost and started giving below error:
[AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established> Broker may not be available.
error would come if trying to connect localhost as there isn't any Kafka setup.
Below is my application.yml file:
spring.cloud.stream.function.definition: amerData;emeaData;ackResponse #added new ackResponse here
spring.cloud.stream.kafka.streams:
binder:
brokers: remoteHost:9092
configuration:
schema.registry.url: remoteHost:8081
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
bindings:
ackResponse-out-0: #new addition
producer.configuration:
key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
spring.cloud.stream.bindings:
amerData-in-0:
destination: topic1
emeaData-in-0:
destination: topic2
ackResponse-out-0: #new addition
destination: topic3
and tried possible options for Supplier -> Supplier<String> ackResponse() or Supplier<Message<String>> ackResponse()
It only doesn't change remoteHost to localhost when I am doing Supplier<KStream<String,String>> ackResponse(), then bootstrap.servers show the configured remote one, but this isn't correct and I can't write the received response (mostly a string or json) like this to a Kafka topic.
I did configure my consumers as Consumer<KStream<String, AVROPOJO1>> amerData() and Consumer<KStream<String, AVROPOJO2>> emeaData() as per need & they work fine.
Am I missing or messing up something? Can't we have producer/consumer both in the same spring cloud stream application? Using Streambridge also couldn't solve this. Could someone help?
If you are adding a Supplier bean as you have done, it becomes a regular producer that is using the MessageChannel based Kafka binder. You need to add the regular Kafka binder in your project (spring-cloud-stream-binder-kafka). The bindings for that should be under spring.cloud.stream.kafka.bindings. I see that you have it defined above under spring.cloud.stream.kafka.streams.bindings. I wonder if that is the issue?

ServiceUnavailable error in cypher statement from guides.neo4j.com/wiki

I recently created a Neo4j sandbox instance (blank) so I could test https://guides.neo4j.com/wiki.
Using https://guides.neo4j.com/wiki, I had success running the first cypher statement to populate the graph with Wikipedia categories. However, the second cypher statement produces an error - after running for a few seconds.
Here's the statement:
UNWIND range(0,4) as level
CALL apoc.cypher.doIt("
MATCH (c:Category { pagesFetched: false, level: $level })
CALL apoc.load.json('https://en.wikipedia.org/w/api.php?format=json&action=query&list=categorymembers&cmtype=page&cmtitle=Category:' + apoc.text.urlencode(c.catName) + '&cmprop=ids%7Ctitle&cmlimit=500')
YIELD value as results
UNWIND results.query.categorymembers AS page
MERGE (p:Page {pageId: page.pageid})
ON CREATE SET p.pageTitle = page.title, p.pageUrl = 'http://en.wikipedia.org/wiki/' + apoc.text.urlencode(replace(page.title, ' ', '_'))
WITH p,c
MERGE (p)-[:IN_CATEGORY]->(c)
WITH DISTINCT c
SET c.pagesFetched = true", { level: level }) YIELD value
RETURN value
And here's the error message:
Error ServiceUnavailable
WebSocket connection failure. Due to security constraints in your web browser, the reason for the failure is not available to this Neo4j Driver. Please use your browsers development console to determine the root cause of the failure. Common reasons include the database being unavailable, using the wrong connection URL or temporary network problems. If you have enabled encryption, ensure your browser is configured to trust the certificate Neo4j is configured to use. WebSocket `readyState` is: 3
The js console shows error message:
neo4j-web.min.js:20 WebSocket is already in CLOSING or CLOSED state.
I also posted this on the neoj4 Slack channel and the neo4j Google group - any help is appreciated.
As an addendum to this post (5/25/2018):
I installed neo4j community edition version 3.4.0 on an AWS EC2 (Linux) instance, and did not get the ServiceUnavailable error as above. The error occurred on my MacBook Pro macos 10.13.4
Thanks for your interest.
Colin Goldberg

Get "the underlying connection was closed an unexpected error occurred on a send" error when calling CompleteSale method

I'm having a problem calling the API's method CompleteSale via eBay_Service .NET SDK (v967) since 2 weeks (02/10).
When the ERP tries to sends some updated information about one order, it receives this Exception:
the underlying connection was closed an unexpected error occurred on a send
so I haven't got a response from the API.
There are more than one strange things:
there are some batch in background, using the same .dll, and they
work fine;
after rebooting the server the first call to "CompleteSale" works fine;
after registering again the .dll via the "regsrv" command, it worked
fine for one day;
all operators that uses the ERP are connected to the server via
remote desktop and all of they notice the problem. Instead, if i
connect from my company's office, all works fine;
I've tried to increase the timeout to 360 sec (from 60 sec) and nothing changed.
The ERP is developed in progress (OpenGL) , so I can't fix with setting "KeepAlive" to "false", setting the certificate explicitly (Tsl1.1 | Tsl1.2) or making other interventions on .net side. I was wrong, it can be done from the source code of the SDK.
i've checked the Security Protocols on the api's servers discovering that the "SSL3" is not longer supported whereas the default value for the ServicePointManager.SecurityProtocol in .NET 2.0 is SSL3.
I've solved the problem by adding this hotfix at the "eBayXmlAPIInterfaceService" class in the SDK's source code:
//768 = Tsl1.1, 3070 = Tsl1.2
ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072 |
(SecurityProtocolType)768;
HttpWebRequest http = (HttpWebRequest) WebRequest.Create(this.Url);
http.Method = "POST";
http.ContentType = "text/xml";
http.ContentLength = data.Length;
http.KeepAlive = false;
Probably Microsoft as released an hotfix to correct this problem but the server wesn't updated since 2015.
Moreover i've replicated the .net code in the ABL application in a program that we use to do get/post requestes:
DEF VAR w-tsl10 AS System.Net.SecurityProtocolType
w-tsl10 = CAST(System.Enum:ToObject(PROGRESS.Util.TypeHelper:GetType("System.Net.SecurityProtocolType":U), 192), System.Net.SecurityProtocolType).
ystem.Net.ServicePointManager:SecurityProtocol = w-tsl10.
link to the security protocol verifier: https://www.ssllabs.com/ssltest/index.html
I was working with betfair api and got this issue. After some research I found this
System.Net.ServicePointManager.Expect100Continue = true;
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;

Ruby Constantly consume external API and share output across processes

I'm writing trading bot on ruby and I need to constantly do some calculations based on Exchange's orderbook depth data across several daemons (daemons gem).
The problem is that right now I'm fetching data via Exchange's API separately in each daemon, so I ran into API calls limit (40 requests/second). That's why I'm trying to utilize ruby Drb to share orderbook data across several processes (daemons) in order to not sending unnecessary API calls.
Although I m not sure how to constantly consume API on the server side and provide the latest actual data to the client process. In the sample code below, client will get data actual at the moment when I started server
server_daemon.rb
require 'drb'
exchange = exchange.new api_key: ENV['APIKEY'], secret_key: ENV['SECRET']
shared_orderbook = exchange.orderbook limit: 50
DRb.start_service('druby://127.0.0.1:61676', shared_orderbook)
puts 'Listening for connection… '
DRb.thread.join
client_daemon.rb
require 'drb'
DRb.start_service
puts shared_data = DRbObject.new_with_uri('druby://127.0.0.1:61676')

Failed to Connect with Websphere MQ SSL Channel through JNDI

My JMS client connects to WMQ through JNDI. The initial context factory used is com.ibm.mq.jms.context.WMQInitialContextFactory.
Currently, at WMQ side, there's a queue manager called TestMgr. Under this queue manager I created two channels. One is PLAIN.CHL which does not specify an SSL Cipher Spec, the other one is SSL.CHL which configured SSL Cipher Spec with RC4_MD5_US and SSL Authentication with Optional.
I have created a key store for the queue manager using IBM Key Management tool. The path of key db is [wmq_home]\qmgrs\TestMgr\ssl\key.
For channel PLAIN.CHL, I defined a queue connection factory like:
DEF QCF(PlainQCF) QMANAGER(TestMgr) CHANNEL(PLAIN.CHL) HOST(192.168.66.23) PORT(1414) TRANSPORT(client)
And under the SSL channel SSL.CHL, I defined a queue connection factory like:
DEF QCF(SSLQCF) QMANAGER(TestMgr) CHANNEL(SSL.CHL) HOST(192.168.66.23) PORT(1414) TRANSPORT(client) SSLCIPHERSUITE(SSL_RSA_WITH_RC4_128_MD5)
Now I only can create connection using the PlainQCF. But failed to look up the SSL queue connection factory. My code looks like:
Hashtable environment = new Hashtable();
environment.put(Context.INITIAL_CONTEXT_FACTORY, "com.ibm.mq.jms.context.WMQInitialContextFactory");
environment.put(Context.PROVIDER_URL, "192.168.66.23:1414/SSL.CHL");
Context ctx = new InitialContext( environment );
QueueConnectionFactory qcf = (QueueConnectionFactory) ctx.lookup("SSLQCF");
qcf.createConnection();
....
Am I missing some context properties when looking up the SSL factory? connection And then I found the code is hanging on the line new InitialContext( environment ) for a long time, almost 5 minutes, and I got CC=2;RC=2009;AMQ9208... error.
Any suggestion would be appreciated. Is it true that SSL channel can't be connected by JNDI?
#T.Rob, thanks for your reply very much. But we still want to use WMQInitialContextFactory, so I'm afraid I still need to find solution for this.
I just defined the connection factory one time. The displayed info for the SSL queue connection factory like:
InitCtx> DISPLAY QCF(SSLQCF)
ASYNCEXCEPTION(ALL)
CCSID(819)
CHANNEL(SSL.CHL)
CLIENTRECONNECTOPTIONS(ASDEF)
CLIENTRECONNECTTIMEOUT(1800)
COMPHDR(NONE )
COMPMSG(NONE )
CONNECTIONNAMELIST(192.168.66.23(1414))
CONNOPT(STANDARD)
FAILIFQUIESCE(YES)
HOSTNAME(192.168.66.23)
LOCALADDRESS()
MAPNAMESTYLE(STANDARD)
MSGBATCHSZ(10)
MSGRETENTION(YES)
POLLINGINT(5000)
PORT(1414)
PROVIDERVERSION(UNSPECIFIED)
QMANAGER(TestMgr)
RESCANINT(5000)
SENDCHECKCOUNT(0)
SHARECONVALLOWED(YES)
SSLCIPHERSUITE(SSL_RSA_WITH_RC4_128_MD5)
SSLFIPSREQUIRED(NO)
SSLRESETCOUNT(0)
SYNCPOINTALLGETS(NO)
TARGCLIENTMATCHING(YES)
TEMPMODEL(SYSTEM.DEFAULT.MODEL.QUEUE)
TEMPQPREFIX()
TRANSPORT(CLIENT)
USECONNPOOLING(YES)
VERSION(7)
WILDCARDFORMAT(TOPIC_ONLY)
The JNDI Provider should be fine because I can look up the plain connection factory successfully. Also, for my client app, I extracted the cert from the key store which created for MQ server and imported it to the trust store(cacerts) of my JRE with alias name ibmwebspheremqtestmgr.
You are correct, with 2009 error there are some log entries:
=================================================================
4/20/2012 20:24:27 - Process(13768.3) User(MUSR_MQADMIN) Program(amqzmur0.exe)
Host(xxxx_host of my MQ) Installation(mqenv)
VRMF(7.1.0.0) QMgr(TestMgr)
AMQ6287: WebSphere MQ V7.1.0.0 (p000-L111019).
EXPLANATION:
WebSphere MQ system information:
Host Info :- Windows Server 2003, Build 3790: SP2 (MQ Windows 32-bit)
Installation :- C:\IBM\WebSphereMQ (mqenv)
Version :- 7.1.0.0 (p000-L111019)
ACTION:
None.
-------------------------------------------------------------------------------
4/20/2012 20:24:27 - Process(7348.116) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(xxxx_host of my MQ) Installation(mqenv)
VRMF(7.1.0.0) QMgr(TestMgr)
AMQ9639: Remote channel 'SSL.CHL' did not specify a CipherSpec.
EXPLANATION:
Remote channel 'SSL.CHL' did not specify a CipherSpec when the local channel
expected one to be specified.
The remote host is 'xxx_host of my app (192.168.66.25)'.
The channel did not start.
ACTION:
Change the remote channel 'SSL.CHL' on host 'xxx_host of my app (192.168.66.25)' to
specify a CipherSpec so that both ends of the channel have matching
CipherSpecs.
----- amqcccxa.c : 3817 -------------------------------------------------------
4/20/2012 20:24:27 - Process(7348.116) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(my app host) Installation(mqenv)
VRMF(7.1.0.0) QMgr(TestMgr)
AMQ9999: Channel 'SSL.CHL' to host 'xxx_host of my app (192.168.66.25)' ended
abnormally.
====================================================================
I also got some confusion with the error log. My app staged at at a machine which is different from my MQ. But the log says the Change the remote channel 'SSL.CHL' on host 'xxx_host of my app (192.168.66.25)' to
specify a CipherSpec so that both ends of the channel have matching
CipherSpecs. How can I change the channel cipher spec on my app host?
updates on MQEnvironment...
reply the comments.
The value of MQEnvironment.sslCipherSuite is null, so it throws out NullPointerExcetpion when i put it the the env hashtable. But i tried another one environment.put(MQC.SSL_CIPHER_SUITE_PROPERTY, "SSL_RSA_WITH_RC4_128_MD5") and it still failed with 2009 error.
For JMSAdmin tool, i had changed the config to use WMQInitialContextFactory. The configuration like(JMSAdmin.config):
INITIAL_CONTEXT_FACTORY=com.ibm.mq.jms.context.WMQInitialContextFactory
PROVIDER_URL=192.168.66.23:1414/SYSTEM.DEF.SVRCONN
The rest configuration leaves as default.
Kindly note, here i use the default channel SYSTEM.DEF.SVRCONN so that i can logon to admin console. If I change the channel to the SSL oneSSL.CHL, I also can't logon to admin console. The error happened here is just like the one in my client app.
Another clarification, in my client, i use follow code can connect to connect qmgr(TestMgr) successfully through channel SSL.CHL.
MQConnectionFactory factory = new MQConnectionFactory();
factory.setTransportType(JMSC.MQJMS_TP_CLIENT_MQ_TCPIP);
factory.setQueueManager("TestMgr");
factory.setSSLCipherSuite("SSL_RSA_WITH_RC4_128_MD5");
factory.setPort(1414);
factory.setHostName("192.168.66.23");
factory.setChannel("SSL.CHL");
MQConnection connection = (MQConnection) factory.createConnection();
And now the problem is just like you said, that's the initial context failed connect to qmgr through SSL channel. The option(use plain channel for initial context and ssl channel for connection factory) you provided works too. But I still want to know how to get initial context with ssl channel work. Thanks for you patience very much. Your updates will be appreciated.
thanks
I never really liked com.ibm.mq.jms.context.WMQInitialContextFactory very much. It stores the managed objects on a queue. So in order to lookup the connectionFactory, which tells JMS how to connect to the QMgr, it is first necessary to connect to the QMgr to make the JNDI call. Therefore, before you can debug the SSL connection, you need to know whether the underlying JNDI provider is working.
If you want to skip the MQ-based JNDI provider and just use the filesystem, see the updated version of Bobby Woolf's article here. If you want to continue with com.ibm.mq.jms.context.WMQInitialContextFactory, read on but be prepared to provide more configuration info.
When you run the JMSAdmin tool, do you display the objects after creating them? For example, here is one of my JMSAdmin.bat scripts:
# Connection Factory for Client mode
# Delete the Connection Factory if it exists
DELETE CF(JMSDEMOCF)
# Define the Connection Factory
DEFINE CF(JMSDEMOCF) +
SYNCPOINTALLGETS(YES) +
SSLCIPHERSUITE(NULL_SHA) +
TRAN(client) +
HOST(127.0.0.1) CHAN(SSL.SVRCONN) PORT(1414) +
QMGR( )
# Display the resulting definition
DISPLAY CF(JMSDEMOCF)
This deletes the object (because JMSAdmin doesn't have a define with replace option) then defines the object, then displays it. Do you in fact see both objects defined? Can you connect and interactively display them both? Can you update your question with the contents displayed?
If so, then what does the JNDI provider configuration look like with each sample program? The 2009 indicates that there is at least a connection to the QMgr being made, so it is important to determine whether the thing that suffering the broken connection is your app or the JNDI provider. To diagnose that requires the config info you are using for the JNDI provider and whether it is the same in the working and failing cases. If not, how do they differ?
Once you know whether it's the app or the JNDI provider that is causing the problem (or switch to another JNDI provider that doesn't require an MQ connection such as the filesystem initial context) then it will be possible to determine the next steps.
The article linked above has samples of code and managed object scripts that use a filesystem JNDI provider. You may notice my scripts pasted in above use the same QMgr name. That's because I wrote that part of the article. When I want to switch to SSL using those same samples, I just update the connectionFactory to point to the SSL channel and it works.
Here are the other bits from the sample that I've modified:
java -Djavax.net.debug=ssl ^
-Djavax.net.ssl.trustStore=key2.jks ^
-Djavax.net.ssl.keyStore=key2.jks ^
-Djavax.net.ssl.keyStorePassword=???????? ^
-Djavax.net.ssl.trustStorePassword=???????? ^
-cp "%CLASSPATH%" ^
com.ibm.examples.JMSDemo -pub -topic JMSDEMOPubTopic %*
Note: The ^ is Windows version of line continuation.
Then if there are problems, I follow the debugging scenario I described in this SO answer. Note that the app will require a truststore, even if you have SSLCAUTH(OPTIONAL) on your channel. This is because the app must always validate the QMgr's certificate, even if the app does not present its own certificate. In my case I was using SSLCAUTH(REQUIRED) so my app needed both a keystore and a truststore. Your question mentions that the QMgr has a keystore but does not say what you did for the application.
Finally, a 2009 will usually generate an entry in the QMgr error logs. If you continue to get the problem, please update your question with those log entries.
UPDATE:
Responding to the comments, the JMSAdmin tool is part of the WMQ package. However, WMQ it comes with jars for filesystem context and LDAP context. The WMQInitialContextFactory is optional and is delivered as SupportPac ME01. When using WMQInitialContextFactory with the JMSAdmin tool (or the JMSAdmin GUI or with WMQ Explorer) it is necessary to configure the PROVIDER_URL with the host, port and channel. For example:
PROVIDER_URL: <Hostname>:<port>/<SVRCONN Channel Name>
192.168.66.23:1414/SSL.SVRCONN
So after reviewing your post again, I realized that you did provide the config info for WMQInitialContextFactory. I was looking for a JMSADmin.config file but you have it in the environment hash table. And that is where the problem is. You are attempting to use the SSL channel for both the WMQInitialContextFactory and the connection factory. This is what is causing the lookup to fail. The WMQInitialContextFactory first makes a Java connection to the QMgre in order to look in the queue to obtain the administered objects such as QCF. In order to do that, it needs to know the ciphersuite that the channel is set up for in order to negotiate the handshake. Right now, the *only * place that ciphersuite is recorded is in the QCF definition.
Try adding the following line:
environment.put(MQEnvironment.sslCipherSuite, "SSL_RSA_WITH_RC4_128_MD5");
As per this Infocenter page, that should tell the context factory classes what ciphersuite to use. Of course, they also need to know where the trust store is (and possibly keystore if the channel has SSLCAUTH(RQUIRED) set) so you still need to get those values in the environment. You can use the command-line variables or try loading them into the environment using code. You'll need both -Djavax.net.ssl.trustStore=key2.jks and -Djavax.net.ssl.trustStorePassword=????????.
The other option is to continue to use the plaintext channel for the WMQInitialContextFactory and the SSL channel for the application. If the plaintext channel has an MCAUSER for a non-privileged user ID, it can be restricted to only connect to the QMgr and access the queue that contains the administered objects. With those restrictions, anyone will be able to read the administered objects using that channel but not the application queues or administrative queues.

Resources