Erlang version 18.0 and ejabberd nodename conflict - erlang

I wanted to install latest ejabberd from https://github.com/processone/ejabberd. For this, Erlang/OTP 18 is required. That too, i have manually installed from https://github.com/erlang/otp. Then, i need to start ejabberd server with command ejabberdctl start. But there is some error in that.
My mnesia node name is akash#akash-Latitude-3450 and ejabberd nodename is akash#localhost. Due to this, server is not getting started. How to resolve this conflict ?
Log ->
2016-01-07 18:38:20.410 [critical] <0.39.0>#ejabberd_app:db_init:125 Node name mismatch: I'm [ejabberd#localhost], the database is owned by ['ejabberd#akash-Latitude-3450']
2016-01-07 18:38:20.410 [critical] <0.39.0>#ejabberd_app:db_init:127 Either set ERLANG_NODE in ejabberdctl.cfg or change node name in Mnesia
2016-01-07 18:38:20.410 [error] <0.38.0> CRASH REPORT Process <0.38.0> with 0 neighbours exited with reason: node_name_mismatch in ejabberd_app:db_init/0 line 129 in application_master:init/4 line 134

You have two options:
name your Erlang node with the name matching your Mnesia database when you start ejabberd. As suggested by error message, it can be changed in var ERLANG_NODE in ejabberdctl.cfg.
Backup Mnesia database by starting the node under old name, do a fresh install and restore your data with node started with the new name.

Related

Neo4J Docker Container Constantly Restarts

Presently running docker desktop on windows. Using a Neo4j 4.0.4 or 4.0.7. I have recently starting seeing this constantly restart every so many seconds.
When looking at the logs in the docker console for the container.
Starting Neo4j.
2022-04-28 17:36:45.021+0000 INFO Note that since you did not explicitly set the port in dbms.connector.bolt.advertised_address Neo4j automatically set it to 7687 to match dbms.connector.bolt.listen_address. This behavior may change in the future and we recommend you to explicitly set it.
2022-04-28 17:36:45.028+0000 INFO ======== Neo4j 4.0.4 ========
2022-04-28 17:36:45.031+0000 INFO Starting...
2022-04-28 17:36:52.667+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService#44fff386' was successfully initialized, but failed to start. Please see the attached cause exception "Mismatching store id. Store StoreId: StoreId{creationTime=1651155662620, randomId=155159839854920640, storeVersion=3471765337752883975, upgradeTime=1651155662620, upgradeTxId=1}. Transaction log StoreId: StoreId{creationTime=1650887045799, randomId=2291666590955424440, storeVersion=3471765337752883975, upgradeTime=1650887045799, upgradeTxId=1}". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService#44fff386' was successfully initialized, but failed to start. Please see the attached cause exception "Mismatching store id. Store StoreId: StoreId{creationTime=1651155662620, randomId=155159839854920640, storeVersion=3471765337752883975, upgradeTime=1651155662620, upgradeTxId=1}. Transaction log StoreId: StoreId{creationTime=1650887045799, randomId=2291666590955424440, storeVersion=3471765337752883975, upgradeTime=1650887045799, upgradeTxId=1}".
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService#44fff386' was successfully initialized, but failed to start. Please see the attached cause exception "Mismatching store id. Store StoreId: StoreId{creationTime=1651155662620, randomId=155159839854920640, storeVersion=3471765337752883975, upgradeTime=1651155662620, upgradeTxId=1}. Transaction log StoreId: StoreId{creationTime=1650887045799, randomId=2291666590955424440, storeVersion=3471765337752883975, upgradeTime=1650887045799, upgradeTxId=1}".
at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:45)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:164)
at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:114)
at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:89)
at com.neo4j.server.enterprise.EnterpriseEntryPoint.main(EnterpriseEntryPoint.java:25)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService#44fff386' was successfully initialized, but failed to start. Please see the attached cause exception "Mismatching store id. Store StoreId: StoreId{creationTime=1651155662620, randomId=155159839854920640, storeVersion=3471765337752883975, upgradeTime=1651155662620, upgradeTxId=1}. Transaction log StoreId: StoreId{creationTime=1650887045799, randomId=2291666590955424440, storeVersion=3471765337752883975, upgradeTime=1650887045799, upgradeTxId=1}".
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:465)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:157)
... 3 more
Caused by: java.lang.RuntimeException: Error starting database server at /data/databases
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.startDatabaseServer(DatabaseManagementServiceFactory.java:167)
at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:145)
at com.neo4j.server.database.EnterpriseGraphFactory.newDatabaseManagementService(EnterpriseGraphFactory.java:38)
at org.neo4j.server.database.LifecycleManagingDatabaseService.start(LifecycleManagingDatabaseService.java:88)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)
Wondering what I need to be considering looking at to resolve, or maybe there is a known solution?
Cheers

Q: Apache Geode - Unable to reconnect a node after SO patching "15 seconds have elapsed while waiting for replies"

I have a cluster situation consisting of 4 total nodes, 3 servers and 1 management node, working properly.
At the beginning of the month we planned to patch the OS and we started from the first server node with this procedure:
Stop service
S.O. patching
Server restart
Start service
The service of the first patched node named "serverA" fails to restart with this error:
Log entries cluster join:
serverA:
| INFO | region-dm-12 | ache.geode.internal.tcp.Connection | --> Connection: shared=true ordered=false failed to connect to peer 10.237.110.195( Server serverB:9993):1024 because: java.net.ConnectException: Connection timed out (Connection timed out)
| WARN | region-dm-12 | ache.geode.internal.tcp.Connection | --> Connection: Attempting reconnect to peer 10.237.110.195( Server serverB:9993):1024
ServerMgmt:
| WARN | pool-3-thread-1 | tributed.internal.ReplyProcessor21 | --> 15 seconds have elapsed while waiting for replies: <CreateRegionProcessor$CreateRegionReplyProcessor 44180 waiting for 1 replies from [10.237.110.194( Server serverA:632):1024]> on 10.237.110.225( Management:6033):1024 whose current membership list is: [[10.237.110.196( Server serverC:16805):1024, 10.237.110.225( Management:6033):1024, 10.237.110.195( Server serverB:9993):1024, 10.237.110.194( Server serverA:632):1024]]
The connection between the systems was verified with tcpdumps, udp 1024 is running fine.
We have tried redeploying the service and making numerous attempts but we always get the same error during startup.
Any suggestions? Thank you.
Marco.
I think to see this error message, serverA was probably able to send UDP messages to serverB but it is failing to create a TCP connection. It's hard to say why though - a firewall issue, some TCP configuration issue, ... ?
Check to see if serverB has anything interesting in its logs. Since you are using TCP dump, you should be watching for that TCP connection for serverB:9993, since it looks like that is wwhat failed.
There is no firewall between the systems, we've analyzed again the network connection, during startup from node a, and we can see that the communication can be established between all systems. But what we detected is, that on port 2323 which is configured as locater, the node sends packages to the b and c node, but only receives back packages from the c node, and not from the b node. This is for us again a sign that the b node has an issue. Does it give a way to check our assumption from the b node?
A node ip .194
B node ip .195
C node ip .196
Management ip .225

Apache nifi is not starting up

I am trying to start Apache nifi version 1.2.0 on window 8 machine. It used to start properly. After I restarted the system the nifi is not starting at all. I had check status Its keep getting "Apacha Nifi not running".
Below are logs from nifi.bootstrap.log file:-
2017-07-05 15:41:57,105 WARN [NiFi Bootstrap Command Listener]
org.apache.nifi.bootstrap.RunNiFi Failed to set permissions so that only the
owner can read pid file E:\softwares\nifi-1.2.0\bin\..\run\nifi.pid; this
may allows others to have access to the key needed to communicate with NiFi.
Permissions should be changed so that only the owner can read this file
2017-07-05 15:41:57,142 WARN [NiFi Bootstrap Command Listener]
org.apache.nifi.bootstrap.RunNiFi Failed to set permissions so that only the
owner can read status file E:\softwares\nifi-1.2.0\bin\..\run\nifi.status;
this may allows others to have access to the key needed to communicate with
NiFi. Permissions should be changed so that only the owner can read this
file
2017-07-05 15:41:57,168 INFO [NiFi Bootstrap Command Listener]
org.apache.nifi.bootstrap.RunNiFi Apache NiFi now running and listening for
Bootstrap requests on port 50765
2017-07-05 15:43:12,077 ERROR [NiFi logging handler] org.apache.nifi.StdErr
Failed to start web server: Unable to start Flow Controller.
2017-07-05 15:43:12,078 ERROR [NiFi logging handler] org.apache.nifi.StdErr
Shutting down...
2017-07-05 15:43:14,501 INFO [main] org.apache.nifi.bootstrap.RunNiFi NiFi
never started. Will not restart NiFi
Stack trace from nifi.app.log: -
2017-07-05 15:43:12,077 WARN [main] org.apache.nifi.web.server.JettyServer Failed to start web server... shutting down.
org.apache.nifi.web.NiFiCoreException: Unable to start Flow Controller.
at org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:88)
at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:876)
at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:532)
at org.eclipse.jetty.server.handler.ContextHandler.startContext(ContextHandler.java:839)
at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:344)
at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1480)
at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1442)
at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:799)
at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:261)
at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:540)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:113)
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler.java:290)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:131)
at org.eclipse.jetty.server.Server.start(Server.java:452)
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:105)
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
at org.eclipse.jetty.server.Server.doStart(Server.java:419)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:695)
at org.apache.nifi.NiFi.<init>(NiFi.java:160)
at org.apache.nifi.NiFi.main(NiFi.java:267)
Caused by: java.io.IOException: Expected to read a Sentinel Byte of '1' but got a value of '0' instead
at org.apache.nifi.repository.schema.SchemaRecordReader.readRecord(SchemaRecordReader.java:65)
at org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeRecord(SchemaRepositoryRecordSerde.java:115)
at org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeEdit(SchemaRepositoryRecordSerde.java:109)
at org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeEdit(SchemaRepositoryRecordSerde.java:46)
at org.wali.MinimalLockingWriteAheadLog$Partition.recoverNextTransaction(MinimalLockingWriteAheadLog.java:1096)
at org.wali.MinimalLockingWriteAheadLog.recoverFromEdits(MinimalLockingWriteAheadLog.java:459)
at org.wali.MinimalLockingWriteAheadLog.recoverRecords(MinimalLockingWriteAheadLog.java:301)
at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:381)
at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:712)
at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:953)
at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:534)
at org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextInitialized(ApplicationStartupContextListener.java:72)
... 28 common frames omitted
Thanks in advance
After Googling on this error "Caused by: java.io.IOException: Expected to read a Sentinel Byte of '1' but got a value of '0' instead" I found that this error indicates a partial write to the repos.
Here are a couple of things you can check/try to bring your Dataflow back online ;
check if your dsks are not full
Did you launch nifi with the same user ? Did you run it with administrator privileges ?
You can backup/move your repositories and try to start Nifi with empty repositories, you will still have your dataflows there but any file that was processing when you shutdown will be gone.
Could you please try that ?
I think the issue is with incompatible java version, use JAVA 8 version.
If you haven't set JAVA_HOME then set in environment variables with path Like "C:/program files/jdk1.8"
Jira addressing when NiFi run with java 9 version and the issue not resolved yet
https://issues.apache.org/jira/browse/NIFI-4419

ejabberd incompatiblity with p1_yaml library

I have installed erlang/OTP 17.1 from source & I have installed ejabberd from GitHub. Both have installed successfully. But when i am starting ejabberd server, i am getting the following error in error.log.
2016-01-08 16:13:57.700 [error] <0.100.0> unable to load p1_yaml NIF: {error,{bad_lib,"Library version (2.8) not compatible (with 2.6)."}}
2016-01-08 16:13:57.700 [error] <0.99.0> CRASH REPORT Process <0.99.0> with 0 neighbours exited with reason: {{bad_lib,"Library version (2.8) not compatible (with 2.6)."},{p1_yaml_app,start,[normal,[]]}} in application_master:init/4 line 133
2016-01-08 16:13:57.769 [critical] <0.38.0>#ejabberd:exit_or_halt:133 failed to start application 'p1_yaml': {error,
{{bad_lib,
"Library version (2.8) not compatible (with 2.6)."},
{p1_yaml_app,start,[normal,[]]}}}
You likely have a p1_yaml.so file around that was build with a different version of Erlang runtime than the one you are using.
You should rebuild ejabberd (or just p1_yaml) with the correct Erlang version.

Random failure of creating a New Cassandra Cluster using OpsCenter

OpsCenter version: 5.1.0 and
DSE Version: 4.6.0
Creating a brand new cluster by using OpsCenter directly, gives us the following error. It randomly works with the same settings but 95% of the times it fails with the same error. Opscenter is running on its own box but sharing the same Security groups as the cluster instances. For good measure, I have opened up all TCP ports to all IPs. The following is the stack trace of the error from the opscenterd.log:
*2015-03-19 10:06:12+0000 [] INFO: Starting provisioning process
2015-03-19 10:06:12+0000 [] INFO: Starting installation phase of cluster provisioning
2015-03-19 10:06:13+0000 [] WARN: HTTP request http://10.x.x.x:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2015-03-19 10:06:13+0000 [] INFO: Beginning install of OpsCenter agent to 54.x.x.x
2015-03-19 10:06:26+0000 [] WARN: HTTP request http://10.x.x.x:61621/alive? failed: Connection was refused by other side: 111: Connection refused.
2015-03-19 10:06:31+0000 [] INFO: Agent for ip 10.x.x.x is version None
2015-03-19 10:06:31+0000 [] INFO: Agent for ip 10.x.x.x is version u'5.1.0'
2015-03-19 10:07:23+0000 [] INFO: Successfully installed agent and dse on node 10.x.x.x
2015-03-19 10:07:23+0000 [] INFO: Beginning "stop" phase of cluster provisioning
2015-03-19 10:07:25+0000 [] WARN: Marking request '10.x.x.x: /ops/stop' (f6708fa2-b45f-42b4-b992-90a82b460ac7) as failed: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] ERROR: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] WARN: Marking request 'stop stage' (0b6fcb6b-96ba-404e-a484-b4b6b167b309) as failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] ERROR: Stop stage failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] WARN: Marking request 'provision' (daf1c15d-92e3-40b0-83ca-34d548ea835b) as failed: Stop stage failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] ERROR:
2015-03-19 10:07:25+0000 [] ERROR: Cluster provisioning failed: Exception: Stop stage failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] ERROR: Failed to provision cluster: Cluster provisioning failed: Exception: Stop stage failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:25+0000 [] WARN: Marking request 28c021fd-d21a-4fed-bb5c-a4fe17d362e0 as failed: Cluster provisioning failed: Exception: Stop stage failed: Failed to stop node 10.x.x.x: /usr/sbin/service dse stop failed
exit status: 1
stdout:
log_daemon_msg is a shell function
Cassandra 2.0 and later require Java 7 or later.
2015-03-19 10:07:41+0000 [] WARN: Unable to find a matching cluster for node with IP [u'fe80:0:0:0:2000:aff:feeb:31c7%2', u'10.x.x.x', u'0:0:0:0:0:0:0:1%1', u'127.0.0.1']; the message was [u'5.1.0', u'/1947480708/conf']. This usually indicates that an OpsCenter agent is still running on an old node that was decommissioned or is part of a cluster that OpsCenter is no longer monitoring.
Appreciate any help!
Thanks in advance
Harsha
OpCenter developer here. I make the OpsCenter provisioning features go zoom (or splat occasionally as you've seen). It is with sadness and shame that I must tell you that you're hitting a bug.
The Datastax AMI version 2.4 used by OpsCenter provisioning (https://github.com/riptano/ComboAMI/tree/2.4) does quite a bit of work at boot time via startup scripts. One of those tasks is to set up some gpg repository keys used to validate packages. Intermittently that process can fail, breaking package installs and leading to the series of errors that you're seeing. This failure is intermittent and has greatly increased in frequency recently. If you check /home/ubuntu/datastax-ami/ami.log you should see the gpg key failures that begin the rest of the failure chain.
Unfortunately, this error is pretty far down the technology stack and is difficult to manually work around. If you just need to provision a single cluster you can retry until you get a good run. Otherwise your best best is to manually launch the instances and use local provisioning to deploy dse/dsc to their private ip addresses:
Launch instances using ami-ada2b6c4 (assuming you're in us-east-1)
Make sure to add the instances to the OpsCenterSecurity group.
Make sure you have the private half of the keypair you use (you'll need it during local provisioning)
On the instance data page, hit the advanced pulldown and add the following userdata as text "--raidonly --java7"
Do a local-provisioning run against the private-ip's
Not a super-simple workaround. I wish your experience with OpsCenter this time around was more awesome. The good news is I'm on this bug and it will be fixed in an upcoming point release.
Edit: No longer necessary to manually remove /etc/security/limits.d/cassandra.conf
if its just complaining about java then install the java 7 preferably datastax wants oracle jdk and jre. you might already have java 7 and another version on your nodes but java 7 is not the default version. to change this do:
sudo update-java-alternatives -s java-7-oracle
which is a command you can script to run with ssh so you dont have to log in to each node

Resources