Errors in opscenterd.log after starting repair service - datastax-enterprise

After upgrading from Opscenter 5.x > 6.0.8 > 6.1.0 > 6.1.1, I see lots of errors and warnings in the opscenterd.log, like the following. I'm using DSE 4.8.10. I've turned on the Repair Service, which seems to be working as expected. But I see WARNings in the log. Are these anything to be concerned about?
2017-06-27 01:00:00,356 [local] ERROR: The best practice rule 'Tombstone count' has failed. (MainThread)
2017-06-27 01:00:00,358 [local] ERROR: The best practice rule 'Wide partitions' has failed. (MainThread)
2017-06-27 01:00:00,451 [local] ERROR: The best practice rule 'Secondary indexes cardinality' has failed. (MainThread)
2017-06-27 13:10:11,672 [opscenterd] WARN: Unknown request 54688d7f-7c5f-4bcb-bc4d-07b7a0a79c3c (running {'started': 1498569009, 'details': u'Repair session f260e7b0-5b39-11e7-87cf-612516369059 for range (1042910172352712044,1065269862139026652] finished', 'details-type': None}) (MainThread)
2017-06-27 13:12:40,885 [opscenterd] WARN: Unknown request 341c9bc9-1c00-4771-aa64-27206ad4152a (running {'started': 1498569160, 'details': u'Repair session 4c3d5c50-5b3a-11e7-87cf-612516369059 for range (-1555782662812296764,-1538702344225528661] finished', 'details-type': None}) (MainThread)

OpsCenter is warning about repair sessions on the Cassandra nodes that it doesn't recognize. They could be sessions that are hung up; there are a number of tickets related to that in Cassandra 2.1.x and older:
https://issues.apache.org/jira/browse/CASSANDRA-10012
https://issues.apache.org/jira/browse/CASSANDRA-10992
https://issues.apache.org/jira/browse/CASSANDRA-10961
https://issues.apache.org/jira/browse/CASSANDRA-12901
You could try to clean up these sessions by invoking forceTerminateAllRepairSessions via JMX on the StorageSession mbean. Or, a rolling restart on the cluster should do the trick.
Regardless, OpsCenter's Repair Service will operate normally.

Related

Issue connecting to Noe4j Aura with Python 'neo4j' driver

I attempted to connect neo4j aura database using Python but failed as "Unable to retrieve routing information".
from neo4j import GraphDatabase
from neo4j.debug import watch
uri = "neo4j+s://<id>.databases.neo4j.io"
driver = GraphDatabase.driver(uri, auth=("neo4j", "<password>"))
def workload(tx):
return tx.run("RETURN 1 as n").data()
with watch("neo4j"): # enable logging
with driver.session() as session:
session.write_transaction(workload)
driver.close()
Running above python scripts returned the following log:
Attempting to update routing table from IPv4Address(('<id>.databases.neo4j.io', 7687))
[#0000] C: <RESOLVE> <id>.databases.neo4j.io:7687
[#0000] C: <OPEN> xx.xxx.xxx.xxx:7687
[#C000] C: <SECURE> <id>.databases.neo4j.io
[#0000] C: <CONNECTION FAILED> BoltSecurityError: [SSLCertVerificationError] Connection Failed. Please ensure that your database is listening on the correct host and port and that you have enabled encryption if required. Note that the default encryption setting has changed in Neo4j 4.0. See the docs for more information. Failed to establish encrypted connection. (code 1: Operation not permitted)
Failed to fetch routing info 35.xxx.xxx.xxx:7687
[#0000] C: <ROUTING> Deactivating address IPv4Address(('<id>.databases.neo4j.io', 7687))
[#0000] C: <ROUTING> table={None: RoutingTable(database=None routers={}, readers={}, writers={}, last_updated_time=0.235748575, ttl=0)}
Attempting to update routing table from
Unable to retrieve routing information
Transaction failed and will be retried in 1.1281720312998946s (Unable to retrieve routing information)
I looked into neo4j documentation and searched other places but none of the possible resolutions can be found.
Version:
Python 3.7.4
neo4j 4.4.2
I very much appreciate your input if you have ever experienced the same issues and found any way to resolve the issue.

tdb2.tdbcompact command line tool returns Failed to get a lock: file

I'm running apache-jena-fuseki-3.13-1 and just found tdb2.tdbcompact from its bin-directory. I should run tdb2.tdbcompact nightly to prevent my jena-fuseki from running out of disk space, but now I get error message( Failed to get a lock: file) when running it:
miettinj#ramen:~/jena> ./apache-jena-3.13.1/bin/tdb2.tdbcompact --loc=./apache-jena-fuseki- 3.13.1/run/databases/test_TDB2
org.apache.jena.dboe.DBOpEnvException: Failed to get a lock: file='/srv/work/miettinj/jena/apache-jena-fuseki-3.13.1/run/databases/test_TDB2/tdb.lock': held by process 6136
ps -x|grep 6136
6136 ? Sl 30:48 /usr/lib64/jvm/java/bin/java -Xmx1200M -cp /srv/work/miettinj/jena/apache-jena-fuseki-3.13.1/fuseki-server.jar
"held by process 6136"
Another process is using the database. Compaction has to happen from the process using the database.
Apache Jena Fuseki Jena 3.17.0 added a function endpoint so that the administrator can ask for compaction on a running Fuseki server.

Gettng error while creating DSE Graph -"Host did not respond in a timely fashion"

We are using DataStax Enterprise version 5.0.1 and are facing issue while creating the graph from the Gremlin Console.
Here are the details of the error that I am getting:
adminuser#dc0vm1:~$ dse gremlin-console
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.tinkergraph
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
gremlin> :remote connect tinkerpop.server conf/remote.yaml
==>Configured 13.82.30.252/13.82.30.252:8182
gremlin> :> 1+1
Host did not respond in a timely fashion - check the server status and submit again.
gremlin> :> system.graph('food').create()
Host did not respond in a timely fashion - check the server status and submit again.
I changed the Remote.yaml file settings from [locahost] to
hosts: [13.82.30.252].
I ran the nodetool command to check if the server is running properly:
adminuser#dc0vm1:~$ nodetool status
Datacenter: dc0
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 13.82.25.134 168.92 KB 64 ? d7a98eed-9b15-42ee-bc5c-f406e98fd6fc FD2
UN 13.82.25.152 189.17 KB 64 ? 7ffa11ea-8607-4bdb-903b-2ee3baeacae8 FD0
UN 13.82.30.252 150.6 KB 64 ? a57f6cd8-5466-480e-b919-329c36fbfd28 FD1
The cassandra.yaml has the following entries related to the host:
broadcast_rpc_address: 13.82.30.252
rpc_address: 0.0.0.0
Could you please let me know what configuration I am missing here?
I figured out that by default the DSE Graph service is not enabled so you need to edit the file "dse" to enable it -
sudo vim /etc/default/dse
Make sure that the following parameter is set to 1 –
# Enable the DSE Graph service on this node
GRAPH_ENABLED=1
Restart the DSE service -
sudo service dse stop
sudo service dse start
Now Gremlin Console is able to connect and create the Graph.

solr is not up when we install alfresco 5.1 and solr in different tomcat instances

I have installed alfresco & solr in different tomcat instances using below docs url. alfresco share was running file but when i run solr instance and see below error.
http://docs.alfresco.com/5.1/tasks/solr4-install-config.html
generated secure keys for Solr communication.
http://docs.alfresco.com/5.1/tasks/generate-keys-solr4.html
2016-07-18 13:25:30,037 ERROR [solr.tracker.AbstractTracker] [SolrTrackerSche
ler_Worker-14] Tracking failed
org.alfresco.error.AlfrescoRuntimeException: 06180034 api/solr/aclchangesets
turn status:403
at org.alfresco.solr.client.SOLRAPIClient.getAclChangeSets(SOLRAPIClie
.java:162)
at org.alfresco.solr.tracker.AclTracker.checkRepoAndIndexConsistency(A
Tracker.java:335)
at org.alfresco.solr.tracker.AclTracker.trackRepository(AclTracker.jav
313)
at org.alfresco.solr.tracker.AclTracker.doTrack(AclTracker.java:104)
at org.alfresco.solr.tracker.AbstractTracker.track(AbstractTracker.jav
185)
at org.alfresco.solr.tracker.TrackerJob.execute(TrackerJob.java:47)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool
ava:563)
2016-07-18 13:25:30,029 ERROR [solr.tracker.AbstractTracker] [SolrTrackerSche
ler_Worker-6] Tracking failed
org.alfresco.error.AlfrescoRuntimeException: 06180033 api/solr/aclchangesets
turn status:403
at org.alfresco.solr.client.SOLRAPIClient.getAclChangeSets(SOLRAPIClie
.java:162)
at org.alfresco.solr.tracker.AclTracker.checkRepoAndIndexConsistency(A
Tracker.java:335)
at org.alfresco.solr.tracker.AclTracker.trackRepository(AclTracker.jav
313)
at org.alfresco.solr.tracker.AclTracker.doTrack(AclTracker.java:104)
at org.alfresco.solr.tracker.AbstractTracker.track(AbstractTracker.jav
185)
at org.alfresco.solr.tracker.TrackerJob.execute(TrackerJob.java:47)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool
ava:563)
2016-07-18 13:25:30,056 ERROR [solr.tracker.AbstractTracker] [SolrTrackerSche
ler_Worker-11] Tracking failed
org.alfresco.error.AlfrescoRuntimeException: 06180036 GetModelsDiff return st
us is 403
at org.alfresco.solr.client.SOLRAPIClient.getModelsDiff(SOLRAPIClient.
va:1157)
at org.alfresco.solr.tracker.ModelTracker.trackModelsImpl(ModelTracker
ava:249)
at org.alfresco.solr.tracker.ModelTracker.trackModels(ModelTracker.jav
207)
at org.alfresco.solr.tracker.ModelTracker.doTrack(ModelTracker.java:16
at org.alfresco.solr.tracker.AbstractTracker.track(AbstractTracker.jav
185)
at org.alfresco.solr.tracker.TrackerJob.execute(TrackerJob.java:47)
at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool
ava:563)
I had a similar problem which has been solved by changing
clientAuth="false"
to
clientAuth="want"
in Tomcat 7 server.xml
(credit goes to https://issues.alfresco.com/jira/browse/ACE-4551?focusedCommentId=424920&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-424920)

Neo4j server failed to start

Tried starting neo4j service and got a message like
WARNING: Detected a limit of 1024 for maximum open files, while a
minimum value of 40000 is recommended.
WARNING: Problems with the
operation of the server may occur. Please refer to the Neo4j manual
regarding lifting this limitation. Starting Neo4j Server...
WARNING:
not changing user process [17348]... waiting for server to be
ready... BAD. Neo4j Server may have failed to start, please check the
logs.
The log says :
Opened [/home/ub/graph_db/neo4j-community-1.7.M01/data/graph.db/nioneo_logical.log.1] clean empty log, version=224, lastTxId=654769
2013-03-14 11:26:28.111+0000: TM opening log: /home/ub/graph_db/neo4j-community-1.7.M01/data/graph.db/tm_tx_log.1
2013-03-14 11:26:28.159+0000: Failed to load index provider lucene Target file[lucene.log.v318] already exists
org.neo4j.graphdb.NotFoundException: Target file[lucene.log.v318] already exists
at org.neo4j.kernel.impl.util.FileUtils.renameFile(FileUtils.java:165)
at org.neo4j.kernel.DefaultFileSystemAbstraction.renameFile(DefaultFileSystemAbstraction.java:78)
at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.renameLogFileToRightVersion(XaLogicalLog.java:700)
at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.renameIfExists(XaLogicalLog.java:219)
at org.neo4j.kernel.impl.transaction.xaframework.XaLogicalLog.open(XaLogicalLog.java:171)
at org.neo4j.kernel.impl.transaction.xaframework.XaContainer.openLogicalLog(XaContainer.java:64)
at org.neo4j.index.impl.lucene.LuceneDataSource.<init>(LuceneDataSource.java:229)
at org.neo4j.index.lucene.LuceneIndexProvider.load(LuceneIndexProvider.java:71)
at org.neo4j.kernel.AbstractGraphDatabase$DefaultKernelExtensionLoader.loadIndexImplementations(AbstractGraphDatabase.java:986)
at org.neo4j.kernel.AbstractGraphDatabase$DefaultKernelExtensionLoader.init(AbstractGraphDatabase.java:958)
at org.neo4j.kernel.LifeSupport$LifecycleInstance.init(LifeSupport.java:362)
at org.neo4j.kernel.LifeSupport.init(LifeSupport.java:76)
at org.neo4j.kernel.LifeSupport.start(LifeSupport.java:110)
at org.neo4j.kernel.AbstractGraphDatabase.run(AbstractGraphDatabase.java:178)
at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:69)
at org.neo4j.server.NeoServerBootstrapper$1.createDatabase(NeoServerBootstrapper.java:65)
at org.neo4j.server.database.Database.createDatabase(Database.java:80)
at org.neo4j.server.database.Database.<init>(Database.java:63)
at org.neo4j.server.NeoServerWithEmbeddedWebServer.startDatabase(NeoServerWithEmbeddedWebServer.java:186)
at org.neo4j.server.NeoServerWithEmbeddedWebServer.start(NeoServerWithEmbeddedWebServer.java:97)
at org.neo4j.server.Bootstrapper.start(Bootstrapper.java:87)
at org.neo4j.server.Bootstrapper.main(Bootstrapper.java:52)
2013-03-14 11:26:28.160+0000: TM shutting down
2013-03-14 11:26:28.382+0000: Closed log /home/biju/graph_db/neo4j-community-1.7.M01/data/graph.db/nioneo_logical.log
2013-03-14 11:26:28.945+0000: NeoStore closed
2013-03-14 11:26:28.946+0000: --- SHUTDOWN diagnostics START ---
2013-03-14 11:26:28.947+0000: --- SHUTDOWN diagnostics END ---
This started happening when I have installed ElasticSearch on my machine. There was one issue with starting Elastic search "JAVA_HOME issue", which is sorted.
I had such a problem when I was installing Neo4j the first time on my Linux laptop, I solved putting this couple of rows at the end of the /etc/security/limits.conf file:
user hard nofile 100000
user soft nofile 40000
where user is the login name of the user who starts Neo4j.
The 10000 and 40000 are somewhat arbirtrary, they were ok for me, in case you still get the error try to increase them.
If you've got a db with that problem, upgrading won't make it go away. 1.8.2 will prevent this from happening though. You're running community I see so keeping old logs around isn't all that necessary. Try deleting the existing lucene.log.v318 file, or move it away at least and see what happens the next startup.

Resources