I'm running DSE 4.5.1 on a 3-node cluster in AWS with RF=3. One of the nodes gets this error in the system.log. See (long) stack trace below. I would like to understand what the error means or implies. Is this a cause for concern? Lastly, how do I resolve the error?
[cqlsh 4.1.1 | Cassandra 2.0.8.39 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
ERROR [Native-Transport-Requests:34142] 2015-05-12 14:15:33,029 CopyFieldMutation.java (line 166) Cannot get comparator 1 in org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type). This might due to a mismatch between the schema and the data read
java.lang.RuntimeException: Cannot get comparator 1 in org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type). This might due to a mismatch between the schema and the data read
at org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:133)
at org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:100)
at com.datastax.bdp.cassandra.cql3.CopyFieldMutation.buildKeyValueIterator(CopyFieldMutation.java:275)
at com.datastax.bdp.cassandra.cql3.CopyFieldMutation.addPrimaryKeyFieldsToMutation(CopyFieldMutation.java:260)
at com.datastax.bdp.cassandra.cql3.CopyFieldMutation.createMutation(CopyFieldMutation.java:110)
at com.datastax.bdp.search.solr.triggers.SolrAugmentationTrigger.augment(SolrAugmentationTrigger.java:76)
at org.apache.cassandra.triggers.TriggerExecutor.executeInternal(TriggerExecutor.java:190)
at org.apache.cassandra.triggers.TriggerExecutor.execute(TriggerExecutor.java:94)
at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:532)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:546)
at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:530)
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
at com.datastax.bdp.cassandra.cql3.DseQueryHandler.statementExecution(DseQueryHandler.java:207)
at com.datastax.bdp.cassandra.cql3.DseQueryHandler.process(DseQueryHandler.java:86)
at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:306)
at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:285)
at com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:45)
at org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:124)
... 23 more
WARN [Native-Transport-Requests:34142] 2015-05-12 14:15:33,029 SolrAugmentationTrigger.java (line 107) Error generating additional mutations for Solr copy/dynamic fields. Update will be applied without them
org.apache.cassandra.exceptions.InvalidRequestException: Cannot get comparator 1 in org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type). This might due to a mismatch between the schema and the data read
at com.datastax.bdp.cassandra.cql3.CopyFieldMutation.createMutation(CopyFieldMutation.java:167)
at com.datastax.bdp.search.solr.triggers.SolrAugmentationTrigger.augment(SolrAugmentationTrigger.java:76)
at org.apache.cassandra.triggers.TriggerExecutor.executeInternal(TriggerExecutor.java:190)
at org.apache.cassandra.triggers.TriggerExecutor.execute(TriggerExecutor.java:94)
at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:532)
at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:546)
at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:530)
at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)
at com.datastax.bdp.cassandra.cql3.DseQueryHandler.statementExecution(DseQueryHandler.java:207)
at com.datastax.bdp.cassandra.cql3.DseQueryHandler.process(DseQueryHandler.java:86)
at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43)
at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Related
Weblogic is not coming up . It is giving following stack trace . Can any one help in solving that ?
<Jun 20, 2018 1:04:27,029 PM UTC> <Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: A MultiException has 4 exceptions. They are:
1. java.lang.ExceptionInInitializerError
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.rjvm.RJVMService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.protocol.ProtocolRegistrationService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.protocol.ProtocolRegistrationService
A MultiException has 4 exceptions. They are:
1. java.lang.ExceptionInInitializerError
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.rjvm.RJVMService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.protocol.ProtocolRegistrationService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.protocol.ProtocolRegistrationService
at org.jvnet.hk2.internal.Collector.throwIfErrors(Collector.java:89)
at org.jvnet.hk2.internal.ClazzCreator.resolveAllDependencies(ClazzCreator.java:250)
at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:358)
at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487)
at org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncRunLevelContext.java:305)
Truncated. see log file for complete stacktrace
Caused By: java.lang.ExceptionInInitializerError
at weblogic.utils.net.AddressUtils.getIPForLocalHost(AddressUtils.java:163)
at weblogic.rjvm.JVMID.setLocalID(JVMID.java:278)
at weblogic.rjvm.RJVMService.setJVMID(RJVMService.java:72)
at weblogic.rjvm.RJVMService.start(RJVMService.java:54)
at weblogic.server.AbstractServerService.postConstruct(AbstractServerService.java:76)
Truncated. see log file for complete stacktrace
Caused By: java.lang.NullPointerException
at weblogic.utils.net.AddressUtils$AddressMaker.getAllAddresses(AddressUtils.java:62)
at weblogic.utils.net.AddressUtils$AddressMaker.<clinit>(AddressUtils.java:45)
at weblogic.utils.net.AddressUtils.getIPForLocalHost(AddressUtils.java:163)
at weblogic.rjvm.JVMID.setLocalID(JVMID.java:278)
at weblogic.rjvm.RJVMService.setJVMID(RJVMService.java:72)
Truncated. see log file for complete stacktrace
>
The WebLogic Server encountered a critical failure
Reason: Assertion violated
Stopping Derby server...
Derby server stopped.
Actually there was an interface resolution problem inside the docker container which was causing this
Make sure following points for resolution :
1) Cat /etc/hosts should have entry corresponding to localhost
2) docker0 interface should be in up state
The Neo4J service is failing with following error log, please help.
We are trying to configure Neo4j in one of our Dev Server to do some POCs, but after installation the Neo4j service is failing with following error.
2018-03-21 10:34:17.935+0000 ERROR [o.n.k.i.DatabaseHealth] Database panic: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details. Failed to apply transaction: null
org.neo4j.kernel.api.exceptions.TransactionApplyKernelException: Failed to apply transaction: null
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.apply(RecordStorageEngine.java:334)
at org.neo4j.kernel.recovery.DefaultRecoverySPI$RecoveryVisitor.visit(DefaultRecoverySPI.java:137)
at org.neo4j.kernel.recovery.DefaultRecoverySPI$RecoveryVisitor.visit(DefaultRecoverySPI.java:118)
at org.neo4j.kernel.recovery.Recovery.init(Recovery.java:128)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:406)
at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:62)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:98)
at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:521)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:100)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.initFacade(GraphDatabaseFacadeFactory.java:207)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:126)
at org.neo4j.server.CommunityNeoServer.lambda$static$0(CommunityNeoServer.java:58)
at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:88)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:445)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:107)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:211)
at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:111)
at org.neo4j.server.BlockingBootstrapper.start(BlockingBootstrapper.java:41)
at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:79)
at org.neo4j.server.CommunityEntryPoint.start(CommunityEntryPoint.java:42)
Caused by: java.io.IOException: Failed to flush label updates
at org.neo4j.kernel.impl.transaction.command.IndexBatchTransactionApplier.applyPendingLabelAndIndexUpdates(IndexBatchTransactionApplier.java:116)
at org.neo4j.kernel.impl.transaction.command.IndexBatchTransactionApplier.close(IndexBatchTransactionApplier.java:124)
at org.neo4j.kernel.impl.api.BatchTransactionApplierFacade.close(BatchTransactionApplierFacade.java:70)
at org.neo4j.kernel.impl.storageengine.impl.recordstorage.RecordStorageEngine.apply(RecordStorageEngine.java:331)
... 23 more
Caused by: java.util.concurrent.ExecutionException: org.neo4j.kernel.impl.store.UnderlyingStorageException: org.neo4j.index.internal.gbptree.TreeInconsistencyException: GSPP WRITE failure
Pointer state A: CRASH
Pointer state B: CRASH
Generations: A < B | GB+Tree[file:D:\NEO_HOME\data\databases\graph.db\neostore.labelscanstore.db, layout:LabelScanLayout[version:0.1, identifier:21483684112629824, keySize:10, valueSize:8], generation:7/9]
at org.neo4j.concurrent.WorkSync.checkFailure(WorkSync.java:182)
at org.neo4j.concurrent.WorkSync.access$100(WorkSync.java:49)
at org.neo4j.concurrent.WorkSync$1.await(WorkSync.java:132)
at org.neo4j.kernel.impl.transaction.command.IndexBatchTransactionApplier.applyPendingLabelAndIndexUpdates(IndexBatchTransactionApplier.java:112)
... 26 more
Thanks in advance.
So I have just set up an HA environment where I have a master server and instances of an application using Neo4J embedded talking to that cluster. Everything seems to work if the state of both databases is the same.
However if I delete all data from my slave instance, and have it join the cluster, I expect the data from the cluster to propagate into the slave instance. Instead I get errors with what appears to be Neo4J spatial. I have Neo4J spatial in my application, and the server plugin installed in the on the master server side.
An example of the stack trace I get:
2015-10-19 15:10:27.096+0000 ERROR [org.neo4j]: Exception when stopping org.neo4j.kernel.lifecycle.Lifecycle$Delegate#ae93556 org.neo4j.gis.spatial.indexprovider.SpatialIndexImplementation.stop()V
java.lang.AbstractMethodError: org.neo4j.gis.spatial.indexprovider.SpatialIndexImplementation.stop()V
at org.neo4j.kernel.lifecycle.Lifecycles$1.stop(Lifecycles.java:55)
at org.neo4j.kernel.lifecycle.Lifecycle$Delegate.stop(Lifecycle.java:75)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:527)
at org.neo4j.kernel.lifecycle.LifeSupport.stop(LifeSupport.java:155)
at org.neo4j.kernel.lifecycle.LifeSupport.shutdown(LifeSupport.java:185)
at org.neo4j.kernel.NeoStoreDataSource.stop(NeoStoreDataSource.java:1160)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:527)
at org.neo4j.kernel.lifecycle.LifeSupport.stop(LifeSupport.java:155)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.stop(DataSourceManager.java:137)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.stopServicesAndHandleBranchedStore(SwitchToSlave.java:521)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.checkDataConsistency(SwitchToSlave.java:357)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.executeConsistencyChecks(SwitchToSlave.java:316)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:219)
at org.neo4j.kernel.ha.cluster.HighAvailabilityModeSwitcher$2.run(HighAvailabilityModeSwitcher.java:328)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:99)
2015-10-19 15:10:27.102+0000 ERROR [org.neo4j]: Lifecycle exception Failed to transition component 'org.neo4j.kernel.lifecycle.Lifecycle$Delegate#ae93556' from STOPPED to SHUTTING_DOWN. Please see attached cause exception
org.neo4j.kernel.lifecycle.LifecycleException: Failed to transition component 'org.neo4j.kernel.lifecycle.Lifecycle$Delegate#ae93556' from STOPPED to SHUTTING_DOWN. Please see attached cause exception
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.shutdown(LifeSupport.java:559)
at org.neo4j.kernel.lifecycle.LifeSupport.shutdown(LifeSupport.java:200)
at org.neo4j.kernel.NeoStoreDataSource.stop(NeoStoreDataSource.java:1160)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:527)
at org.neo4j.kernel.lifecycle.LifeSupport.stop(LifeSupport.java:155)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.stop(DataSourceManager.java:137)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.stopServicesAndHandleBranchedStore(SwitchToSlave.java:521)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.checkDataConsistency(SwitchToSlave.java:357)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.executeConsistencyChecks(SwitchToSlave.java:316)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:219)
at org.neo4j.kernel.ha.cluster.HighAvailabilityModeSwitcher$2.run(HighAvailabilityModeSwitcher.java:328)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:99)
Caused by: java.lang.AbstractMethodError: org.neo4j.gis.spatial.indexprovider.SpatialIndexImplementation.shutdown()V
at org.neo4j.kernel.lifecycle.Lifecycles$1.shutdown(Lifecycles.java:64)
at org.neo4j.kernel.lifecycle.Lifecycle$Delegate.shutdown(Lifecycle.java:81)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.shutdown(LifeSupport.java:555)
... 18 more
2015-10-19 15:10:27.103+0000 ERROR [org.neo4j]: Chained lifecycle exception Component 'org.neo4j.kernel.lifecycle.Lifecycle$Delegate#ae93556' failed to stop. Please see attached cause exception.
org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.lifecycle.Lifecycle$Delegate#ae93556' failed to stop. Please see attached cause exception.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:532)
at org.neo4j.kernel.lifecycle.LifeSupport.stop(LifeSupport.java:155)
at org.neo4j.kernel.lifecycle.LifeSupport.shutdown(LifeSupport.java:185)
at org.neo4j.kernel.NeoStoreDataSource.stop(NeoStoreDataSource.java:1160)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:527)
at org.neo4j.kernel.lifecycle.LifeSupport.stop(LifeSupport.java:155)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.stop(DataSourceManager.java:137)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.stopServicesAndHandleBranchedStore(SwitchToSlave.java:521)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.checkDataConsistency(SwitchToSlave.java:357)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.executeConsistencyChecks(SwitchToSlave.java:316)
at org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:219)
at org.neo4j.kernel.ha.cluster.HighAvailabilityModeSwitcher$2.run(HighAvailabilityModeSwitcher.java:328)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:99)
Caused by: java.lang.AbstractMethodError: org.neo4j.gis.spatial.indexprovider.SpatialIndexImplementation.stop()V
at org.neo4j.kernel.lifecycle.Lifecycles$1.stop(Lifecycles.java:55)
at org.neo4j.kernel.lifecycle.Lifecycle$Delegate.stop(Lifecycle.java:75)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.stop(LifeSupport.java:527)
... 19 more
Does Neo4j Spatial support replication across instances? Or more specifically restoring the spatial index to a new empty instance that joins the cluster for the first time?
An update to Neo4j spatial to version 0.15-neo4j-2.2.6 fixes this issue. So you need to be using Neo4j 2.2.6 in order to have Spatial indexes replicate properly
We are trying to add a new Solr node to our cluster:
DC Cassandra
Cassandra node 1
DC Solr
Solr node 1 <-- new node (actually, a replacement for an old node)
Solr node 2
Solr node 3
Solr node 4
Solr node 5
During the bootstrap process:
The stream from node 3 to node 1 failed with an exception:
ERROR [STREAM-OUT-/IP_OF_NODE1] 2014-04-01 01:14:40,887 CassandraDaemon.java (line 196) Exception in thread Thread[STREAM-OUT-/IP_OF_NODE1,5,main]
java.lang.NullPointerException
at org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
at java.lang.Thread.run(Thread.java:744)
The stream from node 4 to node 1 never started. The last relevant line in node 4's system.log is:
Received streaming plan for Bootstrap.
It should have been followed by:
Prepare completed. Receiving 0 files(0 bytes), sending x files(y bytes)
It seems that the bootstrap process is now stalled because the data file sizes are not changing anymore. How can I force those streams to be retried?
EDIT:
I restarted all nodes today in an attempt to force new node to retry the bootstrap process. Unfortunately, it encountered some stream failures again. This time, the exception in node 1 is as follows:
WARN [STREAM-IN-/IP_OF_NODE3] 2014-04-06 20:48:17,963 StreamSession.java (line 532) [Stream #c84effb0-bda9-11e3-a07d-89325af2f6bf] Retrying for following error
java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-tmp-jb-1209-Data.db (Too many open files)
at org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
at org.apache.cassandra.io.compress.CompressedSequentialWriter.<init>(CompressedSequentialWriter.java:71)
at org.apache.cassandra.io.compress.CompressedSequentialWriter.open(CompressedSequentialWriter.java:42)
at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:107)
at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:60)
at org.apache.cassandra.streaming.StreamReader.createWriter(StreamReader.java:111)
at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:65)
at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:47)
at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:37)
at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:283)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-tmp-jb-1209-Data.db (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
ERROR [STREAM-IN-/78.46.63.218] 2014-04-06 20:48:17,964 StreamSession.java (line 418) [Stream #c84effb0-bda9-11e3-a07d-89325af2f6bf] Streaming error occurred
java.lang.IllegalArgumentException: Unknown type 0
at org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:89)
at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:283)
at java.lang.Thread.run(Thread.java:724)
There are tons of similar errors in the log. e.g.:
ERROR [CompactionExecutor:129] 2014-04-06 20:50:06,401 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:129,1,main]
java.lang.RuntimeException: java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:154)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:137)
at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:400)
at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:833)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:47)
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
at org.apache.cassandra.db.columniterator.SimpleSliceReader.<init>(SimpleSliceReader.java:57)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1550)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:77)
at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:84)
at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:33)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:148)
... 10 more
Caused by: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:58)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:76)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
... 28 more
This appears to be very similar to a Cassandra bug/issue:
https://issues.apache.org/jira/browse/CASSANDRA-6965
I'll follow up on that.
Meanwhile, you could run rebuild/repair on that new node.
EDIT: Another Cassandra issue that appears to be related:
CASSANDRA-6984 - "NullPointerException in Streaming During Repair"
https://issues.apache.org/jira/browse/CASSANDRA-6984
That issue is labeled as a Blocker, so it should get some prompt attention. I've inquired as to whether there is a workaround.
Stay tuned.
(Too many open files)
Looks like you need to increase your ulimit.
I have some code I inherited and there is little documentation. The system keeps failing on various errors. It seems to me it is not reading the jar files and I am not even sure where it is looking. Here is the error below. Can anyone offer any advice?
- Creating instance of source Twitter, type uk.co.senym.flume.TweetDataSource
13 Dec 2013 15:29:55,923 ERROR [conf-file-poller-0](org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:142) - Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load source type: uk.co.senym.flume.TweetDataSource, class: uk.co.senym.flume.TweetDataSource
at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:67)
at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:40)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:327)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.ClassNotFoundException: uk.co.senym.flume.TweetDataSource
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.flume.source.DefaultSourceFactory.getClass(DefaultSourceFactory.java:65)
... 11 more
If you're using a recent version of Flume then you should use the plugins.d directory.
I'll assume for the moment you are using a Bigtop-derived distribution such as Cloudera CDH4. Then you want to take a look # /etc/flume-ng/conf/flume-env.sh to see if they were customizing the Flume classpath to point to the jar file for your custom twitter source.
That is the old way and it kinda sucks. A better way is to put your stuff into plugins.d as documented here: http://archive.cloudera.com/cdh4/cdh/4/flume-ng/FlumeUserGuide.html#installing-third-party-plugins
I believe the default plugins.d directory on CDH4 is /var/lib/flume-ng/plugins.d
HTH