UserCodeException: java.lang.OutOfMemoryError: Java heap space when streaming autoscaling - google-cloud-dataflow

I have two streaming pipelines running on productions that has no troubles so far (both with n1-standard-4). However, when I decided to try autoscaling, it gives me said error. I've tried both with normal autoscaling and streaming engine (with n2-highmem1, n1-highmen1) but nothing works.
Here's the structure of my pipeline. It's a Pubsub to BigQuery.
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:194)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:125)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1203)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
drivemode.com.dataflow.reader.PubsubToAnalyticsTableRowFN$DoFnInvoker.invokeSetup(Unknown Source)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:80)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:62)
org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:95)
org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)\
11 more\nCaused by: java.lang.OutOfMemoryError: Java heap space
org.tukaani.xz.lz.LZDecoder.<init>(Unknown Source)
org.tukaani.xz.LZMA2InputStream.<init>(Unknown Source)
org.tukaani.xz.LZMA2InputStream.<init>(Unknown Source)
org.apache.commons.compress.archivers.sevenz.LZMA2Decoder.decode(LZMA2Decoder.java:39)
org.apache.commons.compress.archivers.sevenz.Coders.addDecoder(Coders.java:76)
org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecoderStack(SevenZFile.java:933)
org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecodingStream(SevenZFile.java:909)
org.apache.commons.compress.archivers.sevenz.SevenZFile.getNextEntry(SevenZFile.java:222)
net.iakovlev.timeshape.TimeZoneEngine.lambda$initialize$0(TimeZoneEngine.java:111)
net.iakovlev.timeshape.TimeZoneEngine$$Lambda$76/1740730188.apply(Unknown Source)
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151 java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) net.iakovlev.timeshape.Index.build(Index.java:73) net.iakovlev.timeshape.TimeZoneEngine.initialize(TimeZoneEngine.java:126) net.iakovlev.timeshape.TimeZoneEngine.initialize(TimeZoneEngine.java:94) drivemode.com.dataflow.reader.AnalyticsTableRowFN.setUp(AnalyticsTableRowFN.java:60) drivemode.com.dataflow.reader.PubsubToAnalyticsTableRowFN$DoFnInvoker.invokeSetup(Unknown Source)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:80) org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:62) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:95) org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264) org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86) org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
or sometimes I get
org.apache.beam.vendor.guava.v20_0.com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: Could not initialize class drivemode.com.dataflow.reader.AnalyticsTableRowFN
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2212)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4899)
org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:91)
org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:125)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1203)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745
Caused by: java.lang.NoClassDefFoundError: Could not initialize class drivemode.com.dataflow.reader.AnalyticsTableRowFN java.io.ObjectStreamClass.hasStaticInitializer(Native Method)
java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1787) java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:72) java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:253)
java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:251)
java.security.AccessController.doPrivileged(Native Method)
java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:250)
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:611)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) org.apache.beam.sdk.util.SerializableUtils.deserializeFromByteArray(SerializableUtils.java:71) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory$UserDoFnExtractor.getDoFnInfo(UserParDoFnFactory.java:62) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.lambda$create$0(UserParDoFnFactory.java:93)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4904) org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628) org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
... 18 more

Some of your ParDo functions or MapElements might be consuming more memory than they should, as per the logs pasted, review PubsubToAnalyticsTableRowFN.
This beam doclink might help you to tune the pipeline.

Apparently I found that the issue is in setup where I tried to initialize a time-zone lookup library that gives tz_string from (lat,lng). The memory consumption was too high and I solved the issue by loading said class at construction time.

Related

Cannot nest operations in the same thread

I'm getting this exception on my jenkins windows slave node. Any ideas please as to what may be causing it? I cant find any info about this exception anywhere.
java.lang.UnsupportedOperationException: Cannot nest operations in the same thread. Each nested operation must run in its own thread.
at org.gradle.internal.operations.DefaultBuildOperationWorkerRegistry.doStartOperation(DefaultBuildOperationWorkerRegistry.java:65)
at org.gradle.internal.operations.DefaultBuildOperationWorkerRegistry.access$400(DefaultBuildOperationWorkerRegistry.java:30)
at org.gradle.internal.operations.DefaultBuildOperationWorkerRegistry$DefaultOperation.operationStart(DefaultBuildOperationWorkerRegistry.java:163)
at org.gradle.api.internal.tasks.testing.worker.ForkingTestClassProcessor.processTestClass(ForkingTestClassProcessor.java:68)
at org.gradle.api.internal.tasks.testing.processors.RestartEveryNTestClassProcessor.processTestClass(RestartEveryNTestClassProcessor.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.FailureHandlingDispatch.dispatch(FailureHandlingDispatch.java:29)
at org.gradle.internal.dispatch.AsyncDispatch.dispatchMessages(AsyncDispatch.java:132)
at org.gradle.internal.dispatch.AsyncDispatch.access$000(AsyncDispatch.java:33)
at org.gradle.internal.dispatch.AsyncDispatch$1.run(AsyncDispatch.java:72)
at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
at org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Some child operations have not yet completed.
Reading this post, I upgraded from Gradle v3.2.1 to v3.4 and the problem has gone away.

Apache Beam PubSub Reader Exceptions

I'm running a Pipeline with a PubSub Source and I'm experiencing some strange exceptions from crashing my Pipeline. I can process a few elements (3-10) just fine and then all of a sudden one of the following two error messages gets thrown. Both don't give me a clue what I might be doing wrong so I removed all of my Transforms and only left the Source in and the problem still exists. I'm posting just some test strings to the PubSub. Any help is appreciated.
Exception 1:
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.NullPointerException
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:640)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:313)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:174)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:127)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Exception 2:
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.IllegalStateException: Cannot finalize a restored checkpoint
at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:293)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:205)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:142)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Basic Code:
PipelineOptions options = PipelineOptionsFactory.create();
PubsubOptions dataflowOptions = options.as(PubsubOptions.class);
dataflowOptions.setStreaming(true);
Pipeline p = Pipeline.create(options);
p.apply(PubsubIO.<String>read().subscription("my-subscription")
.withCoder(StringUtf8Coder.of())));
Execution:
mvn compile exec:java -Dexec.mainClass=my.package.SalesTransactions -Dexec.args="--runner BlockingDataflowRunner --project=my-project --tempLocation=gs://my-project/tmp"
This problem exists because of a Bug (BEAM-1656) in the DirectRunner and a precondition within PubsubCheckpoint.
The answer Apache Beam: PubsubReader fails with NPE contains more information about the bug and how to solve it. Thanks!

maven build fails with message in Jenkins

I am running a Jenkins server version 2.36, and intermittently, I am getting these failures when building a maven project:
I searched around, there are many experiencing this problem, but no one really knows what is causing it. Any ideas?
The error is the following:
ERROR: Aborted Maven execution for InterruptedIOException
java.net.SocketTimeoutException: Accept timed out
at java.net.DualStackPlainSocketImpl.waitForNewConnection(Native Method)
at java.net.DualStackPlainSocketImpl.socketAccept(Unknown Source)
at java.net.AbstractPlainSocketImpl.accept(Unknown Source)
at java.net.PlainSocketImpl.accept(Unknown Source)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at hudson.maven.AbstractMavenProcessFactory$SocketHandler$AcceptorImpl.accept(AbstractMavenProcessFactory.java:213)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:320)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:295)
at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:254)
at hudson.remoting.UserRequest.perform(UserRequest.java:121)
at hudson.remoting.UserRequest.perform(UserRequest.java:49)
at hudson.remoting.Request$2.run(Request.java:324)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at hudson.remoting.Engine$1$1.run(Engine.java:63)
at java.lang.Thread.run(Unknown Source)
at ......remote call to Channel to /10.0.9.100(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1537)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:253)
at hudson.remoting.Channel.call(Channel.java:822)
at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:256)
at hudson.maven.$Proxy66.accept(Unknown Source)
at hudson.maven.AbstractMavenProcessFactory.newProcess(AbstractMavenProcessFactory.java:282)
at hudson.maven.ProcessCache.get(ProcessCache.java:236)
at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:798)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
at hudson.model.Run.execute(Run.java:1729)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:544)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:404)
You can try to disable IPv6 by adding -Djava.net.preferIPv4Stack=true to jenkins.xml arguments tag.
I've found that this seems to be caused by some kind of memory and/or CPU usage problem. I was repeatedly getting this error message whilst building my maven application through jenkins whilst on a t2.micro ec2-instance on AWS.
The solution was to change my instance from t2.micro to t2.medium (small may have worked but untested). Not the best solution I'm sure, but I think the error stems from CPU maxing capacity - which was the one consistent factor when this issue occurred.

Datastax Enterprise: Error during nodetool cleanup

I've got a two-datacenter deployment with cassandra and search nodes. It looks like this:
ubuntu#ip-172-31-25-223:~$ dsetool ring
Note: Ownership information does not include topology, please specify a keyspace.
Address DC Rack Workload Status State Load Owns VNodes
172.31.47.194 Solr 2a Unknown Up Normal 3.17 GB 0.00% 1
172.31.39.59 us-west-2 2a Unknown Up Normal 2.32 GB 31.53% 512
172.31.9.36 us-west-2 2c Unknown Up Normal 3.43 GB 33.48% 512
172.31.25.223 us-west-2 2b Unknown Up Normal 3.25 GB 34.99% 512
Warning: Node 172.31.25.223 is serving 1.11 times the token space of node 172.31.39.59, which means it will be using 1.11 times more disk space and network bandwidth. If this is unintentional, check out http://wiki.apache.org/cassandra/Operations#Ring_management
I recently added the 172.31.39.59 node and I wanted to run cleanups on the other nodes to remove some of the duplicate data.
Running nodetool cleanup on the 172.31.25.223 node resulted in the following trace:
ubuntu#ip-172-31-25-223:~$ nodetool cleanup
Error occurred during cleanup
java.util.concurrent.ExecutionException: java.lang.IllegalStateException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:228)
at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:266)
at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1112)
at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2251)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException
at com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex.delete(Cql3SolrSecondaryIndex.java:61)
at org.apache.cassandra.db.index.SecondaryIndexManager.deleteFromIndexes(SecondaryIndexManager.java:470)
at org.apache.cassandra.db.compaction.CompactionManager$CleanupStrategy$Full.cleanup(CompactionManager.java:720)
at org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:580)
at org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63)
at org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:275)
at org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:223)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
... 3 more
Caused by: java.lang.NullPointerException
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.doDelete(AbstractSolrSecondaryIndex.java:619)
at com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex.delete(Cql3SolrSecondaryIndex.java:57)
... 10 more
I'm new to Datastax enterprise and I don't have much visibility into Cassandra->Solr and how the pipeline works. I'd appreciate any help!
Thanks!
Advait
Update: Tried running it again and got a different trace this time:
ubuntu#ip-172-31-25-223:~$ nodetool cleanup
Exception in thread "main" java.lang.AssertionError: [SSTableReader(path='/raid0/cassandra/data/liminex_ent/sub_accounts/liminex_ent-sub_accounts-jb-73-Data.db'), SSTableReader(path='/raid0/cassandra/data/liminex_ent/sub_accounts/liminex_ent-sub_accounts-jb-74-Data.db')]
at org.apache.cassandra.db.ColumnFamilyStore$13.call(ColumnFamilyStore.java:2130)
at org.apache.cassandra.db.ColumnFamilyStore$13.call(ColumnFamilyStore.java:2127)
at org.apache.cassandra.db.ColumnFamilyStore.runWithCompactionsDisabled(ColumnFamilyStore.java:2109)
at org.apache.cassandra.db.ColumnFamilyStore.markAllCompacting(ColumnFamilyStore.java:2140)
at org.apache.cassandra.db.compaction.CompactionManager.performAllSSTableOperation(CompactionManager.java:215)
at org.apache.cassandra.db.compaction.CompactionManager.performCleanup(CompactionManager.java:266)
at org.apache.cassandra.db.ColumnFamilyStore.forceCleanup(ColumnFamilyStore.java:1112)
at org.apache.cassandra.service.StorageService.forceKeyspaceCleanup(StorageService.java:2251)
at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420)
at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848)
at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
This was a bug - fixed in 4.5.3
The fix for the Cleanup bug is in our latest release 4.5.3 which is GA yesterday, Nov 4th 2014.
The rolling upgrade should be straight forward/no downtime.
Here's the release notes:
http://www.datastax.com/documentation/datastax_enterprise/4.5/datastax_enterprise/RNdse45.html
Fixed an issue causing a null pointer exception on non Solr workload
nodes holding Solr data and attempting to run the nodetool cleanup
command on data. (DSP-4310)

Websphere Application server not starting

We have one Websphere Application server instance which went down with OutOfMemory and is not starting after that. The error message in the log below. Any urgent help will be highly appreciated. WAS version is 6.0.2.33
03/04/13 14:16:01:536 BST] 0000000a WsServerImpl E WSVR0009E: Error occurred during startup
META-INF/ws-server-components.xml
[03/04/13 14:16:01:547 BST] 0000000a WsServerImpl E WSVR0009E: Error occurred during startup
com.ibm.ws.exception.ConfigurationError: com.ibm.ws.exception.ConfigurationError: Problem initializing AdminImpl:
at com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:180)
at com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:133)
at com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:387)
at com.ibm.ws.runtime.WsServer.main(WsServer.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:85)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:58)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:60)
at java.lang.reflect.Method.invoke(Method.java:391)
at com.ibm.ws.bootstrap.WSLauncher.run(WSLauncher.java:219)
at java.lang.Thread.run(Thread.java:568)
Caused by: com.ibm.ws.exception.ConfigurationError: Problem initializing AdminImpl:
at com.ibm.ws.management.component.AdminImpl.initialize(AdminImpl.java:780)
at com.ibm.ws.runtime.component.ContainerImpl.initializeComponent(ContainerImpl.java:1160)
at com.ibm.ws.runtime.component.ContainerImpl.initializeComponents(ContainerImpl.java:1014)
at com.ibm.ws.runtime.component.ServerImpl.initialize(ServerImpl.java:284)
at com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:173)
... 10 more
Caused by: com.ibm.ws.exception.ConfigurationWarning: Problem registering JVM MBean.
at com.ibm.ws.management.component.AdminImpl.initialize(AdminImpl.java:405)
... 14 more
Caused by: com.ibm.websphere.management.exception.AdminException: ADMN0005E: The service is unable to activate MBean: type JVM, collaborator com.ibm.ws.management.component.JVMMBean#6da65714, configuration ID JVM, descriptor null.
at com.ibm.ws.management.MBeanFactoryImpl.activateMBean(MBeanFactoryImpl.java:654)
at com.ibm.ws.management.MBeanFactoryImpl.activateMBean(MBeanFactoryImpl.java:400)
at com.ibm.ws.management.component.AdminImpl.initialize(AdminImpl.java:394)
... 14 more
Caused by: com.ibm.websphere.management.exception.DescriptorParseException: ADMN0001W: The service is unable to parse the MBean descriptor file com/ibm/ws/management/descriptor/xml/JVM.xml.
at com.ibm.ws.management.descriptor.MBeanDescriptorLoader.loadDescriptor(MBeanDescriptorLoader.java:164)
at com.ibm.ws.management.descriptor.MBeanDescriptorManager.loadDescriptorFile(MBeanDescriptorManager.java:349)
at com.ibm.ws.management.descriptor.MBeanDescriptorManager.getDescriptor(MBeanDescriptorManager.java:147)
at com.ibm.ws.management.MBeanFactoryImpl.activateMBean(MBeanFactoryImpl.java:427)
... 16 more
Caused by: java.lang.NullPointerException
at javax.management.MBeanNotificationInfo.equals(MBeanNotificationInfo.java:155)
at com.ibm.ws.management.descriptor.MBeanDescriptorLoader.addFeatures(MBeanDescriptorLoader.java:461)
at com.ibm.ws.management.descriptor.MBeanDescriptorLoader.loadParentTypes(MBeanDescriptorLoader.java:434)
at com.ibm.ws.management.descriptor.MBeanDescriptorLoader.endDocument(MBeanDescriptorLoader.java:217)
at org.apache.xerces.parsers.AbstractSAXParser.endDocument(Unknown Source)
at org.apache.xerces.impl.dtd.XMLDTDValidator.endDocument(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl.endEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityManager.endEntity(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at com.ibm.ws.management.descriptor.MBeanDescriptorLoader.loadDescriptor(MBeanDescriptorLoader.java:155)
... 19 more
I think your websphere is trying to use SUN JDK/JRE instead of IBM JDK/JRE.
you can delete all instance of Websphere servers and then you need to restart it and synch the node , node agent before to check jdk version in your environment path in the WAS.

Resources