Restarting a failed/stalled stream during bootstrap of new node - datastax-enterprise

We are trying to add a new Solr node to our cluster:
DC Cassandra
Cassandra node 1
DC Solr
Solr node 1 <-- new node (actually, a replacement for an old node)
Solr node 2
Solr node 3
Solr node 4
Solr node 5
During the bootstrap process:
The stream from node 3 to node 1 failed with an exception:
ERROR [STREAM-OUT-/IP_OF_NODE1] 2014-04-01 01:14:40,887 CassandraDaemon.java (line 196) Exception in thread Thread[STREAM-OUT-/IP_OF_NODE1,5,main]
java.lang.NullPointerException
at org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:249)
at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:375)
at java.lang.Thread.run(Thread.java:744)
The stream from node 4 to node 1 never started. The last relevant line in node 4's system.log is:
Received streaming plan for Bootstrap.
It should have been followed by:
Prepare completed. Receiving 0 files(0 bytes), sending x files(y bytes)
It seems that the bootstrap process is now stalled because the data file sizes are not changing anymore. How can I force those streams to be retried?
EDIT:
I restarted all nodes today in an attempt to force new node to retry the bootstrap process. Unfortunately, it encountered some stream failures again. This time, the exception in node 1 is as follows:
WARN [STREAM-IN-/IP_OF_NODE3] 2014-04-06 20:48:17,963 StreamSession.java (line 532) [Stream #c84effb0-bda9-11e3-a07d-89325af2f6bf] Retrying for following error
java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-tmp-jb-1209-Data.db (Too many open files)
at org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
at org.apache.cassandra.io.compress.CompressedSequentialWriter.<init>(CompressedSequentialWriter.java:71)
at org.apache.cassandra.io.compress.CompressedSequentialWriter.open(CompressedSequentialWriter.java:42)
at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:107)
at org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:60)
at org.apache.cassandra.streaming.StreamReader.createWriter(StreamReader.java:111)
at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:65)
at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:47)
at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:37)
at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:283)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-tmp-jb-1209-Data.db (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
ERROR [STREAM-IN-/78.46.63.218] 2014-04-06 20:48:17,964 StreamSession.java (line 418) [Stream #c84effb0-bda9-11e3-a07d-89325af2f6bf] Streaming error occurred
java.lang.IllegalArgumentException: Unknown type 0
at org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:89)
at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54)
at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:283)
at java.lang.Thread.run(Thread.java:724)
There are tons of similar errors in the log. e.g.:
ERROR [CompactionExecutor:129] 2014-04-06 20:50:06,401 CassandraDaemon.java (line 196) Exception in thread Thread[CompactionExecutor:129,1,main]
java.lang.RuntimeException: java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:154)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:137)
at org.apache.cassandra.db.Keyspace.indexRow(Keyspace.java:400)
at org.apache.cassandra.db.index.SecondaryIndexBuilder.build(SecondaryIndexBuilder.java:62)
at org.apache.cassandra.db.compaction.CompactionManager$9.run(CompactionManager.java:833)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:47)
at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1195)
at org.apache.cassandra.db.columniterator.SimpleSliceReader.<init>(SimpleSliceReader.java:57)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:250)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1550)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at org.apache.cassandra.service.pager.SliceQueryPager.queryNextPage(SliceQueryPager.java:77)
at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:84)
at org.apache.cassandra.service.pager.SliceQueryPager.fetchPage(SliceQueryPager.java:33)
at org.apache.cassandra.service.pager.QueryPagers$1.next(QueryPagers.java:148)
... 10 more
Caused by: java.io.FileNotFoundException: /home/cassandra/data/my_keyspace/my_table/my_keyspace-my_table-jb-51-Data.db (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:58)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:76)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
... 28 more

This appears to be very similar to a Cassandra bug/issue:
https://issues.apache.org/jira/browse/CASSANDRA-6965
I'll follow up on that.
Meanwhile, you could run rebuild/repair on that new node.
EDIT: Another Cassandra issue that appears to be related:
CASSANDRA-6984 - "NullPointerException in Streaming During Repair"
https://issues.apache.org/jira/browse/CASSANDRA-6984
That issue is labeled as a Blocker, so it should get some prompt attention. I've inquired as to whether there is a workaround.
Stay tuned.

(Too many open files)
Looks like you need to increase your ulimit.

Related

Weblogic not coming up inside the Dockder container

Weblogic is not coming up . It is giving following stack trace . Can any one help in solving that ?
<Jun 20, 2018 1:04:27,029 PM UTC> <Critical> <WebLogicServer> <BEA-000386> <Server subsystem failed. Reason: A MultiException has 4 exceptions. They are:
1. java.lang.ExceptionInInitializerError
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.rjvm.RJVMService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.protocol.ProtocolRegistrationService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.protocol.ProtocolRegistrationService
A MultiException has 4 exceptions. They are:
1. java.lang.ExceptionInInitializerError
2. java.lang.IllegalStateException: Unable to perform operation: post construct on weblogic.rjvm.RJVMService
3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of weblogic.protocol.ProtocolRegistrationService errors were found
4. java.lang.IllegalStateException: Unable to perform operation: resolve on weblogic.protocol.ProtocolRegistrationService
at org.jvnet.hk2.internal.Collector.throwIfErrors(Collector.java:89)
at org.jvnet.hk2.internal.ClazzCreator.resolveAllDependencies(ClazzCreator.java:250)
at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:358)
at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487)
at org.glassfish.hk2.runlevel.internal.AsyncRunLevelContext.findOrCreate(AsyncRunLevelContext.java:305)
Truncated. see log file for complete stacktrace
Caused By: java.lang.ExceptionInInitializerError
at weblogic.utils.net.AddressUtils.getIPForLocalHost(AddressUtils.java:163)
at weblogic.rjvm.JVMID.setLocalID(JVMID.java:278)
at weblogic.rjvm.RJVMService.setJVMID(RJVMService.java:72)
at weblogic.rjvm.RJVMService.start(RJVMService.java:54)
at weblogic.server.AbstractServerService.postConstruct(AbstractServerService.java:76)
Truncated. see log file for complete stacktrace
Caused By: java.lang.NullPointerException
at weblogic.utils.net.AddressUtils$AddressMaker.getAllAddresses(AddressUtils.java:62)
at weblogic.utils.net.AddressUtils$AddressMaker.<clinit>(AddressUtils.java:45)
at weblogic.utils.net.AddressUtils.getIPForLocalHost(AddressUtils.java:163)
at weblogic.rjvm.JVMID.setLocalID(JVMID.java:278)
at weblogic.rjvm.RJVMService.setJVMID(RJVMService.java:72)
Truncated. see log file for complete stacktrace
>
The WebLogic Server encountered a critical failure
Reason: Assertion violated
Stopping Derby server...
Derby server stopped.
Actually there was an interface resolution problem inside the docker container which was causing this
Make sure following points for resolution :
1) Cat /etc/hosts should have entry corresponding to localhost
2) docker0 interface should be in up state

Open Externally Locked Neo4j Store

I am currently unable to start a neo4j instance on a server where it was previously working. Neo4j produces an error saying one of the files is externally locked. From reading answers to similar questions I know this usually means another instance is currently using the store.
I have used lsof to look for any open files with 'neo' in their name but there are no open files.
I have used ps to look for any process with 'neo' or 'java' it it's name but no luck with that either.
I am not sure where to go from here to get neo4j working again. How do I figure out why this file is locked?
[site#x neo4j]$ neo4jstart-234.sh
[site#x neo4j]$ WARNING: Max 1024 open files allowed, minimum of 40 000 recommended. See the Neo4j manual.
Starting Neo4j Server console-mode...
2017-07-20 10:37:51.073-0400 INFO Successfully shutdown Neo4j Server
2017-07-20 10:37:51.074-0400 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#79b877d2' was successfully initialized, but failed to start. Please see attached cause exception. Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#79b877d2' was successfully initialized, but failed to start. Please see attached cause exception.
org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabase#79b877d2' was successfully initialized, but failed to start. Please see attached cause exception.
at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:67)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:235)
at org.neo4j.server.Bootstrapper.start(Bootstrapper.java:97)
at org.neo4j.server.CommunityBootstrapper.start(CommunityBootstrapper.java:48)
at org.neo4j.server.CommunityBootstrapper.main(CommunityBootstrapper.java:35)
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabase#79b877d2' was successfully initialized, but failed to start. Please see attached cause exception.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:462)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:195)
... 3 more
Caused by: java.lang.RuntimeException: Error starting org.neo4j.kernel.impl.factory.CommunityFacadeFactory, /scratch/neo4j/neo4j-community-2.3.4/data/graph.db
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:143)
at org.neo4j.kernel.impl.factory.CommunityFacadeFactory.newFacade(CommunityFacadeFactory.java:43)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:108)
at org.neo4j.server.CommunityNeoServer$1.newGraphDatabase(CommunityNeoServer.java:66)
at org.neo4j.server.database.LifecycleManagingDatabase.start(LifecycleManagingDatabase.java:95)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
... 5 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.NeoStoreDataSource#14c7f72f' was successfully initialized, but failed to start. Please see attached cause exception.
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:462)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.kernel.impl.transaction.state.DataSourceManager.start(DataSourceManager.java:112)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
at org.neo4j.kernel.impl.factory.GraphDatabaseFacadeFactory.newFacade(GraphDatabaseFacadeFactory.java:139)
... 10 more
Caused by: org.neo4j.kernel.impl.store.UnderlyingStorageException: Unable to open store file: /scratch/neo4j/neo4j-community-2.3.4/data/graph.db/neostore.nodestore.db.labels
at org.neo4j.kernel.impl.store.CommonAbstractStore.checkStorage(CommonAbstractStore.java:161)
at org.neo4j.kernel.impl.store.CommonAbstractStore.initialise(CommonAbstractStore.java:102)
at org.neo4j.kernel.impl.store.NeoStores.initialize(NeoStores.java:429)
at org.neo4j.kernel.impl.store.NeoStores$StoreType$1.open(NeoStores.java:85)
at org.neo4j.kernel.impl.store.NeoStores$StoreType$1.open(NeoStores.java:78)
at org.neo4j.kernel.impl.store.NeoStores.openStore(NeoStores.java:422)
at org.neo4j.kernel.impl.store.NeoStores.getOrCreateStore(NeoStores.java:467)
at org.neo4j.kernel.impl.store.NeoStores.<init>(NeoStores.java:365)
at org.neo4j.kernel.impl.store.StoreFactory.openNeoStores(StoreFactory.java:174)
at org.neo4j.kernel.impl.store.StoreFactory.openAllNeoStores(StoreFactory.java:138)
at org.neo4j.kernel.NeoStoreDataSource.buildNeoStore(NeoStoreDataSource.java:661)
at org.neo4j.kernel.NeoStoreDataSource.start(NeoStoreDataSource.java:535)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:452)
... 15 more
Caused by: org.neo4j.io.pagecache.impl.FileLockException: Externally locked: /scratch/neo4j/neo4j-community-2.3.4/data/graph.db/neostore.nodestore.db.labels
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.acquireLock(SingleFilePageSwapper.java:209)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapper.<init>(SingleFilePageSwapper.java:160)
at org.neo4j.io.pagecache.impl.SingleFilePageSwapperFactory.createPageSwapper(SingleFilePageSwapperFactory.java:64)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.<init>(MuninnPagedFile.java:95)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.map(MuninnPageCache.java:357)
at org.neo4j.kernel.impl.store.CommonAbstractStore.checkStorage(CommonAbstractStore.java:140)
... 27 more

Slave lost error in pyspark

I'm using Spark1.6
I'm running a simple df.show(2) method and got errors like
An error occurred while calling o143.showString.
: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 6 in stage 6.0 failed 4 times, most recent failure:
Lost task 6.3 in stage 6.0
ExecutorLostFailure (executor 2 exited caused by one of the
running tasks) Reason: Slave lost
When I did persist, through spark UI I saw the shuffleWrite memory is very high and took a long time and still returned errors.
Through some search, I found these might be the out of memory problem.
Following this link out of memory error Java
I did a repartition up to 1000, still not so helpful.
I set up the SparkConf as
conf = (SparkConf().set("spark.driver.maxResultSize", "150g").set("spark.serializer", "org.apache.spark.serializer.KryoSerializer"))
My server side memory could be up to 200GB
Do yo have any good idea to do this or point me to related links. Pyspark will be most helpful
Here is the error log from YARN:
Application application_1477088172315_0118 failed 2 times due to
AM Container for appattempt_1477088172315_0118_000006 exited
with exitCode: 10
For more detailed output, check application tracking page: Then,
click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1477088172315_0118_06_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
at org.apache.hadoop.util.Shell.run(Shell.java:479)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application.
Here is the error info from notebook:
Py4JJavaError: An error occurred while calling o71.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 15.0 failed 4 times, most recent failure: Lost task 1.3 in stage 15.0 (): ExecutorLostFailure (executor 26 exited caused by one of the running tasks) Reason: Slave lost
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:212)
at org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165)
at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1499)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56)
at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505)
at org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1375)
at org.apache.spark.sql.DataFrame$$anonfun$head$1.apply(DataFrame.scala:1374)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099)
at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1374)
at org.apache.spark.sql.DataFrame.take(DataFrame.scala:1456)
at org.apache.spark.sql.DataFrame.showString(DataFrame.scala:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
Thank you

ERROR starting neo4j service on ubuntu

I installed neo4j 2.0.0 M06 version on my Ubuntu pc. It service worked fine, and I could use the new web browser perfectly.
Then, I used the sample java project (https://github.com/neo4j/neo4j/blob/2.0.0-M06/community/embedded-examples/src/main/java/org/neo4j/examples/EmbeddedNeo4jWithIndexing.java) to connect embedded way to the DB and add some nodes. (btw, I'm sure I stopped the neo4j service before launching the java application)
I changed the number of nodes added by the program to 100,000, and the application crashed on exceeding heap size (GC overhead limit).
Now, when trying to launch the neo4j I get a startup error :
2013-11-01 09:53:13.806+0000 DEBUG [API] Failed to start Neo Server on port [7474]
2013-11-01 10:00:52.865+0000 INFO [API] Setting startup timeout to: 120000ms based on -1
2013-11-01 10:00:52.998+0000 DEBUG [API]
org.neo4j.server.ServerStartupException: Starting Neo4j Server failed: org/neo4j/helpers /Settings
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:193) ~[neo4j- server-2.0.0-M06.jar:2.0.0-M06]
at org.neo4j.server.Bootstrapper.start(Bootstrapper.java:87) [neo4j-server-2.0.0- M06.jar:2.0.0-M06]
at org.neo4j.server.Bootstrapper.main(Bootstrapper.java:50) [neo4j-server-2.0.0- M06.jar:2.0.0-M06]
Caused by: java.lang.NoClassDefFoundError: org/neo4j/helpers/Settings
at org.neo4j.shell.ShellSettings.<clinit>(ShellSettings.java:42) ~[neo4j-shell- 2.0.0-M06.jar:2.0.0-M06]
at org.neo4j.server.database.CommunityDatabase.getDbTuningPropertiesWithServerDefaults(Communit yDatabase.java:106) ~[neo4j-server-2.0.0-M06.jar:2.0.0-M06]
at org.neo4j.server.enterprise.EnterpriseDatabase.start(EnterpriseDatabase.java:89) ~[neo4j-server-enterprise-2.0.0-M06.jar:2.0.0-M06]
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:141) ~[neo4j- server-2.0.0-M06.jar:2.0.0-M06]
... 2 common frames omitted
Caused by: java.lang.ClassNotFoundException: org.neo4j.helpers.Settings
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[na:1.7.0_45]
at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[na:1.7.0_45]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_45]
at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[na:1.7.0_45]
at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[na:1.7.0_45]
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[na:1.7.0_45]
... 6 common frames omitted
2013-11-01 10:00:53.000+0000 DEBUG [API] Failed to start Neo Server on port [7474]
I found the problem with the jar files. Unfortunately, after solving the jar files problem, I had to reinstall neo4j for the service to work again

Cassandra streaming

I am using cassandra version 1.1.4 and for bulk loading data. I am creating SSTables and then trying to stream them to the cassandra cluster. I am trying this only on a single node (my own system) and thus have kept replication factor as 1. But streaming fails every time I try to, and the following error message occurs
progress: [/10.66.92.92 0/6 (0)] [total: 0 - 0MB/s (avg: 0MB/s)]Streaming session to /10.66.92.92 failed
Exception in thread "Streaming to /10.66.92.92:1" java.lang.RuntimeException: java.io.EOFException
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(Unknown Source)
at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194)
at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:181)
at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
progress: [/10.66.92.92 1/6 (16)] [total: 16 - 0MB/s (avg: 0MB/s)]Streaming to the following hosts failed:
[/10.66.92.92]
When I close the server and restart it, I can see my tables inserted, but I dont want this. I want streaming to happen successfully and avoid restarting the server again and again. Cassandra folks please help! I am Using a Windows 7 environment

Resources