Arangodb container reaches memory limit and crashes while filtering using 'path' for graph traversal - docker

My Environment
ArangoDB Version: 3.6.2
Storage Engine: RocksDB
Deployment Mode: Single Server
Deployment Strategy: Manual Start in Docker
Infrastructure: Own
Operating System: Linux version 4.4.0-154-generic (gcc version 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) )
Total RAM in your machine: 4GB
Disks in use: HDD
Used Package: Docker-Official Docker library
My Problem:
I have a graph with 60k nodes and 4*60k edges. Whenever I try using 'path' for Filter or Return, the memory limit reaches, arangodb container crashes and it gets restarted. However, if I don't use 'path' and use 'vertex' or 'edge' only for filter or return the query executes and produces result as expected. This issue is seen in version 3.6.2.
However, in arangodb 3.1.18 this issue is not seen and everything is
working fine.
Sample Query:
FOR v, e, p IN 6 OUTBOUND "root_node" GRAPH "my_graph_db"
FILTER (
LENGTH(p.edges) == 6 &&
LIKE (p.edges[3]._from,"Data_level_3%",true) &&
(LIKE (p.edges[3]._to, "Data_level_4%") || LIKE (p.edges[3]._to, "Data_4%")) &&
...................................................................
)
LIMIT 0,10
RETURN {
result: Merge(
{data: v},
{parent_id: p.edge[5]._id}
),
.................
}
Expected result:
The arangodb container should not reach memory limit crash. 'Path' attributes needs to accessed while doing queries.
Please refer to https://github.com/arangodb/arangodb/issues/11277

Related

xen linux + baremetal cortex A53

I am trying to run Linux (buildroot) + baremetal app on Xen.
Xen boots great and buildroot is ok on the first CPU.
My problem is the baremetal application.
The baremetal app is generated by Vitis (my A53 is in the FPGA).
This is the simplest example application : just an hello world and exit.
I am able to boot directly on this app using the jtag but when I try to run the baremetal app in Xen the app just hand and doesn't print anything.
My configurations :
FILE : xen.cfg
name = "test"
kernel = "/opt/baremetal_app.bin"
memory = 8
vcpus = 1
cpus = [1]
iomem = [ "0xff010,1" ] # UART1
Xen command line :
console=dtuart dtuart=serial0 dom0_mem=1G bootscrub=0 maxcpus=1 timer_slop=0
dom0 command line :
console=hvc0 earlycon=xen earlyprintk=xen maxcpus=1 clk_ignore_unused root=/dev/mmcblk0p2
Vitis is configured to include the hypervisor_guest the in binary.
I already tried to change the psu_ddr_0_MEM_0 to 0x40000000, len 100000.
When I run the command "xl create xen.cfg", the system print "Parsing config from xen.cfg" and returns. I see my app test in "xl list".
The uart1 doesn't print anything and the "xl console test" neither.
With "xl top" my app test seems to use 100% of the cpu.
Any idea ?

Spark executor sends result to a random port though all the ports are explicitly set up

I am trying to run a spark job with PySpark through Jupyter notebook running in Docker. Workers are located on separate machines in the same network. I am performing a take operation on RDD:
data.take(number_of_elements)
When the number_of_elements is 2000 everything works fine. When it is 20000 an exception occurs. From my point of view it breaks when the size of the result exceeds 2GB (or it seems for me so). The idea about 2GB comes from that spark can send results smaller than 2GB in one block and when the result is bigger than 2GB another mechanism starts to work and something breaks there (see here). Here is the exception from executor log:
19/11/05 10:27:14 INFO CodeGenerator: Code generated in 205.7623 ms
19/11/05 10:27:40 INFO PythonRunner: Times: total = 25421, boot = 3, init = 1751, finish = 23667
19/11/05 10:27:42 INFO MemoryStore: Block taskresult_4 stored as bytes in memory (estimated size 927.7 MB, free 6.4 GB)
19/11/05 10:27:42 INFO Executor: Finished task 0.0 in stage 3.0 (TID 4). 972788748 bytes result sent via BlockManager)
19/11/05 10:27:49 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1585998572000, chunkIndex=0}, buffer=org.apache.spark.storage.BlockManagerManagedBuffer#4399ad49} to /10.0.0.9:56222; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.spark.util.io.ChunkedByteBufferFileRegion.transferTo(ChunkedByteBufferFileRegion.scala:64)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:362)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:901)
at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1321)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.ChannelDuplexHandler.flush(ChannelDuplexHandler.java:117)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:776)
at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:768)
at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:749)
at io.netty.channel.DefaultChannelPipeline.flush(DefaultChannelPipeline.java:983)
at io.netty.channel.AbstractChannel.flush(AbstractChannel.java:248)
at io.netty.channel.nio.AbstractNioByteChannel$1.run(AbstractNioByteChannel.java:284)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
As we can see from the log executor tries to send result to 10.0.0.9:56222. It fails because the port is not opened in docker compose. 10.0.0.9 is an IP address of a master node but port 56222 is random though I explicitly set up all ports I can find in documentation to disable random port selection:
spark = SparkSession.builder\
.master('spark://spark.cyber.com:7077')\
.appName('My App')\
.config('spark.task.maxFailures', '16')\
.config('spark.driver.port', '20002')\
.config('spark.driver.host', 'spark.cyber.com')\
.config('spark.driver.bindAddress', '0.0.0.0')\
.config('spark.blockManager.port', '6060')\
.config('spark.driver.blockManager.port', '6060')\
.config('spark.shuffle.service.port', '7070')\
.config('spark.driver.maxResultSize', '14g')\
.getOrCreate()
I mapped these ports with docker compose:
version: "3"
services:
jupyter:
image: jupyter/pyspark-notebook:latest
ports:
- "4040-4050:4040-4050"
- "6060:6060"
- "7070:7070"
- "8888:8888"
- "20000-20010:20000-20010"
You should probably configure you spark driver memory to follow your docker container memory settings
I added
.config('spark.driver.memory', '14g')
as #ML_TN proposed and everything works now.
From my point of view it is strange that the memory setting affects the ports that spark uses.

Mysql Cluster using Docker: Error 708 'No more attribute metadata records (increase MaxNoOfAttributes)'

I'm setting up a mysql cluster using Docker. I have 1 management node, 2 data nodes, and 2 sql nodes. When I create a database on one sql node, it gets replicated to the other sql node which is perfectly fine.
The problem is when I import an sql file which contains many tables into one sql node, I encounter the error: 'No more attribute metadata records (increase MaxNoOfAttributes)'. I tried increasing the value of MaxNoOfAttributes to its maximum (4294967039), and also increasing the value of MaxNoOfTables to its maximum (20320), restarted the management node container, then tried again. But I'm still getting the same error. Here's my config.ini file:
[ndbd default]
NoOfReplicas=2
DataMemory=5G
IndexMemory=64M
MaxNoOfTables = 20320
MaxNoOfAttributes = 4294967039
MaxNoOfOrderedIndexes=5242
[mysqld default]
[ndb_mgmd default]
[tcp default]
[ndb_mgmd]
NodeId=2
hostname=180.168.0.2
[ndbd]
NodeId=3
hostname=180.168.0.3
DataDir= /var/lib/mysql-cluster
[ndbd]
NodeId=4
HostName=180.168.0.4
DataDir=/var/lib/mysql-cluster
[mysqld]
NodeId=5
hostname=180.168.0.10
[mysqld]
NodeId=6
hostname=180.168.0.11
The sql file contains more than 90 tables.
I've been searching this for quite a while now and I can't seem to find a working solution. Any help would be gladly appreciated.
root#swrcmsdbm:/# /usr/sbin/ndb_mgmd -f /var/lib/mysql-cluster/config.ini --initial
MySQL Cluster Management Server mysql-5.7.32 ndb-7.6.16
root#swrcmsdbm:/# usr/bin/ndb_config -q MaxNoOfAttributes
2560000 2560000

caffe powered and GPU enabled Microsoft Azure VM

I'm trying to build a VM for model training in Azure. I found this Data Science Virtual Machine for Linux (Ubuntu) VM which seems to be a suitable candidate.
Unfortunately, when I spun up the VM and installed the caffe prerequisites I wasn't able to run the tests. I'm getting the following error on make runtest (make all and make test were completed without errors):
NVIDIA: no NVIDIA devices found
Cuda number of devices: 0
Setting to use device 0
Current device id: 0
Current device name:
Note: Randomizing tests' orders with a seed of 97204 .
[==========] Running 2041 tests from 267 test cases.
[----------] Global test environment set-up.
[----------] 11 tests from AdaDeltaSolverTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN ] AdaDeltaSolverTest/3.TestAdaDeltaLeastSquaresUpdateWithHalfMomentum
NVIDIA: no NVIDIA devices found
E0715 02:24:32.097311 59355 common.cpp:114] Cannot create Cublas handle. Cublas won't be available.
NVIDIA: no NVIDIA devices found
E0715 02:24:32.103780 59355 common.cpp:121] Cannot create Curand generator. Curand won't be available.
F0715 02:24:32.103914 59355 test_gradient_based_solver.cpp:80] Check failed: error == cudaSuccess (30 vs. 0) unknown error
*** Check failure stack trace: ***
# 0x7f77a463f5cd google::LogMessage::Fail()
# 0x7f77a4641433 google::LogMessage::SendToLog()
# 0x7f77a463f15b google::LogMessage::Flush()
# 0x7f77a4641e1e google::LogMessageFatal::~LogMessageFatal()
# 0x7115e3 caffe::GradientBasedSolverTest<>::TestLeastSquaresUpdate()
# 0x7122af caffe::AdaDeltaSolverTest_TestAdaDeltaLeastSquaresUpdateWithHalfMomentum_Test<>::TestBody()
# 0x8e6023 testing::internal::HandleExceptionsInMethodIfSupported<>()
# 0x8df63a testing::Test::Run()
# 0x8df788 testing::TestInfo::Run()
# 0x8df865 testing::TestCase::Run()
# 0x8e0b3f testing::internal::UnitTestImpl::RunAllTests()
# 0x8e0e63 testing::UnitTest::Run()
# 0x466ecd main
# 0x7f77a111c830 __libc_start_main
# 0x46e589 _start
# (nil) (unknown)
Makefile:532: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)
Is it possible to spin up a virtual machine in Azure suitable for GPU enabled machine learning using caffe?
All the details about the VM here
The Data Science Virtual Machine (DSVM) for Ubuntu already has Caffe installed in /opt/caffe. To use it on a GPU, create a VM with a K80 GPU by choosing the one of the NC sizes. (Be sure to choose HDD as the storage type, or the NC sizes will not appear.) Caffe will then be available out of the box.
Also note that PyCaffe is available. At a terminal:
source activate root
And python will then have PyCaffe available.

Change character set on Microsoft R Server 9.0.1

Q: How to you change/update the character set on Microsoft R Server?
Issue: I am trying to read a CSV that is delimited with '§' but the R Server is not able to interperet the '§' character when I work remotely. Similarly for other characters like 'ø' , 'æ' and 'å'. When I work locally it's not an issue.
For example:
This works fine:
> x <- '§'
> x
[1] "§"
But when i login remotely to the server the following happens:
REMOTE> x <- '§'
REMOTE> x
[1] "?"
Setup: I am running Microsoft R Server 9.0.1 on Windows Server 2012 R2
Detailed sessionInfo:
REMOTE> sessionInfo() R version 3.3.2 (2016-10-31) Platform:
x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server >= 2012
x64 (build 9200)
locale: [1] LC_COLLATE=Norwegian (Bokm�l)_Norway.1252 [2]
LC_CTYPE=Norwegian (Bokm�l)_Norway.1252 [3] LC_MONETARY=Norwegian
(Bokm�l)_Norway.1252 [4] LC_NUMERIC=C
[5] LC_TIME=Norwegian (Bokm�l)_Norway.1252
attached base packages: [1] stats graphics grDevices utils
datasets methods base
other attached packages: [1] RevoUtilsMath_10.0.0 RevoUtils_10.0.2
RevoMods_10.0.0 [4] RevoScaleR_9.0.1 lattice_0.20-34
rpart_4.1-10
loaded via a namespace (and not attached): [1] R6_2.2.0
tools_3.3.2 CompatibilityAPI_1.1.0 [4] codetools_0.2-15
grid_3.3.2 iterators_1.0.8 [7] foreach_1.4.3
mrupdate_1.0.0 jsonlite_1.1
In addition to installing version 9.1 of Microsoft R Server I also had to make the following change for the server to work correctly with remote login:
Stop the service 'RServe9.0.0.0'
and go to C:\Program Files\Microsoft\R Server\R_SERVER\o16n\RServe\RScripts\source.R on the compute nodes
and change
```
#add more here if necessary......
```
to
```
#add more here if necessary......
options(encoding = "UTF-8")
```
and then start that service again, you should be able to use §.
Thanks to Microsoft for providing this fix.
This is a known bug, and has been patched in Microsoft R Server 9.1, please upgrade to solve your issue.

Resources