Not able to limit the memory usage with ImageMagick - imagemagick

I am configuring ImageMagick for PSD thumbnail generation using the command [1]. It works perfectly for small size PSD but when processing a 2GB PSD, the server CPU and memory usage will keep increasing to 100% in a few minutes till the process is killed by the server.
The CPU and memory utilization could not be controlled by the configuration at /etc/ImageMagick/policy.xml[2]. Also attached the ImageMagick log[3] when this happens.
Server: CentOS 6.9, 8-core CPU, 32G Memory, 500G Disk
[1] convert one-page.psd -flatten -thumbnail 1280x1280 thumbnail.jpeg
[2]
<policymap>
<policy domain="resource" name="temporary-path" value="/data01/imagemagick-tmp"/>
<policy domain="resource" name="memory" value="2GiB"/>
<policy domain="resource" name="map" value="4GiB"/>
<policy domain="resource" name="disk" value="50GiB"/>
<policy domain="resource" name="thread" value="1"/>
<policy domain="resource" name="throttle" value="5"/>
</policymap>
[3]
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Cache convert[1944]: cache.c/ClonePixelCacheRepository/842/Cache
Memory => Memory</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Resource convert[1944]: resource.c/RelinquishMagickResource/1069/Resource
Memory: 70224B/1.33725GiB/2GiB</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Cache convert[1944]: cache.c/OpenPixelCache/3778/Cache
open One-page.psd[0] (Heap Memory, 418x14x4 93632B)</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDLayer/1493/Coder
reading data for channel 1</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDLayer/1493/Coder
reading data for channel 2</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDLayer/1493/Coder
reading data for channel 3</event>
<event>2019-03-20T18:25:37+08:00 3:45.830 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>
<event>2019-03-20T18:25:37+08:00 3:45.840 14.550u 7.0.8 Coder convert[1944]: psd.c/ApplyPSDLayerOpacity/393/Coder
applying layer opacity 22873</event>
<event>2019-03-20T18:25:37+08:00 3:45.840 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDImage/2374/Coder
reading the precombined layer</event>
<event>2019-03-20T18:25:37+08:00 3:45.840 14.550u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>
<event>2019-03-20T18:25:39+08:00 3:47.980 14.660u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>
<event>2019-03-20T18:25:41+08:00 3:50.100 14.770u 7.0.8 Coder convert[1944]: psd.c/ReadPSDChannelRLE/1158/Coder
layer data is RLE compressed</event>

Related

Download a video from blob:https url from eventive

I want to download a video from eventive but the tag contains a blob url (blob:https://watch.eventive.org/**************).
I have been reading some answers to similar questions but they all assume that the video has a mp3u8 file extension, which in my case doesn't.
I have also tried to directly download that url (removing the blob:) with curl, but I obtained a html code which I cant really understand, which contains a lot of references to javascript files.
Any tip would be really helpful.
Edit:
I have been doing some more tests and I have been able to found a url to a manifest file which points apparently to the video data. I have tried to open that url using VLC and it loads some info about the video, and it even plays it, but the screen is completely black.
I have also tried to download it using ffmpeg but it gives a bunch of errors.
ffmpeg -i https://eventiveprod-usea.streaming.media.azure.net/02968382-069a-4f02-ab3b-67c0b5a8d5b7/5f427f2d65a569006ebd326c.ism/manifest\(format\=mpd-time-csf\) output.mp4
ffmpeg output
ffmpeg version n4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.1.0 (GCC)
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-nvdec --enable-nvenc --enable-omx --enable-shared --enable-version3
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
[h264 # 0x556c1e280bc0] top block unavailable for requested intra mode
[h264 # 0x556c1e280bc0] error while decoding MB 34 0, bytestream 41478
[h264 # 0x556c1e280bc0] concealing 8160 DC, 8160 AC, 8160 MV errors in I frame
[h264 # 0x556c1e2ce440] top block unavailable for requested intra mode -1
[h264 # 0x556c1e2ce440] error while decoding MB 34 0, bytestream 22093
[h264 # 0x556c1e2ce440] concealing 3600 DC, 3600 AC, 3600 MV errors in I frame
[h264 # 0x556c1e2f9f40] top block unavailable for requested intra mode -1
[h264 # 0x556c1e2f9f40] error while decoding MB 24 0, bytestream 15117
[h264 # 0x556c1e2f9f40] concealing 2040 DC, 2040 AC, 2040 MV errors in I frame
[h264 # 0x556c1e333bc0] top block unavailable for requested intra mode -1
[h264 # 0x556c1e333bc0] error while decoding MB 17 0, bytestream 8716
[h264 # 0x556c1e333bc0] concealing 920 DC, 920 AC, 920 MV errors in I frame
[aac # 0x556c1e3745c0] channel element 2.8 is not allocated
Input #0, dash, from 'https://eventiveprod-usea.streaming.media.azure.net/02968382-069a-4f02-ab3b-67c0b5a8d5b7/5f427f2d65a569006ebd326c.ism/manifest(format=mpd-time-csf)':
Duration: 00:21:18.00, start: 0.066000, bitrate: 0 kb/s
Program 0
Stream #0:0: Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 1317 kb/s, 29.97 fps, 29.97 tbr, 10000k tbn, 59.94 tbc
Metadata:
variant_bitrate : 1770930
id : 1_V_video_1
Stream #0:1: Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 686 kb/s, 29.97 fps, 29.97 tbr, 10000k tbn, 59.94 tbc
Metadata:
variant_bitrate : 904295
id : 1_V_video_2
Stream #0:2: Video: h264 (High) (avc1 / 0x31637661), yuv420p, 960x540 [SAR 1:1 DAR 16:9], 495 kb/s, 29.97 fps, 29.97 tbr, 10000k tbn, 59.94 tbc
Metadata:
variant_bitrate : 572685
id : 1_V_video_3
Stream #0:3: Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 308 kb/s, 29.97 fps, 29.97 tbr, 10000k tbn, 59.94 tbc
Metadata:
variant_bitrate : 301036
id : 1_V_video_4
Stream #0:4(en): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
Metadata:
variant_bitrate : 127999
id : 5_A_aac_eng_2_127999
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Stream #0:4 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[aac # 0x556c1e392dc0] channel element 2.8 is not allocated
Error while decoding stream #0:4: Invalid data found when processing input
[h264 # 0x556c1e3da900] top block unavailable for requested intra mode
[h264 # 0x556c1e3da900] error while decoding MB 34 0, bytestream 41478
[aac # 0x556c1e392dc0] Reserved bit set.
[aac # 0x556c1e392dc0] Number of bands (13) exceeds limit (11).
[h264 # 0x556c1e3da900] concealing 8160 DC, 8160 AC, 8160 MV errors in I frame
Error while decoding stream #0:4: Invalid data found when processing input
[aac # 0x556c1e392dc0] channel element 2.4 is not allocated
Error while decoding stream #0:4: Invalid data found when processing input
[aac # 0x556c1e392dc0] Multiple frames in a packet.
...
[libx264 # 0x5588338c2d00] frame I:1 Avg QP: 4.13 size: 421
[libx264 # 0x5588338c2d00] frame P:32 Avg QP:14.06 size: 1216
[libx264 # 0x5588338c2d00] frame B:96 Avg QP:20.52 size: 306
[libx264 # 0x5588338c2d00] consecutive B-frames: 0.8% 0.0% 0.0% 99.2%
[libx264 # 0x5588338c2d00] mb I I16..4: 100.0% 0.0% 0.0%
[libx264 # 0x5588338c2d00] mb P I16..4: 0.6% 0.6% 0.0% P16..4: 9.4% 0.3% 0.3% 0.0% 0.0% skip:88.9%
[libx264 # 0x5588338c2d00] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 3.7% 0.0% 0.0% direct: 0.0% skip:96.2% L0:53.5% L1:46.3% BI: 0.2%
[libx264 # 0x5588338c2d00] 8x8 transform intra:13.0% inter:86.8%
[libx264 # 0x5588338c2d00] coded y,uvDC,uvAC intra: 1.3% 0.8% 0.2% inter: 0.2% 0.2% 0.0%
[libx264 # 0x5588338c2d00] i16 v,h,dc,p: 98% 0% 2% 0%
[libx264 # 0x5588338c2d00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 33% 2% 64% 0% 0% 0% 0% 0% 0%
[libx264 # 0x5588338c2d00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29% 18% 42% 3% 1% 1% 2% 2% 1%
[libx264 # 0x5588338c2d00] i8c dc,h,v,p: 99% 0% 0% 0%
[libx264 # 0x5588338c2d00] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 # 0x5588338c2d00] ref P L0: 86.7% 1.2% 8.8% 3.3%
[libx264 # 0x5588338c2d00] ref B L0: 62.7% 35.4% 1.9%
[libx264 # 0x5588338c2d00] ref B L1: 96.1% 3.9%
[libx264 # 0x5588338c2d00] kb/s:127.70
Conversion failed!

ElasticSearch in Docker dies silently and restarts, but why?

I am monitoring a docker that runs ElasticSearch 6.2.3
Every day it dies, and I have not been able to figure out why...
I think it is memory, but where can I find proof of that ?
Running top on the host (Linux) I get
Virtual Mem = 53 Gig
Res = 26.7
right now ! that is...
docker docker inspect gives me this
"Memory": 34225520640,
"CpusetMems": "",
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 34225520640,
"MemorySwappiness": null,
"Name": "memlock",
The JVM params are
/usr/bin/java
-Xms1g
-Xmx1g
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+AlwaysPreTouch
-Xss1m
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djna.nosys=true
-XX:-OmitStackTraceInFastThrow
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=/usr/share/elasticsearch/tmp
-XX:+HeapDumpOnOutOfMemoryError
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-Xloggc:logs/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=32
-XX:GCLogFileSize=64m
-Des.cgroups.hierarchy.override=/
-Xms26112m
-Xmx26112m
-Des.path.home=/usr/share/elasticsearch
-Des.path.conf=/usr/share/elasticsearch/config
-cp
/usr/share/elasticsearch/lib/*
org.elasticsearch.bootstrap.Elasticsearch
and here is the log
[2019-01-30T07:14:01,278] [app-mesos-orders_api-2019.01.30/6l7Ga1I5T3qhLKYmWjQpRA] update_mapping [doc]
[2019-01-30T07:25:53,489] initializing ...
[2019-01-30T07:25:53,581] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/vg_tobias-lv_tobias)]], net usable_space [126.8gb], net total_space [199.8gb], types [xfs]
[2019-01-30T07:25:53,581] heap size [25.4gb], compressed ordinary object pointers [true]
[2019-01-30T07:26:12,390], node ID [-sJqW_h1TKy9c_Ka08In0A]
[2019-01-30T07:26:12,391] version[6.2.3], pid[1], build[c59ff00/2018-03-13T10:06:29.741383Z], OS[Linux/3.10.0-862.11.6.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_151/25.151-b12]
[2019-01-30T07:26:12,391] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/usr/share/elasticsearch/tmp, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.cgroups.hierarchy.override=/, -Xms26112m, -Xmx26112m, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config]
[2019-01-30T07:26:13,008] loaded module [aggs-matrix-stats]
[2019-01-30T07:26:13,008] loaded module [analysis-common]
[2019-01-30T07:26:13,008] loaded module [ingest-common]
[2019-01-30T07:26:13,008] loaded module [lang-expression]
[2019-01-30T07:26:13,008] loaded module [lang-mustache]
[2019-01-30T07:26:13,009] loaded module [lang-painless]
[2019-01-30T07:26:13,009] loaded module [mapper-extras]
[2019-01-30T07:26:13,009] loaded module [parent-join]
[2019-01-30T07:26:13,009] loaded module [percolator]
[2019-01-30T07:26:13,009] loaded module [rank-eval]
[2019-01-30T07:26:13,009] loaded module [reindex]
[2019-01-30T07:26:13,009] loaded module [repository-url]
[2019-01-30T07:26:13,009] loaded module [transport-netty4]
[2019-01-30T07:26:13,009] loaded module [tribe]
[2019-01-30T07:26:13,009] no plugins loaded
[2019-01-30T07:26:19,947] using discovery type [zen]
[2019-01-30T07:26:20,444] initialized
[2019-01-30T07:26:20,444] starting ...
[2019-01-30T07:26:20,600] publish_address {172.16.44.8:9300}, bound_addresses {172.17.0.14:9300}
[2019-01-30T07:26:21,507] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2019-01-30T07:26:24,855] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {elasticsearch-1}{-sJqW_h1TKy9c_Ka08In0A}{iGTgehBjQ3yPRm9nlTLbYw}{172.16.44.8}{172.16.44.8:9300}{rack=rack1}
[2019-01-30T07:26:24,861] new_master {elasticsearch-1}{-sJqW_h1TKy9c_Ka08In0A}{iGTgehBjQ3yPRm9nlTLbYw}{172.16.44.8}{172.16.44.8:9300}{rack=rack1}, reason: apply cluster state (from master [master {elasticsearch-1}{-sJqW_h1TKy9c_Ka08In0A}{iGTgehBjQ3yPRm9nlTLbYw}{172.16.44.8}{172.16.44.8:9300}{rack=rack1} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2019-01-30T07:26:24,881] publish_address {172.16.44.8:9200}, bound_addresses {172.17.0.14:9200}
[2019-01-30T07:26:24,881] started
[2019-01-30T07:26:37,033] recovered [1535] indices into cluster_state
I am thinking that the Docker container runs out of memory and dies silently
How can I prove this ? and what can I do to solve this ?

Out of memory exception in dataflow with windowing on bounded input

I have created a dataflow which takes input from datastore and performs transform to convert it to BigQuery TableRow. I am attaching timestamp with each element in a transform. Then window of one day is applied to the PCollection. The windowed output is written to a partition in BigQuery table using Apache Beam's BigQueryIO.
The job fails with OOM error.
Dataflow Job ID: 2018-03-26_20_45_39-10536011060742036262
workerMachineType used: n1-standard-8
The dataflow pipeline at a high level is:
// Read from datastore
PCollection<Entity> entities =
pipeline.apply("ReadFromDatastore",
DatastoreIO.v1().read().withProjectId(options.getProject())
.withQuery(query).withNamespace(options.getNamespace()));
// Apply processing to convert it to BigQuery TableRow
PCollection<TableRow> tableRow =
entities.apply("ConvertToTableRow", ParDo.of(new ProcessEntityFn()));
// Apply timestamp to TableRow element, and then apply windowing of one day on that
PCollection<TableRow> tableRowWindow =
tableRow.apply("tableAddTimestamp", ParDo.of(new ApplyTimestampFn())).apply(
"tableApplyWindow",
Window.<TableRow> into(CalendarWindows.days(1).withTimeZone(
DateTimeZone.forID(options.getTimeZone()))));
// Write windowed output to BigQuery partitions
tableRowWindow.apply(
"WriteTableToBQ",
BigQueryIO
.writeTableRows()
.withSchema(BigqueryHelper.getSchema())
.to(TableRefPartition.perDay(options.getProject(),
options.getBigQueryDataset(), options.getTableName()))
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
I am getting following error logs several times:
An OutOfMemoryException occurred. Consider specifying higher memory instances in PipelineOptions.
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
at com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:182)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:104)
at com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:54)
at com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:37)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:117)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:74)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:113)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:187)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:148)
at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:68)
at com.google.cloud.dataflow.worker.DataflowWorker.executeWork(DataflowWorker.java:330)
at com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:302)
at com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:251)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at org.apache.beam.sdk.io.gcp.bigquery.WriteBundlesToFiles$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:116)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at org.apache.beam.sdk.io.gcp.bigquery.PrepareWrite$1.processElement(PrepareWrite.java:62)
at org.apache.beam.sdk.io.gcp.bigquery.PrepareWrite$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:116)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:443)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.outputWithTimestamp(SimpleDoFnRunner.java:430)
at com.ittiam.cvml.dataflow.service.LoadIsmDataflow$ApplyTimestampFn.processElement(LoadIsmDataflow.java:526)
at com.ittiam.cvml.dataflow.service.LoadIsmDataflow$ApplyTimestampFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at com.ittiam.cvml.dataflow.service.LoadIsmDataflow$ProcessEntityFn.processElement(LoadIsmDataflow.java:478)
at com.ittiam.cvml.dataflow.service.LoadIsmDataflow$ProcessEntityFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn.processElement(DatastoreV1.java:919)
at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read$ReadFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:122)
at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:272)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:211)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at org.apache.beam.runners.dataflow.ReshuffleOverrideFactory$ReshuffleWithOnlyTrigger$1.processElement(ReshuffleOverrideFactory.java:84)
at org.apache.beam.runners.dataflow.ReshuffleOverrideFactory$ReshuffleWithOnlyTrigger$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:324)
at com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:48)
at com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:180)
... 21 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.google.api.client.googleapis.media.MediaHttpUploader.setContentAndHeadersOnCurrentRequest(MediaHttpUploader.java:603)
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:409)
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:357)
Dataflow pipeline fails with the following error log:
Workflow failed. Causes: S53:ReadIsmFromDatastore/Reshuffle/Reshuffle/GroupByKey/Read+ReadIsmFromDatastore/Reshuffle/Reshuffle/GroupByKey/GroupByWindow+ReadIsmFromDatastore/Reshuffle/Reshuffle/ExpandIterable+ReadIsmFromDatastore/Reshuffle/Values/Values/Map+ReadIsmFromDatastore/Read+ConvertToTableRow+IsmTableAddTimestamp+TrackerTableAddTimestamp+IsmTableApplyWindow/Window.Assign+TrackerTableApplyWindow/Window.Assign+WriteTrackerTableToBQ/PrepareWrite/ParDo(Anonymous)+WriteTrackerTableToBQ/BatchLoads/rewindowIntoGlobal/Window.Assign+WriteTrackerTableToBQ/BatchLoads/WriteBundlesToFiles+WriteTrackerTableToBQ/BatchLoads/ReifyResults/View.AsIterable/View.CreatePCollectionView/ParDo(ToIsmRecordForGlobalWindow)+WriteTrackerTableToBQ/BatchLoads/GroupByDestination/Reify+WriteTrackerTableToBQ/BatchLoads/GroupByDestination/Write+WriteIsmTableToBQ/PrepareWrite/ParDo(Anonymous)+WriteIsmTableToBQ/BatchLoads/rewindowIntoGlobal/Window.Assign+WriteIsmTableToBQ/BatchLoads/WriteBundlesToFiles+WriteIsmTableToBQ/BatchLoads/ReifyResults/View.AsIterable/View.CreatePCollectionView/ParDo(ToIsmRecordForGlobalWindow)+WriteIsmTableToBQ/BatchLoads/GroupByDestination/Reify+WriteIsmTableToBQ/BatchLoads/GroupByDestination/Write failed., A work item was attempted 4 times without success. Each time the worker eventually lost contact with the service.

Neo4j bulk import “neo4j-admin import” OutOfMemoryError: Java heap space and OutOfMemoryError: GC overhead limit exceeded

My single machine available resource is :
Total machine memory: 2.00 TB
Free machine memory: 1.81 TB
Max heap memory : 910.50 MB
Processors: 192
Configured max memory: 1.63 TB
My file1.csv file size is 600GB
Number of entries in my csv file = 3 000 000 000
Header structure
attempt1
item_col1:ID(label),item_col2,item_col3:IGNORE,item_col4:IGNORE,item_col5,item_col6,item_col7,item_col8:IGNORE
Attempt2
item_col1:ID,item_col2,item_col3:IGNORE,item_col4:IGNORE,item_col5,item_col6,item_col7,item_col8:IGNORE
Attempt3
item_col1:ID,item_col2,item_col3:IGNORE,item_col4:IGNORE,item_col5:LABEL,item_col6,item_col7,item_col8:IGNORE`
Neo4j version: 3.2.1
Tried with Configuration combination 1
cat ../conf/neo4j.conf | grep "memory"
dbms.memory.heap.initial_size=16000m
dbms.memory.heap.max_size=16000m
dbms.memory.pagecache.size=40g
Tried with Configuration combination 2
cat ../conf/neo4j.conf | grep "memory"
dbms.memory.heap.initial_size=900m
dbms.memory.heap.max_size=900m
dbms.memory.pagecache.size=4g
Tried with Configuration combination 3
dbms.memory.heap.initial_size=1000m
dbms.memory.heap.max_size=1000m
dbms.memory.pagecache.size=1g
Tried with Configuration combination 4
dbms.memory.heap.initial_size=10g
dbms.memory.heap.max_size=10g
dbms.memory.pagecache.size=10g
Tried with Configuration combination 5 ( commented) (no output)
# dbms.memory.heap.initial_size=10g
# dbms.memory.heap.max_size=10g
# dbms.memory.pagecache.size=10g
Commands tried
kaushik#machine1:/neo4j/import$ cl
kaushik#machine1:/neo4j/import$ rm -r ../data/databases/
kaushik#machine1:/neo4j/import$ mkdir ../data/databases/
kaushik#machine1:/neo4j/import$ cat ../conf/neo4j.conf | grep active
dbms.active_database=graph.db
kaushik#machine1:/neo4j/import$ ../bin/neo4j-admin import --mode csv -- database social.db --nodes head.csv,file1.csv
Neo4j version: 3.2.1
Importing the contents of these files into /neo4j/data/databases/social.db:
Nodes:
/neo4j/import/head.csv
/neo4j/import/file1.csv
Available resources:
Total machine memory: 2.00 TB
Free machine memory: 1.79 TB
Max heap memory : 910.50 MB
Processors: 192
Configured max memory: 1.61 TB
Error 1
Nodes, started 2017-07-14 05:32:51.736+0000
[*NODE:7.63 MB---------------------------------------------------|PROPERTIE|LABEL SCAN--------] 0 ? 0
Done in 40s 439ms
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.neo4j.csv.reader.Extractors$StringArrayExtractor.extract0(Extractors.java:739)
at org.neo4j.csv.reader.Extractors$ArrayExtractor.extract(Extractors.java:680)
at org.neo4j.csv.reader.BufferedCharSeeker.tryExtract(BufferedCharSeeker.java:239)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.deserializeNextFromSource(InputEntityDeserializer.java:138)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:77)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:41)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer.lambda$new$0(ParallelInputEntityDeserializer.java:106)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer$$Lambda$150/1372918763.apply(Unknown Source)
at org.neo4j.unsafe.impl.batchimport
Error 2
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.neo4j.csv.reader.Extractors$StringArrayExtractor.extract0(Extractors.java:739)
at org.neo4j.csv.reader.Extractors$ArrayExtractor.extract(Extractors.java:680)
at org.neo4j.csv.reader.BufferedCharSeeker.tryExtract(BufferedCharSeeker.java:239)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.deserializeNextFromSource(InputEntityDeserializer.java:138)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:77)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:41)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer.lambda$new$0(ParallelInputEntityDeserializer.java:106)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer$$Lambda$150/1372918763.apply(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing.lambda$submit$0(TicketedProcessing.java:110)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing$$Lambda$154/1949503798.run(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:237)
Error 3
Nodes, started 2017-07-14 05:39:48.602+0000
[NODE:7.63 MB-----------------------------------------------|PROPER|*LABEL SCAN---------------] 0 ? 0
Done in 42s 140ms
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at org.neo4j.csv.reader.Extractors$StringExtractor.extract0(Extractors.java:328)
at org.neo4j.csv.reader.Extractors$AbstractSingleValueExtractor.extract(Extractors.java:287)
at org.neo4j.csv.reader.BufferedCharSeeker.tryExtract(BufferedCharSeeker.java:239)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.deserializeNextFromSource(InputEntityDeserializer.java:138)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:77)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:41)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer.lambda$new$0(ParallelInputEntityDeserializer.java:106)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer$$Lambda$150/310855317.apply(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing.lambda$submit$0(TicketedProcessing.java:110)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing$$Lambda$154/679112060.run(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:237)
Error 4
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.neo4j.csv.reader.Extractors$StringExtractor.extract0(Extractors.java:328)
at org.neo4j.csv.reader.Extractors$AbstractSingleValueExtractor.extract(Extractors.java:287)
at org.neo4j.csv.reader.BufferedCharSeeker.tryExtract(BufferedCharSeeker.java:239)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.deserializeNextFromSource(InputEntityDeserializer.java:138)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:77)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:41)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer.lambda$new$0(ParallelInputEntityDeserializer.java:106)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer$$Lambda$118/69048864.apply(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing.lambda$submit$0(TicketedProcessing.java:110)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing$$Lambda$122/951451297.run(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:237)
Error 5
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3664)
at java.lang.String.<init>(String.java:207)
at org.neo4j.csv.reader.Extractors$StringExtractor.extract0(Extractors.java:328)
at org.neo4j.csv.reader.Extractors$AbstractSingleValueExtractor.extract(Extractors.java:287)
at org.neo4j.csv.reader.BufferedCharSeeker.tryExtract(BufferedCharSeeker.java:239)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.deserializeNextFromSource(InputEntityDeserializer.java:138)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:77)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:41)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer.lambda$new$0(ParallelInputEntityDeserializer.java:106)
at org.neo4j.unsafe.impl.batchimport.input.csv.ParallelInputEntityDeserializer$$Lambda$118/950986004.apply(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing.lambda$submit$0(TicketedProcessing.java:110)
at org.neo4j.unsafe.impl.batchimport.staging.TicketedProcessing$$Lambda$122/151277029.run(Unknown Source)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:237)
In general if you could explain the Chapter 9. Performance 9.1. Memory tuning with an example, it will be helpful for lot of beginners.
https://neo4j.com/docs/operations-manual/current/performance/
could you give an example to calculate dbms.memory.heap.initial_size, dbms.memory.heap.max_size, dbms.memory.pagecache.size for a sample data set of 500 GB with 3Billion entries having 10 columns of equal size in 1TB RAM machine and 100 processors.
Actually the calculation is pretty simple if you're only doing nodes :
3 * 10^9 * 20 / 1024^3
So I would go with a heap size of at least 55Gb.
Can you try that ?
Regards,
Tom

Stream h264/aacplus media over udp

Does anyone have any experience in this area?
I've tried so far with ffmpeg (using libx264 and libaacplus) muxing into mpegts over udp, but that mpegts muxer is obviously broken (confirmed from several different sources).
I've also tried with vlc, but it can only encode AAC-LC and not HE-AAC v2 (aacplus).
Anyway, the problem I need to solve is to have several different geographic locations, covered with webcams and I need udp to be able to have incoming streams all the time and not to worry about networks ups and downs (udp will simply continue sending packets when the network is up again). So, did any of you have any experience using streaming live media using h264 with aacplus over udp protocol and if yes, can you please give me any links or directions how to accomplish it.
Thanks a lot in advance.
I'm developing a system which is kind of a DVR that must periodically record a h264 file from a video device and at the same time provide a local preview to allow for adjusting the video parameters and camera view. Although I am no expert at all in this field, I had a relative success streaming h264 over UDP, so I'll try to share what didn't and what did work out for me, which was all based on ffmpeg as a server (no audio in my case).
Initially I had set up my application to simultaneously record the video to a file and feed it to a .ffm file for ffserver to stream it up using RTP/UDP for the camera preview. The problem with that approach was exactly that when the feeding ffmpeg process stopped to change the video file, the preview would stop and never resume although the subsequent ffmpeg process would have already started feeding ffserver again. With RTP ffserver apparently complains the frames timestamp started back at 0 instead of continuing where they have stopped. Then I realized that if I were able to send the h264 packets over pure UDP I would get exactly the same effect you described, with preview resuming as soon as the next ffmpeg process would take over.
While trying to understand ffmpeg documentation I also tried using the mpegts format, but doing that I was getting MPEG2 video at the player on the other end (this shows a multicast address, but it also worked for a specific target):
$ ffmpeg -y -f video4linux2 -i /dev/video2 -vcodec libx264 -preset ultrafast /mnt/hd/video.mp4 -an -f mpegts udp://224.124.0.1:5000
ffmpeg version N-35860-g62adc60, Copyright (c) 2000-2011 the FFmpeg developers
built on Dec 16 2011 09:47:41 with gcc 4.5.3
configuration: --prefix=/usr --libdir=/usr/lib --shlibdir=/usr/lib --mandir=/usr/man --disable-debug --enable-shared --disable-static --enab
le-pthreads --enable-libtheora --enable-libvorbis --enable-gpl --enable-version3 --enable-postproc --enable-swscale --enable-avfilter --enable
-libx264 --enable-libvpx --enable-librtmp --disable-indev='v4l,dv1394'
libavutil 51. 32. 0 / 51. 32. 0
libavcodec 53. 46. 0 / 53. 46. 0
libavformat 53. 26. 0 / 53. 26. 0
libavdevice 53. 4. 0 / 53. 4. 0
libavfilter 2. 53. 0 / 2. 53. 0
libswscale 2. 1. 0 / 2. 1. 0
libpostproc 51. 2. 0 / 51. 2. 0
[video4linux2,v4l2 # 0x8a96b00] Estimating duration from bitrate, this may be inaccurate
Input #0, video4linux2,v4l2, from '/dev/video2':
Duration: N/A, start: 1325538250.366878, bitrate: 27620 kb/s
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 320x240, 27620 kb/s, 29.97 tbr, 1000k tbn, 29.97 tbc
[buffer # 0x8a9d8c0] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[buffer # 0x8a9c860] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[libx264 # 0x8a97780] using cpu capabilities: MMX2 Cache64
[libx264 # 0x8a97780] profile Constrained Baseline, level 1.3
[libx264 # 0x8a97780] 264 - core 120 - H.264/MPEG-4 AVC codec - Copyleft 2003-2011 - http://www.videolan.org/x264.html - options: cabac=0 ref=
1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=0 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=
0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
[mpegts # 0x8a98100] muxrate VBR, pcr every 2 pkts, sdt every 200, pat/pmt every 40 pkts
Output #0, mp4, to '/mnt/hd/video.mp4':
Metadata:
encoder : Lavf53.26.0
Stream #0:0: Video: h264 (![0][0][0] / 0x0021), yuv420p, 320x240, q=-1--1, 30k tbn, 29.97 tbc
Output #1, mpegts, to 'udp://224.124.0.1:5000':
Metadata:
encoder : Lavf53.26.0
Stream #1:0: Video: mpeg2video, yuv420p, 320x240, q=2-31, 200 kb/s, 90k tbn, 29.97 tbc
Stream mapping:
Stream #0:0 -> #0:0 (rawvideo -> libx264)
Stream #0:0 -> #1:0 (rawvideo -> mpeg2video)
Press [q] to stop, [?] for help
On the client PC I was able to watch the video with ffplay, and it was indeed in MPEG2 format:
$ ffplay -f mpegts udp://224.124.0.1:5000
ffplay version N-35860-g62adc60, Copyright (c) 2003-2011 the FFmpeg developers
built on Dec 16 2011 09:47:41 with gcc 4.5.3
configuration: --prefix=/usr --libdir=/usr/lib --shlibdir=/usr/lib --mandir=/usr/man --disable-debug --enable-shared --disable-static --enable-pthreads --enable-libtheora --enable-libvorbis --enable-gpl --enable-version3 --enable-postproc --enable-swscale --enable-avfilter --enable-libx264 --enable-libvpx --enable-librtmp --disable-indev='v4l,dv1394'
libavutil 51. 32. 0 / 51. 32. 0
libavcodec 53. 46. 0 / 53. 46. 0
libavformat 53. 26. 0 / 53. 26. 0
libavdevice 53. 4. 0 / 53. 4. 0
libavfilter 2. 53. 0 / 2. 53. 0
libswscale 2. 1. 0 / 2. 1. 0
libpostproc 51. 2. 0 / 51. 2. 0
[mpegts # 0x80f02e0] Unable to seek back to the start
[mpeg2video # 0x8111a00] mpeg_decode_postinit() failure
Last message repeated 6 times
[mpegts # 0x80f02e0] max_analyze_duration 5000000 reached at 5005000
[mpegts # 0x80f02e0] Estimating duration from bitrate, this may be inaccurate
Input #0, mpegts, from 'udp://224.124.0.1:5000':
Duration: N/A, start: 255.420433, bitrate: 104857 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: FFmpeg
Stream #0:0[0x100]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p, 320x240 [SAR 1:1 DAR 4:3], 104857 kb/s, 30.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
VLC also worked to play the stream, but only after specifying the switch --demux ffmpeg (thanks to this):
vlc -vv --demux ffmpeg udp://#224.124.0.1:5000
Since I also wanted the preview to be in h264 but I already had it encoding the file, I tried to use the copy codec for the UDP streaming, but ffmpeg failed with a segfault (versions included for reference):
ffmpeg -y -f video4linux2 -i /dev/video2 -vcodec libx264 -preset ultrafast /mnt/hd/video.mp4 -an -vcodec cop
y -f h264 udp://224.124.0.1:5000
ffmpeg version N-35860-g62adc60, Copyright (c) 2000-2011 the FFmpeg developers
built on Dec 16 2011 09:47:41 with gcc 4.5.3
configuration: --prefix=/usr --libdir=/usr/lib --shlibdir=/usr/lib --mandir=/usr/man --disable-debug --enable-shared --disable-static --enab
le-pthreads --enable-libtheora --enable-libvorbis --enable-gpl --enable-version3 --enable-postproc --enable-swscale --enable-avfilter --enable
-libx264 --enable-libvpx --enable-librtmp --disable-indev='v4l,dv1394'
libavutil 51. 32. 0 / 51. 32. 0
libavcodec 53. 46. 0 / 53. 46. 0
libavformat 53. 26. 0 / 53. 26. 0
libavdevice 53. 4. 0 / 53. 4. 0
libavfilter 2. 53. 0 / 2. 53. 0
libswscale 2. 1. 0 / 2. 1. 0
libpostproc 51. 2. 0 / 51. 2. 0
[video4linux2,v4l2 # 0x92c7b00] Estimating duration from bitrate, this may be inaccurate
Input #0, video4linux2,v4l2, from '/dev/video2':
Duration: N/A, start: 1325539132.411691, bitrate: 27620 kb/s
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 320x240, 27620 kb/s, 29.97 tbr, 1000k tbn, 29.97 tbc
[buffer # 0x92ce860] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[libx264 # 0x92c8780] using cpu capabilities: MMX2 Cache64
[libx264 # 0x92c8780] profile Constrained Baseline, level 1.3
[libx264 # 0x92c8780] 264 - core 120 - H.264/MPEG-4 AVC codec - Copyleft 2003-2011 - http://www.videolan.org/x264.html - options: cabac=0 ref=
1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=0 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=
0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
Output #0, mp4, to '/mnt/hd/video.mp4':
Metadata:
encoder : Lavf53.26.0
Stream #0:0: Video: h264 (![0][0][0] / 0x0021), yuv420p, 320x240, q=-1--1, 30k tbn, 29.97 tbc
Output #1, h264, to 'udp://224.124.0.1:5000':
Metadata:
encoder : Lavf53.26.0
Stream #1:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 320x240, q=2-31, 27620 kb/s, 90k tbn, 29.97 tbc
Stream mapping:
Stream #0:0 -> #0:0 (rawvideo -> libx264)
Stream #0:0 -> #1:0 (copy)
Press [q] to stop, [?] for help
Segmentation fault
Although less than ideal, specifying the format h264 for the UDP streaming part led to a second concurrent h264 conversion, but it worked:
$ ffmpeg -y -f video4linux2 -i /dev/video2 -vcodec libx264 -preset ultrafast /mnt/hd/video.mp4 -an -f h264 -preset ultrafast udp://224.124.0.1:5000
ffmpeg version N-35860-g62adc60, Copyright (c) 2000-2011 the FFmpeg developers
built on Dec 16 2011 09:47:41 with gcc 4.5.3
configuration: --prefix=/usr --libdir=/usr/lib --shlibdir=/usr/lib --mandir=/usr/man --disable-debug --enable-shared --disable-static --enab
le-pthreads --enable-libtheora --enable-libvorbis --enable-gpl --enable-version3 --enable-postproc --enable-swscale --enable-avfilter --enable
-libx264 --enable-libvpx --enable-librtmp --disable-indev='v4l,dv1394'
libavutil 51. 32. 0 / 51. 32. 0
libavcodec 53. 46. 0 / 53. 46. 0
libavformat 53. 26. 0 / 53. 26. 0
libavdevice 53. 4. 0 / 53. 4. 0
libavfilter 2. 53. 0 / 2. 53. 0
libswscale 2. 1. 0 / 2. 1. 0
libpostproc 51. 2. 0 / 51. 2. 0
[video4linux2,v4l2 # 0x913ab00] Estimating duration from bitrate, this may be inaccurate
Input #0, video4linux2,v4l2, from '/dev/video2':
Duration: N/A, start: 1325539689.729735, bitrate: 27620 kb/s
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 320x240, 27620 kb/s, 29.97 tbr, 1000k tbn, 29.97 tbc
[buffer # 0x9141840] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[buffer # 0x913e480] w:320 h:240 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[libx264 # 0x913b780] using cpu capabilities: MMX2 Cache64
[libx264 # 0x913b780] profile Constrained Baseline, level 1.3
[libx264 # 0x913b780] 264 - core 120 - H.264/MPEG-4 AVC codec - Copyleft 2003-2011 - http://www.videolan.org/x264.html - options: cabac=0 ref=
1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=0 threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=
0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=crf mbtree=0 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=0
[libx264 # 0x913c820] using cpu capabilities: MMX2 Cache64
[libx264 # 0x913c820] profile Constrained Baseline, level 1.3
Output #0, mp4, to '/mnt/hd/video.mp4':
Metadata:
encoder : Lavf53.26.0
Stream #0:0: Video: h264 (![0][0][0] / 0x0021), yuv420p, 320x240, q=-1--1, 30k tbn, 29.97 tbc
Output #1, h264, to 'udp://224.124.0.1:5000':
Metadata:
encoder : Lavf53.26.0
Stream #1:0: Video: h264, yuv420p, 320x240, q=-1--1, 90k tbn, 29.97 tbc
Stream mapping:
Stream #0:0 -> #0:0 (rawvideo -> libx264)
Stream #0:0 -> #1:0 (rawvideo -> libx264)
Press [q] to stop, [?] for help
On the client side ffplay repeatedly complained about some missing information, but after some seconds it finally showed up the video, which was ok but a bit choppy:
$ ffplay -f h264 udp://224.124.0.1:5000
[h264 # 0xa0be740] non-existing PPS referenced
[h264 # 0xa0be740] non-existing PPS 0 referenced
[h264 # 0xa0be740] decode_slice_header error
[h264 # 0xa0be740] no frame!
(...)
[h264 # 0xa0be740] non-existing PPS referenced
[h264 # 0xa0be740] non-existing PPS 0 referenced
[h264 # 0xa0be740] decode_slice_header error
[h264 # 0xa0be740] no frame!
[h264 # 0xa0e72e0] max_analyze_duration 5000000 reached at 5013967
[h264 # 0xa0e72e0] Estimating duration from bitrate, this may be inaccurate
Input #0, h264, from 'udp://224.124.0.1:5000':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264 (Constrained Baseline), yuv420p, 320x240, 47.27 fps, 29.97 tbr, 1200k tbn, 59.94 tbc
[h264 # 0xa0be740] Missing reference picture
[h264 # 0xa0be740] decode_slice_header error
[h264 # 0xa0be740] concealing 300 DC, 300 AC, 300 MV errors
[h264 # 0xa0be740] Missing reference picture 0KB sq= 0B f=0/0 0/0
[h264 # 0xa0be740] decode_slice_header error
[h264 # 0xa0be740] mmco: unref short failure
[h264 # 0xa0be740] concealing 300 DC, 300 AC, 300 MV errors
10.78 A-V: 0.000 fd= 0 aq= 0KB vq= 0KB sq= 0B f=0/0
I then tried VLC with the same ''--demux ffmpeg'' switch; it also complained about SPS/PPS (don't know what it is about yet) but in the end it played the video really smoothly:
$ vlc -v --demux ffmpeg udp/h264://#224.124.0.1:5000
VLC media player 1.1.12 The Luggage (revision exported)
Blocked: call to unsetenv("DBUS_ACTIVATION_ADDRESS")
Blocked: call to unsetenv("DBUS_ACTIVATION_BUS_TYPE")
[0x943346c] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
Blocked: call to setlocale(6, "")
Blocked: call to setlocale(6, "")
[0x94cde8c] qt4 interface error: Unable to load extensions module
[0x96720e4] h264 demux error: this doesn't look like a H264 ES stream, continuing anyway
[0x963989c] access_udp access warning: unimplemented query in control
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x967328c] packetizer_h264 demux packetizer warning: waiting for SPS/PPS
[0x94be644] main input error: Invalid PCR value in ES_OUT_SET_(GROUP_)PCR !
[0xb2a325f4] avcodec decoder warning: disabling direct rendering
[0x96dfac4] main video output warning: vlc_object_find_name(postproc) is not safe!
[0x94c2474] signals interface warning: signal 17 overridden (0xb6f31030)
[0x94c2474] signals interface warning: /usr/lib/qt/lib/libQtCore.so.4(?)[(nil)]
[0x96dfac4] main video output warning: late picture skipped (32703 > -4)
I'm still trying to figure out what is missing, but the results are pleasing already. After VLC is playing I can stop the feeding ffmpeg that playback pauses and resumes right away when I restart ffmpeg on the other end.
Hope this can be of any help, and please let me know if you have any additional info on using h264 as the chosen format in place of mpegts - I suspect this missing SPS/PPS info might have something to do with it.

Resources