InfluxDB 2.x - shard X removed during backup - influxdb

When backing up a bucket in InfluxDB 2.3.0 with the backup command influx backup, the backup skips several shards of data thereby not backing it up. How can I fix the shards such that they backup all of the data?
Here is what happens when I run the command:
bash-5.1# influx backup /influxdb2-backup/backup --bucket MyBucket -t MySeCrEtToKeN!
2023/01/26 23:33:24 INFO: Downloading metadata snapshot
2023/01/26 23:33:24 INFO: Backing up TSM for shard 1
2023/01/26 23:33:24 WARN: Shard 1 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 2
2023/01/26 23:33:24 WARN: Shard 2 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 3
2023/01/26 23:33:24 WARN: Shard 3 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 4
2023/01/26 23:33:24 WARN: Shard 4 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 5
2023/01/26 23:33:24 WARN: Shard 5 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 6
2023/01/26 23:33:24 WARN: Shard 6 removed during backup
2023/01/26 23:33:24 INFO: Backing up TSM for shard 7
2023/01/26 23:33:50 INFO: Backing up TSM for shard 8
Of course, when restoring from that backup with missing shards, there is data missing.
This is running InfluxDB 2.3.0:
InfluxDB v2.3.0+SNAPSHOT.090f681737 Server: 090f681 Frontend: a2bd1f3

Related

Crunchy Postgres log messages

I am new to Crunchy Postgres, and recently I installed a Crunchy PostgresCluster on an openshift environment. After the cluster was started, I had a look at the container log messages.
I also checked script startup.sh , which is called during Postgresql startup. In this shell script, there are some lines (begin with echo_info) used for log messsages, for example:
echo_info "Starting PostgreSQL.."
But I could not see this message in the logs.
NAME READY STATUS RESTARTS AGE ROLE
demo-instance1-4vtv-0 5/5 Running 0 7h36m replica
demo-instance1-dg7j-0 5/5 Running 0 7h36m replica
demo-instance1-f696-0 5/5 Running 0 7h36m master
:~$ oc logs -f demo-instance1-f696-0 -c database | more
2022-07-08 07:42:31,064 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-07-08 07:42:31,068 INFO: Lock owner: None; I am demo-instance1-f696-0
2022-07-08 07:42:31,383 INFO: trying to bootstrap a new cluster
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf-8".
The default text search configuration will be set to "english".
Data page checksums are enabled.
fixing permissions on existing directory /pgdata/pg14 ... ok
creating directory /pgdata/pg14_wal ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
/usr/pgsql-14/bin/pg_ctl -D /pgdata/pg14 -l logfile start
2022-07-08 07:42:35.953 UTC [92] LOG: pgaudit extension initialized
2022-07-08 07:42:35,955 INFO: postmaster pid=92
/tmp/postgres:5432 - no response
2022-07-08 07:42:35.998 UTC [92] LOG: redirecting log output to logging collector process
2022-07-08 07:42:35.998 UTC [92] HINT: Future log output will appear in directory "log".
/tmp/postgres:5432 - accepting connections
/tmp/postgres:5432 - accepting connections
2022-07-08 07:42:37,038 INFO: establishing a new patroni connection to the postgres cluster
2022-07-08 07:42:37,334 INFO: running post_bootstrap
2022-07-08 07:42:37,754 INFO: initialized a new cluster
2022-07-08 07:42:38,039 INFO: no action. I am (demo-instance1-f696-0), the leader with the lock
2022-07-08 07:42:48,504 INFO: no action. I am (demo-instance1-f696-0), the leader with the lock
2022-07-08 07:42:58,476 INFO: no action. I am (demo-instance1-f696-0), the leader with the lock
2022-07-08 07:43:08,497 INFO: no action. I am (demo-instance1-f696-0), the leader with the lock

XGBoost model failed due to Closing connection _sid_af1c at exit

We use XGBoost model for regression prediction model, We use XGBoost as grid search hyper parameter tuning process,
We run this model on 90GB h2o cluster. This process now running over 1.2 years, but suddenly this process stop due to "Closing connection _sid_af1c at exit"
Training data set is 800 000, due to this error we decreased it to 500 000 but same error occurred.
ntrees - 300,400
depth - 8.10
variables - 382
I have attached H2o memory log and our application error log. Could you please support to fixed this issue.
----------------------------------------H2o Log [Start]----------------------
**We start H2o as 2 node cluster, but h2o log crated on one node.**
INFO water.default: ----- H2O started -----
INFO water.default: Build git branch: master
INFO water.default: Build git hash: 0588cccd72a7dc1274a83c30c4ae4161b92d9911
INFO water.default: Build git describe: jenkins-master-5236-4-g0588ccc
INFO water.default: Build project version: 3.33.0.5237
INFO water.default: Build age: 1 year, 3 months and 17 days
INFO water.default: Built by: 'jenkins'
INFO water.default: Built on: '2020-10-27 19:21:29'
WARN water.default:
WARN water.default: *** Your H2O version is too old! Please download the latest version from http://h2o.ai/download/ ***
WARN water.default:
INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone]
INFO water.default: Processed H2O arguments: [-flatfile, /usr/local/h2o/flatfile.txt, -port, 54321]
INFO water.default: Java availableProcessors: 20
INFO water.default: Java heap totalMemory: 962.5 MB
INFO water.default: Java heap maxMemory: 42.67 GB
INFO water.default: Java version: Java 1.8.0_262 (from Oracle Corporation)
INFO water.default: JVM launch parameters: [-Xmx48g]
INFO water.default: JVM process id: 83043#masterb.xxxxx.com
INFO water.default: OS version: Linux 3.10.0-1127.10.1.el7.x86_64 (amd64)
INFO water.default: Machine physical memory: 62.74 GB
INFO water.default: Machine locale: en_US
INFO water.default: X-h2o-cluster-id: 1644769990156
INFO water.default: User name: 'root'
INFO water.default: IPv6 stack selected: false
INFO water.default: Possible IP Address: ens192 (ens192), xxxxxxxxxxxxxxxxxxxx
INFO water.default: Possible IP Address: ens192 (ens192), xxxxxxxxxxx
INFO water.default: Possible IP Address: lo (lo), 0:0:0:0:0:0:0:1%lo
INFO water.default: Possible IP Address: lo (lo), 127.0.0.1
INFO water.default: H2O node running in unencrypted mode.
INFO water.default: Internal communication uses port: 54322
INFO water.default: Listening for HTTP and REST traffic on http://xxxxxxxxxxxx:54321/
INFO water.default: H2O cloud name: 'root' on /xxxxxxxxxxxx:54321, discovery address /xxxxxxxxxxxx:57653
INFO water.default: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
INFO water.default: 1. Open a terminal and run 'ssh -L 55555:localhost:54321 root#xxxxxxxxxxxx'
INFO water.default: 2. Point your browser to http://localhost:55555
INFO water.default: Log dir: '/tmp/h2o-root/h2ologs'
INFO water.default: Cur dir: '/usr/local/h2o/h2o-3.33.0.5237'
INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized
INFO water.default: HDFS subsystem successfully initialized
INFO water.default: S3 subsystem successfully initialized
INFO water.default: GCS subsystem successfully initialized
INFO water.default: Flow dir: '/root/h2oflows'
INFO water.default: Cloud of size 1 formed [/xxxxxxxxxxxx:54321]
INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
INFO water.default: XGBoost extension initialized
INFO water.default: KrbStandalone extension initialized
INFO water.default: Registered 2 core extensions in: 2632ms
INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone]
INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_gpu
INFO hex.tree.xgboost.XGBoostExtension: XGBoost supported backends: [WITH_GPU, WITH_OMP]
INFO water.default: Registered: 217 REST APIs in: 353ms
INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4]
INFO water.default: Registered: 291 schemas in 112ms
INFO water.default: H2O started in 4612ms
INFO water.default:
INFO water.default: Open H2O Flow in your web browser: http://xxxxxxxxxxxx:54321
INFO water.default:
INFO water.default: Cloud of size 2 formed [mastera.xxxxxxxxxxxx.com/xxxxxxxxxxxx:54321, masterb.xxxxxxxxxxxx.com/xxxxxxxxxxxx:54321]
INFO water.default: Locking cloud to new members, because water.rapids.Session$1
INFO hex.tree.xgboost.task.XGBoostUpdater: Initial Booster created, size=448
ERROR water.default: Got IO error when sending a batch of bytes:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:468)
at water.H2ONode$SmallMessagesSendThread.sendBuffer(H2ONode.java:605)
at water.H2ONode$SmallMessagesSendThread.run(H2ONode.java:588)
----------------------------------------H2o Log [End]--------------------------------
----------------------------------------Application Log [Start]----------------------
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
Warning: Your H2O cluster version is too old (1 year, 3 months and 17 days)! Please download and install the latest version from http://h2o.ai/download/
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
release memory here...
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
Warning: Your H2O cluster version is too old (1 year, 3 months and 17 days)! Please download and install the latest version from http://h2o.ai/download/
-------------------------- ------------------------------------------------------------------
H2O_cluster_uptime: 19 mins 49 secs
H2O_cluster_timezone: Asia/Colombo
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.33.0.5237
H2O_cluster_version_age: 1 year, 3 months and 17 days !!!
H2O_cluster_name: root
H2O_cluster_total_nodes: 2
H2O_cluster_free_memory: 84.1 Gb
H2O_cluster_total_cores: 40
H2O_cluster_allowed_cores: 40
H2O_cluster_status: locked, healthy
H2O_connection_url: http://localhost:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.0 final
-------------------------- ------------------------------------------------------------------
Parse progress: |█████████████████████████████████████████████████████████| 100%
xgboost Grid Build progress: |████████Closing connection _sid_af1c at exit
H2O session _sid_af1c was not closed properly.
Closing connection _sid_9313 at exit
H2O session _sid_9313 was not closed properly.
----------------------------------------Application Log [End]----------------------
This typically means one of the nodes crashed, it can be due to many different reasons - memory is the most common one.
I see your machine has about 64GB of physical memory and H2O is getting 48GB out of that. XGBoost runs in native memory, not in the JVM memory. For XGBoost we recommend splitting the physical memory 50-50 to H2O and XGBoost.
You are running a development version of H2O (3.33) - I suggest upgrading to the latest stable.

com.orientechnologies.orient.core.exception.OStorageException: File with name internal.pcl does not exist in storage metadata

Run the orientdb with the docker. The server is powered off, restart the server, and then start orientdb fails, the error is shown below,but can I find the file in the folder 'metadata'?
2020-04-05 11:37:39:282 INFO Detected limit of amount of simultaneously open files is 1048576, limit of open files for disk cache will be set to 523776 [ONative]
2020-04-05 11:37:39:298 INFO Loading configuration from: /orientdb/config/orientdb-server-config.xml... [OServerConfigurationLoaderXml]
2020-04-05 11:37:39:587 INFO OrientDB Server v3.0.22 - Veloce (build 8afc634a2ea9c351898ae57dee8d739ff4851252, branch 3.0.x) is starting up... [OServer]
2020-04-05 11:37:39:619 INFO OrientDB config DISKCACHE=800MB [orientechnologies]
2020-04-05 11:37:39:697 INFO System is started under an effective user : `root` [OEngineLocalPaginated]
2020-04-05 11:37:39:697 INFO Allocation of 12160 pages. [OEngineLocalPaginated]
2020-04-05 11:37:39:880 INFO WAL maximum segment size is set to 32,241 MB [OrientDBDistributed]
2020-04-05 11:37:39:882 INFO Databases directory: /orientdb/databases [OServer]
2020-04-05 11:37:39:987 INFO Direct IO for WAL located in /orientdb/databases/OSystem is allowed with block size 4096 bytes. [OCASDiskWriteAheadLog]
2020-04-05 11:37:39:988 INFO Page size for WAL located in /orientdb/databases/OSystem is set to 4096 bytes. [OCASDiskWriteAheadLog]
2020-04-05 11:37:40:587 INFO Storage 'plocal:/orientdb/databases/OSystem' is opened under OrientDB distribution : 3.0.22 - Veloce (build 8afc634a2ea9c351898ae57dee8d739ff4851252, branch 3.0.x) [OLocalPaginatedStorage]
2020-04-05 11:37:41:013 INFO Listening binary connections on 0.0.0.0:2424 (protocol v.37, socket=default) [OServerNetworkListener]
2020-04-05 11:37:41:019 INFO Listening http connections on 0.0.0.0:2480 (protocol v.10, socket=default) [OServerNetworkListener]
2020-04-05 11:37:41:024 INFO Found ORIENTDB_ROOT_PASSWORD variable, using this value as root's password [OServer]
2020-04-05 11:37:41:764 INFO Installing dynamic plugin 'orientdb-studio-3.0.22.zip'... [OServerPluginManager]
2020-04-05 11:37:41:767 INFO Installing dynamic plugin 'orientdb-etl-3.0.22.jar'... [OServerPluginManager]
2020-04-05 11:37:41:773 INFO ODefaultPasswordAuthenticator is active [ODefaultPasswordAuthenticator]
2020-04-05 11:37:41:776 INFO OServerConfigAuthenticator is active [OServerConfigAuthenticator]
2020-04-05 11:37:41:777 INFO OSystemUserAuthenticator is active [OSystemUserAuthenticator]
2020-04-05 11:37:41:778 INFO [OVariableParser.resolveVariables] Property not found: distributed [orientechnologies]
2020-04-05 11:37:41:785 WARNI Authenticated clients can execute any kind of code into the server by using the following allowed languages: [sql] [OServerSideScriptInterpreter]
2020-04-05 11:37:41:788 INFO OrientDB Studio available at http://172.18.0.2:2480/studio/index.html [OServer]
2020-04-05 11:37:41:788 INFO OrientDB Server is active v3.0.22 - Veloce (build 8afc634a2ea9c351898ae57dee8d739ff4851252, branch 3.0.x). [OServer]
2020-04-05 11:38:16:755 INFO Direct IO for WAL located in /orientdb/databases/metadata is allowed with block size 4096 bytes. [OCASDiskWriteAheadLog]
2020-04-05 11:38:16:755 INFO Page size for WAL located in /orientdb/databases/metadata is set to 4096 bytes. [OCASDiskWriteAheadLog]Exception `6EA2C93F` in storage `plocal:/orientdb/databases/metadata`: 3.0.22 - Veloce (build 8afc634a2ea9c351898ae57dee8d739ff4851252, branch 3.0.x)
com.orientechnologies.orient.core.exception.OStorageException: File with name internal.pcl does not exist in storage metadata
at com.orientechnologies.orient.core.storage.cache.local.OWOWCache.loadFile(OWOWCache.java:732)
at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.openFile(ODurableComponent.java:188)
at com.orientechnologies.orient.core.storage.cluster.v0.OPaginatedClusterV0.open(OPaginatedClusterV0.java:192)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.openClusters(OAbstractPaginatedStorage.java:532)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:391)
at com.orientechnologies.orient.core.db.OrientDBEmbedded.openNoAuthenticate(OrientDBEmbedded.java:236)
at com.orientechnologies.orient.core.db.OrientDBEmbedded.openNoAuthenticate(OrientDBEmbedded.java:59)
at com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:936)
at com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:908)
at com.orientechnologies.orient.server.network.protocol.http.command.OServerCommandAuthenticatedDbAbstract.authenticate(OServerCommandAuthenticatedDbAbstract.java:162)
at com.orientechnologies.orient.server.network.protocol.http.command.OServerCommandAuthenticatedDbAbstract.beforeExecute(OServerCommandAuthenticatedDbAbstract.java:122)
at com.orientechnologies.orient.server.network.protocol.http.command.get.OServerCommandGetConnect.beforeExecute(OServerCommandGetConnect.java:50)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.service(ONetworkProtocolHttpAbstract.java:225)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.execute(ONetworkProtocolHttpAbstract.java:707)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:69)

Neo4j import runtime issue

I am trying to import a design with 230M Nodes, 300M relationships and 1.5B properties overall.
It takes nearly 5.5 hours to import the design. Wondering how to improve the runtime.
If i analyze the messages from Neo4j import, it takes quite a bit of time in relationship -> relationship. Not sure as what it does here.
Any suggestion to improve the load time:
My run command is :
/home/neo4j-enterprise-3.3.2/bin/neo4j-admin import --nodes "./instances." --relationships:SIGN_OF "./sign." --relationships:RIN_OF "./rin.* --id-type=INTEGER --database graph.db;
My heap initial and max size is set to 32G
Instance header:
NodeId:ID,:IGNORE,:Label,Din:int,LGit:int,RGit:int,Signed:int,Cens,Type,:IGNORE,Val:Float
Signal Header:
:IGNORE,:IGNORE,LGit:int,RGit:int,Signed:int,:IGNORE,:START_ID,:END_ID
Rin header
:START_ID,:END_ID,:IGNORE
Neo4j import output
Available resources:
Total machine memory: 504.70 GB
Free machine memory: 88.71 GB
Max heap memory : 26.67 GB
Processors: 16
Configured max memory: 55.84 GB
Nodes, started 2018-04-09 17:52:36.028+0000
[>:|NODE:1.75 GB--------------|PROPERTIES(3)=====|LABEL |*v:87.95 MB/s(4)=====================] 234M ∆ 819K
Done in 4m 13s 984ms
Prepare node index, started 2018-04-09 17:56:50.351+0000
[*DETECT:2.62 GB------------------------------------------------------------------------------] 234M ∆71.2M30000
Done in 33s 546ms
Relationships, started 2018-04-09 17:57:23.935+0000
[>||PREPARE-----------------------------------|||*v:20.96 MB/s(16)============================] 303M ∆ 256K
Done in 7m 7s 922ms
Node Degrees, started 2018-04-09 18:04:37.914+0000
[*>(16)=============================================================================|CALCULATE] 303M ∆1.97M
Done in 1m 30s 566ms
Relationship --> Relationship 1-2/2, started 2018-04-09 18:06:08.951+0000
[*>------------------------------------------------------------------------------------------|] 303M ∆ 144K
Done in 2h 8m 4s 36ms
RelationshipGroup 1-2/2, started 2018-04-09 20:14:13.059+0000
[>:4.44 MB/s----------|*v:2.22 MB/s(2)========================================================] 186K ∆9.81K
Done in 2s 105ms
Node --> Relationship, started 2018-04-09 20:14:15.178+0000
[*>------------------------------------------------------------------------------------------|] 234M ∆76.4K
Done in 27m 53s 408ms
Relationship --> Relationship 1-2/2, started 2018-04-09 20:42:08.654+0000
[*>------------------------------------------------------------------------------------------|] 303M ∆36.0K
Done in 2h 33m 24s 201ms
Count groups, started 2018-04-09 23:15:33.152+0000
[*>(16)=======================================================================================] 186K ∆59.8K
Done in 3s 898ms
Gather, started 2018-04-09 23:15:41.513+0000
[>(6)===|*CACHE-------------------------------------------------------------------------------] 186K ∆ 186K
Done in 322ms
Write, started 2018-04-09 23:15:41.859+0000
[>:1.30 |*v:1.11 MB/s(16)=====================================================================] 186K ∆21.2K
Done in 4s 161ms
Node --> Group, started 2018-04-09 23:15:46.117+0000
[*FIRST---------------------------------------------------------------------------------------] 148K ∆1.09K
Done in 4m 8s 747ms
Node counts, started 2018-04-09 23:19:55.032+0000
[*>(16)===========================================================|COUNT:1.79 GB--------------] 234M ∆4.63M
Done in 3m 33s 201ms
Relationship counts, started 2018-04-09 23:23:28.254+0000
[*>(16)===================================================================|COUNT--------------] 303M ∆ 450K
Done in 1m 29s 457ms
IMPORT DONE in 5h 32m 23s 509ms.
Imported:
234425118 nodes
303627293 relationships
1496022710 properties
Peak memory usage: 2.69 GB

Hazelcast memory is continuously increasing

I have a hazelcast cluster with two machines.
The only object in the cluster is a map. Analysing the log files I noticed that the health monitor starts to report a slow increase in memory consumption even though no new entries are being added to map (see sample of log entries below)
Any ideas of what may be causing the memory increase?
<p>2015-09-16 10:45:49 INFO HealthMonitor:? - [10.11.173.129]:5903
[dev] [3.2.1] memory.used=97.6M, memory.free=30.4M,
memory.total=128.0M, memory.max=128.0M, memory.used/total=76.27%,
memory.used/max=76.27%, load.process=0.00%, load.system=1.00%,
load.systemAverage=3.00%, thread.count=96, thread.peakCount=107,
event.q.size=0, executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=1, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>
<p>2015-09-16 10:46:02 INFO
InternalPartitionService:? - [10.11.173.129]:5903 [dev] [3.2.1]
Remaining migration tasks in queue = 51 2015-09-16 10:46:12 DEBUG
TeleavisoIvrLoader:71 - Checking for new files... 2015-09-16 10:46:13
INFO InternalPartitionService:? - [10.11.173.129]:5903 [dev] [3.2.1]
All migration tasks has been completed, queues are empty. 2015-09-16
10:46:19 INFO HealthMonitor:? - [10.11.173.129]:5903 [dev] [3.2.1]
memory.used=103.9M, memory.free=24.1M, memory.total=128.0M,
memory.max=128.0M, memory.used/total=81.21%, memory.used/max=81.21%,
load.process=0.00%, load.system=1.00%, load.systemAverage=2.00%,
thread.count=73, thread.peakCount=107, event.q.size=0,
executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=0, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>
<p>2015-09-16 10:46:49 INFO HealthMonitor:? - [10.11.173.129]:5903
[dev] [3.2.1] memory.used=105.1M, memory.free=22.9M,
memory.total=128.0M, memory.max=128.0M, memory.used/total=82.11%,
memory.used/max=82.11%, load.process=0.00%, load.system=1.00%,
load.systemAverage=1.00%, thread.count=73, thread.peakCount=107,
event.q.size=0, executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=0, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>

Resources