I wrote a script to fetch some data from influxdb.It works well,but some day this script report an error:
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)'), IncompleteRead(0 bytes read)
and then,I restart the influxdb by service influxdb restart
but after restart,I can not query the data of past 4 weeks for all databases.
So how should I restore those data?
Related
I have a host running a spark-master along with 3 spark-workers, all in docker containers. I have another host acting as a Spark-driver, reading data from the first host.
I am able to successfully retrieve data from the first host as long as the data returned is tiny (<6000 rows)
But it's failing when I'm trying to read large blocks (100k+ rows).
I checked the executor logs and when the reads are successful, I'm getting this following log message:
19/07/23 21:54:17 INFO CassandraConnector: Connected to Cassandra cluster: DataMonitor
19/07/23 21:54:17 INFO Executor: Finished task 0.0 in stage 1.0 (TID 4). 1014673 bytes result sent to driver
19/07/23 21:54:24 INFO CassandraConnector: Disconnected from Cassandra cluster: DataMonitor
But when the reads are unsuccessful, I'm getting this following log message:
19/07/23 22:21:55 INFO CassandraConnector: Connected to Cassandra cluster: DataMonitor
19/07/23 22:22:03 INFO MemoryStore: Block taskresult_13 stored as bytes in memory (estimated size 119.2 MB, free 2.4 GB)
19/07/23 22:22:03 INFO Executor: Finished task 0.3 in stage 4.0 (TID 13). 124969484 bytes result sent via BlockManager)
19/07/23 22:22:10 INFO CassandraConnector: Disconnected from Cassandra cluster: DataMonitor
It looks like when the results are large enough, it gets "sent via BlockManager"
But when it's small enough, it gets "sent to the driver".
So how do I get it so every results are sent to the driver?
Each Executor runs tasks and sends the result of the task back to the driver.
If a task result is small, it sends it directly with task status, but if the result size is big, calculated by the following formula:
taskResultSize > conf.getSizeAsBytes("spark.task.maxDirectResultSize", 1L << 20)
or
taskResultSize > conf.get("spark.driver.maxResultSize")
source code
Executor stores the result on disk locally and sends IndirectTaskResult with blockId back to the driver.
Then driver uses netty via BlockManager to download the remote result.
Take a look here.
If it is not detailed enough, let me know.
We have set the flag 'reporting-disabled = true' in /etc/influxdb/influxdb.conf to prevent InfluxDB from sending usage data. however, we still see connections to InfluxDB usage.influxdata.com (ip: 104.131.151.204) when running tcpdump.
Does anyone know why InfluxDB is still connecting to this server? Is there any way to completely stop InfluxDB from sending data back to their servers?
apparently with no specific reason, and with nothing on neo4j logs, our application is getting this:
2019-01-30 14:15:08,715 WARN com.calenco.core.content3.ContentHandler:177 - Unable to acquire connection from the pool within configured maximum time of 60000ms
org.neo4j.driver.v1.exceptions.ClientException: Unable to acquire connection from the pool within configured maximum time of 60000ms
at org.neo4j.driver.internal.async.pool.ConnectionPoolImpl.processAcquisitionError(ConnectionPoolImpl.java:192)
at org.neo4j.driver.internal.async.pool.ConnectionPoolImpl.lambda$acquire$0(ConnectionPoolImpl.java:89)
at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.neo4j.driver.internal.util.Futures.lambda$asCompletionStage$0(Futures.java:78)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:745)
The neo4j server is still running, and answering requests to either its web browser console, or the cypher-shell CLI. Also, restarting our application re-acquires the connection to neo4j with no issue.
Our application is connecting to neo4j once when it's started and then keeps that connection open for as lunch as it's running, opening and closing sessions against that connection as needed to fulfill the received requests.
It's the 2nd time in less than a month that we see the above exception thrown.
Any ideas?
Thanks in advance
I need calculate my service uptime (e.g redis, memcached)
= success fetching metrics attempts / *total* fetching metrics attempts (every 10 sec for some period)
Can I somehow configure Telegraf to send 0/false if my input (service) is down?
Cause now if input-service is down influxdb don't receive any new metrics points from telegraf (only error logs on telegraf daemon side).
Daniel Nelson answered here https://github.com/influxdata/telegraf/issues/4563#issuecomment-413653844
that each plugin can add custom metrics to internal plugin (example: http_listener input plugin)
I started the influxdb. The meta server is getting started at 8088 and I am seeing a series of [wal] logs. When I try to connect with the server using influx command it throws
Failed to connect to http://localhost:8086
Please check your connection settings and ensure 'influxd' is running.
The server is running in the background. What could be the reason ? I have been writing continuously and then I restarted my server. After restarting I am not able to connect to the server. I also tried connecting after an hour of restarting to make sure it was not due to some startup tasks.
What could be the reason for this ?
The db had huge number of series and it took more than 2 hours for the meta server to be up fully. Later, the http listener was up after the initial start tasks.