We know that we use Nodeguarding for the existence of nodes and of course, Heartbeat is also used.
How can we change a system(NMT or M/S) that works in Nodeguarding mode to Heartbeat mode?
Related
I'm currently seeing delays of 2-3 seconds on my first requests coming into our APIs.
We've set the min instances to 1 to prevent cold start but this a delay is still occurring.
If I check the metrics I don't see any startup latencies in the specified timeframe so I have no insights in what is causing these delays. Tracing gives the following:
The only thing I can change, is switching to "CPU is always allocated" but this isn't helping in any way.
Can somebody give more information on this?
As mentioned in the Answer :
As per doc :
Idle instances As traffic fluctuates, Cloud Run attempts to reduce the
chance of cold starts by keeping some idle instances around to handle
spikes in traffic. For example, when a container instance has finished
handling requests, it might remain idle for a period of time in case
another request needs to be handled.
Cloud Run But, Cloud Run will terminate unused containers after some
time if no requests need to be handled. This means a cold start can
still occur. Container instances are scaled as needed, and it will
initialize the execution environment completely. While you can keep
idle instances permanently available using the min-instance setting,
this incurs cost even when the service is not actively serving
requests.
So, let’s say you want to minimize both cost and response time latency
during a possible cold start. You don’t want to set a minimum number
of idle instances, but you also know any additional computation needed
upon container startup before it can start listening to requests means
longer load times and latency.
Cloud Run container startup There are a few tricks you can do to
optimize your service for container startup times. The goal here is to
minimize the latency that delays a container instance from serving
requests. But first, let’s review the Cloud Run container startup
routine.
When Starting the service
Starting the container
Running the entrypoint command to start your server
Checking for the open service port
You want to tune your service to minimize the time needed for step 1a.
Let’s walk through 3 ways to optimize your service for Cloud Run
response times.
1. Create a leaner service
2. Use a leaner base image
3. Use global variables
As mentioned in the Documentation :
Background activity is anything that happens after your HTTP response
has been delivered. To determine whether there is background activity
in your service that is not readily apparent, check your logs for
anything that is logged after the entry for the HTTP request.
Avoid background activities if CPU is allocated only during request processing
If you need to set your service to allocate CPU only during request
processing, when the Cloud Run service finishes handling a
request, the container instance's access to CPU will be disabled or
severely limited. You should not start background threads or routines
that run outside the scope of the request handlers if you use this
type of CPU allocation. Review your code to make sure all asynchronous
operations finish before you deliver your response.
Running background threads with this kind of CPU allocation can create
unpredictable behavior because any subsequent request to the same
container instance resumes any suspended background activity.
As mentioned in the Thread reason could be that all the operations you performed have happened after the response is sent.
According to the docs the CPU is allocated only during the request processing by default so the only thing you have to change is to enable CPU allocation for background activities.
You can refer to the documentation for more information related to the steps to optimize Cloud Run response times.
You can also have a look on the blog related to use of Google API Gateway with Cloud Run.
I am reading DBConnection documentation. And I don't quite understand following quote:
Our goal is to wait at most :queue_target for a connection.
If all connections checked out during a :queue_interval takes more than
:queue_target, then we double the :queue_target. If checking out
connections take longer than the new target, then we start dropping
messages.
Could you please explain me on examples?
In my app I have very huge operation that is executed by periodic worker. I would like to have timeout for it 1minute, or don't have timeout at all. Which queue_target and queue_interval should I set to avoid: Elixir.DBConnection.ConnectionError',message => <<"tcp recv: closed (the connection was closed by the pool, possibly due to a timeout or because the pool has been terminated)"
In regular case I would like me queue timeout to be 5 seconds. How could I achieve this with queue_target and queue_interval?
The timeouts you're referring to are set with the :timeout option in execution functions (i.e. execute/4), :queue_target and :queue_interval are only meant to affect the pool's ability to begin new requests (for requests to checkout connections from the pool), not requests that have already checked out connections and are already being processed.
Keep in mind that all attempts to checkout connections during a :queue_interval must take longer than :queue_target in order for these values to affect anything. Normally you'd test different values and monitor your database's ability to keep up in order to find optimal values for your environment.
Pyramid (v1.5) application served by gunicorn (v19.1.1) behind nginx on a heroic BeagleBone Black "server".
One specific request requires significant I/O and processor time on the server (data exporting from database, formatting to xls and serving)
which results in gunicorn worker timeout and a 'Bad gateway' error returned by nginx.
Is there a practical way to handle this per request instead of increasing the global request timeout for all requests?
It is just this one specific request so I'm looking for the quickest and dirtiest solution instead of implementing a correct, asynchronous client notification protocol.
From the docs:
timeout¶
-t INT, --timeout INT
30
Workers silent for more than this many seconds are killed and restarted.
Generally set to thirty seconds. Only set this noticeably higher if you’re sure of the repercussions for sync workers. For the non sync workers it just means that the worker process is still communicating and is not tied to the length of time required to handle a single request.
graceful_timeout
--graceful-timeout INT
30
Timeout for graceful workers restart.
Generally set to thirty seconds. How max time worker can handle request after got restart signal. If the time is up worker will be force killed.
keepalive
--keep-alive INT
2
The number of seconds to wait for requests on a Keep-Alive connection.
Generally set in the 1-5 seconds range.
I understand from reading HikariCP's documentation (see below) that idle connections should be retired from a connection pool.
My question is: why and when should an idle database connection be retired from the connection pool?
This is the part of HikariCP documentation that sparked my question:
idleTimeout:
This property controls the maximum amount of time (in milliseconds)
that a connection is allowed to sit idle in the pool. Whether a
connection is retired as idle or not is subject to a maximum variation
of +30 seconds, and average variation of +15 seconds. A connection
will never be retired as idle before this timeout. A value of 0 means
that idle connections are never removed from the pool. Default: 600000
(10 minutes)
Two main reasons:
a) they take up resources on the server (not terribly much since the connection is idle)
b) sometimes connections timeout themselves after periods of inactivity. You want to either close them before that, or run some periodic "ping" SQL to make sure they are still alive. Otherwise you'd get an error on the next SQL you want to execute.
Lets say I am using IMAP IDLE to monitor changes in a mail folder.
The IMAP spec says that IDLE connections should only stay alive for 30 minutes max, but it is recommended that a lower number of minutes is selected - say 20 minutes, then cancel the idle and restart.
I am wondering what would happen if the mail contents changed between the idle canceling, and the new idle being created. An email could potentially be missed. Given that RECENT is a bit vague, this could lead to getting a message list before the old idle ends, and a new idle starts.
But this is almost the same as polling every 20 minutes, and defeats some of the benefit of idle.
Alternatively, a new idle session could be started prior to terminating the expiring one.
But in any case, I think this problem has already been solved so here I am asking for recommendations.
Thanks,
Paul
As you know, the purpose of IMAP IDLE command (RFC 2177) is to make it possible to have the server transmit status updates to the client in real time. In this context, status updates means untagged IMAP server responses such as EXISTS, RECENT, FETCH or EXPUNGE that are sent when new messages arrive, message status is updated or a message is removed.
However, these IMAP status updates can be returned by any IMAP command, not just the IDLE command - for example, the NOOP command (see RFC 3501 section 6.1.2) can be used to poll for server updates as well (it predates the IDLE command). IDLE only makes it possible to get these updates more efficiently - if you don't use IDLE command, server updates will simply be sent by the server when the client executes another command (or even when no command is in progress in some cases) - see RFC 3501 section 5.2 and 5.3 for details.
This means that if a message is changed between the IDLE canceling and the new IDLE command, the status updates should not be lost, just as they are not lost if you never used IDLE in the first place (and use NOOP every few seconds instead, for example) - they should simply be sent after the new IDLE command is started.
Another approach would be to remember last highest uid of the folder being monitored. Whenever you think there is chance that you missed update. Do a search as follows :*