How to calculate progress rate of DRBD?

How to calculate progress rate of DRBD? - drbd

WinDRBD's progress is only visible when syncing. But I'd like to know how far has gone if the out-of-sync remains.
drbdsetup status foo --v --s
Through the detail view command, the following contents were obtained.
foo node-id:2 role:Primary suspended:no
write-ordering:flush
volume:1 minor:1 disk:UpToDate backing_dev:\DosDevices\G: quorum:yes
size:524253532 read:7238338 written:5209825 al-writes:589 bm-writes:198 upper-pending:0 lower-pending:10
al-suspended:no blocked:no
Node1 node-id:1 connection:Connected role:Secondary congested:no ap-in-flight:0 rs-in-flight:7168
volume:1 replication:SyncSource peer-disk:Inconsistent done:85.32 resync-suspended:no
received:0 sent:1226764 out-of-sync:210484 pending:6 unacked:10 dbdt1:14.99 eta:14
done:85.32
This part is known as the progress rate.
How was this calculated?

When a resource becomes disconnected, the out-of-sync counter will begin to increment on the peer that is currently Primary. When the resource reconnects, the bitmaps (stored in DRBD's metadata) are compared to determine which blocks became out-of-sync while disconnected, and proceeds to resync those blocks in the background. Any writes that occur during a background resync are immediately replicated, and if that write happens to touch a block that's a part of the background resync, it's removed from the resync queue (since it was updated by the foreground replication).

Related

How to handle timeout in FreeRTOS - wake up task from interrupt before vTaskDelay expires?

Can I wake up task before vTaskDelay expires?
I have code like this:
In the task (hnd_uart_task) code:
transmit_frame();
vTaskDelay(100); // task should wait 100 ticks or be woken up by uart ISR
parse_response();
UART Interrupt:
// if byte was received
BaseType_t xYieldRequired = xTaskResumeFromISR(hnd_uart_task);
portYIELD_FROM_ISR(xYieldRequired);

Instead of using vTaskDelay(), you can use task notifications with timeout.
USART Interrupt:
// if byte was received
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
vTaskNotifyGiveFromISR(hnd_uart_task, &xHigherPriorityTaskWoken);
portYIELD_FROM_ISR(xHigherPriorityTaskWoken);
Task Code:
transmit_frame();
ulTaskNotifyTake(pdTRUE, 100);
parse_response();
ulTaskNotifyTake(pdTRUE, 100) returns when a task notification is received from the ISR, or when 100 tick timeout period elapses.
But as #Jacek Ślimok pointed out, a byte-by-byte parsing may not be good idea. The exact method depends on the protocol used. But in general, you set up your DMA or interrupts to fill a reception buffer with incoming bytes. For example, when parsing Modbus frames, you can use idle line detection hardware interrupt and give notification to a task which parses the RX buffer.

No, because that's not what vTaskDelay was meant to be used for.
The closest solution to yours would be to create a semaphore that you attempt to take inside the task with a 100ms delay and that you give from ISR. This way the task will block for a maximum of 100ms waiting for semaphore to be given, after which it'll unblock and resume execution anyway. If it's given earlier, it'll unblock earlier, which is what I assume you want.
However, from what you've written I assume you want to achieve something like the following:
Send some data over UART
Wait for response
As soon as response is received, do something with the response
In this case, doing both blocking and parsing in the same task is going to be hard (you reaaaalllly don't want to do any sort of parsing inside of your ISR). I'd therefore recommend the following "layout" of two tasks, an interrupt and one shared smepahore between the two tasks:
"High level" task (let's call it ApplicationTask) that can do the following:
Construct whole frames and request them to be sent over UART (add them to some kind of queue). This "construction of whole frames" and sending them over to the other tasks would usually be wrapped into some functions.
Will block waiting for response
Will receive already parsed data (full frame or object/structure holding that parsed data)
"Byte level" task (let's call it ByteTask) that can do the following:
Has a queue for transmitted data (queue of frames or queue of raw bytes)
Has a queue for received data
"Pushes" data from "data to be transmitted" queue into UART
Parses UART data that appears in "received data" queue and gives semaphore to unblock the ApplicationTask
UART Interrupt:
Only transmits data that it's told to transmit by ByteTask
Pushes received data into ByteTask receive queue
Shared semaphore between ApplicationTask and ByteTask:
Whenever ApplicationTask wants to "wait to receive response", it attempts to take this semaphore. Maximum blocking time can be used as a "response receiving timeout".
Whenever ByteTask receives and parses enough data to decide that "yes, this is a complete response", it gives this semaphore.
This above is a super simple example of something that should be easy enough to scale to more tasks as you develop your application. You can get a lot more fancy than that (e.g. ByteTask handling multiple UARTs at the same time, have a pool of semaphores used for blocking for multiple tasks, do some more sophisticated message dispatching etc.), but the example above should hopefully give you a better idea of how something like this can be approached.

How to reset a IXAudio2SourceVoice's 'SamplesPlayed' counter after flushing source buffers?

IXAudio2SourceVoice has a GetState function which returns an XAUDIO2_VOICE_STATE structure. This structure has a SamplesPlayed member, which is:
Total number of samples processed by this voice since it last started, or since the last audio stream ended (as marked with the XAUDIO2_END_OF_STREAM flag).
What I want to be able to do it stop the source voice, flush all its buffers, and then reset the SamplesPlayed counter to zero. Neither calling Stop nor FlushSourceBuffers will by themselves reset SamplesPlayed. And while flagging the last buffer with XAUDIO2_END_OF_STREAM does correctly reset SamplesPlayed back to zero, this seemingly only works if that last buffer is played to completion; if the buffer is flushed, then SamplesPlayed does not get reset. I have also tried calling Discontinuity both before and after stopping/flushing with no effect.
My current workaround is, after stopping and flushing the source voice, to submit a tiny 1-sample silent buffer with the XAUDIO2_END_OF_STREAM flag set and then let the source voice play to process that buffer and thus reset SamplesPlayed to zero. This works fine-ish for my use case, but it seems pretty hacky/clumsy. Is there a better solution?

Looking at the XAudio2 source, there's no exposed way to do that in the API other than letting a packet play with XAUDIO2_END_OF_STREAM.
Calling Discontinuity sets up the end-of-stream flag on the currently playing buffer, or if there's none playing and a queued buffer it sets it there. You need to call Discontinuity and then let the voice play to completion before you recycle it.

Transactions not expired after timeout expired

We're using neo4j (3.1.5-enterprise) for one of our services. (Over HTTP)
We set dbms.transaction.timeout=150s in our neo4j config file .
We have a scenario which may take more time than 150 seconds, but what we would like is for the transaction to be expired after 150 seconds anyway.
For some reason its not happening and the transaction continue until it fully executed but its not being stopped after 150 seconds, any guess why?
In our application logs I can see the following exception (more stacktrace details below):
neo.db.NeoHttpDriver - Errors in response:
[NeoResponseError{
code='Neo.DatabaseError.Statement.ExecutionFailed',
message='Transaction timeout. (Overtime: 23793 ms).',
stackTrace='org.neo4j.kernel.guard.GuardTimeoutException: Transaction timeout. (Overtime: 23793 ms).
...
Also, our service steps(in the specific scenario that may take long time) in general is open a transaction, lock some common entity and proceed. Since the transaction is not expired and released(and therefor the common entity continue to be locked) after 150 seconds, then other threads may also be locked for a long time.
Thanks!
Orel
Exception stacktrace:
15:00:59.627 [DefaultThreadPool-7] DEBUG c.e.e.m.neo.db.NeoHttpDriver - Errors in response: [NeoResponseError{code='Neo.DatabaseError.Statement.ExecutionFailed', message='Transaction timeout. (Overtime: 23793 ms).', stackTrace='org.neo4j.kernel.guard.GuardTimeoutException: Transaction timeout. (Overtime: 23793 ms).
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:71)
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:57)
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:49)
at org.neo4j.kernel.impl.api.GuardingStatementOperations.nodeCursorById(GuardingStatementOperations.java:300)
at org.neo4j.kernel.impl.api.OperationsFacade.nodeHasProperty(OperationsFacade.java:343)
at org.neo4j.cypher.internal.spi.v3_1.TransactionBoundQueryContext$NodeOperations.hasProperty(TransactionBoundQueryContext.scala:319)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply$mcZ$sp(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.spi.v3_1.ExceptionTranslationSupport$class.translateException(ExceptionTranslationSupport.scala:32)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1.translateException(ExceptionTranslatingQueryContextFor3_1.scala:34)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations.hasProperty(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compiler.v3_1.spi.DelegatingOperations.hasProperty(DelegatingQueryContext.scala:221)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.AbstractSetPropertyOperation.setProperty(SetOperation.scala:98)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetEntityPropertyOperation.set(SetOperation.scala:117)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetPipe$$anonfun$internalCreateResults$1.apply(SetPipe.scala:31)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetPipe$$anonfun$internalCreateResults$1.apply(SetPipe.scala:30)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:71)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$failIfThrows$1.apply(ResultIterator.scala:94)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.decoratedCypherException(ResultIterator.scala:103)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.failIfThrows(ResultIterator.scala:92)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.next(ResultIterator.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.next(ResultIterator.scala:49)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.foreach(ResultIterator.scala:49)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:183)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.to(ResultIterator.scala:49)
at scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:294)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.toList(ResultIterator.scala:49)
at org.neo4j.cypher.internal.compiler.v3_1.EagerResultIterator.<init>(ResultIterator.scala:35)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.toEager(ResultIterator.scala:53)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.buildResultIterator(DefaultExecutionResultBuilderFactory.scala:109)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.createResults(DefaultExecutionResultBuilderFactory.scala:99)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.build(DefaultExecutionResultBuilderFactory.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anonfun$getExecutionPlanFunction$1.apply(ExecutionPlanBuilder.scala:164)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anonfun$getExecutionPlanFunction$1.apply(ExecutionPlanBuilder.scala:148)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anon$1.run(ExecutionPlanBuilder.scala:123)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper$$anonfun$run$1.apply(CompatibilityFor3_1.scala:275)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper$$anonfun$run$1.apply(CompatibilityFor3_1.scala:273)
at org.neo4j.cypher.internal.compatibility.exceptionHandlerFor3_1$runSafely$.apply(CompatibilityFor3_1.scala:190)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper.run(CompatibilityFor3_1.scala:273)
at org.neo4j.cypher.internal.PreparedPlanExecution.execute(PreparedPlanExecution.scala:26)
at org.neo4j.cypher.internal.ExecutionEngine.execute(ExecutionEngine.scala:107)
at org.neo4j.cypher.internal.javacompat.ExecutionEngine.executeQuery(ExecutionEngine.java:59)
at org.neo4j.server.rest.transactional.TransactionHandle.safelyExecute(TransactionHandle.java:371)
at org.neo4j.server.rest.transactional.TransactionHandle.executeStatements(TransactionHandle.java:323)
at org.neo4j.server.rest.transactional.TransactionHandle.execute(TransactionHandle.java:230)
at org.neo4j.server.rest.transactional.TransactionHandle.execute(TransactionHandle.java:119)
at org.neo4j.server.rest.web.TransactionalService.lambda$executeStatements$0(TransactionalService.java:203)

Most likely the problem is that the tx is waiting on a lock. Prior to Neo4j 3.2, dbms.transaction.timeout cannot cover the case of terminating a transaction that's waiting on a lock (or rather, it will mark it for termination, but the actual termination won't happen until the lock is acquired).
In Neo4j 3.2, dbms.lock.acquisition.timeout was introduced, which interrupts waiting on a lock and allows the thread to check if the tx has been terminated and take appropriate action.

The following is based on an answer provided by Neo4j Support:
dbms.lock.acquisition.timeout
As a starting point, dbms.lock.acquisition.timeout was only added in Neo4j 3.2, it does not exist for 3.1. where we don't yet have lock acquisition timeout, hence wait times on locks can over-runs past the set limit. Things like GC can also extend the time. However, as you're currently on 3.1.5, dbms.lock.acquisition.timeout would not yet be enforced.
dbms.transaction.timeout
dbms.transaction.timeout marks a transaction for termination, but the actual logic of checking this and performing the termination happens on a running thread, not one waiting on locks, and doesn't cause the thread to wake up and check. Presumably the logic for terminating a thread upon timeout is that some other thread periodically checks execution time for a transaction, and if it has exceeded the transaction timeout, sets a boolean variable on the transaction to indicate that it is marked for termination. The actual termination of the thread likely happens in an event loop for the transaction, where it checks that variable to see if it's marked for termination, then terminates and rolls back. A thread that attempts to acquire a lock enters a waiting state when the lock is already held by another thread. During this waiting state, the event loop is not being processed, so the thread never reaches the point in the event loop where it can check if it's been marked as terminated and take care of it.
Bottom line:
dbms.transaction.timeout does not cause a hard timeout, it only marks the transaction as timed-out, which will cause it to rollback once the flag is checked.

Handing off a piece of work to a thread and waiting for it to accept

My application works as follows:
the worker-threads initialize and begin waiting in pthread_cond_wait()
the main thread connects to DB and starts handing over one row at a time to the proper worker
Because of the DB-driver internals, the next row can not be read until the current one is extracted, so the main thread has to wait for the worker to "accept" the row.
I achieve this by calling pthread_cond_wait() inside the main thread -- waiting for a pthread_signal() from the worker. This works cleanly -- on both Linux and FreeBSD -- but usually takes much longer on Linux. Whereas I consistently process the entire 1.6M rows in about 27 seconds on FreeBSD, on Linux it usually takes over 2 minutes. Except sometimes the Linux box shows the same time...
The code is compiled from the same source and the program talks to the same DB-server. If anything, the Linux box is located on the same LAN as the DB, whereas the FreeBSD machine connects via VPN (so it should be a bit slower). But it is the wide inconsistency of the Linux results that bothers me, and I suspect the thread-coordination...
Here is what I have now:
MAIN THREAD WORKER
--------------------------------------------------------------------------
get new row
figure out, which worker it belongs to lock my mutex
lock the worker's mutex go into pthread_cond_wait
signal the worker extract the row's data
unlock the worker's mutex signal the main thread
go into pthread_cond_wait unlock the mutex
go on back to getting the next row go on to process the row's data
Is there a better way? Thanks!

If reading the next row must be serial anyway, why are you delegating this to the worker? As the main thread has to wait anyway, have the main thread do the extraction and have the hand-off occur as soon as the row has been sufficiently extracted that the master can proceed to the next row.
Other than that, you will need to provide code, as your description is incomplete, as would be any question of this nature submitted without code.

It looks like your problem is that you are calling pthread_cond_wait() without the mutex locked in the main thread. This means that there's a race-condition: if the worker thread wakes up, extracts the data and signals the condition before the parent executes pthread_cond_wait(), the wakeup will be lost.
What you should have is some shared state paired with the condition variable, like this:
Main Thread:
get_new_row();
worker = decide_worker();
pthread_mutex_lock(&mutex);
/* Signal worker that data is available */
flag[worker] = 1;
pthread_cond_signal(&cond);
/* Wait for worker to extract it */
while (flag[worker] == 1)
pthread_cond_wait(&cond, &mutex):
pthread_mutex_unlock(&mutex);
Worker Thread:
pthread_mutex_lock(&mutex);
/* Wait for data to be available */
while (flag[worker] == 0)
pthread_cond_wait(&cond, &mutex):
extract_row_data();
/* Signal main thread that extraction is complete */
flag[worker] = 0;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

Resetting comm event mask

I have been doing overlapped serial port communication in Delphi lately and there is one problem I'm not sure how to solve.
I communicate with a modem. I write a request frame (an AT command) to the modem's COM port and then wait for the modem to respond. The event mask of the port is set to EV_RXCHAR, so when I write a request, I call WaitCommEvent() and start waiting for data to appear in the input queue. When overlapped waiting for event finishes, I immediately start reading data from the queue and read all that the device sends at once:
1) write a request
2) call WaitCommEvent() and wait until waiting finishes
3) read all the data that the device sends (not only the data being in the input queue at that moment)
4) do something and then goto 1
Waiting for event finishes after first byte appears in the input queue. During my read operation, however, more bytes appear in the queue and each of them causes an internal event flag to be set. This means that when I read all the data from the queue and then call WaitCommEvent() for the second time, it will immediately return with EV_RXCHAR mask, even though there is no data to be read.
How should I handle reading and waiting for event to be sure that the event mask returned by WaitCommEvent() is always valid? Is it possible to reset the flags of the serial port so that when I read all data from the queue and call WaitCommEvent() after then, it will not return immediately with a mask that was valid before I read the data?
The only solution that comes to my mind is this:
1) write a request
2) call WaitCommEvent() and wait until waiting finishes
3) read all the data that the device sends (not only the data being in the input queue at that moment)
4) call WaitCommEvent() which should return true immediately at the same time resetting the event flag set internally
5) do something and goto 1
Is it a good idea or is it stupid? Of course I know that the modem almost always finishes its answers with CRLF characters so I could set the comm mask to EV_RXFLAG and wait for the #10 character to appear, but there are many other devices with which I communicate and they do not always send frame end characters.
Your help will be appreciated. Thanks in advance!
Mariusz.

Your solution does sound workable. I just use a state machine to handle the transitions.
(psuedocode)
ioState := ioIdle;
while (ioState <> ioFinished) and (not aborted) do
Case ioState of
ioIdle : if there is data to read then set state to ioMidFrame
ioMidframe : if data to read then read, if end of frame set to ioEndFrame
ioEndFrame : process the data and set to ioFinished
ioFinished : // don't do anything, for doc purposes only.
end;

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart