Do I need to wrap accesses to Int64's with a critical section?

Do I need to wrap accesses to Int64's with a critical section? - delphi

I have code that logs execution times of routines by accessing QueryPerformanceCounter. Roughly:
var
FStart, FStop : Int64 ;
...
QueryPerformanceCounter (FStart) ;
... <code to be measured>
QueryPerformanceCounter (FStop) ;
<calculate FStop - FStart, update minimum and maximum execution times, etc>
Some of this logging code is inside threads, but on the other hand, there is a display UI that accesses the derived results. I figure the possibility exists of the VCL thread accessing the same variables that the logging code is also accessing. The VCL will only ever read the data (and a mangled read would not be too serious) but the logging code will read and write the data, sometimes from another thread.
I assume QueryPerformanceCounter itself is thread-safe.
The code has run happily without any sign of a problem, but I'm wondering if I need to wrap my accesses to the Int64 counters in a critical section?
I'm also wondering what the speed penalty of the critical section access is?

Any time you access multi-byte non-atomic data across thread when both reads and writes are involved, you need to serialize the access. Whether you use a critical section, mutex, semaphore, SRW lock, etc is up to you.

Related

Questions about passing strings and other data from UI to LV2 plugin

I need to pass a string from the UI to the plugin. From the eg-sample, it appears that an LV2 atom should be written to a atom port.
If I understand it correctly
First allocate a LV2_Atom_Forge. May that object be on the stack or does it have to survive after the UI event callback has returned?
Call lv2_atom_forge_set_buffer. How do I know the required size of the buffer? The example sets it to 1024 bytes for no reason. May the buffer be allocated on the stack or does it have to survive the UI after the UI event callback has returned?

The forge is just a utility for writing atoms. The buffer it writes to is provided by the code that uses it, so the lifetime of the forge itself is irrelevant. Allocating it on the stack is fine, though it may be more convenient to keep one around in your UI struct for use in various places.
You can estimate the space required by knowing the format of atoms as described in the documentation, or simply implementing everything with a massive buffer at first and checking the size field of the top-level atom in your output. Keep in mind that this will change if you have variable-sized elements like strings in there. The data passed to the UI callback(s) is const and only valid during the call, it must be copied by the receiver if it needs to be available later.

Wrapper class for thread-safe objects

I have recently played around with one demo opensource project for the basic functionality of the INDY10 TCP/IP server and stumbled upon the problem of internal multitasking implementation of INDY and its interaction with VCL components. Since there are many different topics in SO on the subject, I decided to make a simple client-server application and test some of the solutions and approaches suggested, at least the ones that I understood correctly. Below I would like to summarize and review an approach that was previously suggested on SO, and if possible listen to your expert opinion on the subject.
Problem: Encapsulation the VCL for thread-safe usage inside an indy10-based client/server application.
Description of the Development Env.:
Delphi Version: Delphi® XE2 Version 16.0
INDY Version 10.5.8.0
O.S. Windows 7 (32Bit)
As mentioned in the article ([ Is the VCL Thread-safe?]) (sorry I do not have enough reputation to post the link) special care should be taken when one wishes to use any kind of VCL components inside a multithreaded (multitasking) application. VCL is not thread safe, but can be used in a thread safe way!
The how and the why usually depend on the application at hand but one can attempt to generalize a bit and suggest some kind of general approach to this problem. First of all, as in the case of INDY10, one does not need to be explicitly parallelizing his code, i.e. create and execute multiple threads, in order to expose VCL to deadlocks and data inter dependencies.
In every sclient-server application, the server has to be able to handle multiple requests simultaneously, so naturally, INDY10 internally implements this functionality. This would mean that the INDY10 set of classes are responsible to manage the program's thread creation, execution and destruction procedures internally.
The most obvious place where our code is exposed to the inner workings of INDY10 and hence possible thread conflicts, is the IdTCPServerExecute (TIdTCPServer onExecute event) method.
Naturally, INDY10 provides classes (wrappers) that ensure thread-safe program flow, but since I did not manage to get enough explanation on their application and usage, I prefer a custom made approach.
Below I summarize a method ( the suggested technique is based on a previous comment I found in SO How to use TIdThreadSafe class from Indy10 ) that attempts (and presumably succeeds) in dealing with this problem:
The question I tackle below is: How to make a specific class "MyClass" ThreadSafe?
The main idea is to create kind of a wrapper class that encapsulates "MyClass" and queues the threads that try to access it in First-In-First-Out principle. The underlying objects that are used for synchronization are [Windows's Critical Section Objects.].
In the context of a client-server application, "MyClass" will contain all thread unsafe functionality of our server, so we will try to ensure that those procedures and functions are not executed by more than one working thread simultaneously. This naturally means loss of parallelism of our code, but since the approach is simple and seems to be , in some cases this maybe a useful approach.
Wrapper class Implementation:
constructor TThreadSafeObject<T>.Create(originalObject: T);
begin
tsObject := originalObject; // pass it already instantiated instance of MyClass
tsCriticalSection:= TCriticalSection.Create; // Critical section Object
end;
destructor TThreadSafeObject<T>.Destroy();
begin
FreeAndNil(tsObject);
FreeAndNil(tsCriticalSection);
inherited Destroy;
end;
function TThreadSafeObject<T>.Lock(): T;
begin
tsCriticalSection.Enter;
result:=tsObject;
end;
procedure TThreadSafeObject<T>.Unlock();
begin
tsCriticalSection.Leave;
end;
procedure TThreadSafeObject<T>.FreeOwnership();
begin
FreeAndNil(tsObject);
FreeAndNil(tsCriticalSection);
end;
MyClass Definition:
MyClass = class
public
procedure drawRandomBitmap(abitmap: TBitmap); //Draw Random Lines on TCanvas
function decToBin(i: LongInt): String; //convert decimal number to Bin.
procedure addLineToMemo(aLine: String; MemoFld: TMemo); // output message to TMemo
function randomColor(): TColor;
end;
Usage:
Since threads execute in order and wait for the thread which has the current ownership of the critical section to finish (tsCriticalSection.Enter; and tsCriticalSection.Leave;) it is logical that if you want to manage that ownership relay, you need one unique instance TThreadSafeObject (you can consider using the singleton pattern). so include:
tsMyclass:= TThreadSafeObject<MyClass>.Create(MyClass.Create);
in Form.Create and
tsMyclass.Destroy;
in Form.Close; Here tsMyclass is a global variable of type MyClass.
Usage:
Regarding the usage of MyClass try the following:
with tsMyclass.Lock do
try
addLineToMemo('MemoLine1', Memo1);
addLineToMemo('MemoLine2', Memo1);
addLineToMemo('MemoLine3', Memo1);
finally
// release ownership
tsMyclass.unlock;
end;
, where Memo1 is an instance of a TMemo component on the form.
With this, we are supposed to ensure that anything that happens when tsMyClass is locked
will be executed by only one thread at a time. An obvious drawback of this approach, however, is that since I have only one instance of tsMyclass, even if one thread is trying to draw for e.g. on the Canvas, while another is writing on the Memo, the first thread will have to wait for the second to finish and only then it will be able to carry out its job.
My questions here are:
Is the above suggested method correct? Am I still free of race
conditions or do I have some "loopholes" in the code, from where
data conflicts could occur?
How can one, in general, test for thread
unsafety of his/her applicaiton?
I would like to stress that the above approach is in no way my own doing. It is basically a summary of the solution found in 2. Nevertheless, I have decided to post again in an attempt to get some kind of closure on the topic or a kind of proof of validity for the suggested solution. Besides, repetition is mother of all knowledge, as they say.

With this, we are supposed to ensure that anything that happens when
tsMyClass is locked will be executed by only one thread at a time. An
obvious drawback of this approach, however, is that since I have only
one instance of tsMyclass, even if one thread is trying to draw for
e.g. on the Canvas, while another is writing on the Memo, the first
thread will have to wait for the second to finish and only then it
will be able to carry out its job.
I see one big problem here: the VCL (forms, drawing, etc...) lives on the main thread. Even if you block concurrent thread access, the updates need to be done in the context of the main thread. This is the part where you need to use Synhronize(), the big difference with a lock (Criticalsection) is that synchronized code is ran in the context of the main thread. The end result is basically the same, your threaded code is serialized and you lose the advantage of using threads in the first place.

Locking on the whole object can be much too coarse.
Imagine cases where some properties or methods are independent of others. If the lock works on a "global" level, many operations will be blocked needlessly.
From Reduce lock granularity – Concurrency optimization
So, how can we reduce lock granularity? With a short answer, by asking
for locks as less as possible. The basic idea is to use separate locks
to guard multiple independent state variables of a class, instead of
having only one lock in class scope.

First things first: You don't need to implement a LOCK for each of your objects, Delphi's done that for you with the TMonitor class:
TMonitor.Enter(WhateverObject);
try
// Your code goes here.
finally TMonitor.Leave(WhateverObject);
end;
just make sure you free the WhateverObject when your application shuts down, or else you'll run into a bug that I've opened on QC: http://qc.embarcadero.com/wc/qcmain.aspx?d=111795
Secondly, making an application multi-threading is a bit more involved. You can't just wrapp each call between Enter/Leave calls: your "locking" needs to take into account what the object does and what the access pattern is. Wrapping calls within Enter/Leave simply make sure that only one thread runs that method at any time, but race conditions are much more complex, and might arise from successive calls to your locked methods. Even those each method is locked, and only one thread ever called those methods at any given time, the state of the locked object might change between as a consequence of other thread's activity.
This kind of code would be just fine in a single-threaded application, but locking at method level is not enough when switching to multi-threaded:
if List.IndexOf(Something) = -1 then
List.Add(Something);

How can I parallelize check spelling using Delphi?

I've got a sort of spell checker written in Delphi. It analyzes the text sentence by sentence.
It encolors wrong items according to some rules after parsing each sentence. The user is able to interrupt this process, which is important.
How can I parallelize this process in general using some 3rd party Delphi libraries?
In the current state I've got on the fly sentence coloration after check. Thus the user sees the progress.

The algorithm would be as such:
Create multiple workers.
Create a spell-checker in each worker.
Grab the text and split it into work units (word or sentences). Each work unit must be accompanied with the location in original text.
Send work units to workers. Good approach is to send data into common queue from which workers are taking work units. Queue must either support multiple readers or you must use locking to access it.
Each worker takes a work unit, runs a spell-check and returns the result (together with the location in the original text) to the owner.
The simplest way to return a result is to send a message to the main thread.
Alternatively, you can write results into a result queue (which must either use locking or support multiple writers) and application can then poll those results (either from a timer or from the OnIdle handler).
How the multiple spell-checkers will access the dictionary is another problem. You can load a copy of the dictionary in each worker or you can protect access to the dictionary with a lock (but that would slow things down). If you are lucky, dictionary is thread-safe for reading and you can do simultaneous queries without locking.
Appropriate OmniThreadLibrary abstraction for the problem would be either a ParallelTask or a BackgroundWorker.

To parallelize, just create a new class descendent from TThread, create an object from it, give part of the job to the new thread, run Execute, and collect the results in the main thread.
Like this:
TMySpellChecker = class(TThread)
protected
FText: String;
FResult: String;
public
procedure Execute; override;
property Text: String read FText write FText;
property Result: String read FResult write FResult;
end;
TMySpellChecker.Execute;
begin
// Analyze the text, and compute the result
end;
In the main thread:
NewThread := TMySpellChecker.Create(True); // Create suspended
NewThread.Text := TextSegment;
NewThread.Execute;
The thread object will then do the analyzing in the background, while the main thread continues to run.
To handle the results, you need to assign a handler to the OnTerminate event of the thread object:
NewThread.OnTerminate := HandleMySpellCheckerTerminate;
This must be done before you run Execute on the thread object.
To allow for interruptions, one possibility is to break the main text up into segments, place the segments in a list in the main thread, and then analyze the segments one by one using the thread object. You can then allow for interruptions between each run.

Can I use pthread mutexes in the destructor function for thread-specific data?

I'm allocating my pthread thread-specific data from a fixed-size global pool that's controlled by a mutex. (The code in question is not permitted to allocate memory dynamically; all the memory it's allowed to use is provided by the caller as a single buffer. pthreads might allocate memory, I couldn't say, but this doesn't mean that my code is allowed to.)
This is easy to handle when creating the data, because the function can check the result of pthread_getspecific: if it returns NULL, the global pool's mutex can be taken there and then, the pool entry acquired, and the value set using pthread_setspecific.
When the thread is destroyed, the destructor function (as per pthread_key_create) is called, but the pthreads manual is a bit vague about any restrictions that might be in place.
(I can't impose any requirements on the thread code, such as needing it to call a destructor manually before it exits. So, I could leave the data allocated, and maybe treat the pool as some kind of cache, reusing entries on an LRU basis once it becomes full -- and this is probably the approach I'd take on Windows when using the native API -- but it would be neatest to have the per-thread data correctly freed when each thread is destroyed.)
Can I just take the mutex in the destructor? There's no problem with thread destruction being delayed a bit, should some other thread have the mutex taken at that point. But is this guaranteed to work? My worry is that the thread may "no longer exist" at that point. I use quotes, because of course it certainly exists if it's still running code! -- but will it exist enough to permit a mutex to be acquired? Is this documented anywhere?

The pthread_key_create() rationale seems to justify doing whatever you want from a destructor, provided you keep signal handlers from calling pthread_exit():
There is no notion of a destructor-safe function. If an application does not call pthread_exit() from a signal handler, or if it blocks any signal whose handler may call pthread_exit() while calling async-unsafe functions, all functions may be safely called from destructors.
Do note, however, that this section is informative, not normative.
The thread's existence or non-existence will most likely not affect the mutex in the least, unless the mutex is error-checking. Even then, the kernel is still scheduling whatever thread your destructor is being run on, so there should definitely be enough thread to go around.

Confusion of thread synchronization problem

It makes me confused when I read the article by Zarko Gajic today:
"Multithreaded Delphi Database Queries"
Article URL: http://delphi.about.com/od/kbthread/a/query_threading.htm
Sourecode: http://delphi.about.com/library/weekly/code/adothreading.zip
With the code of "TCalcThread.Execute" procedure, Why the following code do not need to be placed in the Synchronize() method to run?
Line 173: ListBox.Clear;
Line 179: ListBox.Items.Insert(......);
Line 188: ListBox.Items.Add('*---------*');
Line 195: TicksLabel.Caption := 'Ticks: ' + IntToStr(ticks);
These codes are operating the VCL components, and are related to the UI updates. In my knowledge, these operations should be use thread synchronize, and executed by the main thread. Is my knowledge has the flaw?

This is a rare case where you're benefiting from the fact that Windows is doing the thread synchronization for you. The reason is that for a listbox, the items are manipulated using SendMessage with control specific messages. Because of this, each SendMessage call makes sure the message is processed by the same thread on which the control was created, notably the main thread.
Like I said, this is a rare case. It is also causing a thread switch for each of those three calls, which will degrade performance. You're still better off using Synchronize to force that block of code to run in the main thread where it belongs. It also ensures that if you begin working with a control that doesn't internally use SendMessage, you won't get bitten.

Indeed. Maybe the sample isn't problematic cause there are no UI changes while the thread is executing. But UI things always have to occur inside the UI thread.
The only differences I see between the sync'ed and the not sync'ed instructions are:
the not sync'ed are not no-params methods so the program will be more dificult to write :)
the sync'ed method is updating a TLabel which is not a TControl (if I remember my Delphi days) so it uses canvas directly...
But anyway: UI is touched by a single thread. Always. Once I wanted to update a TTreeBox inside a thread (no paralelism nor cross updates, simply a separate thread) and it was a very bad thing (random errors)...

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Do I need to wrap accesses to Int64's with a critical section? - delphi

Any time you access multi-byte non-atomic data across thread when both reads and writes are involved, you need to serialize the access. Whether you use a critical section, mutex, semaphore, SRW lock, etc is up to you.

Related

Questions about passing strings and other data from UI to LV2 plugin

Wrapper class for thread-safe objects

How can I parallelize check spelling using Delphi?

Can I use pthread mutexes in the destructor function for thread-specific data?

Confusion of thread synchronization problem

Categories

Resources