I need multithread safe Int variable in Swift that could be incremented or decremented in different threads. Moreover, this variable must be decremented by 1 every second and notify (by block or selector) when it has zero value.
What is the best way to implement this?
Regarding apple swift block you needn't think about it.
One of the primary reasons to choose value types over reference types
is the ability to more easily reason about your code. If you always
get a unique, copied instance, you can trust that no other part of
your app is changing the data under the covers. This is especially
helpful in multi-threaded environments where a different thread could
alter your data out from under you. This can create nasty bugs that
are extremely hard to debug.
https://developer.apple.com/swift/blog/?id=10
Int is a value type. So, you can use it in several threads without worries.
Related
I've already seen this question:
What's the difference between the atomic and nonatomic attributes?
I understand that #atomic does not guarantee thread safe, and I have to use other mechanisms (e.g. #synchronized) to realize that. Based on that, I don't still know EXACTLY when to use #atomic attribute. I'd like to know USE CASE of using #atomic alone.
The typical use-case for atomic properties is when dealing with a primitive data type across multiple threads. For example, let's say you have some background thread doing some processing and you have some BOOL state property, e.g. isProcessComplete and your main thread wants to check to see if the background process is complete:
if (self.isProcessComplete) {
// Do something
}
In this case, declaring this property as atomic allows us to use/update this property across multiple threads without any more complicated synchronization mechanism because:
we're dealing with a scalar, primitive data type, e.g. BOOL;
we declared it to be atomic; and
we're using the accessor method (e.g. self.) rather than accessing the ivar directly.
When dealing with objects or other more complicated situations, atomic generally is insufficient. As you point out, in practice, atomic, alone, is rarely sufficient to achieve thread safety, which is why we don't use it very often. But for simple, stand-alone, primitive data types, atomic can be an easy way to ensure safe access across multiple threads.
#atomic will guarantee You, that the value, that You will receive will not be gibberish. A possible situation is to read a given value from one thread and to set its value from another. Then the #atomic keyword will ensure that You receive a whole value. Important thing to now is that the value that You get is not guaranteed to be the one that has been set most recently.
The second part of the Your question, about the use case is purely circumstantial, depending on the implementation the intentions. For example if You have some kind of time based update on a list every second or so, You could use atomic to ensure that You will get whole values, and the timer that updates Your list can ensure that You will have the latest data on screen or under the hood in some implicit logic.
EDIT: After a remark from #Rob I saw the need of paraphrasing the last part of my answer. In most cases atomic is not enough to get the job done. And there is a need of a better solution, like #synchronized.
atomic guarantees that competing threads will get/set whole values, whether you're setting a primitive type or a pointer to an object. With nonatomic access, it's possible to get a torn read or write, especially if you're dealing with 64-bit values.
atomic also guarantees that an object pointer won't be garbage when you retrieve it:
https://github.com/opensource-apple/objc4/blob/master/runtime/objc-accessors.mm#L58-L61.
That being said, if you're pointing to a mutable object, atomic won't offer you any guarantees about the object's state.
atomic is appropriate if you're dealing with:
Immutable objects
Primitives (like int, char, float)
atomic is inappropriate if:
The object or any of its descendant properties are mutable
If you need thread safety for a primitive value or immutable object, this is one of the fastest ways to guard against torn reads/writes that you can find.
First, I understand the difference between value and reference types -this isn't that question. I am rewriting some of my code in Swift, and decided to also refactor some of the classes. Therefore, I thought I would see if some of the classes make sense as structs.
Memory: I have some model classes that hold very large arrays, that are constantly growing in size (unknown final size), and could exist for hours. First, are there any guidelines about a suggested or absolute size for a struct, since it lives on the stack?
Refactoring Use: Since I'm refactoring what right now is a mess with too much dependency, I wonder how I could improve on that. The views and view controllers are mostly easily, it's my model, and what it does, that's always left me wishing for better examples to follow.
WorkerManager: Singleton that holds one or two Workers at a time. One will always be recording new data from a sensor, and the other would be reviewing stored data. The view controllers get the Worker reference from the WorkerManager, and ask the Worker for the data to be displayed.
Worker: Does everything on a queue, to prevent memory access issues (C array pointers are constantly changing as they grow). Listening: The listening Worker listens for new data, sends it to a Processor object (that it created) that cleans up the data and stores it in C arrays held by the Worker. Then, if there is valid data, the Worker tells the Analyzer (also owned by the worker) to analyze the data and stores it in other C arrays to be fed to views. Both the Processor and Analyzer need state to know what has happened in the past and what to process and analyze next. The pure raw data is stored in a separate Record NSManaged object. Reviewer Takes a Record and uses the pure raw data to recreate all of the analyzed data so that it can be reviewed. (analyzed data is massive, and I don't want to store it to disk)
Now, my second question is, could/should Processor and Analyzer be replaced with structs? Or maybe protocols for the Worker? They aren't really "objects" in the normal sense, just convenient groups of related methods and the necessary state. And since the code is nearly a thousand lines for each, and I don't want to put it all in one class, or even the same file.
I just don't have a good sense of how to remove all of my state, use pure functions for all of the complex mathematical operations that are performed on the arrays, and where to put them.
While the struct itself lives on the stack, the array data lives on the heap so that array can grow in size dynamically. So even if you have an array with million items in it and pass it somewhere, none of the items are copied until you change the new array due to the copy-on-write implementation. This is described in details in 2015 WWDC Session 414.
As for the second question, I think that 2015 WWDC Session 414 again has the answer. The basic check that Apple engineers recommend for value types are:
Use a value type when:
Comparing instance data with == makes sense
You want copies to have independent state
The data will be used in code across multiple threads
Use a reference type (e.g. use a class) when:
Comparing instance identity with === makes sense
You want to create shared, mutable state
So from what you've described, I think that reference types fit Processor and Analyzer much better. It doesn't seem that copies of Processor and Analyzer are valid objects if you've not created new Producers and Analyzers explicitly. Would you not want the changes to these objects to be shared?
This question already has answers here:
What's the difference between the atomic and nonatomic attributes?
(27 answers)
Closed 7 years ago.
I know there are answers on atomic vs. non-atomic answers, but they mostly seem to be fairly old (2011 and earlier), so I'm hoping for updated advice. My understanding is that non-atomic properties are faster but not thread safe. Does this mean that any property that might be accessed from multiple threads at the same time should always be atomic? Are there conditions that can make it ok to make it non-atomic? What other concerns are there in determining whether to make a property atomic or nonatomic?
Declaring a property atomic makes compiler generate additional code that prevents concurrent access to the property. This additional code locks a semaphore, then gets or sets the property, and then unlock the semaphore. Compared to setting or getting a primitive value or a pointer, locking and unlocking a semaphore is expensive (although it is usually negligible if you consider the overall flow of your app).
Since most of your classes under iOS, especially the ones related to UI, will be used in a single-threaded environment, it is safe to drop atomic (i.e. write nonatomic, because properties are atomic by default), even though the operation is relatively inexpensive, you do not want to pay for things that you do not need.
Declaring a property as atomic does not necessarily make it thread safe.
Atomic is the default and involves some extra overhead compared to nonatomic. If thread A is halfway through the getter for that property and thread B changes the value in the setter, using atomic will ensure that a viable, whole value is returned from the getter. If you use nonatomic no such guarantee is made, the extra code is not generated, and thus nonatomic is faster.
However, This does not guarantee thread safety. If thread A calls the getter and threads B and C are updating the thread with different values then thread A could get either value and there is no guarantee which one it will get.
To specifically answer your question, many scenarios allow for nonatomic properties, if not most. Although the extra overhead of using atomic is probably negligable. Simply put, are your properties read or set on different threads? If not, you most likely don't need to declare them as atomic but the extra overhead might not even be noticable. It's just that simply declaring them as atomic does not guarantee thread safety.
In most situations it is unimportant, whether a property is atomic or not in a multithreaded environment.
What?
In most situations it is unimportant, whether a property is atomic or not in a multithreaded environment.
The reason for this is that making a property "thread-safe" by turning atomicity on does not make your code thread-safe. To get this you need more work and this work typically implicitly ensures that the property is not accessed parallel.
Let's have an example: You have a class Person with the properties firstName and lastName both of NSString*. You simply want to add the both name separated with a space to have a full name.
NSString *fullName = [NSString stringWithFormat:#"%# %#", person.firstName, person.lastName];
You know that other threads can update the properties while you do that. Making the properties atomic doesn't help anything. This does not help, if the last name of the persons are changed, after reading the first name, but before reading the second name. This does not help, if the names of the persons are changed after calculating the full name, because this can be invalid in the very next moment.
You have to serialize operations, not property accesses. But atomicity only serializes accesses. So it does not help, if at least one operation has more than one access. 99,999999373 % of all cases.
Forget about property atomicity. It is meaningless.
There are no other concerns. Yes, any property that can be accessed on multiple threads should be atomic or you can end up with unexpected results.
As i know, volatile is usually used to prevent unexpected compile optimization during some hardware operations. But which scenes volatile should be declared in property definition puzzles me. Please give some representative examples.
Thx.
A compiler assumes that the only way a variable can change its value is through code that changes it.
int a = 24;
Now the compiler assumes that a is 24 until it sees any statement that changes the value of a. If you write code somewhere below above statement that says
int b = a + 3;
the compiler will say "I know what a is, it's 24! So b is 27. I don't have to write code to perform that calculation, I know that it will always be 27". The compiler may just optimize the whole calculation away.
But the compiler would be wrong in case a has changed between the assignment and the calculation. However, why would a do that? Why would a suddenly have a different value? It won't.
If a is a stack variable, it cannot change value, unless you pass a reference to it, e.g.
doSomething(&a);
The function doSomething has a pointer to a, which means it can change the value of a and after that line of code, a may not be 24 any longer. So if you write
int a = 24;
doSomething(&a);
int b = a + 3;
the compiler will not optimize the calculation away. Who knows what value a will have after doSomething? The compiler for sure doesn't.
Things get more tricky with global variables or instance variables of objects. These variables are not on stack, they are on heap and that means that different threads can have access to them.
// Global Scope
int a = 0;
void function ( ) {
a = 24;
b = a + 3;
}
Will b be 27? Most likely the answer is yes, but there is a tiny chance that some other thread has changed the value of a between these two lines of code and then it won't be 27. Does the compiler care? No. Why? Because C doesn't know anything about threads - at least it didn't used to (the latest C standard finally knows native threads, but all thread functionality before that was only API provided by the operating system and not native to C). So a C compiler will still assume that b is 27 and optimize the calculation away, which may lead to incorrect results.
And that's what volatile is good for. If you tag a variable volatile like that
volatile int a = 0;
you are basically telling the compiler: "The value of a may change at any time. No seriously, it may change out of the blue. You don't see it coming and *bang*, it has a different value!". For the compiler that means it must not assume that a has a certain value just because it used to have that value 1 pico-second ago and there was no code that seemed to have changed it. Doesn't matter. When accessing a, always read its current value.
Overuse of volatile prevents a lot of compiler optimizations, may slow down calculation code dramatically and very often people use volatile in situations where it isn't even necessary. For example, the compiler never makes value assumptions across memory barriers. What exactly a memory barrier is? Well, that's a bit far beyond the scope of my reply. You just need to know that typical synchronization constructs are memory barriers, e.g. locks, mutexes or semaphores, etc. Consider this code:
// Global Scope
int a = 0;
void function ( ) {
a = 24;
pthread_mutex_lock(m);
b = a + 3;
pthread_mutex_unlock(m);
}
pthread_mutex_lock is a memory barrier (pthread_mutex_unlock as well, by the way) and thus it's not necessary to declare a as volatile, the compiler will not make an assumption of the value of a across a memory barrier, never.
Objective-C is pretty much like C in all these aspects, after all it's just a C with extensions and a runtime. One thing to note is that atomic properties in Obj-C are memory barriers, so you don't need to declare properties volatile. If you access the property from multiple threads, declare it atomic, which is even default by the way (if you don't mark it nonatomic, it will be atomic). If you never access it from multiple thread, tagging it nonatomic will make access to that property a lot faster, but that only pays off if you access the property really a lot (a lot doesn't mean ten times a minute, it's rather several thousand times a second).
So you want Obj-C code, that requires volatile?
#implementation SomeObject {
volatile bool done;
}
- (void)someMethod {
done = false;
// Start some background task that performes an action
// and when it is done with that action, it sets `done` to true.
// ...
// Wait till the background task is done
while (!done) {
// Run the runloop for 10 ms, then check again
[[NSRunLoop currentRunLoop]
runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.01]
];
}
}
#end
Without volatile, the compiler may be dumb enough to assume, that done will never change here and replace !done simply with true. And while (true) is an endless loop that will never terminate.
I haven't tested that with modern compilers. Maybe the current version of clang is more intelligent than that. It may also depend on how you start the background task. If you dispatch a block, the compiler can actually easily see whether it changes done or not. If you pass a reference to done somewhere, the compiler knows that the receiver may the value of done and will not make any assumptions. But I tested exactly that code a long time ago when Apple was still using GCC 2.x and there not using volatile really caused an endless loop that never terminated (yet only in release builds with optimizations enabled, not in debug builds). So I would not rely on the compiler being clever enough to do it right.
Just some more fun facts about memory barriers:
If you ever had a look at the atomic operations that Apple offers in <libkern/OSAtomic.h>, then you might have wondered why every operation exists twice: Once as x and once as xBarrier (e.g. OSAtomicAdd32 and OSAtomicAdd32Barrier). Well, now you finally know it. The one with "Barrier" in its name is a memory barrier, the other one isn't.
Memory barriers are not just for compilers, they are also for CPUs (there exists CPU instructions, that are considered memory barriers while normal instructions are not). The CPU needs to know these barriers because CPUs like to reorder instructions to perform operations out of order. E.g. if you do
a = x + 3 // (1)
b = y * 5 // (2)
c = a + b // (3)
and the pipeline for additions is busy, but the pipeline for multiplication is not, the CPU may perform instruction (2) before (1), after all the order won't matter in the end. This prevents a pipeline stall. Also the CPU is clever enough to know that it cannot perform (3) before either (1) or (2) because the result of (3) depends on the results of the other two calculations.
Yet, certain kinds of order changes will break the code, or the intention of the programmer. Consider this example:
x = y + z // (1)
a = 1 // (2)
The addition pipe might be busy, so why not just perform (2) before (1)? They don't depend on each other, the order shouldn't matter, right? Well, it depends. Consider another thread monitors a for changes and as soon as a becomes 1, it reads the value of x, which should now be y+z if the instructions were performed in order. Yet if the CPU reordered them, then x will have whatever value it used to have before getting to this code and this makes a difference as the other thread will now work with a different value, not the value the programmer would have expected.
So in this case the order will matter and that's why barriers are needed also for CPUs: CPUs don't order instructions across such barriers and thus instruction (2) would need to be a barrier instruction (or there needs to be such an instruction between (1) and (2); that depends on the CPU). However, reordering instructions is only performed by modern CPUs, a much older problem are delayed memory writes. If a CPU delays memory writes (very common for some CPUs, as memory access is horribly slow for a CPU), it will make sure that all delayed writes are performed and have completed before a memory barrier is crossed, so all memory is in a correct state in case another thread might now access it (and now you also know where the name "memory barrier" actually comes from).
You are probably working a lot more with memory barriers than you are even aware of (GCD - Grand Central Dispatch is full of these and NSOperation/NSOperationQueue bases on GCD), that's why your really need to use volatile only in very rare, exceptional cases. You might get away writing 100 apps and never have to use it even once. However, if you write a lot low level, multi-threading code that aims to achieve maximum performance possible, you will sooner or later run into a situation where only volatile can grantee you correct behavior; not using it in such a situation will lead to strange bugs where loops don't seem to terminate or variables simply seem to have incorrect values and you find no explanation for that. If you run into bugs like these, especially if you only see them in release builds, you might miss a volatile or a memory barrier somewhere in your code.
A good explanation is given here: Understanding “volatile” qualifier in C
The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler.
Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object. So the simple question is, how can value of a variable change in such a way that compiler cannot predict. Consider the following cases for answer to this question.
1) Global variables modified by an interrupt service routine outside the scope: For example, a global variable can represent a data port (usually global pointer referred as memory mapped IO) which will be updated dynamically. The code reading data port must be declared as volatile in order to fetch latest data available at the port. Failing to declare variable as volatile, the compiler will optimize the code in such a way that it will read the port only once and keeps using the same value in a temporary register to speed up the program (speed optimization). In general, an ISR used to update these data port when there is an interrupt due to availability of new data
2) Global variables within a multi-threaded application: There are multiple ways for threads communication, viz, message passing, shared memory, mail boxes, etc. A global variable is weak form of shared memory. When two threads sharing information via global variable, they need to be qualified with volatile. Since threads run asynchronously, any update of global variable due to one thread should be fetched freshly by another consumer thread. Compiler can read the global variable and can place them in temporary variable of current thread context. To nullify the effect of compiler optimizations, such global variables to be qualified as volatile
If we do not use volatile qualifier, the following problems may arise
1) Code may not work as expected when optimization is turned on.
2) Code may not work as expected when interrupts are enabled and used.
volatile comes from C. Type "C language volatile" into your favourite search engine (some of the results will probably come from SO), or read a book on C programming. There are plenty of examples out there.
I am working on CoreImage on IOS, you guys know CIContext and CIImage are immutable so they can be shared on threads. The problem is i am not sure why objects' mutability is closely relevant to its thread safe.
I can guess the rough reason is to prevent multiple threads from do something on a certain object at the same time. But can anybody provide some theoretical evidence or give a specific answer to it?
You are correct, mutable objects are not thread safe because multiple threads can write to that data at the same time. This is in contrast to reading data, an operation multiple threads can do simultaneously without causing problems. Immutable types can only be read.
However, when multiple threads are writing to the same data, these operations may interfere. For example, an NSMutableArray which two threads are editing could easily be corrupted. One thread is editing at one location, suddenly changing the memory the other thread was updating. iOS will use what is called an atomic operation for simple data types. What this means is that iOS requires the edit operation to fully finish before anything else can happen. This has an efficiency advantage over locks. Just Google about this stuff if you want to know more.
Well, actually, you are right.
If you have an immutable objects, all you can do is to read data from them. And every thread will get the same data, since the object can not be changed.
When you have a mutable object, the problem is that (in general) read and write operations are not atomic. It means that they are not performed instantaneously, they take time. So they can overlap each other. Even if we have a single-core processor, it can switch between threads at arbitrary moments of time. Possible problems it might cause:
1) Two threads write at the same object simultaneously. In this case the object might become corrupted (for instance, half of data comes from the first thread, the other half – from the second, the result is unpredictable.
2) One thread writes the data, another one reads it. One problem is that thread might read already outdated data. Another one is that it might read a corrupted data (if the writing thread haven't finished writing yet).
Take the example with an array of values. Let's say I'm doing some computation and I find the count of the array to be 4
var array = [1,2,3,4]
let count = array.count
Now maybe I want to loop through all the elements in the array, so I setup a loop that goes through index i<4 or something along those lines. So far so good.
This array is mutable though, so on a separate thread, I could easily remove an element. So perhaps I'm on thread 1 and I get the count to be 4, I perhaps start looping through the array. Now we switch to thread 2 and I remove an element, so now this same array has only 3 values in it. If I end up going back to thread 1 and I still assume I have 4 values while looping through the array, my program is going to crash when it tries to access the 4th element.
This is why immutability is desirable, it can guarantee some level of consistency across threads.