I have encountered a data race in my app using Xcode's Thread Sanitizer and I have a question on how to address it.
I have a var defined as:
var myDict = [Double : [Date:[String:Any]]]()
I have a thread setup where I call a setup() function:
let queue = DispatchQueue(label: "my-queue", qos: .utility)
queue.async {
self.setup {
}
}
My setup() function essentially loops through tons of data and populates myDict. This can take a while, which is why we need to do it asynchronously.
On the main thread, my UI accesses myDict to display its data. In a cellForRow: method:
if !myDict.keys.contains(someObject) {
//Do something
}
And that is where I get my data race alert and the subsequent crash.
Exception NSException * "-[_NSCoreDataTaggedObjectID objectForKey:]:
unrecognized selector sent to instance
0x8000000000000000" 0x0000000283df6a60
Please kindly help me understand how to access a variable in a thread safe manner in Swift. I feel like I'm possibly half way there with setting, but I'm confused on how to approach getting on the main thread.
One way to access it asynchronously:
typealias Dict = [Double : [Date:[String:Any]]]
var myDict = Dict()
func getMyDict(f: #escaping (Dict) -> ()) {
queue.async {
DispatchQueue.main.async {
f(myDict)
}
}
}
getMyDict { dict in
assert(Thread.isMainThread)
}
Making the assumption, that queue possibly schedules long lasting closures.
How it works?
You can only access myDict from within queue. In the above function, myDict will be accessed on this queue, and a copy of it gets imported to the main queue. While you are showing the copy of myDict in a UI, you can simultaneously mutate the original myDict. "Copy on write" semantics on Dictionary ensures that copies are cheap.
You can call getMyDict from any thread and it will always call the closure on the main thread (in this implementation).
Caveat:
getMyDict is an async function. Which shouldn't be a caveat at all nowadays, but I just want to emphasise this ;)
Alternatives:
Swift Combine. Make myDict a published Value from some Publisher which implements your logic.
later, you may also consider to use async & await when it is available.
Preface: This will be a pretty long non-answer. I don't actually know what's wrong with your code, but I can share the things I do know that can help you troubleshoot it, and learn some interesting things along the way.
Understanding the error
Exception NSException * "-[_NSCoreDataTaggedObjectID objectForKey:]: unrecognized selector sent to instance 0x8000000000000000"
An Objective C exception was thrown (and not caught).
The exception happened when attempting to invoke -[_NSCoreDataTaggedObjectID objectForKey:]. This is a conventional way to refer to an Objective C method in writing. In this case, it's:
An instance method (hence the -, rather than a + that would be used for class methods)
On the class _NSCoreDataTaggedObjectID (more on this later)
On the method named objectForKey:
The object receiving this method invocation is the one with address 0x8000000000000000.
This is a pretty weird address. Something is up.
Another hint is the strange class name of _NSCoreDataTaggedObjectID. There's a few observations we can make about it:
The prefixed _NS suggests that it's an internal implementation detail of CoreData.
We google the name to find class dumps of the CoreData framework, which show us that:
_NSCoreDataTaggedObjectID subclasses _NSScalarObjectID
Which subclasses _NSCoreManagedObjectID
Which subclasses NSManagedObjectID
NSManagedObjectID is a public API, which has its own first-party documentation.
It has the word "tagged" in its name, which has a special meaning in the Objective C world.
Some back story
Objective C used message passing as its sole mechanism for method dispatch (unlike Swift which usually prefers static and v-table dispatch, depending on the context). Every method call you wrote was essentially syntactic sugar overtop of objc_msgSend (and its variants), passing to it the receiver object, the selector (the "name" of the method being invoked) and the arguments. This was a special function that would do the job of checking the class of the receiver object, and looking through that classes' hierarchy until it found a method implementation for the desired selector.
This was great, because it allows you to do a lot of cool runtime dynamic behaviour. For example, menu bar items on a macOS app would just define the method name they invoke. Clicking on them would "send that message" to the responder chain, which would invoke that method on the first object that had an implementation for it (the lingo is "the first object that answers to that message").
This works really well, but has several trade-offs. One of them was that everything had to be an object. And by object, we mean a heap-allocated memory region, whose first several words of memory stored meta-data for the object. This meta-data would contain a pointer to the class of the object, which was necessary for doing the method-loopup process in objc_msgSend as I just described.
The issue is, that for small objects, (particularly NSNumber values, small strings, empty arrays, etc.) the overhead of these several words of object meta-data might be several times bigger than the actual object data you're interested in. E.g. even though NSNumber(value: true /* or false */) stores a single bit of "useful" data, on 64 bit systems there would be 128 bits of object overhead. Add to that all the malloc/free and retain/release overhead associated with dealing with large numbers of tiny object, and you got a real performance issue.
"Tagged pointers" were a solution to this problem. The idea is that for small enough values of particular privileged classes, we won't allocate heap memory for their objects. Instead, we'll store their objects' data directly in their pointer representation. Of course, we would need a way to know if a given pointer is a real pointer (that points to a real heap-allocated object), or a "fake pointer" that encodes data inline.
The key realization that malloc only ever returns memory aligned to 16-byte boundaries. This means that 4 bits of every memory address were always 0 (if they weren't, then it wouldn't have been 16-byte aligned). These "unused" 4 bits could be employed to discriminate real pointers from tagged pointers. Exactly which bits are used and how differs between process architectures and runtime versions, but the general idea is the same.
If a pointer value had 0000 for those 4 bits then the system would know it's a real object pointer that points to a real heap-allocated object. All other possible values of those 4-bit values could be used to signal what kind of data is stored in the remaining bits. The Objective C runtime is actually opensource, so you can actually see the tagged pointer classes and their tags:
{
// 60-bit payloads
OBJC_TAG_NSAtom = 0,
OBJC_TAG_1 = 1,
OBJC_TAG_NSString = 2,
OBJC_TAG_NSNumber = 3,
OBJC_TAG_NSIndexPath = 4,
OBJC_TAG_NSManagedObjectID = 5,
OBJC_TAG_NSDate = 6,
// 60-bit reserved
OBJC_TAG_RESERVED_7 = 7,
// 52-bit payloads
OBJC_TAG_Photos_1 = 8,
OBJC_TAG_Photos_2 = 9,
OBJC_TAG_Photos_3 = 10,
OBJC_TAG_Photos_4 = 11,
OBJC_TAG_XPC_1 = 12,
OBJC_TAG_XPC_2 = 13,
OBJC_TAG_XPC_3 = 14,
OBJC_TAG_XPC_4 = 15,
OBJC_TAG_NSColor = 16,
OBJC_TAG_UIColor = 17,
OBJC_TAG_CGColor = 18,
OBJC_TAG_NSIndexSet = 19,
OBJC_TAG_NSMethodSignature = 20,
OBJC_TAG_UTTypeRecord = 21,
// When using the split tagged pointer representation
// (OBJC_SPLIT_TAGGED_POINTERS), this is the first tag where
// the tag and payload are unobfuscated. All tags from here to
// OBJC_TAG_Last52BitPayload are unobfuscated. The shared cache
// builder is able to construct these as long as the low bit is
// not set (i.e. even-numbered tags).
OBJC_TAG_FirstUnobfuscatedSplitTag = 136, // 128 + 8, first ext tag with high bit set
OBJC_TAG_Constant_CFString = 136,
OBJC_TAG_First60BitPayload = 0,
OBJC_TAG_Last60BitPayload = 6,
OBJC_TAG_First52BitPayload = 8,
OBJC_TAG_Last52BitPayload = 263,
OBJC_TAG_RESERVED_264 = 264
You can see, strings, index paths, dates, and other similar "small and numerous" classes all have reserved pointer tag values. For each of these "normal classes" (NSString, NSDate, NSNumber, etc.), there's a special internal subclass which implements all the same public API, but using a tagged pointer instead of a regular object.
As you can see, there's a value for OBJC_TAG_NSManagedObjectID. It turns out, that NSManagedObjectID objects were numerous and small enough that they would benefit greatly for this tagged-pointer representation. After all, the value of NSManagedObjectID might be a single integer, much like NSNumber, which would be wasteful to heap-allocate.
If you'd like to learn more about tagged pointers, I'd recommend Mike Ash's writings, such as https://www.mikeash.com/pyblog/friday-qa-2012-07-27-lets-build-tagged-pointers.html
There was also a recent WWDC talk on the subject: WWDC 2020 - Advancements in the Objective-C runtime
The strange address
So in the previous section we found out that _NSCoreDataTaggedObjectID is the tagged-pointer subclass of NSManagedObjectID. Now we can notice something else that's strange, the pointer value we saw had a lot of zeros: 0x8000000000000000. So what we're dealing with is probably some kind of uninitialized-state of an object.
Conclusion
The call stack can shed further light on where this happens exactly, but what we know is that somewhere in your program, the objectForKey: method is being invoked on an uninitialized value of NSManagedObjectID.
You're probably accessing a value too-early, before it's properly initialized.
To work around this you can take one of several approaches:
A future ideal world, use would just use the structured concurrency of Swift 5.5 (once that's available on enough devices) and async/await to push the work on the background and await the result.
Use a completion handler to invoke your value-consuming code only after the value is ready. This is most immediately-easy, but will blow up your code base with completion handler boilerplate and bugs.
Use a concurrency abstraction library, like Combine, RxSwift, or PromiseKit. This will be a bit more work to set up, but usually leads to much clearer/safer code than throwing completion handlers in everywhere.
The basic pattern to achieve thread safety is to never mutate/access the same property from multiple threads at the same time. The simplest solution is to just never let any background queue interact with your property directly. So, create a local variable that the background queue will use, and then dispatch the updating of the property to the main queue.
Personally, I wouldn't have setup interact with myDict at all, but rather return the result via the completion handler, e.g.
// properties
var myDict = ...
private let queue = DispatchQueue(label: "my-queue", qos: .utility) // some background queue on which we'll recalculate what will eventually be used to update `myProperty`
// method doesn't reference `myDict` at all, but uses local var only
func setup(completion: #escaping (Foo) -> Void) {
queue.async {
var results = ... // some local variable that we'll use as we're building up our results
// do time-consuming population of `results` here;
// do not touch `myDict` here, though;
// when all done, dispatch update of `myDict` back to the main queue
DispatchQueue.main.async { // dispatch update of property back to the main queue
completion(results)
}
}
}
Then the routine that calls setup can update the property (and trigger necessary UI update, too).
setup { results in
self.myDict = results
// also trigger UI update here, too
}
(Note, your closure parameter type (Foo in my example) would be whatever type myDict is. Maybe a typealias as advised elsewhere, or better, use custom types rather than dictionary within dictionary within dictionary. Use whatever type you’d prefer.)
By the way, your question’s title and preamble talks about TSAN and thread safety, but you then share a “unrecognized selector” exception, which is a completely different issue. So, you may well have two completely separate issues going on. A TSAN data race error would have produced a very different message. (Something like the error I show here.) Now, if setup is mutating myDict from a background thread, that undoubtedly will lead to thread-safety problems, but your reported exception suggests there might also be some other problem, too...
How can I detect if malloc fails in swift?
The end goal is to simply allocate the required amount of space, and if ios can not allocate it, report this elegantly to the user (instead of being terminated).
When I try the code below, the pointer is never nil and errno is always 0.
let pointer : UnsafeMutableRawPointer? = malloc(fileSize)
print("errno = \(errno)")
if (pointer == nil) {
print("Malloc failed")
}
Why are you using malloc in Swift, at all?
let pointer = UnsafeMutablePointer<UInt8>.allocate(capacity: fileSize)
More to the point, under what circumstance does reading a file need you to manually allocate memory like this? 🤔 Foundation provides APIs for reading files directly into Data.
LLVM has an optimization which elides unused malloc calls. If your test code never actually needs the allocation, it will never be performed.
I wanted to see if you could pass struct through the stack and I manage to get a local var from a void function in another void function.
Do you guys thinks there is any use to that and is there any chance you can get corrupted data between the two function call ?
Here's the Code in C (I know it's dirty)
#include <stdio.h>
typedef struct pouet
{
int a,b,c;
char d;
char * e;
}Pouet;
void test1()
{
Pouet p1;
p1.a = 1;
p1.b = 2;
p1.c = 3;
p1.d = 'a';
p1.e = "1234567890";
printf("Declared struct : %d %d %d %c \'%s\'\n", p1.a, p1.b, p1.c, p1.d, p1.e);
}
void test2()
{
Pouet p2;
printf("Element of struct undeclared : %d %d %d %c \'%s\'\n", p2.a, p2.b, p2.c, p2.d, p2.e);
p2.a++;
}
int main()
{
test1();
test2();
test2();
return 0;
}
Output is :
Declared struct : 1 2 3 a '1234567890'
Element of struct undeclared : 1 2 3 a '1234567890'
Element of struct undeclared : 2 2 3 a '1234567890'
Contrary to the opinion of the majority, I think it can work out in most of the cases (not that you should rely on it, though).
Let's check it out. First you call test1, and it gets a new stack frame: the stack pointer which signifies the top of the stack goes up. On that stack frame, besides other things, memory for your struct (exactly the size of sizeof(struct pouet)) is reserved and then initialized. What happens when test1 returns? Does its stack frame, along with your memory, get destroyed?
Quite the opposite. It stays on the stack. However, the stack pointer drops below it, back into the calling function. You see, this is quite a simple operation, it's just a matter of changing the stack pointer's value. I doubt there is any technology that clears a stack frame when it is disposed. It's just too costy a thing to do!
What happens then? Well, you call test2. All it stores on the stack is just another instance of struct pouet, which means that its stack frame will most probably be exactly the same size as that of test1. This also means that test2 will reserve the memory that previously contained your initialized struct pouet for its own variable Pouet p2, since both variables should most probably have the same positions relative to the beginning of the stack frame. Which in turn means that it will be initialized to the same value.
However, this setup is not something to be relied upon. Even with concerns about non-standartized behaviour aside, it's bound to be broken by something as simple as a call to a different function between the calls to test1 and test2, or test1 and test2 having stack frames of different sizes.
Also, you should take compiler optimizations into account, which could break things too. However, the more similar your functions are, the less chances there are that they will receive different optimization treatment.
Of course there's a chance you can get corrupted data; you're using undefined behavior.
What you have is undefined behavior.
printf("Element of struct undeclared : %d %d %d %c \'%s\'\n", p2.a, p2.b, p2.c, p2.d, p2.e);
The scope of the variable p2 is local to function test2() and as soon as you exit the function the variable is no more valid.
You are accessing uninitialized variables which will lead to undefined behavior.
The output what you see is not guaranteed at all times and on all platforms. So you need to get rid of the undefined behavior in your code.
The data may or may not appear in test2. It depends on exactly how the program was compiled. It's more likely to work in a toy example like yours than in a real program, and it's more likely to work if you turn off compiler optimizations.
The language definition says that the local variable ceases to exist at the end of the function. Attempting to read the address where you think it was stored may or may produce a result; it could even crash the program, or make it execute some completely unexpected code. It's undefined behavior.
For example, the compiler might decide to put a variable in registers in one function but not in the other, breaking the alignment of variables on the stack. It can even do that with a big struct, splitting it into several registers and some stack — as long as you don't take the address of the struct it doesn't need to exist as an addressable chunk of memory. The compiler might write a stack canary on top of one of the variables. These are just possibilities at the top of my head.
C lets you see a lot behind the scenes. A lot of what you see behind the scenes can completely change from one production compilation or run to the next.
Understanding what's going on here is useful as a debugging skill, to understand where values that you see in a debugger might be coming from. As a programming technique, this is useless since you aren't making the computer accomplish any particular result.
Just because this works for one compiler doesn't mean that it will for all. How uninitialized variables are handled is undefined and one computer could very well init pointers to null etc without breaking any rules.
So don't do this or rely on it. I have actually seen code that depended on functionality in mysql that was a bug. When that was fixed in later versions the program stopped working. My thoughts about the designer of that system I'll keep to myself.
In short, never rely on functionality that is not defined. If you knowingly use it for a specific function and you are prepared that an update to the compiler etc can break it and you keep an eye out for this at all times it might be something you could explain and live with. But most of the time this is far from a good idea.
If not, what is the standard way to free up cudaMalloced memory when an exception is thrown? (Note that I am unable to use Thrust.)
You can use RAII idiom and put your cudaMalloc() and cudaFree() calls to the constructor and destructor of your object respectively.
Once the exception is thrown your destructor will be called which will free the allocated memory.
If you wrap this object into a smart-pointer (or make it behave like a pointer) you will get your CUDA smart-pointer.
You can use this custom cuda::shared_ptr implementation. As mentioned above, this implementation uses std::shared_ptr as a wrapper for CUDA device memory.
Usage Example:
std::shared_ptr<T[]> data_host = std::shared_ptr<T[]>(new T[n]);
.
.
.
// In host code:
fun::cuda::shared_ptr<T> data_dev;
data_dev->upload(data_host.get(), n);
// In .cu file:
// data_dev.data() points to device memory which contains data_host;
This repository is indeed a single header file (cudasharedptr.h), so it will be easy to manipulate it if is necessary for your application.
This function is in a loop. When I run the program, the line with IntPtr is giving me memory problems, I've put delete[], but it still doesn't solve the memory problem, can anyone help please? thanks
void showImage(IplImage *img,System::Windows::Forms::PictureBox^ picturebox)
{
IntPtr ip(new unsigned char[img->widthStep*img->height]); // this line causing memory usage to keep going up very fast
//memcpy(ip.ToPointer(),img->imageData,img->widthStep*img->height);
//picturebox->Image = gcnew Bitmap(img->width,img->height, img->widthStep, System:rawing::Imaging::PixelFormat::Format24bppRgb, ip);
delete[] ip;
}
This is C++\CLI
It is rather sad that this code compiles, but that is by design. The delete operator applied to a managed type doesn't actually free any memory. It calls the IDisposable::Dispose() method on the passed object. It is rather sad that this even works, the IntPtr gets boxed to turn it into an object and then checked to see if it implements the IDisposable interface. It doesn't of course, nothing happens.
You have to pass the pointer that you got back from the new operator. Don't forget to do this in a finally block so an exception cannot cause a leak.
Btw, there are more complications in the code that you commented. The Bitmap constructor you use requires you to keep the IntPtr valid, you cannot release the memory until the Bitmap is no longer used. So using delete isn't actually valid. Consider using Bitmap.LockBits() instead to get a pointer to a Bitmap that manages its own memory. And watch out for stride.