I read a code, it checks data and update UI every second. It sounds like what we usually do using NSTimer scheduledtimerwithtimeinterval. But this code is implemented with recursively calling dispatch_after like this:
- (void) retriggerMethod {
... do stuff here, assuming you want to do it on first invocation ...
dispatch_after( ..., ^{
[self retriggerMethod];
});
}
What's the difference between dispatch_after recursion and NSTimer scheduledtimerwithtimeinterval ? Is there potential risk when using the former? Cos I thought when you use it, the call stack would grow as long as not end this recursion.
NSTimer:
1. Need a NSRunloop.
2. Can repeat.
3. Can be invalid anytime if u want to cancel.
4. Can only run with delegate.
5. High level API.
dispatch_after:
1. Can be run everywhere that u want with dispatch_queue.
2. Can't repeat by itself.
3. Can't be cancel.
4. It run as block.
5. GCD.
You actually don't have any recursion in your code. Using dispatch_after in your case behaves similar to using a NSTimer.
With dispatch_after, the next call to your retriggerMethod originates from GCD which executes the enqueued block. So you don't have a call within a call - no recursion here.
So in terms of recursion, there's no risk. Though in this case it seems more reasonable to use a NSTimer which would be easier to cancel if necessary.
This is really late, but a couple additional considerations now:
cancelling difference is not a huge deal
dispatch_after does accepts workItems which can be cancelled about as easily as a timer can.
time slipping
For a high number of repetitions dispach_after recursion could slip in accuracy by a lot. Each slip is compounded into the next recursion, since the next recursion is set only after executing the previous recursion. Say for example dispatch_after may slip up to 10ms. That means after 100 recursions, it may have slipped up to 1 second.
In comparison, while each tick of the NSTimer may also be up to 10ms off (for example), later ticks won't ever slip more than 10ms that since each tick time is relative to the original scheduled time.
Related
How should I implement a lock/unlock sequence with Compare and Swap using a Metal compute shader.
I’ve tested this sample code but it does not seem to work. For some reason, the threads are not detecting that the lock was released.
Here is a brief explanation of the code below:
The depthFlag is an array of atomic_bools. In this simple example, I simply try to do a lock by comparing the content of depthFlag[1]. I then go ahead and do my operation and once the operation is done, I do an unlock.
As stated above, only one thread is able to do the locking/work/unlocking but the rest of the threads get stuck in the while loop. They NEVER leave. I expect another thread to detect the unlock and go through the sequence.
What am I doing wrong? My knowledge on CAS is limited, so I appreciate any tips.
kernel void testFunction(device float *depthBuffer[[buffer(4)]], device atomic_bool *depthFlag [[buffer(5)]], uint index[[thread_position_in_grid]]){
//lock
bool expected=false;
while(!atomic_compare_exchange_weak_explicit(&depthFlag[1],&expected,true,memory_order_relaxed,memory_order_relaxed)){
//wait
expected=false;
}
//Do my operation here
//unlock
atomic_store_explicit(&depthFlag[1], false, memory_order_relaxed);
//barrier
}
You essentially can't use the locking programming model for GPU concurrency. For one, the relaxed memory order model (the only one available) is not suitable for this; for another, you can't guarantee that other threads will make progress between your atomic operations. Your code must always be able to make progress, regardless of what the other threads are doing.
My recommendation is that you use something like the following model instead:
Read atomic value to check if another thread has already completed the operation in question.
If no other thread has done it yet, perform the operation. (But don't cause any side effects, i.e. don't write to device memory.)
Perform an atomic operation to indicate your thread has completed the operation while checking whether another thread got there first. (e.g. compare-and-swap a boolean, but increasing a counter also works)
If another thread got there first, don't perform side effects.
If your thread "won" and no other thread registered completion, perform your operation's side effects, e.g. do whatever you need to do to write out the result etc.
This works well if there's not much competition, and if the result does not vary depending on which thread performs the operation.
The occasional discarded work should not matter. If there is significant competition, use thread groups; within a thread group, the threads can coordinate which thread will perform which operation. You may still end up with wasted computation from competition between groups. If this is a problem, you may need to change your approach more fundamentally.
If the results of the operation are not deterministic, and the threads all need to proceed using the same result, you will need to change your approach. For example, split your kernels up so any computation which depends on the result of the operation in question runs in a sequentially queued kernel.
I'm trying to create a label that shows the time on a NSTimer, however my problem is that the interval on the timer isn't 1.0 and so there is no way to update the label every second like I would like to be able to. I have tried to synchronize two timers, however that is proving to be a challenge. So, Is there a way to get secondary updates or synchronize two timers of different interval?
I'm reading your question as follows:
I have a timer that needs to fire once per 5 seconds, but I would like to tell the user how many seconds remain until the timer fires.
The simplest way that I can think of is to make an intermediary method that the first timer will call. You would change:
timer -> METHOD_A
to
timer -> METHOD_B -> METHOD_A
the timer could then be set to update every .1 seconds, and METHOD_B could keep track of the time and just call METHOD_A when 5 seconds have passed since the previous call.
For what it's worth though I don't think that NSTimer will slip, so when you are updating the time you are probably doing something like time=time+interval, where it might make more sense to do (currentTime-startingTime)%interval, and then the synchronization shouldn't be a problem
Can NSTimer be used to fire a series of events. For instance for effect:
Its kick off Click start to toss
create random number
wait 5 seconds show result
wait 3 seconds start the match?
You can use it to repeat at a given interval, but not a variable one. If you really wanted to wait 5 seconds and then then wait another 3 seconds you'd probably want to chain timers. So, when the first timer fires and calls a message, that message creates a second timer with a different time interval.
This is actually a case where the Prototype Pattern would apply: make an NSTimer and set it up with all the properties you want, and then clone that object each time you need to make another. Or you could just make a factory. Objective C does not have a clone, but the NSCoding protocol is actually a workable and proper way of doing cloning (unlike Java's broken (and abandoned) clone interface).
epoll_wait, select and poll functions all provide a timeout. However with epoll, it's at a large resolution of 1ms. Select & ppoll are the only one providing sub-millisecond timeout.
That would mean doing other things at 1ms intervals at best. I could do a lot of other things within 1ms on a modern CPU.
So to do other things more often than 1ms I actually have to provide a timeout of zero (essentially disabling it). And I'd probably add my own usleep somewhere in the main loop to stop it chewing up too much CPU.
So the question is, why is the timeout in milli's when I would think clearly there is a case for a higher resolution timeout.
Since you are on Linux, instead of providing a zero timeout value and manually usleeeping in the loop body, you could simply use the timerfd API. This essentially lets you create a timer (with a resolution finer than 1ms) associated with a file descriptor, which you can add to the set of monitored descriptors.
The epoll_wait interface just inherited a timeout measured in milliseconds from poll. While it doesn't make sense to poll for less than a millisecond, because of the overhead of adding the calling thread to all the wait sets, it does make sense for epoll_wait. A call to epoll_wait doesn't require ever putting the calling thread onto more than one wait set, the calling overhead is very low, and it could make sense, on very rare occasions, to block for less than a millisecond.
I'd recommend just using a timing thread. Most of what you would want to do can just be done in that timing thread, so you won't need to break out of epoll_wait. If you do need to make a thread return from epoll_wait, just send a byte to a pipe that thread is polling and the wait will terminate.
In Linux 5.11, an epoll_pwait2 API has been added, which uses a struct timespec as timeout. This means you can now wait using nanoseconds precision.
I've tried every possible fields but can not find the number of times functions are called.
Besides, I don't get Self and # Self. What do these two numbers mean?
There are several other ways to accomplish this. One is obviously to create a static hit counter and an NSLog that emits and increments a counter. This is intrusive though and I found a way to do this with lldb.
Set a breakpoint
Execute the program until you hit the breakpoint the first time and note the breakpoint number on the right hand side of the line you hit (e.g. "Thread 1: breakpoint 7.1", note the 7.1)
Context click on the breakpoint and choose "Edit Breakpoint"
Leave condition blank and choose "Add Action"
Choose "Debugger Command"
In the command box, enter "breakpoint list 7.1" (using the breakpoint number for your breakpoint from step 2). I believe you can use "info break " if you are using gdb.
Check Options "Automatically Continue after evaluating"
Continue
Now, instead of stopping, llvm will emit info about the breakpoint including the number of times it has been passed.
As for the discussion between Glenn and Mike on the previous answer, I'll describe a performance problem where function execution count was useful: I had a particular action in my app where performance degraded considerably with each execution of the action. The Instruments time profiler showed that each time the action was executed, a particular function was taking twice as long as the time before until quickly the app would hang if the action was performed repeatedly. With the count, I was able to determine that with each execution, the function was called twice as many times as it was during the previous execution. It was then pretty easy to look for the reason, which turned out to be that someone was re-registering for a notification in NotificationCenter on each event execution. This had the effect of doubling the number of response handler calls on each execution and thus doubling the "cost" of the function each time. Knowing that it was doubling because it was called twice as many times and not because the performance was just getting worse caused me to look at the calling sequence rather than for reasons the function itself could be degrading over time.
While it's interesting, knowing the number of times called doesn't have anything to do with how much time is spent in them. Which is what Time Profiler is all about. In fact, since it does sampling, it cannot answer how many times.
It seems you cannot use Time Profiler for counting function calls. This question seems to address potential methods for counting.
W/ respect to self and #self:
Self is "The number of times the symbol calls itself." according to the Apple Docs on the Time Profiler.
From the way the numbers look though, it seems self is the summed duration of samples that had this symbol at the bottom of its stack trace. That would make:
# self: the number of samples where this symbol was at the bottom of the stack trace
% self: the percent of self samples relative to total samples of currently displayed call tree
(eg - #self / total samples).
So this wouldn't tell you how many times a method was called. But it would give you an idea how much time is spent in a method or lower in the call tree.
NOTE: I too am unsure about the various 'self' meanings though. Would love to see someone answer this authoritatively. Arrived here searching for that...
IF your objective is to find out what you need to fix to make the program as fast as possible,
Number of calls and self time may be interesting but are irrelevant.
Look at my answer to this question, in particular points 6 and 8.
EDIT: To clarify the point further, suppose the following is the timeline of execution of the program. Some of that time (in this case about 50%) is spent in an activity that can be removed, if you know what it is, such as needless buried I/O, excessive calls to new, runaway notifications, or "insignificant" data validation. If a random-time sample is taken, it has a 50% chance of occurring in that activity, and an examination of the call stack and/or program variables shows that it is doing something that can be removed. Then, if 10 such samples are taken, the activity will be seen on roughly 5 of them, regardless of whether the activity occurs in a few large chunks of time, or many small ones. The activity may be a few lines of code in a function doing something unnecessary, or it may be something much more generalized. Regardless, you recognize it, fix it, and get roughly a factor of 2 speedup. Call counts and self time contribute nothing to this process.