Does the following function block on its running core?
wait(Sec) ->
receive
after (1000 * Sec) -> ok
end.
A great answer will detail the internal working of Erlang and/or the CPU.
The process which executes that code will block, the scheduler which runs that process currently will not block. The code you posted is equal to a yield, but with a timeout.
The Erlang VM scheduler for that core will continue to execute other processes until that timeout fires and that process will be scheduled for execution again.
Short answer: this will block only current (lightweight) process, and will not block all VM. For more details you must read about erlang scheduler. Nice description comes from book "Concurent Programming" by Francesco Cesarini and Simon Thompson.
...snip...
When a process is dispatched, it is assigned a number of reductionsâ€
it is allowed to execute, a number which is reduced for every
operation executed. As soon as the process enters a receive clause
where none of the messages matches or its reduction count reaches
zero, it is preempted. As long as BIFs are not being executed, this
strategy results in a fair (but not equal) allocation of execution
time among the processes.
...snip...
nothing Erlang-specific, pretty classical problem: timeouts can only happen on a system clock interrupt. Same answer as above: that process is blocked waiting for the clock interrupt, everything else is working just fine.
There is another discussion about the actual time that process is going to wait which is not that precise exactly because it depends on the clock period (and that's system dependent) but that's another topic.
Related
I'm writing an Erlang application that requires actively polling some remote resources, and I want the process that does the polling to fit into the OTP supervision trees and support all the standard facilities like proper termination, hot code reloading, etc.
However, the two default behaviours, gen_server and gen_fsm seem to only support operation based on callbacks. I could abuse gen_server to do that through calls to self or abuse gen_fsm by having a single state that always loops to itself with a timeout 0, but I'm not sure that's safe (i.e. doesn't exhaust the stack or accumulate unread messages in the mailbox).
I could make my process into a special process and write all that handling myself, but that effectively makes me reimplement the Erlang equivalent of the wheel.
So is there a behavior for code like this?
loop(State) ->
do_stuff(State), % without waiting to be called
loop(NewState).
And if not, is there a safe way to trick default behaviours into doing this without exhausting the stack or accumulating messages over time or something?
The standard way of doing that in Erlang is by using erlang:send_after/3. See this SO answer and also this example implementation.
Is it possible that you could employ an essentially non OTP compliant process? Although to be a good OTP citizen, you do ideally want to make your long running processes into gen_server's and gen_fsm's, sometimes you have to look beyond the standard issue rule book and consider why the rules exist.
What if, for example, your supervisor starts your gen_server, and your gen_server spawns another process (lets call it the active_poll process), and they link to each other so that they have shared fate (if one dies the other dies). The active_poll process is now indirectly supervised by the supervisor that spawned the gen_server, because if it dies, so will the gen_server, and they will both get restarted. The only problem you really have to solve now is code upgrade, but this is not too difficult - your gen_server gets a code_change callback call when the code is to be upgraded, and it could simply send a message to the active_poll process, which can make an appropriate fully qualified function call, and bingo, it's running the new code.
If this doesn't suit you for some reason and/or you MUST use gen_server/gen_fsm/similar directly...
I'm not sure that writing a 'special process' really gives you very much. If you wrote a special process correctly, such that it is in theory compliant to OTP design principals, it could still be ineffective in practice if it blocks or busy waits in a loop somewhere, and doesn't invoke sys when it should, so you really have at most a small optimisation over using gen_server/gen_fsm with a zero timeout (or by having an async message handler which does the polling and sends a message to self to trigger the next poll).
If what ever you are doing to actively poll can block (such as a blocking socket read for example), this is really big trouble, as gen_server, gen_fsm or a special process will all be stopped from fullfilling their usual obligations (which they would usually be able to either because the callback in the case of gen_server/gen_fsm returns, or because receive is called and the sys module invoked explicitly in the case of a special process).
If what you are doing to actively poll is non blocking though, you can do it, but if you poll without any delay then it effectively becomes a busy wait (it's not quite because the loop will include a receive call somewhere, which means the process will yield, giving the scheduler voluntary opportunity to run other processes, but it's not far off, and it will still be a relative CPU hog). If you can have a 1ms delay between each poll that makes a world of difference vs polling as rapidly as you can. It's not ideal, but if you MUST, it'll work. So use a timeout (as big as you can without it becoming a problem), or have an async message handler which does the polling and sends a message to self to trigger the next poll.
In the never-ending quest for UI responsiveness, I would like to gain more insight into cases where the main thread performs blocking operations.
I'm looking for some sort of "debugging mode" or extra code, or hook, or whatever, whereby I can set a breakpoint/log/something that will get hit and allow me to inspect what's going if my main thread "voluntarily" blocks for I/O (or whatever reason, really) other than for going idle at the end of the runloop.
In the past, I've looked at runloop cycle wall-clock duration using a runloop observer, and that's valuable for seeing the problem, but by the time you can check, it's too late to get a good idea for what it was doing, because your code is already done running for that cycle of the runloop.
I realize that there are operations performed by UIKit/AppKit that are main-thread-only that will cause I/O and cause the main thread to block so, to a certain degree, it's hopeless (for instance, accessing the pasteboard appears to be a potentially-blocking, main-thread-only operation) but something would be better than nothing.
Anyone have any good ideas? Seems like something that would be useful. In the ideal case, you'd never want to block the main thread while your app's code is active on the runloop, and something like this would be very helpful in getting as close to that goal as possible.
So I set forth to answer my own question this weekend. For the record, this endeavor turned into something pretty complex, so like Kendall Helmstetter Glen suggested, most folks reading this question should probably just muddle through with Instruments. For the masochists in the crowd, read on!
It was easiest to start by restating the problem. Here's what I came up with:
I want to be alerted to long periods of time spent in
syscalls/mach_msg_trap that are not legitimate idle time. "Legitimate
idle time" is defined as time spent in mach_msg_trap waiting for the
next event from the OS.
Also importantly, I didn't care about user code that takes a long time. That problem is quite easy to diagnose and understand using Instruments' Time Profiler tool. I wanted to know specifically about blocked time. While it's true that you can also diagnose blocked time with Time Profiler, I've found it considerably harder to use for that purpose. Likewise, the System Trace Instrument is also useful for investigations like this, but is extremely fine grained and complex. I wanted something simpler -- more targetted to this specific task.
It seemed evident from the get-go that the tool of choice here would be Dtrace.
I started out by using a CFRunLoop observer that fired on kCFRunLoopAfterWaiting and kCFRunLoopBeforeWaiting. A call to my kCFRunLoopBeforeWaiting handler would indicate the beginning of a "legitimate idle time" and the kCFRunLoopAfterWaiting handler would be the signal to me that a legitimate wait had ended. I would use the Dtrace pid provider to trap on calls to those functions as a way to sort legitimate idle from blocking idle.
This approach got me started, but in the end proved to be flawed. The biggest problem is that many AppKit operations are synchronous, in that they block event processing in the UI, but actually spin the RunLoop lower in the call stack. Those spins of the RunLoop are not "legitimate" idle time (for my purposes), because the user can't interact with the UI during that time. They're valuable, to be sure -- imagine a runloop on a background thread watching a bunch of RunLoop-oriented I/O -- but the UI is still blocked when this happens on the main thread. For example, I put the following code into an IBAction and triggered it from a button:
NSMutableURLRequest *req = [NSMutableURLRequest requestWithURL: [NSURL URLWithString: #"http://www.google.com/"]
cachePolicy: NSURLRequestReloadIgnoringCacheData
timeoutInterval: 60.0];
NSURLResponse* response = nil;
NSError* err = nil;
[NSURLConnection sendSynchronousRequest: req returningResponse: &response error: &err];
That code doesn't prevent the RunLoop from spinning -- AppKit spins it for you inside the sendSynchronousRequest:... call -- but it does prevent the user from interacting with the UI until it returns. This is not "legitimate idle" to my mind, so I needed a way to sort out which idles were which. (The CFRunLoopObserver approach was also flawed in that it requred changes to the code, which my final solution does not.)
I decided that I would model my UI/main thread as a state machine. It was in one of three states at all times: LEGIT_IDLE, RUNNING or BLOCKED, and would transition back and forth between those states as the program executed. I needed to come up with Dtrace probes that would allow me to catch (and therefore measure) those transitions. The final state machine I implemented was quite a bit more complicated than just those three states, but that's the 20,000 ft view.
As described above, sorting out legitimate idle from bad idle was not straightforward, since both cases end up in mach_msg_trap() and __CFRunLoopRun. I couldn't find one simple artifact in the call stack that I could use to reliably tell the difference; It appears that a simple probe on one function is not going to help me. I ended up using the debugger to look at the state of the stack at various instances of legitimate idle vs. bad idle. I determined that during legitimate idle, I'd (seemingly reliably) see a call stack like this:
#0 in mach_msg
#1 in __CFRunLoopServiceMachPort
#2 in __CFRunLoopRun
#3 in CFRunLoopRunSpecific
#4 in RunCurrentEventLoopInMode
#5 in ReceiveNextEventCommon
#6 in BlockUntilNextEventMatchingListInMode
#7 in _DPSNextEvent
#8 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]
#9 in -[NSApplication run]
#10 in NSApplicationMain
#11 in main
So I endeavored to set up a bunch of nested/chained pid probes that would establish when I had arrived at, and subsequently left, this state. Unfortunately, for whatever reason Dtrace's pid provider doesn't seem to be universally able to probe both entry and return on all arbitrary symbols. Specifically, I couldn't get probes on pid000:*:__CFRunLoopServiceMachPort:return or on pid000:*:_DPSNextEvent:return to work. The details aren't important, but by watching various other goings on, and keeping track of certain state, I was able to establish (again, seemingly reliably) when I was entered and left the legit idle state.
Then I had to determine probes for telling the difference between RUNNING and BLOCKED. That was a bit easier. In the end, I chose to consider BSD system calls (using Dtrace's syscall probe) and calls to mach_msg_trap() (using the pid probe) not occurring in periods of legit idle to be BLOCKED. (I did look at Dtrace's mach_trap probe, but it did not seem to do what I wanted, so I fell back to using the pid probe.)
Initially, I did some extra work with the Dtrace sched provider to actually measure real blocked time (i.e. time when my thread had been suspended by the scheduler) but that added considerable complexity, and I ended up thinking to myself, "If I'm in the kernel, what do I care if the thread is actually asleep or not? It's all the same to the user: it's blocked." So the final approach just measures all time in (syscalls || mach_msg_trap()) && !legit_idle and calls that the blocked time.
At this point, catching single kernel calls of long duration (like a call to sleep(5) for instance) is rendered trivial. However, more often UI thread latency comes from many little latencies accumulating over multiple calls into the kernel (think of hundreds of calls to read() or select()), so I thought it would also be desirable to dump SOME call stack when the overall amount of syscall or mach_msg_trap time in a single pass of the event loop exceeded a certain threshold. I ended up setting up various timers and logging accumulated time spent in each state, scoped to various states in the state machine, and dumping alerts when we happened to be transitioning out of the BLOCKED state, and had gone over some threshold. This method will obviously produce data that is subject to misinterpretation, or might be a total red herring (i.e. some random, relatively quick syscall that just happens to push us over the alert threshold), but I feel like it's better than nothing.
In the end, the Dtrace script ends up keeping a state machine in D variables, and uses the described probes to track transitions between the states and gives me the opportunity to do things (like print alerts) when the state machine is transitioning state, based on certain conditions. I played around a bit with a contrived sample app that does a bunch of disk I/O, network I/O, and calls sleep(), and was able to catch all three of those cases, without distractions from data pertaining to legitimate waits. This was exactly what I was looking for.
This solution is obviously quite fragile, and thoroughly awful in almost every regard. :) It may or may not be useful to me, or anyone else, but it was a fun exercise, so I thought I'd share the story, and the resulting Dtrace script. Maybe someone else will find it useful. I also must confess being a relative n00b with respect to writing Dtrace scripts so I'm sure I've done a million things wrong. Enjoy!
It was too big to post in line, so it's kindly being hosted by #Catfish_Man over here: MainThreadBlocking.d
Really this is the kind of job for the Time Profiler instrument. I believe you can see where time is spent in code per thread, so you'd go see what code was taking a while to perform and get the answer as to what was potentially blocking UI.
In Linux, with POSIX threads, is it possible to hint the scheduler to schedule a particular thread. Actually the scenario is that I have a process which is replica of another process. For deterministic execution, the follower process needs to acquire the locks in the same order as the leader process.
So for example, say in leader process, mutex a is locked by first thread 2, then 3 and 4. The follower must execute in the same order. So if in follower, thread 3 first encounters mutex a, I want thread 3 to say to the scheduler, ok I'm giving up my time slice, please schedule thread 2 instead. I know this can be achieved by modifying the scheduler, but I do not want that, I want to be able to control this from user space program.
In any system, Linux, Windows POSIX or not, if you have to ask this sort of question then I'm afraid that your app is heading for a dark place :(
Even if thread 3 were to yield, say with sleep(0), an interrupt straight after might well just schedule thread 3 back on again, preempting thread 2, or the OS might run thread 3 straightaway on another free core and it could get to the mutex first.
You have to make your app work correctly, (maybee not optimally), independently of the OS scheduling/dispatching algorithms. Even if you get your design to work on a test box, you will end up having to test your system on every combination of OS/hardware to ensure that it still works without deadlocking or performing incorrectly.
Fiddling with scheduling algorithms, thread priorities etc. should only be done to improve the performance of your app, not to try and make it work correctly or to stop it locking up!
Rgds,
Martin
I am developing a program in c++ and I have to implement a cron. This cron should be executed every hour and every 24 hours for different reasons. My first idea was to make an independent pthread and sleep it during 1h every time. Is this correct? I mean, is really efficient to have a thread asleep more than awake? What are the inconvenients of having a thread slept?
I would tend to prefer to have such a task running via cron/scheduler since it is to run at pre-determined intervals, as opposed to in response to some environmental event. So the program should just 'do' what ever it needs to do, and then be executed by the operating system as needed. This also makes it easy to change the frequency of execution - just change the scheduling, rather than needing to rebuild the app or expose extra configurability.
That said, if you really, really wanted to do it that way, you probably would not sleep for the whole hour; You would sleep in multiple of some smaller time frame (perhaps five minutes, or what ever seems appropriate) and have a variable keeping the 'last run' time so you know when to run again.
Sleep() calls typically won't be exceptionally accurate as far as the time the thread ends up sleeping; it depends on what other threads have tasks waiting, etc.
There is no performance impact of having a thread sleeping for a long time, aside of the fact that it will occupy some memory without doing anything useful. Maybe if you'd do it for thousands of threads there would be some slow down in the OS's management of threads, but it doesn't look like you are planning to do that.
A practical disadvantage of having a thread sleep for long is, that you can't do anything with it. If you, for example, want to tell the thread that it should stop because the application wants to shut down, the thread could only get this message after the sleep. So your application either needs a very long time to shut down, or the thread will need to be stopped forcefully.
My first idea was to make an independent pthread and sleep it during 1h every time.
I see no problem.
Is this correct? I mean, is really efficient to have a thread asleep more than awake?
As long as a thread sleeps and doesn't wake up, OS wouldn't even bother with its existence.
Though otherwise, if thread sleeps most of its life, why to have a dedicated thread? Why other thread (e.g. main thread) can't check the time, and start a thread to do the cron job?
What are the inconvenients of having a thread slept?
None. But the fact that the sleeping thread cannot be easily unblocked. That is a concern if one needs proper shutdown of an application. That is why it is a better idea to have another (busy) thread to check the time and start the cron job when needed.
When designing your solution keep these scenarios in mind:
At 07:03 the system time is reset to 06:59. What happens one minute later?
At 07:03 the system time is moved forward to 07:59. What happens one minute later?
At 07:59 the system time is moved forward to 08:01. Is the 08:00-job ever executed?
The answers to those questions will tell you a lot about how you should implement your solution.
The performance of the solution should not be an issue. A single sleeping thread will use minimal resources.
I have a backroundrb scheduled task that takes quite a long time to run. However it seems that the process is ending after only 2.5 minutes.
My background.yml file:
:schedules:
:named_worker:
:task_name:
:trigger_args: 0 0 12 * * * *
:data: input_data
I have zero activity on the server when the process is running. (Meaning I am the only one on the server watching the log files do their thing until the process suddenly stops.)
Any ideas?
There's not much information here that allows us to get to the bottom of the problem.
Because backgroundrb operates in the background, it can be quite hard to monitor/debug.
Here are some ideas I use:
Write a unit test to test the worker code itself and make sure there are no problems there
Put "puts" statements at multiple points in the code so you can at least see some responses while the worker is running.
Wrap the entire worker in a begin..rescue..end block so that you can catch any errors that might be occurring and cutting the process short.
Thanks Andrew. Those debugging tips helped. Especially the begin..rescue..end block.
It was still a pain to debug though. In the end it wasn't BackgroundRB cutting short after 2.5 minutes. There was a network connection being made that wasn't being closed properly. Once that was found and closed, everything works great.