I have a solid idea how GCD works, but I want to know more about the touted "operating system management" internals. It seems almost every technical explanation of how Grand Central Dispatch works with the "Operating System" is totally different. I'll paraphrase some of my findings.
"It's a daemon that's global to the OS
that distributes tasks over many
cores."
I'm not stupid enough to believe that.
"Support is built into the kernel to
be aware of all GCD applications. GCD
applications work in concert with the
kernel to make logical decisions on
how to manage threads within the
application."
Sounds like this synchronization scheme would be much slower than just managing the logic within the application.
"GCD is exists solely in the
application and uses current system
load as a metric to how it behaves."
This sounds more realistic to me, but I only saw a statement like this in one place.
What's really going on here? Is it just a library, or is it an entire "system"?
It is a library, but there are some kernel optimizations to allow for system level control. In particular, what happens is that there is an addition interface pthread_workqueue that allows GCD to tell the kernel it wants a thread to run some particular function, but doesn't actually start a thread (it is basically a continuation). At that point the kernel can choose to start that continuation or not depending on system load.
So yes, there is a global system wide infrastructure that manages GCD threads in the kernel, and the second answer is the correct one. The mistake you are making is thinking that there is synchronization going on there that is going to cost something. The scheduler is going to run no matter what, what GCD has done is used a new interface that lets the scheduler not only decide whether or not to run the threads based on their relative priority, but whether or not to create or destroy the threads as well.
It is a (significant) optimization, but it is not strictly necessary, and the FreeBSD port doesn't actually have support for the system wide stuff. If you want to look at the actual interfaces, here is pthread_workqueue.h, the implementation is in Apple's pthread.c, and you can see the stub entry point the kernel uses for starting up the workqueues in their asm stubs in start_wqthread.s. You can also go crawling through xnu to see how it upcalls into the stub if you really want.
Related
I just finished Erlang in Practice screencasts (code here), and have some questions about distribution.
Here's the is overall architecture:
Here is how to the supervision tree looks like:
Reading Distributed Applications leads me to believe that one of the primary motivations is for failover/takeover.
However, is it possible, for example, the Message Router supervisor and its workers to be on one node, and the rest of the system to be on another, without much changes to the code?
Or should there be 3 different OTP applications?
Also, how can this system be made to scale horizontally? For example if I realize now that my system can handle 100 users, and that I've identified the Message Router as the main bottleneck, how can I 'just add another node' where now it can handle 200 users?
I've developed Erlang apps only during my studies, but generally we had many small processes doing only one thing and sending messages to other processes. And the beauty of Erlang is that it doesn't matter if you send a message within the same Erlang VM or withing the same Computer, same LAN or over the Internet, the call and the pointer to the other process looks always the same for the developer.
So you really want to have one application for every small part of the system.
That being said, it doesn't make it any simpler to construct an application which can scale out. A rule of thumb says that if you want an application to work on a factor of 10-times more nodes, you need to rewrite, since otherwise the messaging overhead would be too large. And obviously when you start from 1 to 2 you also need to consider it.
So if you found a bottleneck, the application which is particularly slow when handling too many clients, you want to run it a second time and than you need to have some additional load-balancing implemented, already before you start the second application.
Let's assume the supervisor checks the message content for inappropriate content and therefore is slow. In this case the node, everyone is talking to would be simple router application which would forward the messages to different instances of the supervisor application, in a round robin manner. In case those 1 or 2 instances are not enough, you could have the router written in a way, that you can manipulate the number of instances by sending controlling messages.
However for this, to work automatically, you would need to have another process monitoring the servers and discovering that they are overloaded or under utilized.
I know that dynamically adding and removing resources always sounds great when you hear about it, but as you can see it is a lot of work and you need to have some messaging system built which allows it, as well as a monitoring system which can monitor the need.
Hope this gives you some idea of how it could be done, unfortunately it's been over a year since I wrote my last Erlang application, and I didn't want to provide code which would be possibly wrong.
If we have really heavy-processes system where process spawning is made for some kind of distribution of load - that's clear.
If we are talking about web-server : it's a good idea to spawn a new proccess for each connection, because then can be distributed. But what else? A single process for Model, View and Controller? Sounds strange, because they all run in a "liner" way, so it can not be good paralleled and we only get overhead on swapping. Also, those "Model, View and Controller" are so light, so they can stay in a single process, isn't it?
So, where is it good to spawn a new process excepting "new connection" situation.
Thank you in advice.
In general, it's anywhere you have a shared resource to manage. It may be a socket, or a database connection, but it may also be some shared in-memory data, or a state machine of some kind.
You may also want to do parallel processing of a list of values (see pmap).
To your "swapping" point you should know that Erlang processes do no use op-sys facilities for scheduling, and scheduling is all but free.
In the specific case of a web-application server, I understand your question. If you are writing a conventional web application with very little share state. Your web framework probably already handles caching and session state and such (these facilities will spawn process).
We are all highly indoctrinated into this stateless web application model. We have all been told since we were pups the stateful systems are hard to develop and they don't scale. I think you will find that there are those that are challenging that. As browser support for WebSockets improve, and with server-side language like Erlang and Clojure providing scalable platforms with safe state management, there will be those who are able to make more interactive web-applications. As an extreme example, could you image WoW as a web application?
One reason to spawn a new process for each connection is that it makes programming the connections much simpler. As a process only handles one connection doing things like having blocking access to data-bases, long polling or streaming becomes much easier. That this process blocks will not affect any other connections.
In Erlang the general "rule" is that you use processes to model concurrent activity and to manage shared resources. Processes are the fundamental way for structuring your system.
Perhaps the Question isnt that simple to answer... but what is your opinion? Should i either use Non-Blocking approaches (libevent for exampe) or use erlang light weight processes to:
Achieve as much connections as possible at a given amount of RAM
Achieve as much throughput as possible at a given amount of CPU
The background is, that i am planing to code a pub/sub-Server and i cannot decide which approach i should use.
One article about making A Million-user Comet Application with Mochiweb you can read there. But I think stability, flexibility and maintainability will be more important most of time. Keeping this in mind I would not think about anything other than Erlang even there will be some better performing solution.
Under the hood, the Erlang VM uses non-blocking IO. If you Erlang light weight process blocks, the VM does not really do a kernel level thread context switch. Most of the time, it will just wake up another LWP on the same OS thread (thus, its not "blocking" in the right sense of the word).
You can even start the vm using the +A argument and specify how many IO event loop threads you would like to allocate (AFAIK, Node.js is still single-threaded and if a callback function hangs, ur VM is done for)
I've grown to like erlang, and it's a great (cough) architectural fit to my problem. Meanwhile I still like to imagine that I can kludge erlang processes & asynchronous message passing in python (I am currently in therapy to rid myself of this obsession).
During a recent binge I came across 0MQ & I like its messaging features. These may be self-evident to an erlang/OTP expert, but I'm just a humble python programmer (my shrink will no doubt get to read this clever argument). The 0MQ user-guide states that it uses native OS threads, and not virtual "green" threads.
Is there a way to make 0MQ work with say eventlet/gevent?
Or, should I avoid the green-eyed monster and stick to a single Python app thread, with non-blocking I/O handled by 0MQ's message queuing & its own (skilled) use of native threads?
Or, check out of rehab & go back to erlang?
Responding to a stale thread because I am kind of in the same boat. Thought I would share my thoughts.
1: It looks like all the heavy lifting has already been done: https://github.com/traviscline/gevent-zeromq has integrated the gevent loop with a nonblocking zmq socket and even some Cpython speedups. It also seems to be (at the time of this writing), reasonably well maintainted.
2: It depends; if you are writing something that can use zmq without a ton of external event logic, then you should just use zmq. If OTOH you need to integrate with other protocols, you may want to use gevent (or twisted perhaps, although it has no workable zmq now at all). My projects generally require multiple protocols (ie: private queue manager, public http, public https, private memcache, etc), so I am investigating switching to gevent for quicker project turnaround than my current favorite: twisted.
3: You may want to skip zmq entirely and integrate with an existing erlang based solution like rabbitMQ; the performance advantages of zmq may not be as important as you think, and then you have an erlang message queue that easily integrates with python with existing libraries.
Also see: Messsage Queue comparison at second life wiki
Zero MQ now works with Eventlet:
https://lists.secondlife.com/pipermail/eventletdev/2010-October/000907.html
Can anyone give a concise set of real-world considerations that would drive the choice of whether or not to use inetd to manage a program that acts as a network server?
(If inetd is used, I think it alters the requirements around networking code in the program, so I think it's definitely programming-related and not general IT)
The question is based around an implementation I've seen that uses a control program managed by inetd to start a network listener that then runs forever and takes constant and heavy load. It didn't seem like a good fit with the textbook inetd usage profile (on-demand, infrequently used, lightweight) and got me interested in the more general question.
It depends on the usage pattern for your service. If the startup time for your daemon is low, and you expect it to be used infrequently, then inted might be a good fit. It reduces or even eliminates the need to write any additional networking code.
If your daemon is more heavyweight or more frequently used, you're probably better off writing it standalone. You can just as easily write an init.d script and some conf.d configuration to go with it and it will be no harder for an admin to manage. Most programming languages these days have easy to use socket libraries so in many cases the networking code may not even be that difficult.
I've found in my experience that few admins these days are familiar with inetd. Most daemons just provide their own init script. In fact, of the few hundred systems which I manage I can't think of a single one that launches anything through inetd at all. That's something worth considering.
Hooking into inetd will make your service slightly easier to manage from an operational point of view as inetd allows a sysadmin to control practically all of how network communications with your program happens. However it will require you to make a few code changes to your program. Also, it may not be as efficient as just making your program run as a daemon to begin with.
EDIT: I personally never use inetd and always opt to write server processes as standalone daemons.
I think another factor worth considering when deciding to use inetd is how much memory the process handling the request consumes on an average? If this is fairly high, then under high load you risk running out of memory (since inetd forks). The same server might be implementable in a multithreaded or select-poll manner possibly allowing for higher load / less memory per connection.
You can use inetd to give TCP/UDP capabilities to simple programs that operate over stdin/stdout.
Without inetd, your program will need to manage a slew of concerns, including network interfaces, sockets, forking, resource limits, etc.
What alternative strategy are you considering?
inetd is a good way of ensuring your server starts when the OS boots in the appropriate run level. Even if you do design your server to have some other management mechanism, inetd could still wrap all the commands quite simply. It's only shell scripts after all.