I have a C++ code that implement a special protocol over the serial port. The code is multi-threaded and internally polls the serial port and do its own cyclic processing. I would like to call this driver from erlang and also receive events from this driver. My concern is that this C++ code is multi-threaded and also statefull meaning that when I call a certain function on the driver, it caches things internally which will be used/required on the subsequent calls of the driver. My questions are
1.Does NIF run in the same os process as the rest of my erlang proceses or NIF is launched in a separate os process?
2.Does it make sense to warp this multi-threaded stateful C++ code with NIF?
4.If NIF is not the right approach, what is the better way for me to make Elrang talk back and forth with this C++ code. I also prefer my C++ code to be inside the same OS process as the rest of my Erlang processes and as it looks like linked-in drivers are an option but not sure if the multi-threaded nature of my C++ code will be ok to that model. Plus I hear they can mess up elrang scheduler?
Unlike ports, NIFs are run within Erlang VM process, similar to drivers. Because of that, any NIF crashes will bring VM down as well. And, answering in advance, to your last question, NIFs, like drivers, may block your scheduler.
That depends on the functionality you are implementing by this C++ code. Due to the answer 1), you probably want to avoid concurrency in the C++ part, since it's a potential source of errors. It's not always possible, of course. But if you are implementing, say, some workers pool, go ahead and implement 1-threaded code, spawning it as many times as you need.
Drivers can be multi-threaded too, with same potential problems and quite similar performance (well, still slightly faster than NIFs). If you are not completely sure about your C++ code stability, use it as an Erlang port.
Speaking of the difference between NIFs and drivers, the former is synchronous natively, and the latter can be asynchronous (which can be really a huge advantage if you don't want to receive any answers for most of the commands). Drivers are easier to mess up and harder to implement (but once you grasp the main patterns and problems, they seem okay, actually).
Here's a good start for drivers:
http://www.erlang.org/doc/apps/erts/driver.html
And something similar (behold the difference in complexity) for NIFs:
http://www.erlang.org/doc/tutorial/nif.html
Related
Erlang Interoperability guide discusses different interoperability mechanisms. Here are my conclusions:
Ports and Erl_Interface programs: OS scheduled, limit scalability.
Port Drivers: dangerous because a crash in the port driver brings the
emulator down too.
C Nodes: Node server needs to scale as well as Erlang app to avoid
scalability sacrifices.
NIFs: Loic sums
them up well.
Some advocate the use of OpenCL basically delegating resource hungry computations to GPU while letting the Erlang emulator to own the CPU. This sounds fantastic but then you have a requirement on your servers having a suitable GPU.
Using JInterface and communicating with a Java process that spawns a thread for every request might be an option.
So has anyone come across a solution that has been tested in practise and turned out to work well?
Actually all solutions take place. As I've been working tightly with some of them I could say the following:
Ports are safe but port communication is slow. If port crashes, VM continues working. If you do not communicate with your port extensively or you do not trust the port - this is your choice
NIFs are extremely fast. If your data flow is great you should use them. Of course they are unsafe so you have to program NIF library carefully and you'd better learn some C (the point that most of NIF creators skip). Actually scheduling problems are easily overcome with the specific pattern. You should start the new C thread that does actual job just after receiving data from Erlang and detach processing from Erlang thread. So you quit NIF function very quickly returning back in Erlang and waiting for a message from C code.
Java Nodes or C nodes are for tasks that can be moved to the node completely. That are some long and heavy jobs.
Bearing in mind above considerations you decide the way that fits your task best.
Perhaps the Question isnt that simple to answer... but what is your opinion? Should i either use Non-Blocking approaches (libevent for exampe) or use erlang light weight processes to:
Achieve as much connections as possible at a given amount of RAM
Achieve as much throughput as possible at a given amount of CPU
The background is, that i am planing to code a pub/sub-Server and i cannot decide which approach i should use.
One article about making A Million-user Comet Application with Mochiweb you can read there. But I think stability, flexibility and maintainability will be more important most of time. Keeping this in mind I would not think about anything other than Erlang even there will be some better performing solution.
Under the hood, the Erlang VM uses non-blocking IO. If you Erlang light weight process blocks, the VM does not really do a kernel level thread context switch. Most of the time, it will just wake up another LWP on the same OS thread (thus, its not "blocking" in the right sense of the word).
You can even start the vm using the +A argument and specify how many IO event loop threads you would like to allocate (AFAIK, Node.js is still single-threaded and if a callback function hangs, ur VM is done for)
I've grown to like erlang, and it's a great (cough) architectural fit to my problem. Meanwhile I still like to imagine that I can kludge erlang processes & asynchronous message passing in python (I am currently in therapy to rid myself of this obsession).
During a recent binge I came across 0MQ & I like its messaging features. These may be self-evident to an erlang/OTP expert, but I'm just a humble python programmer (my shrink will no doubt get to read this clever argument). The 0MQ user-guide states that it uses native OS threads, and not virtual "green" threads.
Is there a way to make 0MQ work with say eventlet/gevent?
Or, should I avoid the green-eyed monster and stick to a single Python app thread, with non-blocking I/O handled by 0MQ's message queuing & its own (skilled) use of native threads?
Or, check out of rehab & go back to erlang?
Responding to a stale thread because I am kind of in the same boat. Thought I would share my thoughts.
1: It looks like all the heavy lifting has already been done: https://github.com/traviscline/gevent-zeromq has integrated the gevent loop with a nonblocking zmq socket and even some Cpython speedups. It also seems to be (at the time of this writing), reasonably well maintainted.
2: It depends; if you are writing something that can use zmq without a ton of external event logic, then you should just use zmq. If OTOH you need to integrate with other protocols, you may want to use gevent (or twisted perhaps, although it has no workable zmq now at all). My projects generally require multiple protocols (ie: private queue manager, public http, public https, private memcache, etc), so I am investigating switching to gevent for quicker project turnaround than my current favorite: twisted.
3: You may want to skip zmq entirely and integrate with an existing erlang based solution like rabbitMQ; the performance advantages of zmq may not be as important as you think, and then you have an erlang message queue that easily integrates with python with existing libraries.
Also see: Messsage Queue comparison at second life wiki
Zero MQ now works with Eventlet:
https://lists.secondlife.com/pipermail/eventletdev/2010-October/000907.html
I have a solid idea how GCD works, but I want to know more about the touted "operating system management" internals. It seems almost every technical explanation of how Grand Central Dispatch works with the "Operating System" is totally different. I'll paraphrase some of my findings.
"It's a daemon that's global to the OS
that distributes tasks over many
cores."
I'm not stupid enough to believe that.
"Support is built into the kernel to
be aware of all GCD applications. GCD
applications work in concert with the
kernel to make logical decisions on
how to manage threads within the
application."
Sounds like this synchronization scheme would be much slower than just managing the logic within the application.
"GCD is exists solely in the
application and uses current system
load as a metric to how it behaves."
This sounds more realistic to me, but I only saw a statement like this in one place.
What's really going on here? Is it just a library, or is it an entire "system"?
It is a library, but there are some kernel optimizations to allow for system level control. In particular, what happens is that there is an addition interface pthread_workqueue that allows GCD to tell the kernel it wants a thread to run some particular function, but doesn't actually start a thread (it is basically a continuation). At that point the kernel can choose to start that continuation or not depending on system load.
So yes, there is a global system wide infrastructure that manages GCD threads in the kernel, and the second answer is the correct one. The mistake you are making is thinking that there is synchronization going on there that is going to cost something. The scheduler is going to run no matter what, what GCD has done is used a new interface that lets the scheduler not only decide whether or not to run the threads based on their relative priority, but whether or not to create or destroy the threads as well.
It is a (significant) optimization, but it is not strictly necessary, and the FreeBSD port doesn't actually have support for the system wide stuff. If you want to look at the actual interfaces, here is pthread_workqueue.h, the implementation is in Apple's pthread.c, and you can see the stub entry point the kernel uses for starting up the workqueues in their asm stubs in start_wqthread.s. You can also go crawling through xnu to see how it upcalls into the stub if you really want.
Can anyone give a concise set of real-world considerations that would drive the choice of whether or not to use inetd to manage a program that acts as a network server?
(If inetd is used, I think it alters the requirements around networking code in the program, so I think it's definitely programming-related and not general IT)
The question is based around an implementation I've seen that uses a control program managed by inetd to start a network listener that then runs forever and takes constant and heavy load. It didn't seem like a good fit with the textbook inetd usage profile (on-demand, infrequently used, lightweight) and got me interested in the more general question.
It depends on the usage pattern for your service. If the startup time for your daemon is low, and you expect it to be used infrequently, then inted might be a good fit. It reduces or even eliminates the need to write any additional networking code.
If your daemon is more heavyweight or more frequently used, you're probably better off writing it standalone. You can just as easily write an init.d script and some conf.d configuration to go with it and it will be no harder for an admin to manage. Most programming languages these days have easy to use socket libraries so in many cases the networking code may not even be that difficult.
I've found in my experience that few admins these days are familiar with inetd. Most daemons just provide their own init script. In fact, of the few hundred systems which I manage I can't think of a single one that launches anything through inetd at all. That's something worth considering.
Hooking into inetd will make your service slightly easier to manage from an operational point of view as inetd allows a sysadmin to control practically all of how network communications with your program happens. However it will require you to make a few code changes to your program. Also, it may not be as efficient as just making your program run as a daemon to begin with.
EDIT: I personally never use inetd and always opt to write server processes as standalone daemons.
I think another factor worth considering when deciding to use inetd is how much memory the process handling the request consumes on an average? If this is fairly high, then under high load you risk running out of memory (since inetd forks). The same server might be implementable in a multithreaded or select-poll manner possibly allowing for higher load / less memory per connection.
You can use inetd to give TCP/UDP capabilities to simple programs that operate over stdin/stdout.
Without inetd, your program will need to manage a slew of concerns, including network interfaces, sockets, forking, resource limits, etc.
What alternative strategy are you considering?
inetd is a good way of ensuring your server starts when the OS boots in the appropriate run level. Even if you do design your server to have some other management mechanism, inetd could still wrap all the commands quite simply. It's only shell scripts after all.