Running the erlang vm inside a process - erlang

It's possible to run the erlang VM inside a process?
I'm asking this because I'm trying to use some code using the erl_nif, witch is very cool indeed, but I have to send information back to the process that could possibily spawn the VM. The only approach I've thinked is to create some IPC communication, like pipes or reading from COUT, but this imposes the need of some protocol, and would be cool if I could call what I need directly from the function response.

Even don't mention that Erlang VM manage OS threads and has event loop, how do you want it will be stable and predictable when running inside an unpredictable OS process? No, you can't run Erlang VM inside an OS process.
Think about Erlang VM as about operating system:
Write all your code in Erlang;
Use NIFs/Port drivers only if you really need more speed. But be aware - you're in "kernel mode" now!
Use Ports/Erl_interface/C Nodes if you have many code written in some other language;

Related

In Elixir, what's the difference between a node and a process?

This question is tagged "Erlang" as well, because these Elixir modules more or less just wrap Erlang functionality.
Nodes seem like named processes. They can execute functions concurrently, link to other nodes, and act like process supervisors. Many of the functions in each module appear to be the same, strengthening the similarities.
What is value of the Node module? What does it offer that Process doesn't?
Nodes seem like named processes.
It seems you've misunderstood what a Node is. A Node is an instance of the Erlang VM, running as one Operating System process. An Erlang Process is a unit executing code, similar to an Operating System thread but lighter. An Erlang Process runs on an Erlang Node, just like Operating System processes run on an Operating System. An Erlang Process cannot run without an Erlang Node.
It's two distinct concepts. A node is an instance of an Erlang virtual machine and a process is a very lightweight thread running inside the virtual machine.
Here is the definition of an Elixir process, according to the documentation :
In Elixir, all code runs inside processes. Processes are isolated
from each other, run concurrent to one another and communicate via
message passing. Processes are not only the basis for concurrency in
Elixir, but they also provide the means for building distributed and
fault-tolerant programs.
Elixir’s processes should not be confused with operating system processes. Processes in Elixir are extremely lightweight in terms of
memory and CPU (unlike threads in many other programming languages).
And a node is the representation of an Erlang virtual machine. Here is some examples of functions of the node module :
alive?()
Returns true if the local node is alive
connect(node)
Establishes a connection to node
disconnect(node)
Forces the disconnection of a node

Can erlang use named pipes instead of sockets?

NGINX and other servers offer the option to use named pipes (mkfifo).
Can erlang use these instead of ports for nif interaction. What if I wanted to make 70,000 connections to my NIF (don't judge).
In short, no.
This is covered in the Erlang FAQ on opening device files. It boils down to it being hard/impossible to write the Erlang runtime in a portable way across Unices (not to mention Windows) so that it can access things like device files and named pipes without blocking on at least some of them. That blocking would screw up the "soft realtime" nature of the Erlang runtime.
What is possible is to write a C program that communicates with the Erlang runtime as a "port process", and that program can communicate over the named pipe (and block or not or whatever without screwing up the Erlang runtime).

NIF to wrap my multi-threaded C++ code

I have a C++ code that implement a special protocol over the serial port. The code is multi-threaded and internally polls the serial port and do its own cyclic processing. I would like to call this driver from erlang and also receive events from this driver. My concern is that this C++ code is multi-threaded and also statefull meaning that when I call a certain function on the driver, it caches things internally which will be used/required on the subsequent calls of the driver. My questions are
1.Does NIF run in the same os process as the rest of my erlang proceses or NIF is launched in a separate os process?
2.Does it make sense to warp this multi-threaded stateful C++ code with NIF?
4.If NIF is not the right approach, what is the better way for me to make Elrang talk back and forth with this C++ code. I also prefer my C++ code to be inside the same OS process as the rest of my Erlang processes and as it looks like linked-in drivers are an option but not sure if the multi-threaded nature of my C++ code will be ok to that model. Plus I hear they can mess up elrang scheduler?
Unlike ports, NIFs are run within Erlang VM process, similar to drivers. Because of that, any NIF crashes will bring VM down as well. And, answering in advance, to your last question, NIFs, like drivers, may block your scheduler.
That depends on the functionality you are implementing by this C++ code. Due to the answer 1), you probably want to avoid concurrency in the C++ part, since it's a potential source of errors. It's not always possible, of course. But if you are implementing, say, some workers pool, go ahead and implement 1-threaded code, spawning it as many times as you need.
Drivers can be multi-threaded too, with same potential problems and quite similar performance (well, still slightly faster than NIFs). If you are not completely sure about your C++ code stability, use it as an Erlang port.
Speaking of the difference between NIFs and drivers, the former is synchronous natively, and the latter can be asynchronous (which can be really a huge advantage if you don't want to receive any answers for most of the commands). Drivers are easier to mess up and harder to implement (but once you grasp the main patterns and problems, they seem okay, actually).
Here's a good start for drivers:
http://www.erlang.org/doc/apps/erts/driver.html
And something similar (behold the difference in complexity) for NIFs:
http://www.erlang.org/doc/tutorial/nif.html

When I make a C plugin for erlang will it take full advantage of the spawning system? Does it block?

eg I have a program that eats a lot of CPU. I make a C plugin that can interact with erlang. I spawn 16 threads with SMP +16. Will it give me a similar performance compared to something like pthreads on a multicore? The threads do not need to communicate with each other.
"C plugin" is not clearly defined in the erlang context.
Either you are writing a port which basically forks a system process.
Or you are writing a linked in driver which runs in the same context as the Erlang vm.
In both cases you can take advantage of multicore cpu's. The first case just relies on the OS to place the OS processes on different CPU's (which any decent SMP OS should be capable of).
In the second case I'm not so sure but I would expect the drivers to run on different CPU cores also. Unless you have a strong cause for using linked drivers and you know exactly what you are doing I recommend against them for complexity and stability reasons. If a port crashes Erlang is notified and can restart it or take other precautions. If a driver crashes the whole Erlang vm is taken down hard.
The main question is what part of the problem you want to solve in Erlang, if you use erlang only to start your "plugins" this can be much easier be solved just starting processes from the shell, since your "threads" don't need to communicate, why not pass the parameters on the commandline and fork working processes from a shell script?

Is it better to start multiple erlang nodes per machine, or just one per machine?

Preface: When I say "machine" below, I mean either a physical dedicated server, or a virtual private server. When I say "node" I mean, an instance of the erlang virtual machine, of which there could be multiple running as separate processes under a single unix kernel.
I've got a project that involves multiple erlang/OTP applications. The applications will be running together and talking to each other on the same machine. They will all be hitting the disk, using memory and spawning erlang processes. They will also be using network resources because they will be talking to similar machines with the same set of applications running on them in a cluster.
Almost all of this communication is via HTTP. Thus I could separate each erlang OTP application into a separate instance of the erlang VM on the same machine and they could still talk to each other.
My question is: Is it better to have them running all under one erlang VM so that this erlang VM process can allocate access to resources among them, and schedule the execution of the various erlang processes.
Or is it better to have separate erlang nodes on a given server?
If one is better than the other, why?
I'm assuming running all of these apps in a single erlang vm which is given, essentially, full run of the server, will result in better performance. The OS is just managing the disk and ram at the low level, and only has one significant process (the erlang VM) to switch with... and the erlang VM is probably smarter about allocating resources when it has the holistic view of all the erlang processes.
This may be something that I need to test, but I'm not in a position to do so effectively in the near term.
The answer is: it depends.
Advantages of using a single node:
Memory is controlled by a single Erlang VM. It is way easier.
Inter-application communication (if using erlang-messaging) is faster.
Less operating system context switches happens
Advantages of using multiple nodes:
If the system is linking in C code to the VM, death of one node due to a bug in C will not kill the others.
Agree with #I GIVE CRAP ANSWERS
I would go with one VM. Here is why:
dynamic handling of run time queues belonging to schedulers (with varied origin of CPU load its important)
fewer VMs to monitor
better understanding of memory allocation and easier to spot malicious process (can compare all of them at once)
much easier inter app supervision
I wouldn't care about VM crash - you need to be prepared any way. Heart works especially well in the cluster of equal units.
We've always used one VM per application because it's easier to manage.
The scheduler and SMP support in Erlang have come a long way in the past few years, so there isn't as much reason as there used to be to run multiple VMs on the same node.
I Agree with previous answers but there is a case scenario where having multiple nodes per cpu is the answer: When a heavy task hits the node. A task may take multiple minutes to complete and in such case a gen server will hold the node until completion of the task.

Resources