How to do Memory-mapped IO in Erlang?

How to do Memory-mapped IO in Erlang? - erlang

I have been considering using Erlang for an embedded system.
The one thing I am missing in my research is the ability to do direct memory mapping.
Is this expected to be done via a NIF (Native Interface) or some other method (if so, what)?

There is no memory mapped IO interface in the Erlang VM. You would need to use NIF or alternatively you can try to make such IO subsystem available as an file descriptor. Then erlang:open_port/2 can be used to communicate with that.

Related

Can I write a file to a specific cluster location?

You know, when an application opens a file and write to it, the system chooses in which cluster will be stored. I want to choose myself ! Let me tell you what I really want to do... In fact, I don't necessarily want to write anything. I have a HDD with a BAD range of clusters in the middle and I want to mark that space as it is occupied by a file, and eventually set it as a hidden-unmoveable-system one (like page file in windows) so that it won't be accessed anymore. Any ideas on how to do that ?
Later Edit:
I think THIS is my last hope. I just found it, but I need to investigate... Maybe a file could be created anywhere and then relocated to the desired cluster. But that requires writing, and the function may fail if that cluster is bad.

I believe the answer to your specific question: "Can I write a file to a specific cluster location" is, in general, "No".
The reason for that is that the architecture of modern operating systems is layered so that the underlying disk store is accessed at a lower level than you can access, and of course disks can be formatted in different ways so there will be different kernel mode drivers that support different formats. Even so, an intelligent disk controller can remap the addresses used by the kernel mode driver anyway. In short there are too many levels of possible redirection for you to be sure that your intervention is happening at the correct level.
If you are talking about Windows - which you haven't stated but which appears to assumed - then you need to be looking at storage drivers in the kernel (see https://learn.microsoft.com/en-us/windows-hardware/drivers/storage/). I think the closest you could reasonably come would be to write your own Installable File System driver (see https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/_ifsk/). This is really a 'filter' as it sits in the IO request chain and can intercept and change IO Request Packets (IRPs). Of course this would run in the kernel, not in userspace, and normally this would be written in C and I note your question is tagged for Delphi.
Your IFS Driver can sit at differnt levels in the request chain. I have used this technique to intercept calls to specific file system locations (paths / file names) and alter the IRP so as to virtualise the request - even calling back to user space from the kernel to resolve how the request should be handled. Using the provided examples implementing basic functionality with an IFS driver is not too involved because it's a filter and not a complete storgae system.
However the very nature of this approach means that another filter can also alter what you are doing in your driver.
You could look at replacing the file system driver that interfaces to the hardware, but I think that's likely to be an excessive task under the circumstances ... and as pointed out already by #fpiette the disk controller hardware can remap your request anyway.
In the days of MSDOS the access to the hardware was simpler and provided by the BIOS which could be hooked to allow the requests to be intercepted. Modern environments aren't that simple anymore. The IFS approach does allow IO to be hooked, but it does not provide the level of control you need.
EDIT regarding suggestion by the OP of using FSCTL_MOVE_FILE
For simple environment this may well do what you want, it is designed to support a defragmentation process.
However I still think there's no guarantee that this actually will do what you want.
You will note from the page you have linked to it states that it is moving one or more virtual clusters of a file from one logical cluster to another within the same volume
This is a code that's passed to the underlying storage drivers which I have referred to above. What the storage layer does is up to the storage layer and will depend on the underlying technology. With more advanced storage there's no guarantee this actually addresses the physical locations which I believe your question is asking about.
However that's entirely dependent on the underlying storage system. For some types of storage relocation by the OS may not be honoured in the same way. As an example consider an enterprise storage array that has a built in data-tiering function. Without the awareness of the OS data will be relocated within the storage based on the tiering algorithms. Also consider that there are technologies which allow data to be directly accessed (like NVMe) and that you are working with 'virtual' and 'logical' clusters, not physical locations.
However, you may well find that in a simple case, with support in the underlying drivers and no remapping done outside the OS and kernel, this does what you need.

Since you problem is to mark bad cluster, you don't need to write any program. Use the command line utility CHKDSK that Windows provides.
I an elevated command prompt (Run as administrator), run the command:
chkdsk /r c:
The check will be done on the next reboot.
Don't forget to read the documentation.

Using kqueue for simple async io

How does one actually use kqueue() for doing simple async r/w's?
It's inception seems to be as a replacement for epoll(), and select(), and thus the problem it is trying to solve is scaling to listening on large number of file descriptors for changes.
However, if I want to do something like: read data from descriptor X, let me know when the data is ready - how does the API support that? Unless there is a complimentary API for kicking-off non-blocking r/w requests, I don't see a way other than managing a thread pool myself, which defeats the purpose.
Is this simply the wrong tool for the job? Stick with aio?
Aside: I'm not savvy with how modern BSD-based OS internals work - but is kqueue() built on aio or visa-versa? I would imagine it would depend on whether the OS io subsystem system is fundamentally interrupt-driven or polling.

None of the APIs you mention, aside from aio itself, has anything to do with asynchronous IO, as such.
None of select(), poll(), epoll(), or kqueue() are helpful for reading from file systems (or "vnodes"). File descriptors for file system items are always "ready", even if the file system is network-mounted and there is network latency such that a read would actually block for a significant time. Your only choice there to avoid blocking is aio or, on a platform with GCD, dispatch IO.
The use of kqueue() and the like is for other kinds of file descriptors such as sockets, pipes, etc. where the kernel maintains buffers and there's some "event" (like the arrival of a packet or a write to a pipe) that changes when data is available. Of course, kqueue() can also monitor a variety of other input sources, like Mach ports, processes, etc.
(You can use kqueue() for reads of vnodes, but then it only tells you when the file position is not at the end of the file. So, you might use it to be informed when a file has been extended or truncated. It doesn't mean that a read would not block.)
I don't think either kqueue() or aio is built on the other. Why would you think they were?

I used kqueues to adapt a Linux proxy server (based on epoll) to BSD. I set up separate GCD async queues, each using a kqueue to listen on a set of sockets. GCD manages the threads for you.

Can erlang use named pipes instead of sockets?

NGINX and other servers offer the option to use named pipes (mkfifo).
Can erlang use these instead of ports for nif interaction. What if I wanted to make 70,000 connections to my NIF (don't judge).

In short, no.
This is covered in the Erlang FAQ on opening device files. It boils down to it being hard/impossible to write the Erlang runtime in a portable way across Unices (not to mention Windows) so that it can access things like device files and named pipes without blocking on at least some of them. That blocking would screw up the "soft realtime" nature of the Erlang runtime.
What is possible is to write a C program that communicates with the Erlang runtime as a "port process", and that program can communicate over the named pipe (and block or not or whatever without screwing up the Erlang runtime).

Pass an Interface to a different process

I use WM_COPYDATA to enable communication between my two processes A and B. There is no problem to exchange data with basic data types.
Now I have a problem, in some case I want to pass an Interface (IDispatch) from my process A to my process B. Is it possible?

It is not possible to directly pass an interface pointer to another process. Like any other pointer, an interface is only valid in the process address space that instantiates it at runtime. COM has its own mechanism for marshaling interfaces and data across process boundaries, even across different apartments in the same process. In the case of interfaces, that involves proxies and stubs which run in each process/apartment and communicate with each other using various IPC mechanisms, such as pipes, RPC, or TCP/IP. Have a look at these articles for how using interfaces across processes/apartments is accomplished:
Inter-Object Communication
Understanding Custom Marshaling Part 1
To do what you are asking for, without resorting to implementing custom marshaling, you would have to make one of the processes act as an out-of-process COM server, and then the other process can use CoCreateInstance() or GetActiveObject() to obtain an interface pointer to the server's object that works within its local address space, and let COM handle the marshaling details for you.

It can't be done directly, but you can use a Client-Server service framework, which may be interface based.
For instance, see the last feature of our Open Source mORMot framework: Interface based services sample code and this link.
You can execute an interface on a remote process. The feature handles all communication means of the framework, i.e. in-process call, GDI messages, named pipes and TCP/HTTP. Internally it will use WM_COPYDATA for GDI messages, then transmit the parameters and results as JSON. Use this link to download the source code (use the http://synopse.info/fossil 1.16+ version) and the documentation (there are several pages about how to implement those services).
It is an Open-Source project, working with Delphi 6 up to XE2.
You can also expose your interface with a SOAP or DataSnap Client-Server (if you have the corresponding version of Delphi), or n-Tier commercial packages (like http://www.remobjects.com/da). This is similar to the method implemented in mORMot.
COM is also a good candidate, native to Windows, but it is more difficult to initialize: you'll have to register the COM on each PC (with administrator rights), and you won't be able to make it work over a network (DCOM is deprecated, remember). COM is good if you want your service to be shared with other languages, like .Net, but only locally.

Running the erlang vm inside a process

It's possible to run the erlang VM inside a process?
I'm asking this because I'm trying to use some code using the erl_nif, witch is very cool indeed, but I have to send information back to the process that could possibily spawn the VM. The only approach I've thinked is to create some IPC communication, like pipes or reading from COUT, but this imposes the need of some protocol, and would be cool if I could call what I need directly from the function response.

Even don't mention that Erlang VM manage OS threads and has event loop, how do you want it will be stable and predictable when running inside an unpredictable OS process? No, you can't run Erlang VM inside an OS process.
Think about Erlang VM as about operating system:
Write all your code in Erlang;
Use NIFs/Port drivers only if you really need more speed. But be aware - you're in "kernel mode" now!
Use Ports/Erl_interface/C Nodes if you have many code written in some other language;

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart