popen() implementation,fd leaks,and vfork() - popen

In the glibc implementation of popen(), it specifies that
The popen() function shall ensure that any streams from previous popen() calls that remain open in the parent process are closed in the new child process.
Why? If the purpose is to avoid fd leaks, why not just close all open fds?
The glibc implementation of popen() uses fork(). Although there are dup2() and close() calls between fork() and exec(), is it possible to replace fork() by vfork() to improve performance?
Is the Linux implementation of popen() based on fork() rather than vfork()? Why (or why not)?
I'm going to write a bidirectional version of popen(), which returns two FILE*: one for read and one for write. How do I implement it correctly? It should be thread safe and no fd leaks. It is better if it is fast.

vfork(2) is obsolete (removed from POSIX2008), and fork(2) is quite efficient, since it uses copy-on-write techniques.
popen(3) cannot close all opened files, because it does not know them and cannot know which are relevant. Imagine a program which gets a socket and pass its file descriptor as an argument to the popen-ed command (or simply popen("time cat /etc/issue >&9; date","r")....). See also fcntl(2) with FD_CLOEXEC, open(2) with O_CLOEXEC, execve(2)
File descriptors are program-wide and process-wide scare resources and it is your responsability to manage them correctly. You should know which fd-s should be closed in your child process before execve. If you know what program is execve-d and what fds it needs, you can close all other fds (or most of them, perhaps with for (int i=STDERR_FILENO+1; i<64; i++) (void) close(i);) before execve.
If you are coding a reusable library, document its policy regarding file descriptors (and any other global process-wide resources) and probably use FD_CLOEXEC on any file descriptors it is obtaining itself (not as explicit argument or data), e.g. for internal use.
It looks like you are reinventing p2open (then you probably need to understand the implementation details of your FILE in your C standard library, or else use fdopen(3) with care and caution); you might find some implementation of it. Beware, the process using that probably need to have some event loop (e.g. above poll(2) ...) to avoid a potential deadlock (with both parent and child processes blocked on reading).
Did you consider using some existing event loop infrastructure (e.g. libevent, libev, glib from GTK, etc....)?
BTW Linux has several free software implementations for its C standard library. The GNU libc is quite common, but there is musl-libc and several others. Study the source code of your libc.

Related

Why does VkAccessFlagBits include both read bits and write bits?

In vulkan.h, every instance of VkAccessFlagBits appears in a pair that contains a srcAccessMask and a dstAccessMask:
VkAccessFlags srcAccessMask;
VkAccessFlags dstAccessMask;
In every case, according to my understanding, the purpose of these masks is to help designate two sets of operations, such that results of operations in the first set will be visible to operations in the second set. For instance, write operations occurring prior to a barrier should not get hung up in caches but should instead propagate all the way to locations from which they can be read after the barrier. Or something like that.
The access flags come in both READ and WRITE forms:
/* ... */
VK_ACCESS_SHADER_READ_BIT = 0x00000020,
VK_ACCESS_SHADER_WRITE_BIT = 0x00000040,
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT = 0x00000080,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT = 0x00000100,
/* ... */
But it seems to me that srcAccessMask should probably always be some sort of VK_ACCESS_*_WRITE_BIT combination, while dstAccessMask should always be a combination of VK_ACCESS_*_READ_BIT values. If that is true, then the READ/WRITE distinction is identical to and implicit in the src/dst distinction, and so it should be good enough to just have VK_ACCESS_SHADER_BIT etc., without READ_ or WRITE_ variants.
Why are there READ_ and WRITE_ variants, then? Is it ever useful to specify that some read operations must fully complete before some other operations have begun? Note that all operations using VkAccessFlagBits produce (I think) execution dependencies as well as memory dependencies. It seems to me that the execution dependencies should be good enough to prevent earlier reads from receiving values written by later writes.
While writing this question I encountered a statement in the Vulkan specification that provides at least part of an answer:
Memory dependencies are used to solve data hazards, e.g. to ensure that write operations are visible to subsequent read operations (read-after-write hazard), as well as write-after-write hazards. Write-after-read and read-after-read hazards only require execution dependencies to synchronize.
This is from the section 6.4. Execution And Memory Dependencies. Also, from earlier in that section:
The application must use memory dependencies to make writes visible before subsequent reads can rely on them, and before subsequent writes can overwrite them. Failure to do so causes the result of the reads to be undefined, and the order of writes to be undefined.
From this I surmise that, yes, the execution dependencies produced by the Vulkan commands that involve these access flags probably do free you from ever having to put a VK_ACCESS_*_READ_BIT into a srcAccessMask field--but that you might in fact want to have READ_ flags, WRITE_ flags, or both in some of your dstAccessMask fields, because apparently it's possible to use an explicit dependency to prevent read-after-write hazards in such a way that write-after-write hazards are NOT prevented. (And maybe vice-versa?)
Like, maybe your Vulkan will sometimes decide that a write does not actually need to be propagated all the way through a particular cache to its final specified destination for the sake of a subsequent read operation, IF Vulkan happens to know that that read operation will simply read from that same cache, saving some time? But then a second write might happen, and write to a different cache, and there'll be two caches left in a race (with the choice of winner undefined) to send their two values to the same spot. Or something? Maybe my mental model of these caches is entirely wrong.
It is fairly solidly established, at least, that memory barriers are confusing.
Let's go over all the possibilities:
read–read — well yeah that one is pretty useless. Khronos seems to agree #131 it is pointless value in src (basically equivalent to 0).
read–write — execution dependency should be sufficient to synchronize without this. Khronos seems to agree #131 it is pointless value in src (basically equivalent to 0).
write–read — that's the obvious and most common one.
write–write — similar reason to write–read above. Without it the order of the writes would be undefined. It is a bit pointless for most situations to write something you haven't even read in between. But hey, now you have a way to synchronize it.
You can provide bitmask of more of these masks to both src and dst. In which case it makes sense to have both masks for driver to sort the dependencies out for you. (I don't expect performance overhead from this on API level, so it is allowed as convenience)
From API design perspective, it could mean adding different enum for srcAccess. But perhaps _READ variants could just be forbidden in srcAccess through "Valid Usage", making this argument weak. The src == READ variant might have been kept, because it is benign.

Using pthreads with MPICH

I am having trouble using pthreads in my MPI program. My program runs fine without involving pthreads. But I then decided to execute a time-consuming operation in parallel and hence I create a pthread that does the following (MPI_Probe, MPI_Get_count, and MPI_Recv). My program fails at MPI_Probe and no error code is returned. This is how I initialize the MPI environment
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided_threading_support);
The provided threading support is '3' which I assume is MPI_THREAD_SERIALIZED. Any ideas on how I can solve this problem?
The provided threading support is '3' which I assume is MPI_THREAD_SERIALIZED.
The MPI standard defines thread support levels as named constants and only requires that their values are monotonic, i.e. MPI_THREAD_SINGLE < MPI_THREAD_FUNNELED < MPI_THREAD_SERIALIZED < MPI_THREAD_MULTIPLE. The actual numeric values are implementation-specific and should never be used or compared against.
MPI communication calls by default never return error codes other than MPI_SUCCESS. The reason for that is, MPI calls the communicator's error handler before an MPI call returns and all communicators are initially created with MPI_ERRORS_ARE_FATAL installed as their error handler. That error handler terminates the program and usually prints some debugging information, e.g. the reason for the failure. Both MPICH (and its countless variants) and Open MPI produce quite elaborate reports on what led to the termination.
To enable user error handling on communicator comm, you should make the following call:
MPI_Comm_set_errhandler(comm, MPI_ERRORS_RETURN);
Watch out for the error codes returned - their numerical values are also implementation-specific.
If your MPI implementation isn't willing to give you MPI_THREAD_MULTIPLE, there's three things you can do:
Get a new MPI implementation.
Protect MPI calls with a critical section.
Cut it out with the threading thing.
I would suggest #3. The whole point of MPI is parallelism -- if you find yourself creating multiple threads for a single MPI subprocess, you should consider whether those threads should have been independent subprocesses to begin with.
Particularly with MPI_THREAD_MULTIPLE. I could maybe see a use for MPI_THREAD_SERIALIZED, if your threads are sub-subprocess workers for the main subprocess thread... but MULTIPLE implies that you're tossing data around all over the place. That loses you the primary convenience offered by MPI, namely synchronization. You'll find yourself essentially reimplementing MPI on top of MPI.
Okay, now that you've read all that, the punchline: 3 is MPI_THREAD_MULTIPLE. But seriously. Reconsider your architecture.

Unix: sharing already-mapped memory between processes

I have a pre-built userspace library that has an API along the lines of
void getBuffer (void **ppBuf, unsigned long *pSize);
void bufferFilled (void *pBuf, unsigned long size);
The idea being that my code requests a buffer from the lib, fills it with stuff, then hands it back to the lib.
I want another process to be able to fill this buffer. I can do this by creating some new shared buffer via shm*/shm_* APIs, have the other process fill that, then copy it to the lib's buffer in the lib's local process, but this has the overhead of an extra (potentially large) copy.
Is there a way to share memory that has ALREADY been mapped for a process? eg something like:
[local lib process]
getBuffer (&myLocalBuf, &mySize);
shmName = shareThisMemory (myLocalBuf, mySize);
[other process]
myLocalBuf = openTheSharedMemory (shmName);
That way the other process could write directly into the lib's buffer.
(Synchronization between the processes is already taken care of so no problems there).
There are good reasons for not allowing this functionality, particularly from the security side of things. A "share this mem" API would subvert the access permissions system.
Just assume an application holds some sort of critical/sensitive information in memory; the app links (via e.g. using a shared library, a preload, a modified linker/loader) to whatever component outside, and said component for the sheer fun of it decides to "share out the address space". It'd be a free-for-all, a method to bypass any sort of data access permission/restriction. You'd tunnel your way into the app.
Not good for your usecase, admitted, but rather justified from the system / application integrity point of view. Try searching the web for /proc/pid/mem mmap vulnerability for some explanation why this sort of access isn't wanted (in general).
If the library you use is designed to allow such shared access, it must itself provide the hooks to either allocate such a shared buffer, or use an elsewhere-preallocated (and possibly shared) buffer.
Edit: To make this clear, the process boundary is explicitly about not sharing the address space (amongst other things).
If you require a shared address space, either use threads (then the entire address space is shared and there's never any need to "export" anything), or explicitly set up a shared memory region in the same way as you'd set up a shared file.
Look at it from the latter point of view, two processes not opening it O_EXCL would share access to a file. But if one process already has it open O_EXCL, then the only way to "make it shared" (open-able to another process) is to close() it first then open() it again without O_EXCL. There's no other way to "remove" exclusive access from a file that you've opened as such other than to close it first.
Just as there is no way to remove exclusive access to a memory region mapped as such other than to unmap it first - and for a process' memory, MAP_PRIVATE is the default, for good reasons.
More: a process-shared memory buffer really isn't much different than a process shared file; using SysV-IPC style semantics, you have:
| SysV IPC shared memory Files
==============+===================================================================
creation | id = shmget(key,..., IPC_CREAT); fd = open("name",...,O_CREAT);
lookup | id = shmget(key,...); fd = open("name",...);
access | addr = shmat(id,...); addr = mmap(...,fd,...);
|
global handle | IPC key filename
local handle | SHM ID number filedescriptor number
mem location | created by shmat() created by mmap()
I.e. the key is the "handle" you're looking for, pass that the same way you would pass a filename, and both sides of the IPC connection can then use that key to check whether the shared resource exists, as well at access (attach to the handle) the contents though that.
A more modern way to share memory among processes is to use the POSIX shm_open() API.
Essentially, it's a portable way of putting files on a ramdisk (tmpfs). So one process uses shm_open plus ftruncate plus mmap. The other uses shm_open (with the same name) plus mmap plus shm_unlink. (With more than two processes, the last one to mmap it can unlink it.)
This way the shared memory will get reclaimed automatically when the last process exits; no need to explicitly remove the shared segment (as with SysV shared memory).
You still need to modify your application to allocate shared memory in this way, though.
In theory at least, you can record the memory address of the buffer you got from your lib and have the other process mmap /proc/$PID_OF_FIRST_PROCCESS/mem file with the address as the offset.
I haven't tested it and I'm not sure /proc/PID/mem actually has an mmap file op implemented and there are a ton of security consideration but it might work. Best of luck :-)

What all APIs are affected by {$IOCHECKS OFF}?

We have some ancient Delphi code (might have even originated as Turbo Pascal code) that uses {$I-}, aka {$IOCHECKS
OFF}, which makes the code use IOResult instead of exceptions for disk I/O errors.
I want to get rid of the {$I-} and bring this code forward into the 1990s, but to do that, I'd like to know what all is affected by {$IOCHECKS OFF}. Does this only affect the crufty old built-in I/O functions like AssignFile / Reset / Rewrite / Append / CloseFile? Or does it affect more modern things like TFileStream as well? More importantly, what else might be affected that I'm not thinking of? (Delphi Basics suggests that it also affects MkDir and RmDir. If it affects those, there have to be more.)
The Delphi 2007 Help topic "Input output checking (Delphi)" (ms-help://borland.bds5/devcommon/compdirsinput_outputchecking_xml.html) says that this affects "I/O procedure[s]", and that "I/O procedures are described in the Delphi Language Guide." This doesn't help much, since CodeGear has never shipped a Language Guide, and the last time Borland shipped one was Delphi 5.
Which functions and classes behave differently under {$I-}?
EDIT: The accepted answer gives some great background, but here's the quick summary in alphabetized list form: {$IOCHECKS OFF} only affects the following routines from the System unit.
Append
BlockRead
BlockWrite
ChDir
CloseFile
Eof
Eoln
Erase
FilePos
FileSize
Flush
MkDir
Read
Readln
Rename
Reset
Rewrite
RmDir
Seek
SeekEof
SeekEoln
SetLineBreakStyle
Truncate
Write
Writeln
Since $I is a compiler directive, it can only affect compiler-generated code, and it can only affect code that actually gets compiled.
For those two reasons, it cannot affect things like TFileStream. It's a class in Classes.pas, which is a unit you don't compile. Any code in it is not affected by the $I directive. Furthermore, the compiler doesn't treat that class specially in any way. It's just another ordinary class.
The $I directive affects the language built-in functions that you've mentioned. The compiler generates calls to those functions specially. It also affects calls to write, writeln, and readln. It should also affect BlockRead and BlockWrite.
You can check the source code. Anything that calls SetInOutRes is susceptible to $I. That includes functions that open files (Append, Reset, and Rewrite), as well as anything else that accepts a parameter of type file or TextFile (Flush, BlockRead, BlockWrite, Erase, FilePos, Seek, FileSize, Read, Readln, Write, Writeln, Rename, Eof, SeekEof, Eoln, SeekEol, Truncate, SetLineBreakStyle, and CloseFile). Also, anything that calls InOutError (ChDir, MkDir, amd RmDir).
Notably absent from the list is AssignFile. That function doesn't actually do any I/O. It just sets up the file record so that Append, Reset, and Rewrite will know what to do.
I should point out that looking at the source code is just inference. The $I directive controls whether the compiler will insert calls to the __IOTest function in your own code after you call certain other functions. That function checks the value of InOutRes, and if it's not zero, it raises a run-time error (which may yield an exception if SysUtils is included in your program). We can't check the source code to directly find out what functions are affected by $I (since it's only called in compiler-generated code), so we're really just looking for which functions set InOutRes, with the assumption that they wouldn't bother doing that if they didn't know the compiler would check for it afterward.

Ant (or NAnt) in Lisp

In his article The Nature of Lisp, Slava Akhmechet introduces people to lisp by using Ant/NAnt as an example. Is there an implementation of Ant/NAnt in lisp? Where you can use actual lisp code, instead of xml, for defining things? I've had to deal with creating additions to NAnt, and have wished for a way to bypass the xml system in the way Slava shows could be done.
Ant is a program that interprets commands written in some XML language. You can, as justinhj mentioned in his answer use some XML parser (like the mentioned XMLisp) and convert the XML description in some kind of Lisp data and then write additional code in Lisp. You need to reimplement also some of the Ant interpretation.
Much of the primitive stuff in Ant is not needed in Lisp. Some file operations are built-in in Lisp (delete-file, rename-file, probe-file, ...). Some are missing and need to be implemented - alternative you can use one of the existing libraries. Also note that you can LOAD Lisp files into Lisp and execute code - there is also the REPL - so it comes already with an interactive frontend (unlike Java).
Higher level build systems in Common Lisp usually are implementing an abstraction called 'SYSTEM'. There are several of those. ASDF is a popular choice, but there are others. A system has subsystems and files. A system has also a few options. Its components also have options. A system has either a structural description of the components, a description of the dependencies, or a kind descriptions of 'actions' and their dependencies. Typically these things are implemented in an object-oriented way and you can implement 'actions' as Lisp (generic) functions. Lisp also brings functions like COMPILE-FILE, which will use the Lisp compiler to compile a file. If your code has, say, C files - you would need to call a C compiler - usually through some implementation specific function that allows to call external programs (here the C compiler).
As, mentioned by dmitry-vk, ASDF is a popular choice. LispWorks provides Common Defsystem. Allegro CL has their own DEFSYSTEM. Its DEFSYSTEM manual describes also how to extend it.
All the Lisp solution are using some kind of Lisp syntax (not XML syntax), usually implemented by a macro to describe the system. Once that is read into Lisp, it turns into a data representation - often with CLOS instances for the system, modules, etc.. The actions then are also Lisp functions. Some higher-order functions then walk over the component graph/tree and execute actions of necessary. Some other tools walk over the component graph/tree and return a representation for actions - which is then a proposed plan - the user then can let Lisp execute the whole plan, or parts of the plan.
On a Lisp Machine a simple system description looks like this:
(sct:defsystem scigraph
(:default-pathname "sys:scigraph;"
:required-systems "DWIM")
(:serial "package" "copy" "dump" "duplicate" "random"
"menu-tools" "basic-classes" "draw" "mouse"
"color" "basic-graph" "graph-mixins" "axis"
"moving-object" "symbol" "graph-data" "legend"
"graph-classes" "present" "annotations" "annotated-graph"
"contour" "equation" "popup-accept" "popup-accept-methods"
"duplicate-methods" "frame" "export" "demo-frame"))
Above defines a system SCIGRAPH and all files should be compiled and load in serial order.
Now I can see what the Lisp Machine would do to update the compiled code:
Command: Compile System (a system [default Scigraph]) Scigraph (keywords)
:Simulate (compiling [default Yes]) Yes
The plan for constructing Scigraph version Newest for the Compile
operation is:
Compile RJNXP:>software>scigraph>scigraph>popup-accept-methods.lisp.newest
Load RJNXP:>software>scigraph>scigraph>popup-accept-methods.ibin.newest
It would compile one file and load it - I have the software loaded and changed only this file so far.
For ASDF see the documentation mentioned on the CLIKI page - it works a bit different.
Stuart Halloway's upcoming book Programming Clojure goes through the construction of Lancet throughout the book as an example app. Lancet is a Clojure build system which (optionally) integrates directly with Ant. Source code and examples are available.
If all you want to do is generate Ant XML files using Lisp code, you could use something like clj-html for Clojure or CL-WHO for Common Lisp. Generating XML from Lisp s-exps is fun and easy.
Common Lisp's ASDF (Another System Definition Facility) is analogous to Make/Ant (but not a full analogue — it is aimed at building lisp programs, not generic systems like make or ant). It is extensible with Lisp code (subclassing systems, components, adding operations to systems). E.g., there is an asdf-ecs extensions that allows including (and compiling) C source files into system.
Perhaps you could define things in lisp and convert them to XML at the point you pass them to NAnt.
Something like XMLisp makes it easier to go back and forth between the two representations.
Edit: Actually, xml-emitter would make more sense.

Resources