Does Tcl do any internal input buffering that's out of the script writers control? Will the following code possibly waste entropy (read more than 1 byte), and if so, how can I prevent it?
set stream [open "/dev/srandom"]
chan configure $stream -translation binary
set randomByte [chan read $stream 1]
Yes, tcl defaults to buffering and will waste enthropy (as much as a single read call will decide to hand over).
I thought that you can prevent it with
chan configure $stream -buffering none
But no, -buffering has no effect on input queue (it's not a single buffer internally).
However,
chan configure $stream -buffersize 0
does the trick, as I've seen from an experiment with stdin under strace. It makes any input go in reads (syscall) of size 1 (an argument to TCL read doesn't matter), so it would be extremely slow for normal use.
Related
I would like to send a very large (~8GB) datastructure through the network, so I use the Marshal module to transform it into Bytes.
My problem is that the memory doubles, because we need to store both representations (initial data and Marshaled data).
Is there a simple way to Marshal into a Stream instead ? This would avoid to have the full Marshalled representation of the initial datastructure.
I thought of Marshaling to an out_channel in which I opened a pipe with a second thread and reading from the pipe in the main thread into s Stream, but I guess there might be a simpler solution.
Thanks !
Edit: Answer to a comment:
In the toplevel :
let a = Array.make (1024*1024*1024) 0. ;; (* Takes 8GB of RAM *)
let data = Marshal.to_bytes a [Marshal.Closures] ;; (* Takes an extra 8GB *)
It's not possible. You would have to modify the Marshal module to stream the data as it marshals something and to reconstruct the data in place without buffering it all first.
In the short run it might be simpler to implement your own specialized marshal function specific to your data. For an 8GiB array you might want to switch to using BigArray so you can send/recv the data without having to copy it.
Note: A 8GiB array will use 16GiB if the GC ever copies it, at least temporary.
From what I understand, MPI only allows to send data packets with a known size, not a stream of data. You could implement a custom stream type that split an incoming flow of data to packets of constant, small size (on close, you flush whatever remains in the buffer).
Also, you only can marshall arbitrary long data to a channel, because otherwise you take up too many space.
And then, you need to have a way to connect the channel to the stream, which AFAIK is not easily possible. Maybe you could start antoer ocaml process: the process would convert the flow of bytes (you can wrap a custom stream over Stream.of_channel) and send it through MPI. The main process would marshall data to the process's input channel.
I am feeling Stream Tags, Message Passing, Packet Data Transmission are a bit of overkill, and I have hard time to understand.
I have a simple wish: starting from a stream of bytes I would like to "extract" only a fixed number of bytes) starting from a known pattern. For example from a stream like this: ...01h 55h XXh YYh ZZh..., it should extract XXh YYh ZZh.
I utilized Correlate Access Code Tag block -- Tagged Stream Align -- Pack K Bits to convert a bit stream into a byte stream and synch to the desired Access Code (01h 55h), but how do I tell gnuradio to only process 3 bytes after every time the code is found? Likely OOT block would solve, but is it there some combinatino of standard GRC block to do this?
I think with correllate_access_code_tag_bb you can actually build this, with a bit of brain-twisting, from existing blocks alone. (Note: this does rely on stream tags, because those are the right tool to mark special points in a sample flow.)
However, your simple case might really not be worth it. Simply follow the guided tutorials up to the point where you can write your own python block.
Use self.set_history(len(preamble)+len_payload) in the constructor of your new block to make sure you always see the last samples of the previous iteration in your current call to work, and simply search for the preamble in your sample stream, outputting only the len_payload following bytes when you find it, not producing anything if you don't find it.
I want to be able to write bytes and read them from standard input/output but when I try this in SBCL I get the error "The stream has no suitable method[...]", why is this and how would I go about to make my own stream which can handle bytes?
This seems to be because the standard input and output streams are streams with element type character, not (unsigned-byte 8). The element type of a stream is usually configured, when the stream is opened, which, in the case of standard input/output, is done automatically when the interpreter starts.
However, SBCL has the notion of bivalent streams, which can support both, character and byte-oriented I/O. As it happens, on my machine,
* (read-byte *standard-input* nil)
a
97
* (read-char *standard-input* nil)
a
#\a
works fine. So, which version of SBCL are you using? Mine is SBCL 1.0.49.
I have a program which calculates 'Printer Queues Total' value using '/usr/bin/lpstat' through popen() system call.
{
int n=0;
FILE *fp=NULL;
printf("Before popen()");
fp = popen("/usr/bin/lpstat -o | grep '^[^ ]*-[0-9]*[ \t]' | wc -l", "r");
printf("After popen()");
if (fp == NULL)
{
printf("Failed to start lpstat - %s", strerror(errno));
return -1;
}
printf("Before fscanf");
fscanf(fp, "%d", &n);
printf("After fscanf");
printf("Before pclose()");
pclose(fp);
printf("After pclose()");
printf("Value=%d",n);
printf("=== END ===");
return 0;
}
Note: In the command line, '/usr/bin/lpstat' command is hanging for some time as there are many printers available in the network.
The problem here is, the execution is hanging at popen() system call, Where as I would expect it to hang at fscanf() which reads the output from the file stream fp.
If anybody can tell me the reasons for the hang at popen() system call, it will help me in modifying the program to work for my requirement.
Thanks for taking time in reading this post and your efforts.
What people expect does not always have a basis in reality :-)
The command you're running doesn't actually generate any output until it's finished. That would be why it would seem to be hung in the popen rather than the fscanf.
There are two possible reasons for that which spring to mind immediately.
The first is that it's implemented this way, with popen capturing the output in full before delivering the first line. Based on my knowledge of UNIX, this seems unlikely but I can't be sure.
Far more likely is the impact of the pipe. One thing I've noticed is that some filters (like grep) batch up their lines for efficiency. So, while popen itself may be spewing forth its lines immediately (well, until it gets to the delay bit anyway), the fact that grep is holding on to the lines until it gets a big enough block may be causing the delay.
In fact, it's almost certainly the pipe-through-wc, which cannot generate any output until all lines are received from lpstat (you cannot figure out how many lines there are until all the lines have been received). So, even if popen just waited for the first character to be available, that would seem to be where the hang was.
It would be a simple matter to test this by simply removing the pipe-through-grep-and-wc bit and seeing what happens.
Just one other point I'd like to raise. Your printf statements do not have newlines following and, even if they did, there are circumstances where the output may still be fully buffered (so that you probably wouldn't see anything until that program exited, or the buffer filled up).
I would start by changing them to the form:
printf ("message here\n"); fflush (stdout); fsync (fileno (stdout));
to ensure they're flushed fully before continuing. I'd hate this to be a simple misunderstanding of a buffering issue :-)
It sounds as if popen may be hanging whilst lpstat attempts to retrieve information from remote printers. There is a fair amount of discussion on this particular problem. Have a look at that thread, and especially the ones that are linked from that.
Section 7.19.3/7 of c99 states that:
At program start-up, three text streams are predefined and need not be opened explicitly - standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output).
As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.
So that makes sense. If you're pushing your standard output to a file, you want it fully buffered for efficiency.
But I can find no mention in the standard as to whether the output is line buffered or unbuffered when you can't determine the device is non-interactive (ie, normal output to a terminal).
The reason I ask was a comment to my answer here that I should insert an fflush(stdout); between the two statements:
printf ("Enter number> ");
// fflush (stdout); needed ?
if (fgets (buff, sizeof(buff), stdin) == NULL) { ... }
because I wasn't terminating the printf with a newline. Can anyone clear this up?
The C99 standard does not specify if the three standard streams are unbuffered or line buffered: It is up to the implementation. All UNIX implementations I know have a line buffered stdin. On Linux, stdout in line buffered and stderr unbuffered.
As far as I know, POSIX does not impose additional restrictions. POSIX's fflush page does note in the EXAMPLES section:
[...] The fflush() function is used because standard output is usually buffered and the prompt may not immediately be printed on the output or terminal.
So the remark that you add fflush(stdout); is correct.
An alternative could be to make stdout unbuffered:
setbuf(stdout, NULL);
/* or */
setvbuf(stdout, NULL, _IONBF, 0);
But as R. notes you can only do this once, and it must be before you write to stdout or perform any other operantion on it. (C99 7.19.5.5 2)
I just read a recent thread on comp.lang.c about the same thing. One of the remarks:
Unix convention is that stdin and stdout are line-buffered when associated with a terminal, and fully-buffered (aka block-buffered) otherwise. stderr is always unbuffered.