Print statement versus stdout performance and Dart-Editor versus command-line performance - dart

This is probably not of major importance, however I have noticed during testing that the performance of the print statement and also stdout is much faster in the Dart-Editor than from the command-line. From the command-line the performance of print takes around 36% longer than using stdout from the command-line. However, running the program from within the editor, using stdout takes around 900% longer than using the print statement in the editor, but both are considerably faster than from the command-line. ie. Print from a program running in the editor takes around 2.65% of the time it takes from the command-line.
Some relative timings based on average performance from my test :
Running program from command line (5000 iterations) :
print 1700 milliseconds.
stdout 1245 milliseconds.
Running program within Dart-Editor (5000 iterations) :
print 45 milliseconds
stdout 447 milliseconds.
Can someone explain to me the reason for these differences – in particular why performance in the Dart-Editor is so much faster? Also, is it acceptable practice to use stdout and what are the pros and cons versus using print?

Why is the Dart Editor faster?
Because the output handling by the command line is just really slow, and this blocks the output stream, and subsequently the call to print/stdout.
You can test this for yourself - test the following java program (with your own paths, of course):
public static void main(String[] args) {
try {
// the dart file does print and stdout in a loop
Process p = Runtime.getRuntime().exec("C:\\eclipse\\dart-sdk\\bin\\dart.exe D:\\DEVELOP\\Dart\\Console_Playground\\bin\\console_playground.dart");
BufferedReader in = new BufferedReader(new InputStreamReader(p.getInputStream()));
StringBuffer buf = new StringBuffer();
String line;
while((line = in.readLine()) != null) {
buf.append(line + "\r\n");
}
System.out.print(buf.toString());
} catch (IOException e) {
e.printStackTrace();
}
}
On my machine, this is even slightly faster than the Dart Editor (which probably does something like buffering the input and rendering it periodically, but I don't really know).
You will also see that adding a Thread.sleep(1); into the loop will severely impact the performance of the dart program, because the stream is blocked.
Should stdout be used?
I think that's highly subjective. I, for one, do whatever lets me write code more quickly. When i just want to dump a variable, i use print(myvar);. But with stdout, you can do neat stuff like this: stdout.addStream(new File(r"D:\test.csv").openRead());. Of course, if performance is an issue, it depends on how your application will be used - for example, called by another program (where print is faster) vs. command line (where stdout is faster, for some reason).
Why is stdout faster in command line?
I have no idea, sorry. It's the only environment I tested where print() is slower, so I'd guess it has something to do with how the console handles incoming data.

Related

How do I intercept the unbuffered output of a Proc::Async in Raku?

With a snippet like
# Contents of ./run
my $p = Proc::Async.new: #*ARGS;
react {
whenever Promise.in: 5 { $p.kill }
whenever $p.stdout { say "OUT: { .chomp }" }
whenever $p.ready { say "PID: $_" }
whenever $p.start { say "Done" }
}
executed like
./run raku -e 'react whenever Supply.interval: 1 { .say }'
I expected to see something like
PID: 1234
OUT: 0
OUT: 1
OUT: 2
OUT: 3
OUT: 4
Done
but instead I see
PID: 1234
OUT: 0
Done
I understand that this has to do with buffering: if I change that command into something like
# The $|++ disables buffering
./run perl -E '$|++; while(1) { state $i; say $i++; sleep 1 }'
I get the desired output.
I know that TTY IO::Handle objects are unbuffered, and that in this case the $*OUT of the spawned process is not one. And I've read that IO::Pipe objects are buffered "so that a write without a read doesn't immediately block" (although I cannot say I entirely understand what this means).
But no matter what I've tried, I cannot get the unbuffered output stream of a Proc::Async. How do I do this?
I've tried binding an open IO::Handle using $proc.bind-stdout but I still get the same issue.
Note that doing something like $proc.bind-stdout: $*OUT does work, in the sense that the Proc::Async object no longer buffers, but it's also not a solution to my problem, because I cannot tap into the output before it goes out. It does suggest to me that if I can bind the Proc::Async to an unbuffered handle, it should do the right thing. But I haven't been able to get that to work either.
For clarification: as suggested with the Perl example, I know I can fix this by disabling the buffering on the command I'll be passing as input, but I'm looking for a way to do this from the side that creates the Proc::Async object.
You can set the .out-buffer of a handle (such as $*OUT or $*ERR) to 0:
$ ./run raku -e '$*OUT.out-buffer = 0; react whenever Supply.interval: 1 { .say }'
PID: 11340
OUT: 0
OUT: 1
OUT: 2
OUT: 3
OUT: 4
Done
Proc::Async itself isn't performing buffering on the received data. However, spawned processes may do their own depending on what they are outputting to, and that's what is being observed here.
Many programs make decisions about their output buffering (among other things, such as whether to emit color codes) based on whether the output handle is attached to a TTY (a terminal). The assumption is that a TTY means a human is going to be watching the output, and thus latency is preferable to throughput, so buffering is disabled (or restricted to line buffering). If, on the other hand, the output is going to a pipe or a file, then the assumption is that latency is not so important, and buffering is used to achieve a significant throughput win (a lot less system calls to write data).
When we spawn something with Proc::Async, the standard output of the spawned process is bound to a pipe - which is not a TTY. Thus the invoked program may use this to decide to apply output buffering.
If you're willing to have another dependency, then you can invoke the program via. something that fakes up a TTY, such as unbuffer (part of the expect package, it seems). Here's an example of a program that is suffering from buffering:
my $proc = Proc::Async.new: 'raku', '-e',
'react whenever Supply.interval(1) { .say }';
react whenever $proc.stdout {
.print
}
We only see a 0 and then have to wait a long time for more output. Running it via unbuffer:
my $proc = Proc::Async.new: 'unbuffer', 'raku', '-e',
'react whenever Supply.interval(1) { .say }';
react whenever $proc.stdout {
.print
}
Means that we see a number output every second.
Could Raku provide a built-in solution to this some day? Yes - by doing the "magic" that unbuffer itself does (I presume allocating a pty - kind of a fake TTY). This isn't trivial - although it is being explored by the libuv developers; at least so far as Rakudo on MoarVM goes, the moment there's a libuv release available offering such a feature, we'll work on exposing it.

What standard input and output would be if there's no terminal connected to server?

This question came up into my mind when I was thinking about ways of server logging yesterday.
Normally, we open a terminal connected to local computer or remote server, run an executable, and print (printf, cout) some debug/log information in the terminal.
But for those processes/executables/scripts running on the server which are not connected to a terminal, what are the standard input and output?
For example:
Suppose I have a crontab task, running a program on the server many times a day. If I write something like cout << "blablabla" << endl; in the program. What's gonna happen? Where those output will flow into?
Another example I came up and wanted to know is, if I write a CGI program (use C or C++) for let's say a Apache web server, what is the standard input and output of my CGI program ? (According to this C++ CGI tutorial, I guess the standard input and output of the CGI program are in some ways redirected to the Apache server. Because it's using cout to output the html contents, not by return. )
I've read this What is “standard input”? before asking, which told me standard input isn't necessary to be tied to keyboard while standard output isn't necessary to be tied to a terminal/console/screen.
OS is Linux.
The standard input and standard output (and standard error) streams can point to basically any I/O device. This is commonly a terminal, but it can also be a file, a pipe, a network socket, a printer, etc. What exactly those streams direct their I/O to is usually determined by the process that launches your process, be that a shell or a daemon like cron or apache, but a process can redirect those streams itself it it would like.
I'll use Linux as an example, but the concepts are similar on most other OSes. On Linux, the standard input and standard output stream are represented by file descriptors 0 and 1. The macros STDIN_FILENO and STDOUT_FILENO are just for convenience and clarity. A file descriptor is just a number that matches up to some file description that the OS kernel maintains that tells it how to write to that device. That means that from a user-space process's perspective, you write to pretty much anything the same way: write(some_file_descriptor, some_string, some_string_length) (higher-level I/O functions like printf or cout are just wrappers around one or more calls to write). To the process, it doesn't matter what type of device some_file_descriptor represents. The OS kernel will figure that out for you and pass your data to the appropriate device driver.
The standard way to launch a new process is to call fork to duplicate the parent process, and then later to call one of the exec family of functions in the child process to start executing some new program. In between, it will often close the standard streams it inherited from its parent and open new ones to redirect the child process's output somewhere new. For instance, to have the child pipe its output back to the parent, you could do something like this in C++:
int main()
{
// create a pipe for the child process to use for its
// standard output stream
int pipefds[2];
pipe(pipefds);
// spawn a child process that's a copy of this process
pid_t pid = fork();
if (pid == 0)
{
// we're now in the child process
// we won't be reading from this pipe, so close its read end
close(pipefds[0]);
// we won't be reading anything
close(STDIN_FILENO);
// close the stdout stream we inherited from our parent
close(STDOUT_FILENO);
// make stdout's file descriptor refer to the write end of our pipe
dup2(pipefds[1], STDOUT_FILENO);
// we don't need the old file descriptor anymore.
// stdout points to this pipe now
close(pipefds[1]);
// replace this process's code with another program
execlp("ls", "ls", nullptr);
} else {
// we're still in the parent process
// we won't be writing to this pipe, so close its write end
close(pipefds[1]);
// now we can read from the pipe that the
// child is using for its standard output stream
std::string read_from_child;
ssize_t count;
constexpr size_t BUF_SIZE = 100;
char buf[BUF_SIZE];
while((count = read(pipefds[0], buf, BUF_SIZE)) > 0) {
std::cout << "Read " << count << " bytes from child process\n";
read_from_child.append(buf, count);
}
std::cout << "Read output from child:\n" << read_from_child << '\n';
return EXIT_SUCCESS;
}
}
Note: I've omitted error handling for clarity
This example creates a child process and redirects its output to a pipe. The program run in the child process (ls) can treat the standard output stream just as it would if it were referencing a terminal (though ls changes some behaviors if it detects its standard output isn't a terminal).
This sort of redirection can also be done from a terminal. When you run a command you can use the redirection operators to tell your shell to redirect that commands standard streams to some other location than the terminal. For instance, here's a convoluted way to copy a file from one machine to another using an sh-like shell:
gzip < some_file | ssh some_server 'zcat > some_file'
This does the following:
create a pipe
run gzip redirecting its standard input stream to read from "some_file" and redirecting its standard output stream to write to the pipe
run ssh and redirect its standard input stream to read from the pipe
on the server, run zcat with its standard input redirected from the data read from the ssh connection and its standard output redirected to write to "some_file"

Broken pipe error in CCL Lisp

I am using CCL Lisp to run batches of experiments in parallel. On my machine, everything is running fine. However, I would like to use this on a server. When I execute this on a server, I always get the following error message:
> Error: on #<BASIC-CHARACTER-OUTPUT-STREAM UTF-8 (PIPE/7) #x302001C2725D> :
> Broken pipe during write
> While executing: #<CCL::STANDARD-KERNEL-METHOD CCL::STREAM-IO-ERROR (STREAM T T)>, in process listener(1).
My code always reaches the same point when trowing this error. An excerpt of the code is given below:
;; ... A really long function
;; write commands to processes
(format t ".. writing commands to process ~a:~%" counter)
(loop for c in commands
do
(format t " ~a~%" c)
(write-string c output-stream)
(princ #\lf output-stream))
(force-output t)
(force-output output-stream)
(finish-output output-stream)
#-lispworks
(close output-stream))
I think this error occurs inside the loop statement, since not all of the commands are written to the output stream.
How can I further debug this and solve this issue?
"Broken pipe" means that the process which is supposed to be reading from the pipe is dead when the Lisp process is writing to the pipe.
IOW, the problem is probably outside of Lisp. You need to see what is happening with the other process.
PS. You can combine your write-string and princ into a single write-line. Also, you don't need force-output if you are calling finish-output immediately.

Unbuffered Output Very Slow

I generate a very large .csv file from a database using the method outlined in
https://stackoverflow.com/a/13456219/141172
It works fine, up to a point. When the exported file is too large, I get an OutOfMemoryException.
If I turn off output buffering by modifying that code like this:
protected override void WriteFile(System.Web.HttpResponseBase response)
{
response.BufferOutput = false; // <--- Added this
this.Content(response.OutputStream);
}
the file download completes. However, it is several orders of magnitude slower than when output buffering was enabled (measured for the same file with buffering true/false, on localhost).
I understand that is slower, but why would it slow to a relative crawl? Is there anything I can do to improve processing speed?
UPDATE
It would also be an option to use File(Stream stream, String contentType) as suggested in the comments. However, I'm not sure how to create stream. The data is dynamically assembled based on a DB query, and a MemoryStream would run out of contiguous physical memory. Suggestions are welcome.
UPDATE 2
It was suggested in the comments that alternately reading from the database and writing to the stream is causing a degradation. I modified the code to perform the stream writing in a separate thread (using the producer/consumer pattern). There is no appreciable difference in performance.
I don't know what ASP.NET and IIS are doing exactly with output streaming but maybe too small chunks are being uses. Hook in a BufferedStream with a very big buffer, like 4MB.
According to your comments it worked. Now, tune down the buffer size to save memory and have a smaller working set. Good for cache.
As a subjective comment I'm disappointed that this is even necessary. IIS should use the right buffers automatically which is extremely easy with TCP connections.
EDIT FROM OP
Here is the code derived from this answer
public ActionResult Export()
{
// Domain specific stuff here
return new FileGeneratingResult("MyFile.txt", "text/text",
stream => this.StreamExport(stream), false);
}
private void StreamExport(Stream stream)
{
using (BufferedStream bs = new BufferedStream(stream, 256*1024))
using (StreamWriter sw = new StreamWriter(bs))
foreach (var stuff in MyData())
{
sw.Write(stuff);
}
}
In Eric's latest update, he mentioned using another thread. I too had this problem for implementing database exports. Here is some example code for the solution I used:
Handling with temporary file stream

Capturing output from WshShell.Exec using Windows Script Host

I wrote the following two functions, and call the second ("callAndWait") from JavaScript running inside Windows Script Host. My overall intent is to call one command line program from another. That is, I'm running the initial scripting using cscript, and then trying to run something else (Ant) from that script.
function readAllFromAny(oExec)
{
if (!oExec.StdOut.AtEndOfStream)
return oExec.StdOut.ReadLine();
if (!oExec.StdErr.AtEndOfStream)
return "STDERR: " + oExec.StdErr.ReadLine();
return -1;
}
// Execute a command line function....
function callAndWait(execStr) {
var oExec = WshShell.Exec(execStr);
while (oExec.Status == 0)
{
WScript.Sleep(100);
var output;
while ( (output = readAllFromAny(oExec)) != -1) {
WScript.StdOut.WriteLine(output);
}
}
}
Unfortunately, when I run my program, I don't get immediate feedback about what the called program is doing. Instead, the output seems to come in fits and starts, sometimes waiting until the original program has finished, and sometimes it appears to have deadlocked. What I really want to do is have the spawned process actually share the same StdOut as the calling process, but I don't see a way to do that. Just setting oExec.StdOut = WScript.StdOut doesn't work.
Is there an alternate way to spawn processes that will share the StdOut & StdErr of the launching process? I tried using "WshShell.Run(), but that gives me a "permission denied" error. That's problematic, because I don't want to have to tell my clients to change how their Windows environment is configured just to run my program.
What can I do?
You cannot read from StdErr and StdOut in the script engine in this way, as there is no non-blocking IO as Code Master Bob says. If the called process fills up the buffer (about 4KB) on StdErr while you are attempting to read from StdOut, or vice-versa, then you will deadlock/hang. You will starve while waiting for StdOut and it will block waiting for you to read from StdErr.
The practical solution is to redirect StdErr to StdOut like this:
sCommandLine = """c:\Path\To\prog.exe"" Argument1 argument2"
Dim oExec
Set oExec = WshShell.Exec("CMD /S /C "" " & sCommandLine & " 2>&1 """)
In other words, what gets passed to CreateProcess is this:
CMD /S /C " "c:\Path\To\prog.exe" Argument1 argument2 2>&1 "
This invokes CMD.EXE, which interprets the command line. /S /C invokes a special parsing rule so that the first and last quote are stripped off, and the remainder used as-is and executed by CMD.EXE. So CMD.EXE executes this:
"c:\Path\To\prog.exe" Argument1 argument2 2>&1
The incantation 2>&1 redirects prog.exe's StdErr to StdOut. CMD.EXE will propagate the exit code.
You can now succeed by reading from StdOut and ignoring StdErr.
The downside is that the StdErr and StdOut output get mixed together. As long as they are recognisable you can probably work with this.
Another technique which might help in this situation is to redirect the standard error stream of the command to accompany the standard output.
Do this by adding "%comspec% /c" to the front and "2>&1" to the end of the execStr string.
That is, change the command you run from:
zzz
to:
%comspec% /c zzz 2>&1
The "2>&1" is a redirect instruction which causes the StdErr output (file descriptor 2) to be written to the StdOut stream (file descriptor 1).
You need to include the "%comspec% /c" part because it is the command interpreter which understands about the command line redirect. See http://technet.microsoft.com/en-us/library/ee156605.aspx
Using "%comspec%" instead of "cmd" gives portability to a wider range of Windows versions.
If your command contains quoted string arguments, it may be tricky to get them right:
the specification for how cmd handles quotes after "/c" seems to be incomplete.
With this, your script needs only to read the StdOut stream, and will receive both standard output and standard error.
I used this with "net stop wuauserv", which writes to StdOut on success (if the service is running)
and StdErr on failure (if the service is already stopped).
First, your loop is broken in that it always tries to read from oExec.StdOut first. If there is no actual output then it will hang until there is. You wont see any StdErr output until StdOut.atEndOfStream becomes true (probably when the child terminates). Unfortunately, there is no concept of non-blocking I/O in the script engine. That means calling read and having it return immediately if there is no data in the buffer. Thus there is probably no way to get this loop to work as you want. Second, WShell.Run does not provide any properties or methods to access the standard I/O of the child process. It creates the child in a separate window, totally isolated from the parent except for the return code. However, if all you want is to be able to SEE the output from the child then this might be acceptable. You will also be able to interact with the child (input) but only through the new window (see SendKeys).
As for using ReadAll(), this would be even worse since it collects all the input from the stream before returning so you wouldn't see anything at all until the stream was closed. I have no idea why the example places the ReadAll in a loop which builds a string, a single if (!WScript.StdIn.AtEndOfStream) should be sufficient to avoid exceptions.
Another alternative might be to use the process creation methods in WMI. How standard I/O is handled is not clear and there doesn't appear to be any way to allocate specific streams as StdIn/Out/Err. The only hope would be that the child would inherit these from the parent but that's what you want, isn't it? (This comment based upon an idea and a little bit of research but no actual testing.)
Basically, the scripting system is not designed for complicated interprocess communication/synchronisation.
Note: Tests confirming the above were performed on Windows XP Sp2 using Script version 5.6. Reference to current (5.8) manuals suggests no change.
Yes, the Exec function seems to be broken when it comes to terminal output.
I have been using a similar function function ConsumeStd(e) {WScript.StdOut.Write(e.StdOut.ReadAll());WScript.StdErr.Write(e.StdErr.ReadAll());} that I call in a loop similar to yours. Not sure if checking for EOF and reading line by line is better or worse.
You might have hit the deadlock issue described on this Microsoft Support site.
One suggestion is to always read both from stdout and stderr.
You could change readAllFromAny to:
function readAllFromAny(oExec)
{
var output = "";
if (!oExec.StdOut.AtEndOfStream)
output = output + oExec.StdOut.ReadLine();
if (!oExec.StdErr.AtEndOfStream)
output = output + "STDERR: " + oExec.StdErr.ReadLine();
return output ? output : -1;
}

Resources