How to read IO stream without reaching EOF? - ruby-on-rails

In one of the legacy application I am working, we are trying to get extracted zip file name through Open3 utility. This is divided into two commands. First command unzip the compressed file and the second one tries to list and grep filename with specific extension. But recently we started encountering error while reading IO buffer.
# unzip with 7z
Open3.capture3("7z", "x", my_zip.path, "-p#{my_pass}", "-o#{tmp_dir}")
# extract file name with .dbf extension
resp, _threads = Open3.pipeline_r(["7z", "-slt", "l", "-p#{my_pass}", my_zip.path], ["grep", "-oP", "(?<=Path = ).+dbf"])
# read IO stream for filename
res_filename = resp.readline.strip
# close stream
# resp.close
at resp.readline.strip we are encoutering EOFError. I tried to using ::gets instead of readline to avoid EOFError but it will lead to empty/nil value.
What am I missing? Is there any way to convert IO stream to String object or something like that?
Observed that this is working fine from console but raising exception when job invoked from sidekiq dashboard

Related

What standard input and output would be if there's no terminal connected to server?

This question came up into my mind when I was thinking about ways of server logging yesterday.
Normally, we open a terminal connected to local computer or remote server, run an executable, and print (printf, cout) some debug/log information in the terminal.
But for those processes/executables/scripts running on the server which are not connected to a terminal, what are the standard input and output?
For example:
Suppose I have a crontab task, running a program on the server many times a day. If I write something like cout << "blablabla" << endl; in the program. What's gonna happen? Where those output will flow into?
Another example I came up and wanted to know is, if I write a CGI program (use C or C++) for let's say a Apache web server, what is the standard input and output of my CGI program ? (According to this C++ CGI tutorial, I guess the standard input and output of the CGI program are in some ways redirected to the Apache server. Because it's using cout to output the html contents, not by return. )
I've read this What is “standard input”? before asking, which told me standard input isn't necessary to be tied to keyboard while standard output isn't necessary to be tied to a terminal/console/screen.
OS is Linux.
The standard input and standard output (and standard error) streams can point to basically any I/O device. This is commonly a terminal, but it can also be a file, a pipe, a network socket, a printer, etc. What exactly those streams direct their I/O to is usually determined by the process that launches your process, be that a shell or a daemon like cron or apache, but a process can redirect those streams itself it it would like.
I'll use Linux as an example, but the concepts are similar on most other OSes. On Linux, the standard input and standard output stream are represented by file descriptors 0 and 1. The macros STDIN_FILENO and STDOUT_FILENO are just for convenience and clarity. A file descriptor is just a number that matches up to some file description that the OS kernel maintains that tells it how to write to that device. That means that from a user-space process's perspective, you write to pretty much anything the same way: write(some_file_descriptor, some_string, some_string_length) (higher-level I/O functions like printf or cout are just wrappers around one or more calls to write). To the process, it doesn't matter what type of device some_file_descriptor represents. The OS kernel will figure that out for you and pass your data to the appropriate device driver.
The standard way to launch a new process is to call fork to duplicate the parent process, and then later to call one of the exec family of functions in the child process to start executing some new program. In between, it will often close the standard streams it inherited from its parent and open new ones to redirect the child process's output somewhere new. For instance, to have the child pipe its output back to the parent, you could do something like this in C++:
int main()
{
// create a pipe for the child process to use for its
// standard output stream
int pipefds[2];
pipe(pipefds);
// spawn a child process that's a copy of this process
pid_t pid = fork();
if (pid == 0)
{
// we're now in the child process
// we won't be reading from this pipe, so close its read end
close(pipefds[0]);
// we won't be reading anything
close(STDIN_FILENO);
// close the stdout stream we inherited from our parent
close(STDOUT_FILENO);
// make stdout's file descriptor refer to the write end of our pipe
dup2(pipefds[1], STDOUT_FILENO);
// we don't need the old file descriptor anymore.
// stdout points to this pipe now
close(pipefds[1]);
// replace this process's code with another program
execlp("ls", "ls", nullptr);
} else {
// we're still in the parent process
// we won't be writing to this pipe, so close its write end
close(pipefds[1]);
// now we can read from the pipe that the
// child is using for its standard output stream
std::string read_from_child;
ssize_t count;
constexpr size_t BUF_SIZE = 100;
char buf[BUF_SIZE];
while((count = read(pipefds[0], buf, BUF_SIZE)) > 0) {
std::cout << "Read " << count << " bytes from child process\n";
read_from_child.append(buf, count);
}
std::cout << "Read output from child:\n" << read_from_child << '\n';
return EXIT_SUCCESS;
}
}
Note: I've omitted error handling for clarity
This example creates a child process and redirects its output to a pipe. The program run in the child process (ls) can treat the standard output stream just as it would if it were referencing a terminal (though ls changes some behaviors if it detects its standard output isn't a terminal).
This sort of redirection can also be done from a terminal. When you run a command you can use the redirection operators to tell your shell to redirect that commands standard streams to some other location than the terminal. For instance, here's a convoluted way to copy a file from one machine to another using an sh-like shell:
gzip < some_file | ssh some_server 'zcat > some_file'
This does the following:
create a pipe
run gzip redirecting its standard input stream to read from "some_file" and redirecting its standard output stream to write to the pipe
run ssh and redirect its standard input stream to read from the pipe
on the server, run zcat with its standard input redirected from the data read from the ssh connection and its standard output redirected to write to "some_file"

Why does my python script not recognize speech from audio file?

I have the following piece of code successfully recognizing short (less than 1 min) test audio file, but failing with recognition another long audiofile (1.5h).
from google.cloud import speech
def run_quickstart():
speech_client = speech.Client()
sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC)
alternatives = sample.recognize('uk-UA')
for alternative in alternatives:
print(u'Transcript: {}'.format(alternative.transcript))
with open("Output.txt", "w") as text_file:
for alternative in alternatives:
text_file.write(alternative.transcript.encode('utf8'))
if __name__ == '__main__':
run_quickstart()
Both files are uploaded to Google Cloud.
The first one:
https://storage.googleapis.com/linear-arena-2109/sample.flac
The second one:
https://storage.googleapis.com/linear-arena-2109/zoom0070.flac
Both were converted from mp3 with ffmpeg utility:
ffmpeg -i sample.mp3 -ac 1 sample.flac
ffmpeg -i zoom0070.mp3 -ac 1 zoom0070.flac
First file was successfully recognized, but second file outputs the following error:
google.gax.errors.RetryError: GaxError(Exception occurred in retry method that was not classified as transient, caused by <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.)>)
But I have already used uri parameter in my python script. What is wrong?
update
#NieDzejkob helped to understand the error. So, method long_running_recognize should be used instead of recognize. The comprehensive long_running_recognize usage example can be found on the corresponding document page
For any audio file longer than 1 minute, you need to use Asynchronous Speech Recognition and the file has to be uploaded to Google Cloud Storage so that you can pass in a gcs_uri.
In addition, you will need to use the .long_running_recognize method in your script. An example from GCP documentation can be found here.
I realize that OP figured it out but thought it would be useful to provide an answer and generalize it a bit.

How do I view the source code of a GW-BASIC .BAS file?

I have an old .BAS file that I'm trying to view and for which I'm running into some problems. Searching online seems to indicate that I should be able to just open it in NOTEPAD.EXE or similar, but doing so gives me gibberish, like this:
þ*©¿TÜ…7[/C̸yõ»€¹Ù<Ñ~Æ-$Ì™}³nFuJ,ÖYòÎg)ʇŒ~Š¯DËðïþSnhœJN
‰=É™2+df”c).vX»[šû'Û9¹8%ñx5m#8úV4ÊBº)Eª;Iú¹ó‹|àÆ„72#Ž§i§Ë #îÑ?
í‘ú™ÞMÖæÕjYе‘_¢y<…7i$°Ò.ÃÅR×ÒTÒç_yÄÐ
}+d&jQ *YòÎg)ʇŒ~Š¯DË?úŽ©Ž5\šm€S{ÔÍo—#ìôÔ”ÜÍѱ]ʵ¬0wêÂLª¡öm#Å„Ws雦 X
Ô¶æ¯÷¦É®jÛ ¼§
”n ŸëÆf¿´ó½4ÂäÌ3§Œ®
I know the file is sound, because I can open it in GW-BASIC. However, list does not seem to work to view the file, and trying to save the file in ASCII format from within GW-BASIC, didn't work either. Both just gave me an "Illegal function call" error:
GW-BASIC 3.22
(C) Copyright Microsoft 1983,1984,1986,1987
60300 Bytes free
Ok
LOAD"Pwrharm
Ok
LIST
Illegal function call
Ok
SAVE "Pwrharm2",A
Illegal function call
Ok
RUN
[Program runs successfully]
Then again, the run command works just fine. What am I doing wrong?
You're not doing anything wrong; the file was originally saved in GWBASIC with the ,P option. There is a 'hack' to unprotect it, described at https://groups.google.com/forum/#!topic/comp.os.msdos.misc/PA9sve0eKAk - basically, you create a file (call it UNPROT.BAS) containing only the characters 0xff 0x1a, then load the protected file, then load UNPROT.BAS, and you should then be able to list and save the program.
If you can't LIST or EDIT a GW-BASIC .BAS file that you LOADed from disk, it means that the file was originally SAVEd in protected format via SAVE filespec, P.
The 1988 "Handbook of BASIC - third edition" by David I. Schneider describes it as follows:
A program that has been SAVEd in protected format can be unprotected with the following technique.
(a) Create a file called RECOVER.BAS with the following program.
10 OPEN "RECOVER.BAS" FOR OUTPUT AS #1
20 PRINT #1, CHR$(255);
30 CLOSE #1
(b) LOAD the protected program into memory.
(c) Enter LOAD "RECOVER.BAS"
The formerly protected program will now be in memory and can be LISTed or EDITed, and reSAVEd in an unprotected format. This technique appears to work with most versions of BASIC. I have used it successfully with IBM PC BASIC, Compaq BASIC, and several versions of GW-BASIC. LOADing the file RECOVER.BAS will also restore a program after a NEW command has been executed.

Net::SFTP Errors

I have been trying to download a file using Net::SFTP and it keeps getting an error.
The file is partially downloaded, and is only 2.1 MB, so it's not a huge file. I removed the loop over the files and even tried just downloading the one file and got the same error:
yml = YAML.load_file Rails.root.join('config', 'ftp.yml')
Net::SFTP.start(yml["url"], yml["username"], password: yml["password"]) do |sftp|
sftp.dir.glob(File.join('users', 'import'), '*.csv').each do |f|
sftp.download!(File.join('users', 'import', f.name), Rails.root.join('processing_files', 'download_files', f.name), read_size: 1024)
end
end
NoMethodError: undefined method `close' for #<Pathname:0x007fc8fdb50ea0>
from /[my_working_ap_dir]/gems/net-sftp-2.1.2/lib/net/sftp/operations/download.rb:331:in `on_read'
I have prayed to Google all I can and am not getting anywhere with it.
Rails.root returns a Pathname object, but it looks like the sftp code doesn't check to see whether it got a Pathname or a File handle, it just runs with it. When it runs into entry.sink.close it crashes because Pathnames don't implement close.
Pathnames are great for manipulating paths to files and directories, but they're not substitutes for file handles. You could probably tack on to_s which would return a string.
Here's a summary of the download call from the documentation that hints that the expected parameters should be a String:
To download a single file from the remote server, simply specify both the
remote and local paths:
downloader = sftp.download("/path/to/remote.txt", "/path/to/local.txt")
I suspect that if I dig into the code it will check to see whether the parameters are strings, and, if not, assumes that they are IO handles.
See ri Net::SFTP::Operations::Download for more info.
Here's an excerpt from the current download! code, and you can see how the problem occurred:
def download!(remote, local=nil, options={}, &block)
require 'stringio' unless defined?(StringIO)
destination = local || StringIO.new
result = download(remote, destination, options, &block).wait
local ? result : destination.string
end
local was passed in as a Pathname. The code checks to see if there's something passed in, but not what that is. If nothing is passed in it assumes it's something with IO-like features, which is what StringIO provides for the in-memory caching.
Apparently you can't use Rails.root.join, which was causing the problem. It is really stupid though because it would download part of the file.
Changed:
sftp.download!(File.join('users', 'import', f.name), Rails.root.join('processing_files', 'download_files', f.name))
To:
sftp.download!(File.join('users', 'import', f.name), File.join('processing_files', 'download_files', f.name))
argument remote can be a Pathname object while argument local when set should be a String or else an object that responds to #write method.
Below is the working code
local_stringified_path = Rails.root.join('processing_files', f.name).to_s
sftp.download!(Pathname.new('/users/import'), local_stringified_path)
For all those curious minds please read below to understand this behaviour..
The issue NoMethodError: undefined method close' for #<Pathname:0x007fc8fdb50ea0> happens exactly here
in the #on_read method and below is the code snippet of the concerned statements.
if response.eof?
update_progress(:close, entry)
entry.sink.close # ERRORED OUT LINE.. ideally when eof, file IO handler is supposed to be closed
WHAT IS entry.sink ?
We know already that #download! methods takes two args as below
sftp.download!(remote, local)
The given args remote and local is converted to an Entry object here
[Entry.new(remote, local, recursive?)]
and Entry is a nothing but a Struct here
Entry = Struct.new(:remote, :local, :directory, :size, :handle, :offset, :sink)
okay then what is sink attribute? we will jump to that right away..
Once the concerned remote file is open to be read, the #on_open method updates this sink attribute with a File handler here.
Find the snippet below,
entry.sink = entry.local.respond_to?(:write) ? entry.local : ::File.open(entry.local, "wb")
This actually happens only when given local path object doesn't implement it's own #write method In our scenario, Pathname objects does respond to write
Below are some snippets of the console outputs, I inspected in between multiple download chunk calls while debugging this.. which shows the entry and entry.sink displaying the above discussed objects.
Here I chose my remote to be a Pathname object and local to be String path which returns proper value for the entry.sink and there by downloading successfully..
0> entry
=> #<struct Net::SFTP::Operations::Download::Entry remote=#<Pathname:214010463.xml>, local="214010463.xml", directory=nil, size=nil, handle="1", offset=32000, sink=#<File:214010463.xml>>
0> entry.sink
=> #<File:214010463.xml>

How to capture console output from TDUMP.EXE?

TDUMP.exe is file dumping utility from Delphi RAD Studio. If I run
tdump.exe myapp.exe
It will return some information about the myapp.exe.
I want to capture the console output of tdump.exe to my VCL gui application. I have tried the RunDosInMemo in http://delphi.about.com/cs/adptips2001/a/bltip0201_2.htm. The output result isn't same as command line console output. It always return:
ERROR: Can not open output file myapp.exe.
And myapp.exe file will be overwritten.
Running other console command with RunDosInMemo works as expected but not Delphi tdump.exe.
Any ideas why redirect console output doesn't work with tdump?
I am using the following code to invoke RunDosInMemo:
RunDosInMemo('tdump.exe ' + ParamStr(0), Memo1);
ParamStr(0) returns the full name to your exe including path that may contain spaces that needs to be quoted. Try:
RunDosInMemo('tdump.exe "' + ParamStr(0) + '"', Memo1);
As evident from the error message 'tdump' gives, it is not trying to read the contents of the file name you pass to it, on the contrary, it takes the filename for output.
What 'tdump' actually expects is to read file contents from its 'stdin'. The code you linked in the question is not suitable. You need to create at least two pipes, write contents of the input file to the write end of 'tdump's standard input and read 'tdump's output through the read end of the output pipe.
But this is not required, you can tell 'tdump' to read the file that's passed with an argument, not from stdin. Issue a tdump -? at a console and see the help. You'll notice this option:
-ns Disable support for redirecting stdin
You just need to change your call to make your procedure work:
RunDosInMemo('tdump.exe -ns ' + ParamStr(0), Memo1);

Resources