haskell - Parsing command-line and REPL commands and options - parsing

I'm writing a program that has both a command-line interface and an interactive mode. In CLI mode it executes one command, prints results and exits. In interactive mode it repeatedly reads commands using GNU readline, executes them and prints results (in spirit of a REPL).
The syntax for commands and their parameters is almost the same regardless of whether they come from command-line or frmo stdin. I would like to maximize code-reuse by using a single framework for parsing both command-line and interactive mode inputs.
My proposed syntax is (square brackets denote optional parts, braces repetition) as follows:
From shell:
program-name {[GLOBAL OPTION] ...} <command> [{<command arg>|<GLOBAL OPTION>|<LOCAL OPTION> ...}]
In interactive mode:
<command> [{<command arg>|<GLOBAL OPTION>|<LOCAL OPTION> ...}]
Local options are only valid for one particular command (different commands may assign a different meaning to one option).
My problem is that there are some differences between the CL and interactive interfaces:
Some global options are only valid from command line (like --help, --version or --config-file). There is obviously also the 'quit'-command which is very important in interactive mode, but using it from CL makes no sense.
To solve this I've searched web and hackage for command-line parsing libraries. The most interesting ones I've found are cmdlib and optparse-applicative. However, I'm quite new to Haskell and even though I can create a working program by copying and modifying example code from library docs, I haven't quite understood the mechanics of these libraries and therefore have not been able to solve my problem.
I have these questions in mind:
How to make a base parser for commands and options that are common to CL and REPL interfaces and then be able to extend the base parser with new commands and options?
How to prevent these libraries from exiting my program upon incorrect input or when '--help' is used?
I plan to add complete i18n support to my program. Therefore I would like to prevent my chosen library from printing any messages, because all messages need to be translated. How to achieve this?
So I wish you could give me some hints on where to go from here. Does cmdlib or optparse-applicative (or some other library) support what I'm looking for? Or should I revert to a hand-crafted parser?

I think you could use my library http://hackage.haskell.org/package/options to do this. The subcommands feature exactly matches the command flag parsing behavior you're looking for.
It'd be a little tricky to share subcommands between two disjoint sets of options, but a helper typeclass should be able to do it. Rough sample code:
-- A type for options shared between CLI and interactive modes.
data CommonOptions = CommonOptions
{ optSomeOption :: Bool
}
instance Options CommonOptions where ...
-- A type for options only available in CLI mode (such as --version or --config-file)
data CliOptions = CliOptions
{ common :: CommonOptions
, version :: Bool
, configFile :: String
}
instance Options CliOptions where ...
-- if a command takes only global options, it can use this subcommand option type.
data NoOptions = NoOptions
instance Options NoOptions where
defineOptions = pure NoOptions
-- typeclass to let commands available in both modes access common options
class HasCommonOptions a where
getCommonOptions :: a -> CommonOptions
instance HasCommonOptions CommonOptions where
getCommonOptions = id
instance HasCommonOptions CliOptions where
getCommonOptions = common
commonCommands :: HasCommonOptions a => [Subcommand a (IO ())]
commonCommands = [... {- your commands here -} ...]
cliCommands :: HasCommonOptions a => [Subcommand a (IO ())]
cliCommands = commonCommands ++ [cmdRepl]
interactiveCommands :: HasCommonOptions a => [Subcommand a (IO ())]
interactiveCommands = commonCommands ++ [cmdQuit]
cmdRepl :: HasCommonOptions a => Subcommand a (IO ())
cmdRepl = subcommand "repl" $ \opts NoOptions -> do
{- run your interactive REPL here -}
cmdQuit :: Subcommand a (IO ())
cmdQuit = subcommand "quit" (\_ NoOptions -> exitSuccess)
I suspect the helper functions like runSubcommand wouldn't be specialized enough, so you'll want to invoke the parser with parseSubcommand once you've split up the input string from the REPL prompt. The docs have examples of how to inspect the parsed options, including checking whether the user requested help.
The options parser itself won't print any output, but it may be difficult to internationalize error messages generated by the default type parsers. Please let me know if there's any changes to the library that would help.

Related

Can Erlang source code be embedded in Elixir code? If so, how?

Elixir source may be injected using Code.eval_string/3. I don't see mention of running raw Erlang code in the docs:
https://hexdocs.pm/elixir/Code.html#eval_string/3
I am coming from a Scala world in which Java objects are callable using Scala syntax, and Scala is compiled into Java and visible by intercepting the compiler output (directly generated with scalac).
I get the sense that Elixir does not provide such interoperating features, nor allow injection of custom Erlang into the runtime. Is this the case?
You can use the erlang standard library modules from Elixir, as described here or here.
For example:
def random_integer(upper) do
:rand.uniform(upper) # rand is an erlang library
end
You can also add erlang packages to your mix.exs dependencies and use them in your project, as long as these packages are published on hex or on github.
You can also use erlang and elixir code together in a project as described here.
So yeah, it's perfectly possible to call erlang code from elixir.
Vice-versa is also possible, see here for more information:
Elixir compiles into BEAM byte code (via Erlang Abstract Format). This
means that Elixir code can be called from Erlang and vice versa,
without the need to write any bindings.
Expanding what #zwippie have written:
All remote function calls (by that I mean calling function with explicitly set module/alias) are in form of:
<atom with module name>.<function name>(<arguments>)
# Technically it is the same as:
# apply(module, function_name_as_atom, [arguments])
And all "upper case module names" in Elixir are just atoms:
is_atom(Foo) == true
Foo == :"Elixir.Foo" # => true
So from Elixir viewpoint there is no difference between calling Erlang functions and Elixir functions. It is just different atom passed as the receiving module.
So you can easily call Erlang modules from Elixir. That mean that without much of the hassle you should be able to compile Erlang AST from within Elixir as well:
"rand:uniform(100)"
|> :merl.quote()
|> :erl_eval.expr(#{})
No need for any mental translation.
Additionally you can without any problems mix Erlang and Elixir code in single Mix project. With tree structure like:
.
|`- mix.exs
|`- src
| `- example.erl
`- lib
`- example.ex
Where example.erl is:
-module(example).
-export([hello/0]).
hello() -> <<"World">>.
And example.ex:
defmodule Example do
def print_hello, do: IO.puts(:example.hello())
end
You can compile project and run it with
mix run -e "Example.print_hello()"
And see that Erlang module was successfully compiled and executed from within Elixir code in the same project without problems.
One more thing to watch for when calling erlang code from elixir. erlang uses charlists for strings. When you call a erlang function that takes a string, convert the string to a charlist and convert returned string to a string.
Examples:
iex(17)> :string.to_upper "test"
** (FunctionClauseError) no function clause matching in :string.to_upper/1
The following arguments were given to :string.to_upper/1:
# 1
"test"
(stdlib 3.15.1) string.erl:2231: :string.to_upper/1
iex(17)> "test" |> String.to_charlist() |> :string.to_upper
'TEST'
iex(18)> "test" |> String.to_charlist() |> :string.to_upper |> to_string
"TEST"
iex(19)>

Validate URL in Informix 4GL program

In my Informix 4GL program, I have an input field where the user can insert a URL and the feed is later being sent over to the web via a script.
How can I validate the URL at the time of input, to ensure that it's a live link? Can I make a call and see if I get back any errors?
I4GL checking the URL
There is no built-in function to do that (URLs didn't exist when I4GL was invented, amongst other things).
If you can devise a C method to do that, you can arrange to call that method through the C interface. You'll write the method in native C, and then write an I4GL-callable C interface function using the normal rules. When you build the program with I4GL c-code, you'll link the extra C functions too. If you build the program with I4GL-RDS (p-code), you'll need to build a custom runner with the extra function(s) exposed. All of this is standard technique for I4GL.
In general terms, the C interface code you'll need will look vaguely like this:
#include <fglsys.h>
// Standard interface for I4GL-callable C functions
extern int i4gl_validate_url(int nargs);
// Using obsolescent interface functions
int i4gl_validate_url(int nargs)
{
if (nargs != 1)
fgl_fatal(__FILE__, __LINE__, -1318);
char url[4096];
popstring(url, sizeof(url));
int r = validate_url(url); // Your C function
retint(r);
return 1;
}
You can and should check the manuals but that code, using the 'old style' function names, should compile correctly. The code can be called in I4GL like this:
DEFINE url CHAR(256)
DEFINE rc INTEGER
LET url = "http://www.google.com/"
LET rc = i4gl_validate_url(url)
IF rc != 0 THEN
ERROR "Invalid URL"
ELSE
MESSAGE "URL is OK"
END IF
Or along those general lines. Exactly what values you return depends on your decisions about how to return a status from validate_url(). If need so be, you can return multiple values from the interface function (e.g. error number and text of error message). Etc. This is about the simplest possible design for calling some C code to validate a URL from within an I4GL program.
Modern C interface functions
The function names in the interface library were all changed in the mid-00's, though the old names still exist as macros. The old names were:
popstring(char *buffer, int buflen)
retint(int retval)
fgl_fatal(const char *file, int line, int errnum)
You can find the revised documentation at IBM Informix 4GL v7.50.xC3: Publication library in PDF in the 4GL Reference Manual, and you need Appendix C "Using C with IBM Informix 4GL".
The new names start ibm_lib4gl_:
ibm_libi4gl_popMInt()
ibm_libi4gl_popString()
As to the error reporting function, there is one — it exists — but I don't have access to documentation for it any more. It'll be in the fglsys.h header. It takes an error number as one argument; there's the file name and a line number as the other arguments. And it will, presumably, be ibm_lib4gl_… and there'll be probably be Fatal or perhaps fatal (or maybe Err or err) in the rest of the name.
I4GL running a script that checks the URL
Wouldn't it be easier to write a shell script to get the status code? That might work if I can return the status code or any existing results back to the program into a variable? Can I do that?
Quite possibly. If you want the contents of the URL as a string, though, you'll might end up wanting to call C. It is certainly worth thinking about whether calling a shell script from within I4GL is doable. If so, it will be a lot simpler (RUN "script", IIRC, where the literal string would probably be replaced by a built-up string containing the command and the URL). I believe there are file I/O functions in I4GL now, too, so if you can get the script to write a file (trivial), you can read the data from the file without needing custom C. For a long time, you needed custom C to do that.
I just need to validate the URL before storing it into the database. I was thinking about:
#!/bin/bash
read -p "URL to check: " url
if curl --output /dev/null --silent --head --fail "$url"; then
printf '%s\n' "$url exist"
else
printf '%s\n' "$url does not exist"
fi
but I just need the output instead of /dev/null to be into a variable. I believe the only option is to dump the output into a temp file and read from there.
Instead of having I4GL run the code to validate the URL, have I4GL run a script to validate the URL. Use the exit status of the script and dump the output of curl into /dev/null.
FUNCTION check_url(url)
DEFINE url VARCHAR(255)
DEFINE command_line VARCHAR(255)
DEFINE exit_status INTEGER
LET command_line = "check_url ", url
RUN command_line RETURNING exit_status
RETURN exit_status
END FUNCTION {check_url}
Your calling code can analyze exit_status to see whether it worked. A value of 0 indicates success; non-zero indicates a problem of some sort, which can be deemed 'URL does not work'.
Make sure the check_url script (a) exits with status zero on success and non-zero on any sort of failure, and (b) doesn't write anything to standard output (or standard error) by default. The writing to standard error or output will screw up screen layouts, etc, and you do not want that. (You can obviously have options to the script that enable standard output, or you can invoke the script with options to suppress standard output and standard error, or redirect the outputs to /dev/null; however, when used by the I4GL program, it should be silent.)
Your 'script' (check_url) could be as simple as:
#!/bin/bash
exec curl --output /dev/null --silent --head --fail "${1:-http://www.example.com/"
This passes the first argument to curl, or the non-existent example.com URL if no argument is given, and replaces itself with curl, which generates a zero/non-zero exit status as required. You might add 2>/dev/null to the end of the command line to ensure that error messages are not seen. (Note that it will be hell debugging this if anything goes wrong; make sure you've got provision for debugging.)
The exec is a minor optimization; you could omit it with almost no difference in result. (I could devise a scheme that would probably spot the difference; it involves signalling the curl process, though — kill -9 9999 or similar, where the 9999 is the PID of the curl process — and isn't of practical significance.)
Given that the script is just one line of code that invokes another program, it would be possible to embed all that in the I4GL program. However, having an external shell script (or Perl script, or …) has merits of flexibility; you can edit it to log attempts, for example, without changing the I4GL code at all. One more file to distribute, but better flexibility — keep a separate script, even though it could all be embedded in the I4GL.
As Jonathan said "URLs didn't exist when I4GL was invented, amongst other things". What you will find is that the products that have grown to superceed Informix-4gl such as FourJs Genero will cater for new technologies and other things invented after I4GL.
Using FourJs Genero, the code below will do what you are after using the Informix 4gl syntax you are familiar with
IMPORT com
MAIN
-- Should succeed and display 1
DISPLAY validate_url("http://www.google.com")
DISPLAY validate_url("http://www.4js.com/online_documentation/fjs-fgl-manual-html/index.html#c_fgl_nf.html") -- link to some of the features added to I4GL by Genero
-- Should fail and display 0
DISPLAY validate_url("http://www.google.com/testing")
DISPLAY validate_url("http://www.google2.com")
END MAIN
FUNCTION validate_url(url)
DEFINE url STRING
DEFINE req com.HttpRequest
DEFINE resp com.HttpResponse
-- Returns TRUE if http request to a URL returns 200
TRY
LET req = com.HttpRequest.create(url)
CALL req.doRequest()
LET resp = req.getResponse()
IF resp.getStatusCode() = 200 THEN
RETURN TRUE
END IF
-- May want to handle other HTTP status codes
CATCH
-- May want to capture case if not connected to internet etc
END TRY
RETURN FALSE
END FUNCTION

FAKE Fsc task is writing build products to wrong directory

I'm just learning F#, and setting up a FAKE build harness for a hello-world-like application. (Though the phrase "Hell world" does occasionally come to mind... :-) I'm using a Mac and emacs (generally trying to avoid GUI IDEs by preference).
After a bit of fiddling about with documentation, here's how I'm invoking the F# compiler via FAKE:
let buildDir = #"./build-app/" // Where application build products go
Target "CompileApp" (fun _ -> // Compile application source code
!! #"src/app/**/*.fs" // Look for F# source files
|> Seq.toList // Convert FileIncludes to string list
|> Fsc (fun p -> // which is what the Fsc task wants
{p with //
FscTarget = Exe //
Platform = AnyCpu //
Output = (buildDir + "hello-fsharp.exe") }) // *** Writing to . instead of buildDir?
) //
That uses !! to make a FileIncludes of all the sources in the usual way, then uses Seq.toList to change that to a string list of filenames, which is then handed off to the Fsc task. Simple enough, and it even seems to work:
...
Starting Target: CompileApp (==> SetVersions)
FSC with args:[|"-o"; "./build-app/hello-fsharp.exe"; "--target:exe"; "--platform:anycpu";
"/Users/sgr/Documents/laboratory/hello-fsharp/src/app/hello-fsharp.fs"|]
Finished Target: CompileApp
...
However, despite what the console output above says, the actual build products go to the top-level directory, not the build directory. The message above looks like the -o argument is being passed to the compiler with an appropriate filename, but the executable gets put in . instead of ./build-app/.
So, 2 questions:
Is this a reasonable way to be invoking the F# compiler in a FAKE build harness?
What am I misunderstanding that is causing the build products to go to the wrong place?
This, or a very similar problem, was reported in FAKE issue #521 and seems to have been fixed in FAKE pull request #601, which see.
Explanation of the Problem
As is apparently well-known to everyone but me, the F# compiler as implemented in FSharp.Compiler.Service has a practice of skipping its first argument. See FSharp.Compiler.Service/tests/service/FscTests.fs around line 127, where we see the following nicely informative comment:
// fsc parser skips the first argument by default;
// perhaps this shouldn't happen in library code.
Whether it should or should not happen, it's what does happen. Since the -o came first in the arguments generated by FscHelper, it was dutifully ignored (along with its argument, apparently). Thus the assembly went to the default place, not the place specified.
Solutions
The temporary workaround was to specify --out:destinationFile in the OtherParams field of the FscParams setter in addition to the Output field; the latter is the sacrificial lamb to be ignored while the former gets the job done.
The longer term solution is to fix the arguments generated by FscHelper to have an extra throwaway argument at the front; then these 2 problems will annihilate in a puff of greasy black smoke. (It's kind of balletic in its beauty, when you think about it.) This is exactly what was just merged into the master by #forki23:
// Always prepend "fsc.exe" since fsc compiler skips the first argument
let optsArr = Array.append [|"fsc.exe"|] optsArr
So that solution should be in the newest version of FAKE (3.11.0).
The answers to my 2 questions are thus:
Yes, this appears to be a reasonable way to invoke the F# compiler.
I didn't misunderstand anything; it was just a bug and a fix is in the pipeline.
More to the point: the actual misunderstanding was that I should have checked the FAKE issues and pull requests to see if anybody else had reported this sort of thing, and that's what I'll do next time.

How to pass extra arguments to RabbitMQ connection in Erlang client

I have written some extension modules for eJabberd most of which pass pieces of information to RabbitMQ for various reasons. All has been fine until we brought the server up in staging where we have a Rabbit cluster rather than a single box.
In order to utilize the cluster you need to pass "x-ha-policy" parameter to Rabbit with either the "all" or "nodes" value. This works fine for the Java and Python Producers and Consumers, but the eJabberd (using the Erlang AMQP client of course) has me a bit stumped. The x-ha-policy parameter needs to be passed into the "client_properties" parameter which is just the "catchall" for extra parameters.
In Python with pika I can do:
client_params = {"x-ha-policy": "all"}
queue.declare(host, vhost, username, password, arguments=client_params)
and that works. However the doc for the Erlang client says the arguments should be passed in as a list per:
[{binary(), atom(), binary()}]
If it were just [{binary(), binary()}] I could see the relationship with key/value but not sure what the atom would be there.
Just to be clear, I am a novice Erlang programmer so this may be a common construct that I am not familiar with, so no answer would be too obvious.
I found this in amqp_network_connection.erl, which looks like a wrapper to set some default values:
client_properties(UserProperties) ->
{ok, Vsn} = application:get_key(amqp_client, vsn),
Default = [{<<"product">>, longstr, <<"RabbitMQ">>},
{<<"version">>, longstr, list_to_binary(Vsn)},
{<<"platform">>, longstr, <<"Erlang">>},
{<<"copyright">>, longstr,
<<"Copyright (c) 2007-2012 VMware, Inc.">>},
{<<"information">>, longstr,
<<"Licensed under the MPL. "
"See http://www.rabbitmq.com/">>},
{<<"capabilities">>, table, ?CLIENT_CAPABILITIES}],
lists:foldl(fun({K, _, _} = Tuple, Acc) ->
lists:keystore(K, 1, Acc, Tuple)
end, Default, UserProperties).
Apparently the atom describes the value type. I don't know the available types, but there's a chance that longstr will work in your case.

What kind of types can be sent on an Erlang message?

Mainly I want to know if I can send a function in a message in a distributed Erlang setup.
On Machine 1:
F1 = Fun()-> hey end,
gen_server:call(on_other_machine,F1)
On Machine 2:
handler_call(Function,From,State) ->
{reply,Function(),State)
Does it make sense?
Here's an interesting article about "passing fun's to other Erlang nodes". To resume it briefly:
[...] As you might know, Erlang distribution
works by sending the binary encoding
of terms; and so sending a fun is also
essentially done by encoding it using
erlang:term_to_binary/1; passing the
resulting binary to another node, and
then decoding it again using
erlang:binary_to_term/1.[...]
This is pretty obvious
for most data types; but how does it
work for function objects?
When you encode a fun, what is encoded
is just a reference to the function,
not the function implementation.
[...]
[...]the definition of the function is not passed along; just exactly enough information to recreate the fun at an other node if the module is there.
[...] If the module containing the fun has not yet been loaded, and the target node is running in interactive mode; then the module is attempted loaded using the regular module loading mechanism (contained in the module error_handler); and then it tries to see if a fun with the given id is available in said module. However, this only happens lazily when you try to apply the function.
[...] If you never attempt to apply the function, then nothing bad happens. The fun can be passed to another node (which has the module/fun in question) and then everybody is happy.
Maybe the target node has a module loaded of said name, but perhaps in a different version; which would then be very likely to have a different MD5 checksum, then you get the error badfun if you try to apply it.
I would suggest you to read the whole article, cause it's extremely interesting.
You can send any valid Erlang term. Although you have to be careful when sending funs. Any fun referencing a function inside a module needs that module to exist on the target node to work:
(first#host)9> rpc:call(second#host, erlang, apply,
[fun io:format/1, ["Hey!~n"]]).
Hey!
ok
(first#host)10> mymodule:func("Hey!~n").
5
(first#host)11> rpc:call(second#host, erlang, apply,
[fun mymodule:func/1, ["Hey!~n"]]).
{badrpc,{'EXIT',{undef,[{mymodule,func,["Hey!~n"]},
{rpc,'-handle_call_call/6-fun-0-',5}]}}}
In this example, io exists on both nodes and it works to send a function from io as a fun. However, mymodule exists only on the first node and the fun generates an undef exception when called on the other node.
As for anonymous functions, it seems they can be sent and work as expected.
t1#localhost:
(t1#localhost)7> register(shell, self()).
true
(t1#localhost)10> A = me, receive Fun when is_function(Fun) -> Fun(A) end.
hello me you
ok
t2#localhost:
(t2#localhost)11> B = you.
you
(t2#localhost)12> Fn2 = fun (A) -> io:format("hello ~p ~p~n", [A, B]) end.
#Fun<erl_eval.6.54118792>
(t2#localhost)13> {shell, 't1#localhost'} ! Fn2.
I am adding coverage logic to an app built on riak-core, and the merge of results gathered can be tricky if anonymous functions cannot be used in messages.
Also check out riak_kv/src/riak_kv_coverage_filter.erl
riak_kv might be using it to filter result, I guess.

Resources