Is Erlang a Constraint-Logic programming language? - erlang

Since Erlang is based upon Prolog, does this mean that Erlang is a Constraint-Logic Language?
Does Erlang have Prolog's building blocks: Facts, Rules and Query

No.
Erlang's syntax is very similar to Prolog's, but the semantics are very different. An early version of Erlang was written using Prolog, but today's Erlang can no longer meaningfully be said to be "based on Prolog."
Erlang does not include backtracking or other features of Prolog regularly used for logic programming. You can of course implement Prolog atop other languages, and Erlang is an easier choice for this than some others. This can be seen in Robert Virding's "Erlog" project:
https://github.com/rvirding/erlog

Yes.
The first version of Erlang was not written in Prolog, it was written in one of the committed-choice logic programming languages. These languages dropped Prolog's backtracking hence the name "committed choice" meaning once a choice was made it was not possible to backtrack and try another. This was done in order to simplify making a form of logic programming concurrent. Another way of looking at it is that concurrent processes would apply constraints to variables, but being logic variables and hence not re-assignable these would be successive constraints not changes of value. The constraint might assign a partial value to a variable, containing another variable which would be assigned later. This is the underlying model of Erlang. Constraint logic programming has tended to be used for versions where the constraints could also include mathematical statements about possible ranges of variables with intended numerical values.
The syntax of Erlang shows its logic programming heritage, but it is important to understand it picked this up through the committed choice logic programming languages which picked it up from Prolog, not directly from Prolog. Although several committed choice logic programming languages were devised during the 1980s, they were unable to pull out of the shadow of Prolog, and were pulled down by their association with the failed Japanese Fifth Generation initiative, and also by competing teams of developers who bickered over minor differences so no standard was established.
Erlang's developers introduced a syntactic sugar which gave a more functional appearance to the code, and made the marketing decision to promote it as a functional rather than logic programming language, which enabled it not to be dragged down by the post fifth generation dismissal of logic programming.

In short, no it's not :) It doesn't have those building blocks. It's focus is on concurrency, parallel programming, distributed applications and fault-tolerance (while being a functional, strict, declarative language).

You may well use the list comprehension feature in Erlang as a way of implementing in a constraint programming style.
% Produce the tuple {1, 0}
%
constraint_test() -> [ {A, B} ||
A <- lists:seq(0, 1),
B <- lists:seq(0, 1),
A > B].
You alternatively place generators of elements taken from lists (A <- lists:seq(0, 1)) and constraints (A > B).
I recently solved the problem linked below. And if you place the constraints correctly you will have the answer in an instant of a second.
http://www.geocaching.com/seek/cache_details.aspx?guid=a8605431-53b5-4c2c-97fb-d42ee299b167

Related

How is Coq's parser implemented?

I was entirely amazed by how Coq's parser is implemented. e.g.
https://softwarefoundations.cis.upenn.edu/lf-current/Imp.html#lab347
It's so crazy that the parser seems ok to take any lexeme by giving notation command and subsequent parser is able to parse any expression as it is. So what it means is the grammar must be context sensitive. But this is so flexible that it absolutely goes beyond my comprehension.
Any pointers on how this kind of parser is theoretically feasible? How should it work? Any materials or knowledge would work. I just try to learn about this type of parser in general. Thanks.
Please do not ask me to read Coq's source myself. I want to check the idea in general but not a specific implementation.
Indeed, this notation system is very powerful and it was probably one of the reasons of Coq's success. In practice, this is a source of much complication in the source code. I think that #ejgallego should be able to tell you more about it but here is a quick explanation:
At the beginning, Coq's documents were evaluated sentence by sentence (sentences are separated by dots) by coqtop. Some commands can define notations and these modify the parsing rules when they are evaluated. Thus, later sentences are evaluated with a slightly different parser.
Since version 8.5, there is also a mechanism (the STM) to evaluate a document fully (many sentences in parallel) but there is some special mechanism for handling these notation commands (basically you have to wait for these to be evaluated before you can continue parsing and evaluating the rest of the document).
Thus, contrary to a normal programming language, where the compiler will take a document, pass it through the lexer, then the parser (parse the full document in one go), and then have an AST to give to the typer or other later stages, in Coq each command is parsed and evaluated separately. Thus, there is no need to resort to complex contextual grammars...
I'll drop my two cents to complement #Zimmi48's excellent answer.
Coq indeed features an extensible parser, which TTBOMK is mainly the work of Hugo Herbelin, built on the CAMLP4/CAMLP5 extensible parsing system by Daniel de Rauglaudre. Both are the canonical sources for information about the parser, I'll try to summarize what I know but note indeed that my experience with the system is short.
The CAMLPX system basically supports any LL1 grammar. Coq exposes to the user the whole set of grammar rules, allowing the user to redefine them. This is the base mechanism on which extensible grammars are built. Notations are compiled into parsing rules in the Metasyntax module, and unfolded in a latter post-processing phase. And that really is AFAICT.
The system itself hasn't changed much in the whole 8.x series, #Zimmi48's comments are more related to the internal processing of commands after parsing. I recently learned that Coq v7 had an even more powerful system for modifying the parser.
In words of Hugo Herbelin "the art of extensible parsing is a delicate one" and indeed it is, but Coq's achieved a pretty great implementation of it.

Is a 'standard module' part of the 'programming language'?

On page 57 of the book "Programming Erlang" by Joe Armstrong (2007) 'lists:map/2' is mentioned in the following way:
Virtually all the modules that I write use functions like
lists:map/2 —this is so common that I almost consider map
to be part of the Erlang language. Calling functions such
as map and filter and partition in the module lists is extremely
common.
The usage of the word 'almost' got me confused about what the difference between Erlang as a whole and the Erlang language might be, and if there even is a difference at all. Is my confusion based on semantics of the word 'language'? It seems to me as if a standard module floats around the borders of what does and does not belong to the actual language it's implemented in. What are the differences between a programming language at it's core and the standard libraries implemented in them?
I'm aware of the fact that this is quite the newby question, but in my experience jumping to my own conclusions can lead to bad things. I was hoping someone could clarify this somewhat.
Consider this simple program:
1> List = [1, 2, 3, 4, 5].
[1,2,3,4,5]
2> Fun = fun(X) -> X*2 end.
#Fun<erl_eval.6.50752066>
3> lists:map(Fun, List).
[2,4,6,8,10]
4> [Fun(X) || X <- List].
[2,4,6,8,10]
Both produce the same output, however the first one list:map/2 is a library function, and the second one is a language construct at its core, called list comprehension. The first one is implemented in Erlang (accidentally also using list comprehension), the second one is parsed by Erlang. The library function can be optimized only as much as the compiler is able to optimize its implementation in Erlang. However, the list comprehension may be optimized as far as being written in assembler in the Beam VM and called from the resulted beam file for maximum performance.
Some language constructs look like they are part of the language, whereas in fact they are implemented in the library, for example spawn/3. When it's used in the code it looks like a keyword, but in Erlang it's not one of the reserved words. Because of that, Erlang compiler will automatically add the erlang module in front of it and call erlang:spawn/3, which is a library function. Those functions are called BIFs (Build-In Functions).
In general, what belongs to the language itself is what that language's compiler can parse and translate to the executable code (or in other words, what's defined by the language's grammar). Everything else is a library. Libraries are usually written in the language for which they are designed, but it doesn't necessarily have to be the case, e.g. some of Erlang library functions are written using C as Erlang NIFs.

Why do we use the term syntax in computer languages and not the term grammar instead

I am confused between the word syntax and grammar. Is there a reason that for computer languages we always use the word syntax to describe the word order and not the word grammar?
The term "syntax" and "grammar" both comes from the field of linguistics. In linguistics, syntax refers to the rules by which sentences are constructed. Grammar refers to how the rules of the language relate to one another.
Grammar actually covers syntax, morphology and phonology. Morphology are the rules of how words can be modified to add meaning or context. Phonology are the rules of how words should sound like (which in turn govern how spelling works in that language).
So, how did concepts form linguistics got adopted by programmers?
If you look at really old papers and publications related to computing, for example Turing's seminal work on computability (Turing machines) or even older, Babbage's publications describing his Analytical Engine and Ada Lovelace's publications on programming, you'll find that they don't refer to computer programs as languages. Instead, they were just referred to as instructions or, if you want to get fancy, algorithms.
It was partly, perhaps mostly, the work of Noam Chomsky that related languages to programming.
Looking for a new way to study languages and how to extract meaning from sentences Chomsky created the concept of the Chomsky hierarchy. His idea was to start with the simplest system that could process a string of "stuff" (sounds,letters,words): a Turing machine and categorize the instructions for a Turing machine as type-0 grammar. Then he went on to define grammar types 1, 2 and 3 (type 3 being the grammar of human languages such as English or Swahili) hoping that as we understand how complexity gets introduced we will end up with a parser for human languages.
Most programming languages are type 2. Indeed we have discovered parsers for types 0, 1 and 2 in the form of language interperters and CPU designs.
Inheriting Chomsky's work, we have defined "syntax" in computing to mean how symbols are arranged to implement a language feature and "grammar" to mean the collection of syntax rules.
Because a language has only "one" syntax (the set of strings it will accept), and probably very many grammars even if we exclude trivial variants.
This may be clearer if you think about the phrase, "the language syntax allows stuff". This phrase is independent of any grammars that might be used to describe the syntax.

Specifying a dynamic priority and precedence for an operator in Menhir/Ocamlyacc

I'm trying to parse a language where the operators have a dynamic attributes (priority and precedence) using the Menhir parser (similar to Ocamlyacc). During the lexing phase, all the operators fill a OP:string token (so "+" turns into (OP "+"), etc).
The operator attributes are determined at parse time and fill a table associating operators and their attributes. Given this table, how can I instruct Menhir to dynamically change the priority of the rule parsing the operators based on this table's data ?
Thanks,
CharlieP.
I'm sorry for answering with a "you're doing it wrong" kind of comment. I have three objections I hope are constructive, in decreasing order of relevance:
Menhir is not meant for dynamic grammar updates; if you insist on changing your grammar at parse-time, you should use a tool that provides this feature, such as the GLR parser Dypgen. The Dypgen manual mentions the possibility of dynamically updating operator priorities, in a constrained way (it seems you can add new operators and corresponding priorities, but not change priority of existing ones) that may or may not match your needs. See section 6.6 of the Dypgen manual (PDF), page... 42.
Dynamically updating a CFG grammar is, I think, not the best way to handle user-defined operator precedences. Agda has very general user-defined mixfix operators, and their solution is roughly the following: use your CFG parser to parse the grammatical structure that is statically known, but for expression that possibly use fancy precendences and associativities, just parse them into a list of tokens. For example, let x = if foo then x + y * z else bar would be parsed into something like Let(x, If(foo, Expr(x, +, y, *, z), bar). A later specialized pass can gather the required information to post-parse those into Expr nodes into their specialized structure. Use parser generators for what they're good for (statically-known rich CFG), and use a post-processing pass for the complex, ill-defined, dynamic stuff. The Agda guys have some literature on the topic, for example Parsing Mixfix Operators, Danielsson and Norell, 2009.
From a design point of view, I strongly urge you to separate your lexing and parsing in several different passes, each of them well-defined and using only information gathered on the previous structure, instead of trying to dynamically change its own behavior. You'll have something much simpler and much more robust.
Dynamic or user-defined precedence and priorities are, in my opinion, a bit evil. OCaml has a different system where operator precedence priorities is determined by their first few characters (eg. #, ## and #+ are all right-associative). It is a bit restrictive for the people choosing an infix operator, but makes code reader lives much more comfortable, as they have only one set of grammar rules to learn, instead of having to dynamically adapt their eyes to any new piece of code. If you want to allow for insertion of wild, foreign pieces of code with an entirely different syntax, quotations mechanisms (eg. camlp4 <:foo< ... >>) are much more robust than fiddling with operator-level associativities and priorities, and also much simpler to parse.
That said, projects have different needs and I would completely understand if you insisted on having dynamically changing operators precedence and associativities for some application I don't know about. Just keep in mind that it's not the only way around, and sometimes consistency and simplicity are better than absolute flexibility.

What are advantages and disadvantages of "point free" style in functional programming?

I know that in some languages (Haskell?) the striving is to achieve point-free style, or to never explicitly refer to function arguments by name. This is a very difficult concept for me to master, but it might help me to understand what the advantages (or maybe even disadvantages) of that style are. Can anyone explain?
The point-free style is considered by some author as the ultimate functional programming style. To put things simply, a function of type t1 -> t2 describes a transformation from one element of type t1 into another element of type t2. The idea is that "pointful" functions (written using variables) emphasize elements (when you write \x -> ... x ..., you're describing what's happening to the element x), while "point-free" functions (expressed without using variables) emphasize the transformation itself, as a composition of simpler transforms. Advocates of the point-free style argue that transformations should indeed be the central concept, and that the pointful notation, while easy to use, distracts us from this noble ideal.
Point-free functional programming has been available for a very long time. It was already known by logicians which have studied combinatory logic since the seminal work by Moses Schönfinkel in 1924, and has been the basis for the first study on what would become ML type inference by Robert Feys and Haskell Curry in the 1950s.
The idea to build functions from an expressive set of basic combinators is very appealing and has been applied in various domains, such as the array-manipulation languages derived from APL, or the parser combinator libraries such as Haskell's Parsec. A notable advocate of point-free programming is John Backus. In his 1978 speech "Can Programming Be Liberated From the Von Neumann Style ?", he wrote:
The lambda expression (with its substitution rules) is capable of
defining all possible computable functions of all possible types
and of any number of arguments. This freedom and power has its
disadvantages as well as its obvious advantages. It is analogous
to the power of unrestricted control statements in conventional
languages: with unrestricted freedom comes chaos. If one
constantly invents new combining forms to suit the occasion, as
one can in the lambda calculus, one will not become familiar with
the style or useful properties of the few combining forms that
are adequate for all purposes. Just as structured programming
eschews many control statements to obtain programs with simpler
structure, better properties, and uniform methods for
understanding their behavior, so functional programming eschews
the lambda expression, substitution, and multiple function
types. It thereby achieves programs built with familiar
functional forms with known useful properties. These programs are
so structured that their behavior can often be understood and
proven by mechanical use of algebraic techniques similar to those
used in solving high school algebra problems.
So here they are. The main advantage of point-free programming are that they force a structured combinator style which makes equational reasoning natural. Equational reasoning has been particularly advertised by the proponents of the "Squiggol" movement (see [1] [2]), and indeed use a fair share of point-free combinators and computation/rewriting/reasoning rules.
[1] "An introduction to the Bird-Merteens Formalism", Jeremy Gibbons, 1994
[2] "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire", Erik Meijer, Maarten Fokkinga and Ross Paterson, 1991
Finally, one cause for the popularity of point-free programming among Haskellites is its relation to category theory. In category theory, morphisms (which could be seen as "transformations between objects") are the basic object of study and computation. While partial results allow reasoning in specific categories to be performed in a pointful style, the common way to build, examine and manipulate arrows is still the point-free style, and other syntaxes such as string diagrams also exhibit this "pointfreeness". There are rather tight links between the people advocating "algebra of programming" methods and users of categories in programming (for example the authors of the banana paper [2] are/were hardcore categorists).
You may be interested in the Pointfree page of the Haskell wiki.
The downside of pointfree style is rather obvious: it can be a real pain to read. The reason why we still love to use variables, despite the numerous horrors of shadowing, alpha-equivalence etc., is that it's a notation that's just so natural to read and think about. The general idea is that a complex function (in a referentially transparent language) is like a complex plumbing system: the inputs are the parameters, they get into some pipes, are applied to inner functions, duplicated (\x -> (x,x)) or forgotten (\x -> (), pipe leading nowhere), etc. And the variable notation is nicely implicit about all that machinery: you give a name to the input, and names on the outputs (or auxiliary computations), but you don't have to describe all the plumbing plan, where the small pipes will go not to be a hindrance for the bigger ones, etc. The amount of plumbing inside something as short as \(f,x,y) -> ((x,y), f x y) is amazing. You may follow each variable individually, or read each intermediate plumbing node, but you never have to see the whole machinery together. When you use a point-free style, all the plumbing is explicit, you have to write everything down, and look at it afterwards, and sometimes it's just plain ugly.
PS: this plumbing vision is closely related to the stack programming languages, which are probably the least pointful programming languages (barely) in use. I would recommend trying to do some programming in them just to get of feeling of it (as I would recommend logic programming). See Factor, Cat or the venerable Forth.
I believe the purpose is to be succinct and to express pipelined computations as a composition of functions rather than thinking of threading arguments through. Simple example (in F#) - given:
let sum = List.sum
let sqr = List.map (fun x -> x * x)
Used like:
> sum [3;4;5]
12
> sqr [3;4;5]
[9;16;25]
We could express a "sum of squares" function as:
let sumsqr x = sum (sqr x)
And use like:
> sumsqr [3;4;5]
50
Or we could define it by piping x through:
let sumsqr x = x |> sqr |> sum
Written this way, it's obvious that x is being passed in only to be "threaded" through a sequence of functions. Direct composition looks much nicer:
let sumsqr = sqr >> sum
This is more concise and it's a different way of thinking of what we're doing; composing functions rather than imagining the process of arguments flowing through. We're not describing how sumsqr works. We're describing what it is.
PS: An interesting way to get your head around composition is to try programming in a concatenative language such as Forth, Joy, Factor, etc. These can be thought of as being nothing but composition (Forth : sumsqr sqr sum ;) in which the space between words is the composition operator.
PPS: Perhaps others could comment on the performance differences. It seems to me that composition may reduce GC pressure by making it more obvious to the compiler that there is no need to produce intermediate values as in pipelining; helping make the so-called "deforestation" problem more tractable.
While I'm attracted to the point-free concept and used it for some things, and agree with all the positives said before, I found these things with it as negative (some are detailed above):
The shorter notation reduces redundancy; in a heavily structured composition (ramda.js style, or point-free in Haskell, or whatever concatenative language) the code reading is more complex than linearly scanning through a bunch of const bindings and using a symbol highlighter to see which binding goes into what other downstream calculation. Besides the tree vs linear structure, the loss of descriptive symbol names makes the function hard to intuitively grasp. Of course both the tree structure and the loss of named bindings also have a lot of positives as well, for example, functions will feel more general - not bound to some application domain via the chosen symbol names - and the tree structure is semantically present even if bindings are laid out, and can be comprehended sequentially (lisp let/let* style).
Point-free is simplest when just piping through or composing a series of functions, as this also results in a linear structure that we humans find easy to follow. However, threading some interim calculation through multiple recipients is tedious. There are all kinds of wrapping into tuples, lensing and other painstaking mechanisms go into just making some calculation accessible, that would otherwise be just the multiple use of some value binding. Of course the repeated part can be extracted out as a separate function and maybe it's a good idea anyway, but there are also arguments for some non-short functions and even if it's extracted, its arguments will have to be somehow threaded through both applications, and then there may be a need for memoizing the function to not actually repeat the calculation. One will use a lot of converge, lens, memoize, useWidth etc.
JavaScript specific: harder to casually debug. With a linear flow of let bindings, it's easy to add a breakpoint wherever. With the point-free style, even if a breakpoint is somehow added, the value flow is hard to read, eg. you can't just query or hover over some variable in the dev console. Also, as point-free is not native in JS, library functions of ramda.js or similar will obscure the stack quite a bit, especially with the obligate currying.
Code brittleness, especially on nontrivial size systems and in production. If a new piece of requirement comes in, then the above disadvantages get into play (eg. harder to read the code for the next maintainer who may be yourself a few weeks down the line, and also harder to trace the dataflow for inspection). But most importantly, even something seemingly small and innocent new requirement can necessitate a whole different structuring of the code. It may be argued that it's a good thing in that it'll be a crystal clear representation of the new thing, but rewriting large swaths of point-free code is very time consuming and then we haven't mentioned testing. So it feels that the looser, less structured, lexical assignment based coding can be more quickly repurposed. Especially if the coding is exploratory, and in the domain of human data with weird conventions (time etc.) that can rarely be captured 100% accurately and there may always be an upcoming request for handling something more accurately or more to the needs of the customer, whichever method leads to faster pivoting matters a lot.
To the pointfree variant, the concatenative programming language, i have to write:
I had a little experience with Joy. Joy is a very simple and beautiful concept with lists. When converting a problem into a Joy function, you have to split your brain into a part for the stack plumbing work and a part for the solution in the Joy syntax. The stack is always handled from the back. Since the composition is contained in Joy, there is no computing time for a composition combiner.

Resources