Logic programming in Lua? - lua

Is there a way to do logic programming (think of Prolog) in Lua?
In particular: is there any Lua module for logic programming (miniKanren implemenatation will be the best, but it isn't strictly required)? Because I couldn't find any [1]. And if not, are there any known (preferably tried) ways how to do logic programming in Lua?
Also: is there anybody who has tried to do something like logic programming in Lua?
[1] So far I've found only blog post mentioning the possibility of writing one in Metalua, but I would rather see one compatible with the standard Lua.

There is a forward-chaining inference engine in Lua called lua-faces. In addition to MiniKanRen, there are several other logic programming systems in JavaScript that could be automatically translated into Lua using Castl.
I also wrote a translator that converts a subset of Lua into Prolog. Given this input:
function print_each(The_list)
for _, Item in pairs(The_list) do
print(Item)
end
end
it will produce this output in Prolog:
print_each(The_list) :-
forall(member(Item,The_list),(
writeln(Item)
)).

Would ASP be helpful? https://potassco.org/
Check section 3.1.14 of the manual https://github.com/potassco/guide/releases/download/v2.1.0/guide.pdf

Logic programming is a paradigm and thus is just a form of specific syntax where you state some facts and base result on logical equation of those facts, while facts themselves could be results of other equations.
Lua is not specifically designed for this, but you can easily simulate this behavior by defining all logic programming operators as functions - i.e. function and(...) that would return true only if all its arguments true, etc., and making defining your "facts" as a table with lazy evaluation provided by metatable.

Related

Is a 'standard module' part of the 'programming language'?

On page 57 of the book "Programming Erlang" by Joe Armstrong (2007) 'lists:map/2' is mentioned in the following way:
Virtually all the modules that I write use functions like
lists:map/2 —this is so common that I almost consider map
to be part of the Erlang language. Calling functions such
as map and filter and partition in the module lists is extremely
common.
The usage of the word 'almost' got me confused about what the difference between Erlang as a whole and the Erlang language might be, and if there even is a difference at all. Is my confusion based on semantics of the word 'language'? It seems to me as if a standard module floats around the borders of what does and does not belong to the actual language it's implemented in. What are the differences between a programming language at it's core and the standard libraries implemented in them?
I'm aware of the fact that this is quite the newby question, but in my experience jumping to my own conclusions can lead to bad things. I was hoping someone could clarify this somewhat.
Consider this simple program:
1> List = [1, 2, 3, 4, 5].
[1,2,3,4,5]
2> Fun = fun(X) -> X*2 end.
#Fun<erl_eval.6.50752066>
3> lists:map(Fun, List).
[2,4,6,8,10]
4> [Fun(X) || X <- List].
[2,4,6,8,10]
Both produce the same output, however the first one list:map/2 is a library function, and the second one is a language construct at its core, called list comprehension. The first one is implemented in Erlang (accidentally also using list comprehension), the second one is parsed by Erlang. The library function can be optimized only as much as the compiler is able to optimize its implementation in Erlang. However, the list comprehension may be optimized as far as being written in assembler in the Beam VM and called from the resulted beam file for maximum performance.
Some language constructs look like they are part of the language, whereas in fact they are implemented in the library, for example spawn/3. When it's used in the code it looks like a keyword, but in Erlang it's not one of the reserved words. Because of that, Erlang compiler will automatically add the erlang module in front of it and call erlang:spawn/3, which is a library function. Those functions are called BIFs (Build-In Functions).
In general, what belongs to the language itself is what that language's compiler can parse and translate to the executable code (or in other words, what's defined by the language's grammar). Everything else is a library. Libraries are usually written in the language for which they are designed, but it doesn't necessarily have to be the case, e.g. some of Erlang library functions are written using C as Erlang NIFs.

F# tail recursion and why not write a while loop?

I'm learning F# (new to functional programming in general though used functional aspects of C# for years but let's face it, that's pretty different) and one of the things that I've read is that the F# compiler identifies tail recursion and compiles it into a while loop (see http://thevalerios.net/matt/2009/01/recursion-in-f-and-the-tail-recursion-police/).
What I don't understand is why you would write a recursive function instead of a while loop if that's what it's going to turn into anyway. Especially considering that you need to do some extra work to make your function recursive.
I have a feeling someone might say that the while loop is not particularly functional and you want to act all functional and whatnot so you use recursion but then why is it sufficient for the compiler to turn it into a while loop?
Can someone explain this to me?
You could use the same argument for any transformation that the compiler performs. For instance, when you're using C#, do you ever use lambda expressions or anonymous delegates? If the compiler is just going to turn those into classes and (non-anonymous) delegates, then why not just use those constructions yourself? Likewise, do you ever use iterator blocks? If the compiler is just going to turn those into state machines which explicitly implement IEnumerable<T>, then why not just write that code yourself? Or if the C# compiler is just going to emit IL anyway, why bother writing C# instead of IL in the first place? And so on.
One obvious answer to all of these questions is that we want to write code which allows us to express ourselves clearly. Likewise, there are many algorithms which are naturally recursive, and so writing recursive functions will often lead to a clear expression of those algorithms. In particular, it is arguably easier to reason about the termination of a recursive algorithm than a while loop in many cases (e.g. is there a clear base case, and does each recursive call make the problem "smaller"?).
However, since we're writing code and not mathematics papers, it's also nice to have software which meets certain real-world performance criteria (such as the ability to handle large inputs without overflowing the stack). Therefore, the fact that tail recursion is converted into the equivalent of while loops is critical for being able to use recursive formulations of algorithms.
A recursive function is often the most natural way to work with certain data structures (such as trees and F# lists). If the compiler wants to transform my natural, intuitive code into an awkward while loop for performance reasons that's fine, but why would I want to write that myself?
Also, Brian's answer to a related question is relevant here. Higher-order functions can often replace both loops and recursive functions in your code.
The fact that F# performs tail optimization is just an implementation detail that allows you to use tail recursion with the same efficiency (and no fear of a stack overflow) as a while loop. But it is just that - an implementation detail - on the surface your algorithm is still recursive and is structured that way, which for many algorithms is the most logical, functional way to represent it.
The same applies to some of the list handling internals as well in F# - internally mutation is used for a more efficient implementation of list manipulation, but this fact is hidden from the programmer.
What it comes down to is how the language allows you to describe and implement your algorithm, not what mechanics are used under the hood to make it happen.
A while loop is imperative by its nature. Most of the time, when using while loops, you will find yourself writing code like this:
let mutable x = ...
...
while someCond do
...
x <- ...
This pattern is common in imperative languages like C, C++ or C#, but not so common in functional languages.
As the other posters have said some data structures, more exactly recursive data structures, lend themselves to recursive processing. Since the most common data structure in functional languages is by far the singly linked list, solving problems by using lists and recursive functions is a common practice.
Another argument in favor of recursive solutions is the tight relation between recursion and induction. Using a recursive solution allows the programmer to think about the problem inductively, which arguably helps in solving it.
Again, as other posters said, the fact that the compiler optimizes tail-recursive functions (obviously, not all functions can benefit from tail-call optimization) is an implementation detail which lets your recursive algorithm run in constant space.

Is Erlang a Constraint-Logic programming language?

Since Erlang is based upon Prolog, does this mean that Erlang is a Constraint-Logic Language?
Does Erlang have Prolog's building blocks: Facts, Rules and Query
No.
Erlang's syntax is very similar to Prolog's, but the semantics are very different. An early version of Erlang was written using Prolog, but today's Erlang can no longer meaningfully be said to be "based on Prolog."
Erlang does not include backtracking or other features of Prolog regularly used for logic programming. You can of course implement Prolog atop other languages, and Erlang is an easier choice for this than some others. This can be seen in Robert Virding's "Erlog" project:
https://github.com/rvirding/erlog
Yes.
The first version of Erlang was not written in Prolog, it was written in one of the committed-choice logic programming languages. These languages dropped Prolog's backtracking hence the name "committed choice" meaning once a choice was made it was not possible to backtrack and try another. This was done in order to simplify making a form of logic programming concurrent. Another way of looking at it is that concurrent processes would apply constraints to variables, but being logic variables and hence not re-assignable these would be successive constraints not changes of value. The constraint might assign a partial value to a variable, containing another variable which would be assigned later. This is the underlying model of Erlang. Constraint logic programming has tended to be used for versions where the constraints could also include mathematical statements about possible ranges of variables with intended numerical values.
The syntax of Erlang shows its logic programming heritage, but it is important to understand it picked this up through the committed choice logic programming languages which picked it up from Prolog, not directly from Prolog. Although several committed choice logic programming languages were devised during the 1980s, they were unable to pull out of the shadow of Prolog, and were pulled down by their association with the failed Japanese Fifth Generation initiative, and also by competing teams of developers who bickered over minor differences so no standard was established.
Erlang's developers introduced a syntactic sugar which gave a more functional appearance to the code, and made the marketing decision to promote it as a functional rather than logic programming language, which enabled it not to be dragged down by the post fifth generation dismissal of logic programming.
In short, no it's not :) It doesn't have those building blocks. It's focus is on concurrency, parallel programming, distributed applications and fault-tolerance (while being a functional, strict, declarative language).
You may well use the list comprehension feature in Erlang as a way of implementing in a constraint programming style.
% Produce the tuple {1, 0}
%
constraint_test() -> [ {A, B} ||
A <- lists:seq(0, 1),
B <- lists:seq(0, 1),
A > B].
You alternatively place generators of elements taken from lists (A <- lists:seq(0, 1)) and constraints (A > B).
I recently solved the problem linked below. And if you place the constraints correctly you will have the answer in an instant of a second.
http://www.geocaching.com/seek/cache_details.aspx?guid=a8605431-53b5-4c2c-97fb-d42ee299b167

Parsing Source Code - Unique Identifiers for Different Languages? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm building an application that receives source code as input and analyzes several aspects of the code. It can accept code from many common languages, e.g. C/C++, C#, Java, Python, PHP, Pascal, SQL, and more (however many languages are unsupported, e.g. Ada, Cobol, Fortran). Once the language is known, my application knows what to do (I have different handlers for different languages).
Currently I'm asking the user to input the programming language the code is written in, and this is error-prone: although users know the programming languages, a small percentage of them (on rare occasions) click the wrong option just due to recklessness, and that breaks the system (i.e. my analysis fails).
It seems to me like there should be a way to figure out (in most cases) what the language is, from the input text itself. Several notes:
I'm receiving pure text and not file names, so I can't use the extension as a hint.
The user is not required to input complete source codes, and can also input code snippets (i.e. the include/import part may not be included).
it's clear to me that any algorithm I choose will not be 100% proof, certainly for very short input codes (e.g. that could be accepted by both Python and Ruby), in which cases I will still need the user's assistance, however I would like to minimize user involvement in the process to minimize mistakes.
Examples:
If the text contains "x->y()", I may know for sure it's C++ (?)
If the text contains "public static void main", I may know for sure it's Java (?)
If the text contains "for x := y to z do begin", I may know for sure it's Pascal (?)
My question:
Are you familiar with any standard library/method for figuring out automatically what the language of an input source code is?
What are the unique code "tokens" with which I could certainly differentiate one language from another?
I'm writing my code in Python but I believe the question to be language agnostic.
Thanks
Vim has a autodetect filetype feature. If you download vim sourcecode you will find a /vim/runtime/filetype.vim file.
For each language it checks the extension of the file and also, for some of them (most common), it has a function that can get the filetype from the source code. You can check that out. The code is pretty easy to understand and there are some very useful comments there.
build a generic tokenizer and then use a Bayesian filter on them. Use the existing "user checks a box" system to train it.
Here is a simple way to do it. Just run the parser on every language. Whatever language gets the farthest without encountering any errors (or has the fewest errors) wins.
This technique has the following advantages:
You already have most of the code necessary to do this.
The analysis can be done in parallel on multi-core machines.
Most languages can be eliminated very quickly.
This technique is very robust. Languages that might appear very similar when using a fuzzy analysis (baysian for example), would likely have many errors when the actual parser is run.
If a program is parsed correctly in two different languages, then there was never any hope of distinguishing them in the first place.
I think the problem is impossible. The best you can do is to come up with some probability that a program is in a particular language, and even then I would guess producing a solid probability is very hard. Problems that come to mind at once:
use of features like the C pre-processor can effectively mask the underlyuing language altogether
looking for keywords is not sufficient as the keywords can be used in other languages as identifiers
looking for actual language constructs requires you to parse the code, but to do that you need to know the language
what do you do about malformed code?
Those seem enough problems to solve to be going on with.
One program I know which even can distinguish several different languages within the same file is ohcount. You might get some ideas there, although I don't really know how they do it.
In general you can look for distinctive patterns:
Operators might be an indicator, such as := for Pascal/Modula/Oberon, => or the whole of LINQ in C#
Keywords would be another one as probably no two languages have the same set of keywords
Casing rules for identifiers, assuming the piece of code was writting conforming to best practices. Probably a very weak rule
Standard library functions or types. Especially for languages that usually rely heavily on them, such as PHP you might just use a long list of standard library functions.
You may create a set of rules, each of which indicates a possible set of languages if it matches. Intersecting the resulting lists will hopefully get you only one language.
The problem with this approach however, is that you need to do tokenizing and compare tokens (otherwise you can't really know what operators are or whether something you found was inside a comment or string). Tokenizing rules are different for each language as well, though; just splitting everything at whitespace and punctuation will probably not yield a very useful sequence of tokens. You can try several different tokenizing rules (each of which would indicate a certain set of languages as well) and have your rules match to a specified tokenization. For example, trying to find a single-quoted string (for trying out Pascal) in a VB snippet with one comment will probably fail, but another tokenizer might have more luck.
But since you want to perform analysis anyway you probably have parsers for the languages you support, so you can just try running the snippet through each parser and take that as indicator which language it would be (as suggested by OregonGhost as well).
Some thoughts:
$x->y() would be valid in PHP, so ensure that there's no $ symbol if you think C++ (though I think you can store function pointers in a C struct, so this could also be C).
public static void main is Java if it is cased properly - write Main and it's C#. This gets complicated if you take case-insensitive languages like many scripting languages or Pascal into account. The [] attribute syntax in C# on the other hand seems to be rather unique.
You can also try to use the keywords of a language - for example, Option Strict or End Sub are typical for VB and the like, while yield is likely C# and initialization/implementation are Object Pascal / Delphi.
If your application is analyzing the source code anyway, you code try to throw your analysis code at it for every language and if it fails really bad, it was the wrong language :)
My approach would be:
Create a list of strings or regexes (with and without case sensitivity), where each element has assigned a list of languages that the element is an indicator for:
class => C++, C#, Java
interface => C#, Java
implements => Java
[attribute] => C#
procedure => Pascal, Modula
create table / insert / ... => SQL
etc. Then parse the file line-by-line, match each element of the list, and count the hits.
The language with the most hits wins ;)
How about word frequency analysis (with a twist)? Parse the source code and categorise it much like a spam filter does. This way when a code snippet is entered into your app which cannot be 100% identified you can have it show the closest matches which the user can pick from - this can then be fed into your database.
Here's an idea for you. For each of your N languages, find some files in the language, something like 10-20 per language would be enough, each one not too short. Concatenate all files in one language together. Call this lang1.txt. GZip it to lang1.txt.gz. You will have a set of N langX.txt and langX.txt.gz files.
Now, take the file in question and append to each of he langX.txt files, producing langXapp.txt, and corresponding gzipped langXapp.txt.gz. For each X, find the difference between the size of langXapp.gz and langX.gz. The smallest difference will correspond to the language of your file.
Disclaimer: this will work reasonably well only for longer files. Also, it's not very efficient. But on the plus side you don't need to know anything about the language, it's completely automatic. And it can detect natural languages and tell between French or Chinese as well. Just in case you need it :) But the main reason, I just think it's interesting thing to try :)
The most bulletproof but also most work intensive way is to write a parser for each language and just run them in sequence to see which one would accept the code. This won't work well if code has syntax errors though and you most probably would have to deal with code like that, people do make mistakes. One of the fast ways to implement this is to get common compilers for every language you support and just run them and check how many errors they produce.
Heuristics works up to a certain point and the more languages you will support the less help you would get from them. But for first few versions it's a good start, mostly because it's fast to implement and works good enough in most cases. You could check for specific keywords, function/class names in API that is used often, some language constructions etc. Best way is to check how many of these specific stuff a file have for each possible language, this will help with some syntax errors, user defined functions with names like this() in languages that doesn't have such keywords, stuff written in comments and string literals.
Anyhow you most likely would fail sometimes so some mechanism for user to override language choice is still necessary.
I think you never should rely on one single feature, since the absence in a fragment (e.g. somebody systematically using WHILE instead of for) might confuse you.
Also try to stay away from global identifiers like "IMPORT" or "MODULE" or "UNIT" or INITIALIZATION/FINALIZATION, since they might not always exist, be optional in complete sources, and totally absent in fragments.
Dialects and similar languages (e.g. Modula2 and Pascal) are dangerous too.
I would create simple lexers for a bunch of languages that keep track of key tokens, and then simply calculate a key tokens to "other" identifiers ratio. Give each token a weight, since some might be a key indicator to disambiguate between dialects or versions.
Note that this is also a convenient way to allow users to plugin "known" keywords to increase the detection ratio, by e.g. providing identifiers of runtime library routines or types.
Very interesting question, I don't know if it is possible to be able to distinguish languages by code snippets, but here are some ideas:
One simple way is to watch out for single-quotes: In some languages, it is used as character wrapper, whereas in the others it can contain a whole string
A unary asterisk or a unary ampersand operator is a certain indication that it's either of C/C++/C#.
Pascal is the only language (of the ones given) to use two characters for assignments :=. Pascal has many unique keywords, too (begin, sub, end, ...)
The class initialization with a function could be a nice hint for Java.
Functions that do not belong to a class eliminates java (there is no max(), for example)
Naming of basic types (bool vs boolean)
Which reminds me: C++ can look very differently across projects (#define boolean int) So you can never guarantee, that you found the correct language.
If you run the source code through a hashing algorithm and it looks the same, you're most likely analyzing Perl
Indentation is a good hint for Python
You could use functions provided by the languages themselves - like token_get_all() for PHP - or third-party tools - like pychecker for python - to check the syntax
Summing it up: This project would make an interesting research paper (IMHO) and if you want it to work well, be prepared to put a lot of effort into it.
There is no way of making this foolproof, but I would personally start with operators, since they are in most cases "set in stone" (I can't say this holds true to every language since I know only a limited set). This would narrow it down quite considerably, but not nearly enough. For instance "->" is used in many languages (at least C, C++ and Perl).
I would go for something like this:
Create a list of features for each language, these could be operators, commenting style (since most use some sort of easily detectable character or character combination).
For instance:
Some languages have lines that start with the character "#", these include C, C++ and Perl. Do others than the first two use #include and #define in their vocabulary? If you detect this character at the beginning of line, the language is probably one of those. If the character is in the middle of the line, the language is most likely Perl.
Also, if you find the pattern := this would narrow it down to some likely languages.
Etc.
I would have a two-dimensional table with languages and patterns found and after analysis I would simply count which language had most "hits". If I wanted it to be really clever I would give each feature a weight which would signify how likely or unlikely it is that this feature is included in a snippet of this language. For instance if you can find a snippet that starts with /* and ends with */ it is more than likely that this is either C or C++.
The problem with keywords is someone might use it as a normal variable or even inside comments. They can be used as a decider (e.g. the word "class" is much more likely in C++ than C if everything else is equal), but you can't rely on them.
After the analysis I would offer the most likely language as the choice for the user with the rest ordered which would also be selectable. So the user would accept your guess by simply clicking a button, or he can switch it easily.
In answer to 2: if there's a "#!" and the name of an interpreter at the very beginning, then you definitely know which language it is. (Can't believe this wasn't mentioned by anyone else.)

Do concepts like Map and Reduce apply to all Functional Programming Languages?

I have just started delving into the world of functional programming.
A lot of OOP (Object Oriented Programming) concepts such as inheritance and polymorphism apply to most modern OO languages like C#, Java and VB.NET.
But how about concepts such as Map, Reduce, Tuples and Sets, do they apply to all FP (Functional Programming) languages?
I have just started with F#. But do aforementioned concepts apply to other FP like Haskell, Nemerle, Lisp, etc.?
You bet. The desirable thing about function programming is that the mathematical concepts you describe are more naturally expressed in an FP.
It's a bit of tough going, but John Backus' Turing Award paper in which he described functional (or "applicative") programming is a good read. The Wikipedia article is good too.
Yes; higher-order functions, algebraic data types, folds/catamorphisms, etc are common to almost all functional languages (though they sometimes go by slightly different names in each language).
Functional tools apply to all programming, not just languages that handle that explicitly. For example, python has map and reduce builtin functions that do exactly what you expect, besides out of order evaluation. you'll need something like the multiprocessing module to get really clever.
Even if the language doesn't provide the exact primitives, most modern languages still make it possible to get the desired effect with a bit more work. This is similar to the way a class-like concept can be coded in pure C.
I would interpret what you're asking as, "Are higher-order functions (map, reduce, filter, ...) and immutable data structures (tuples, cons lists, records, maps, sets, ...) common across FP languages?" and I would say, absolutely yes.
Like you say, OOP has well known pillars (encapsulation, inheritance, polymorphism). The "pillars" of functional programming I'd say are 1) Using functions as first-class values and 2) Expressing yourself without side effects.
You'll likely find common tools to apply these ideas across various FP languages (F# is an excellent choice BTW!) and you'll see them finding their way into more mainstream languages; maybe in a less recognizable form (e.g. LINQ's Select = map, Aggregate = reduce/fold, Where = filter, C# has light weight lambda syntax, System.Tuple, etc.).
As an aside, the thing that seems to be generally missing from non-explicitly-FP languages is good immutable data structures and syntax support for them (not merely a library) which makes it hard to stick to pillar #2 in those languages. F# lists, records, tuples, etc. all are good examples of great language and library combined support for this.
If you really want to jump into the deep end and understand why these concepts are not just conventional but, ahem, foundational, check out the paper "Functional programming with bananas, lenses, envelopes and barbed wire".
They apply to all languages that contain data types that can be "mapped" and "reduced", i.e maps, arrays/vectors, or lists.
In a "pure lambda calculus" language, where every data structure is defined via function application, you can of course apply functions in parallel (i.e., in a call fn(expr1, expr2), you can evaluate expr1 and expr2 in parallel), but that isn't really what map/reduce is about.

Resources