Categories library for Agda? - standard-library

Are there any "recommended" libraries that provide a easy-to-use formalisation of basic category theory in Agda? The Agda standard library seems to provide very little in this regard.
I'm looking for something with a low barrier to entry, similar to how one uses the algebraic structures such as Semigroup defined in the standard library.
For example, there are several notions of morphism in my current project, and overloading syntax for composition and identity gets awkward. The natural thing to do would be to introduce a suitable record type and use Agda's "instance arguments" mechanism to emulate a Morphism type class. But no doubt this must be a wheel that has been invented many times. (Ok, there is a structure called Morphism in the standard library which perhaps could be adapted to this purpose, so this isn't necessarily the best example, but you get the idea.)
I'm aware of this library, which looks comprehensive, but doesn't seem to be particularly active.

This is an old question, but it still gets hits on google and other searches, so: the de facto library is now agda-categories.

I'm using the Categories library as mentioned above, and although I'm only using its basic features, it seems fine so far.

Related

Standards for when to use custom operators

I'm currently reading Real-World Functional Programming, and it briefly mentions in-fix operators as being one of the main benefits of custom operators. Are there any sort of standards for when to use or not use custom operators in F#? I'm looking for answers equivalent to this.
For reference, here is the quote to which #JohnPalmer is referring from here:
3.8 Operator Definitions 
Avoid defining custom symbolic operators in F#-facing library designs.
Custom operators are essential in some
situations and are highly useful notational devices within a large
body of implementation code. For new users of a library, named
functions are often easier to use. In addition custom symbolic
operators can be hard to document, and users find it more difficult to
lookup help on operators, due to existing limitations in IDE and
search engines.
As a result, it is generally best to publish your
functionality as named functions and members.
Custom infix operators are a nice feature in some situations, but when you use them, you have to be very careful to keep your code readable - so the recommendation from F# design guidelines applies most of the time. If I was writing Real-World Functional Programming again, I would be a bit less enthusiastic about them, because they really should be used carefuly :-).
That said, there are some F# libraries that make good use of custom operators and sometimes they work quite nicely. I think FParsec (parser combinator library) is one case - though maybe they have too many of them. Another example is a XML DSL which uses #=.
In general, when you're writing ordinary F# library, you probably do not want to expose them. However, when you're writing a domain specific language, custom operators might be useful.

F# parsing Abstract Syntax Trees

What is the best way to use F# to parse an AST to build an interpreter? There are plenty of F# examples for trivial syntax (basic arithmatical operations) but I can't seem to find anything for languages with much larger ranges of features.
Discriminated unions look to be incrediably useful but how would you go about constructing one with large number of options? Is it better to define the types (say addition, subtraction, conditionals, control flow) elsewhere and just bring them together as predefined types in the union?
Or have I missed some far more effective means of writing interpreters? Is having an eval function for each type more effective, or perhaps using monads?
Thanks in advance
Discriminated unions look to be
incrediably useful but how would you
go about constructing one with large
number of options? Is it better to
define the types (say addition,
subtraction, conditionals, control
flow) elsewhere and just bring them
together as predefined types in the
union?
I am not sure what you are asking here; even with a large number of options, DUs still are simple to define. See e.g. this blog entry for a tiny language's DU structure (as well as a more general discussion about writing tree transforms). It's fine to have a DU with many more cases, and common in compilers/interpreters to use such a representation.
As for parsing, I prefer monadic parser combinators; check out FParsec or see this old blog entry. After using such parser combinators, I can never go back to anything like lex/yacc/ANTLR - external DSLs seem so primitive in comparison.
(EDIT: The 'tiny arithmetic examples' you have found are probably pretty much representative of what larger solutions looks like as well. The 'toy' examples usually show off the right architecture.)
You should take a copy of Robert Pickering's "Beginning F#".
The chapter 13, "Parsing Text", contains an example with FsLex and FsYacc, as suggested by Noldorin.
Other than that, in the same book, chapter 12, the author explains how to build an actual simple compiler for an arithmetic language he proposes. Very enlightening. The most important part is what you are looking for: the AST parser.
Good luck.
I second Brian's suggestion to take a look at FParsec. If you're interested in doing things the old-school way with FsLex and FsYacc, one place to look for how to parse a non-trivial language is the F# source itself. See the source\fsharp\FSharp.Compiler directory in the distribution.
You may be interesting in checking out the Lexing and Parsing section of the F# WikiBook. The F# PowerPack library contains the FsLex and FsYacc tools, which asssist greatly with this. The WikiBook guide is a good way to get started with this.
Beyond that, you will need to think about how you actually want to execute the code from the AST form, which is common the design of both compilers and interpreters. This is generally considered the easier part however, and there are lots of general resources on compilers/interpreter out there that should provide information on this.
I haven't done an interpreter myself. Hope the following helps:)
Here's a compiler course taught at Yale using ML, which you might find useful. The lecture notes are very concise (short) and informative. You can follow the first few lecture notes and the assignments. As you know F#, you won't have problem reading ML programs.
Btw, the professor was a student of A. Appel, who is the creator of SML implementation. So from these notes, you also get the most natural way to write a compiler/interpreter in ML family language.
This is an excellent example of a complete Small Basic implementation with F# and FParsec. It includes even IL compiler. The whole code is very accessible and is accompanied by a series of blog post from the author at http://trelford.com/blog/

How does "Language Oriented Programming" compare to OOP/Functional in the real world

I recently began to read some F# related literature, speaking of "Real World Functional Programming" and "Expert F#" e. g.. At the beginning it's easy, because I have some background in Haskell, and know C#. But when it comes to "Language Oriented Programming" I just don't get it. - I read some explanations and it's like reading an academic paper that gets more abstract and strange with every sentence.
Does anybody have an easy example for that kind of stuff and how it compares to existing paradigms? It's not just academic fantasy, isn't it? ;)
Thanks,
wishi
Language oriented program (LOP) can be used to describe any of the following.
Creating an external language (DSL)
This is perhaps the most common use of LOP, and is where you have a specific domain - such as UPS shipping packages via transit types through routes, etc. Rather than try to encode all of these domain-specific entities inside of program code, you rather create a separate programming language for just that domain. So you can encode your problem in a separate, external language.
Creating an internal language
Sometimes you want your program code to look less like 'code' and map more closely to the problem domain. That is, have the code 'read more naturally'. A fluent interface is an example of this: Fluent Interface. Also, F# has Active Patterns which support this quite well.
I wrote a blog post on LOP a while back that provides some code examples.
F# has a few mechanisms for doing programming in a style one might call "language-oriented".
First, the syntax niceties (function calls don't need parentheses, can define own infix operators, ...) make it so that many user-defined libraries have the appearance of embedded DSLs.
Second, the F# "quotations" mechanism can enable you to quote code and then run it with an alternative semantics/evaluation engine.
Third, F# "computation expressions" (aka workflows, monads, ...) also provide a way to provide a type of alternative semantics for certain blocks of code.
All of these kinda fall into the EDSL category.
In Object Oriented Programming, you try to model a problem using Objects. You can then connect those Objects together to perform functions...and in the end solve the original problem.
In Language Oriented Programming, rather than use an existing Object Oriented or Functional Programming Language, you design a new Domain Specific Language that is best suited to efficiently solve your problem.
The term language Oriented Programming may be overloaded in that it might have different meanings to different people.
But in terms of how I've used it, it means that you create a DSL(http://en.wikipedia.org/wiki/Domain_Specific_Language) before you start to solve your problem.
Once your DSL is created you would then write your program in terms of the DSL.
The idea being that your DSL is more suited to expressing the problem than a General purpose language would be.
Some examples would be the make file syntax or Ruby on Rails ActiveRecord class.
I haven't directly used language oriented programming in real-world situations (creating an actual language), but it is useful to think about and helps design better domain-driven objects.
In a sense, any real-world development in Lisp or Scheme can be considered "language-oriented," since you are developing the "language" of your application and its abstract tree as you code along. Cucumber is another real-world example I've heard about.
Please note that there are some problems to this approach (and any domain-driven approach) in real-world development. One major problem that I've dealt with before is mismatch between the logic that makes sense in the domain and the logic that makes sense in software. Domain (business) logic can be extremely convoluted and senseless - and causes domain models to break down.
An easy example of a domain-specific language, mentioned here, is SQL. Also: UNIX shell scripts.
Of course, if you are doing a lot of basic ops and have a lot of overlap with the underlying language, it is probably overengineering.

Does F# provide you automatic parallelism?

By this I meant: when you design your app side effects free, etc, will F# code be automatically distributed across all cores?
No, I'm afraid not. Given that F# isn't a pure functional language (in the strictest sense), it would be rather difficult to do so I believe. The primary way to make good use of parallelism in F# is to use Async Workflows (mainly via the Async module I believe). The TPL (Task Parallel Library), which is being introduced with .NET 4.0, is going to fulfil a similar role in F# (though notably it can be used in all .NET languages equally well), though I can't say I'm sure exactly how it's going to integrate with the existing async framework. Perhaps Microsoft will simply advise the use of the TPL for everything, or maybe they will leave both as an option and one will eventually become the de facto standard...
Anyway, here are a few articles on asynchronous programming/workflows in F# to get you started.
http://blogs.msdn.com/dsyme/archive/2007/10/11/introducing-f-asynchronous-workflows.aspx
http://strangelights.com/blog/archive/2007/09/29/1597.aspx
http://www.infoq.com/articles/pickering-fsharp-async
F# does not make it automatic, it just makes it easy.
Yet another chance to link to Luca's PDC talk. Eight minutes starting at 52:20 are an awesome demo of F# async workflows. It rocks!
No, I'm pretty sure that it won't automatically parallelise for you. It would have to know that your code was side-effect free, which could be hard to prove, for one thing.
Of course, F# can make it easier to parallelise your code, particularly if you don't have any side effects... but that's a different matter.
Like the others mentioned, F# will not automatically scale across cores and will still require a framework such as the port of ParallelFX that Josh mentioned.
F# is commonly associated with potential for parallel processing because it defaults to objects being immutable, removing the need for locking for many scenarios.
On purity annotations: Code Contracts have a Pure attribute. I remember hearing the some parts of the BCL already use this. Potentially, this attribute could be used by parallellization frameworks as well, but I'm not aware of such work at this point. Also, I' not even sure how well code contacts are usable from within F#, so a lot of unknowns here.
Still, it will be interesting to see how all this stuff comes together.
No it will not. You must still explicitly marshal calls to other threads via one of the many mechanisms supported by F#.
My understanding is that it won't but Parallel Extensions is being modified to make it consumable by F#. Which won't make it automatically multi-thread it, should make it very easy to achieve.
Well, you have your answer, but I just wanted to add that I think this is the most significant limitation of F# stemming from the fact that it is a hybrid imperative/functional language.
I would like to see some extension to F# that declares a function to be pure. That is, it has no side-effects that are not denoted by the function's type. The idea would be that a function is pure only if it references other "known-pure" functions. Of course, this would only be useful if it were then possible to require that a delegate passed as a function parameter references a pure function.

Standard format for concrete and abstract syntax trees

I have an idea for a hobby project which performs some code analysis and manipulation. This project will require both the concrete and abstract syntax trees of a given source file. Additionally, bi-directional references between the two trees would be helpful. I would like to avoid the work of transcribing a grammar to construct my own lexer and parser.
Is there a standard format for describing either concrete or abstract syntax trees?
Do any widely-used tool chains support outputting to these formats?
I don't have a particular target programming language in mind. Any popular one will do for a prototype, but I'd prefer one I know well: Python, C#, Javascript, or C/C++.
I'd like the ability to run a source file through a tool or library and get back both trees. In an ideal world, it would be practical to run this tool on code as it is being edited by a user and be tolerant of errors. Again, I am simply trying to develop a prototype, so these requirements are pretty lax.
Thanks!
The research community decided that graph exchange was the right thing to do when moving information from one program analysis tool to another.
See http://www.gupro.de/GXL
More recently, the OMG has defined a standard for interchanging Abstract Syntax Trees.
See http://www.omg.org/spec/ASTM/1.0/Beta1/
This problem seems to get solved over and over again.
There's half a dozen "tool bus" proposals made over the years
that all solved it, with no one ever overtaking the industry.
The problem is that a) it is easy to represent ASTs using
any kind of nestable notation [parentheses like LISP,
like XML, ...] so people roll their own solution easily,
and b) for one tool to exchange an AST with another, they
both have to agree essentially on what the AST nodes mean;
but most ASTs are rather accidentally derived from the particular
grammar/parsing technology used by each tool, and there's
almost always disagreement about that between tools.
So, I've seen very few tools that exchange ASTs meaningfully.
If you're doing a hobby thing, I'd stick with a lisp-like
encoding of trees, where each node has the following format:
( ... )
Its easy to generate, and easy to read.
I work on a professional tool to manipulate programs. If we
have print out the AST, we do the above. Mostly individual
ASTs are far too complicated to look at in practice,
so we hardly ever print out the entire AST, at best only
a node and a few children deep. Our tool doesn't exchange
ASTs with anybody (see above reasons :) but does just
fine building it in memory, doing whizzy things with it
for analysis reasons or transformation reasons, and then
either just deleteing it (no need to send it anywhere)
or regenerating the original language text from the tree.
[The latter means you need anti-parsing or "prettyprinting"
technology]
In our project we defined the AST metamodel in UML and use ANTLR (Java) to populate the model. We also maintain the token information from ANTLR after parsing, but we have not yet tried to update the underlying text-file with modifications made on the model.
This has a hideous overhead (in infrastructure, such as Eclipse UML2/EMF), but our goal is to use high-level tools for Model-based/driven Development (MDD, MDA) anyway, so we decided to use it on each level.
I think one of our students once played with OpenArchitectureWare and managed to get changes from the Eclipse-based, generated editor back into the syntax tree (not related to the UML model above) automatically, but I don't know the details about this.
You might also want to look at ANTLR's tree grammars.
Specific standards are an expectation, while more general purpose standards may also be appropriate. Ira Baxter already mentioned GXL, and RDF may be added too, just that it would require an appropriate ontology and is more oriented toward semantic than syntax. Still may be an option to investigate.
For specific standards, Ira Baxter already mentioned ASTM, another one, although it rather targets a specific kind of programming language (logic languages), is a standard for semantic/conceptual graph, known as ISO‑IEC 24707 2007.
Not a standard on its own, but a paper about that matter: Towards Portable Source Code Representations Using XML
.
I don't know any effectively used standard (in this area, that's always house‑made cooking everywhere), I'm just interested too in this topic.

Resources