Generic core binary relations of M3 - rascal

In the paper "M3: a General Model for Code Analytics in Rascal" 3 Generic core binary relations for the M3 are given. These are: containment, declarations, and uses.
Looking at the M3 source code in analysis::m3::Core, I see a lot more binary relations:
Declarations
Types
Uses
Containment
Messages
Names
Documentation
Modifiers
Is this list just extended in the meantime? If so, should all relations be used for a correct implementation of the M3?

containment, declarations, and uses are still necessary core relations. The others are sufficiently generic to be implementable for all languages, but are not strictly necessary. This depends on the tooling you use in the "back-end".

Related

Is repr(C) a preprocessor directive?

I've seen some Rust codebases use the #[repr(C)] macro (is that what it's called?), however, I couldn't find much information about it but that it sets the type layout in memory to the same layout as 'C's.
Here's what I would like to know: is this a preprocessor directive restricted to the compiler and not the language itself (even though there aren't any other compiler front-ends for Rust), and why does Rust even have a memory layout different than that of Cs? (it's just that I've never had to do this in another language).
Here's a nice situation to demonstrate what I meant: if someone creates another compiler for Rust, are they required to implement this macro, or is it a compiler specific thing?
#[repr(C)] is not a preprocessor directive, since Rust doesn't use a preprocessor 1. It is an attribute. Rust doesn't have a complete specification, but the repr attribute is mentioned in the Rust reference, so it is absolutely a part of the language. Implementation-wise, attributes are parsed the same way all other Rust code is, and are stored in the same AST. Rust has no "attribute pass": attributes are an actual part of the language. If someone else were to implement a Rust compiler, they would need to implement #[repr(C)].
Furthermore, #[repr(C)] can't be implemented without some compiler magic. In the absence of a #[repr(...)], Rust compilers are free to arrange the fields of a struct/enum however they want to (and they do take advantage of this for optimization purposes!).
Rust does have a good reason for using it's own memory layout. If compilers aren't tied to how a struct is written in the source code, they can do optimisations like not storing struct fields that are never read from, reordering fields for better performance, enum tag pooling2, and using spare bits throughout NonZero*s in the struct to store data (the last one isn't happening yet, but might in the future). But the main reason is that Rust has things that just don't make sense in C. For instance, Rust has zero-sized types (like () and [i8; 0]) which can't exist in C, trait vtables, enums with fields, generic types, all of which cause problems when trying to translate them to C.
1 Okay, you could use the C preprocessor with Rust if you really wanted to. Please don't.
2 For example, enum Food { Apple, Pizza(Topping) } enum Topping { Pineapple, Mushroom, Garlic } can be stored in just 1 byte since there are only 4 possible Food values that can be created.
What is this?
It is not a macro it is an attribute.
The book has a good chapter on what macros are and it mentions that there are "Attribute-like macros":
The term macro refers to a family of features in Rust: declarative macros with macro_rules! and three kinds of procedural macros:
Custom #[derive] macros that specify code added with the derive attribute used on structs and enums
Attribute-like macros that define custom attributes usable on any item
Function-like macros that look like function calls but operate on the tokens specified as their argument
Attribute-like macros are what you could use like attributes. For example:
#[route(GET, "/")]
fn index() {}
It does look like the repr attribute doesn't it 😃
So what is an attribute then?
Luckily Rust has great resources like rust-by-example which includes:
An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:
conditional compilation of code
set crate name, version and type (binary or library)
disable lints (warnings)
enable compiler features (macros, glob imports, etc.)
link to a foreign library
mark functions as unit tests
mark functions that will be part of a benchmark
The rust reference is also something you usually look at when you need to know something more in depth. (chapter for attributes)
To the compiler authors out there:
If you were to write a rust compiler, and wanted to support things like the standard library or other crates then you would 100% need to implement these. Because the libraries use these and need them.
Otherwise I guess you could come up with a subset of rust that your compiler supports. But then most people wouldn't use it..
Why does rust not just use the C layout?
The nomicon explains why rust needs to be able to reorder fields of structs for example. For reasons of saving space and being more efficient. It is related to, among other things, generics and monomorphization. In repr(C) fields of structs must be in the same order as the definition.
The C representation is designed for dual purposes. One purpose is for creating types that are interoperable with the C Language. The second purpose is to create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type.

Elixir Enum vs Erlang Lists

Is there a reason to use enum library over erlang lists when writing code in Elixir? lists has many of the same functions, like takewhile, partition, any, all...
also elixir-lang.org states
"Elixir provides excellent interoperability with Erlang libraries. In fact, Elixir discourages simply wrapping Erlang libraries in favor of directly interfacing with Erlang code."
Is it because the condition function needs to be formated differently that Enum is used over Lists?
So, Enum is a module that holds functions that work with data structures implementing the Enumerable protocol.
Lists in Erlang/Elixir are Linked Lists, which have several properties in terms of algorithmic complexity when it comes to indexing and inserting.
Enumerables in Elixir are data structures such as lists, but also maps and mapsets.
However, as stated in the documentation:
the majority of the functions in Enum enumerate the whole enumerable and return a list as result
That being said, two reasons to use Enum functions over List-based functions is that basically:
There are more of them, and will provide you with very useful ways to tackle a problem
They are polymorphic. Change your map for a keyword list any time and don't waste time rewriting everything in your codebase.

Converting OCaml to F#: Differences between typing and type inference

In researching type inference differences between F# and OCaml I found they tended to focus on nominative vs. structural type system. Then I found Distinctive traits of functional programming languages which list typing and type inference as different traits.
Since the trait article says OCaml and F# both use Damas-Milner type inference which I thought was a standard algorithm, i.e. an algorithm that does not allow for variations, how do the two traits relate? Is it that Damas-Milner is the basis upon which both type inference systems are built but that they each modify Damas-Milner based on the typing?
Also I checked the F# source code for the words Damas, Milner and Hindley and found none. A search for the word inference turned up the code for type inference.
If so, are there any papers that discuss the details of each type inference algorithm for the particular language, or do I have to look at the source code for OCaml and F#.
EDIT
Here is a page that highlights some differences related to type inference between OCaml and F#.
Concerning your DM question, you are right. For both F# and OCaml, DM algorithm is just a pattern. Type checkers are extended to support custom features. In OCaml these features include objects with row types, poly variants, first-class modules. In F# - .NET type system interop (classes, interfaces, structs, subtyping, method overloads), units of measure. I think F# type inference is also skewed in a left-to-right fashion to allow more efficient interactive checking, therefore some code surprisingly needs annotations.
As far as type checking and inference goes, OCaml is more expressive and intuitive than F#. SML would be closer than either of them to a vanilla HM, but SML also has a few extensions for some operator polymorphism and record support.
I believe that when they talk about structural typing in OCaml, they are probably referring to the object system (the "O" part of "OCaml"). The non-object parts of OCaml are pretty standard ML type system; it's the object system that is unusual.
The object system in OCaml is very different from the .NET class-based object system in F#. In OCaml, you can create objects directly without using a class. And classes are basically a convenience function for creating objects. An object after creation (either created directly using a literal, or using a class) has no concept of its class.
Look at what happens when you write a function that takes an object and calls a particular method on it:
# let foo x = x#bar;;
val foo : < bar : 'a; .. > -> 'a = <fun>
The argument type is inferred to be an abstract type that includes a method named bar. So it can take any object with such a method and type.
That's what it means when they say that the object system is structurally-typed. The only things that matter about an object is its set of methods, which determines where it can be used. So compatibility is just based on the "structure" of methods. And not on any idea of "class".
Since the trait article says OCaml and F# both use Damas-Milner type inference which I thought was a standard algorithm, i.e. an algorithm that does not allow for variations, how do the two traits relate?
The Damas-Milner algorithm (also known as Algorithm W) can be extended and, indeed, all practically-relevant implementations of it have added many extensions including both OCaml and F#.
Is it that Damas-Milner is the basis upon which both type inference systems are built but that they each modify Damas-Milner based on the typing?
Exactly, yes. In particular, OCaml has a great many different experimental extensions to a Damas-Milner core including polymorphic variants, objects, first-class modules. F# is simpler but also has some extensions that OCaml does not have, most notably overloading (primarily operators).
I don't believe there are summary papers describing the whole type systems of either OCaml or F#. Indeed, I do not know of a paper that describes today's F# type system. For OCaml, you have many different papers each covering different aspects. I would start with Jacques Garrigue's own publications and then follow the references therein.

What are the postfix numbers on F# core methods?

I was looking at the source code for the Append function in the SeqModule and noticed that there are a ton of duplicate methods with #xxx postfixed to them. Does anyone know why these are here?
In short, those are the concrete classes that back various local function values, and the #xxx values indicate the source code line number that caused them to be generated (though this is an implementation detail, and the classes could be given any arbitrary name).
Likewise, the C# compiler uses a conceptually similar scheme when defining classes to implement anonymous delegates, iterator state machines, etc. (see Eric Lippert's answer here for how the "magic names" in C# work).
These schemes are necessary because not every language feature maps perfectly to things that can be expressed cleanly in the CLR.

How to design a set of file readers and writers for different format

Digging into a legacy project (C++) that needs to be extended, I realized that there are about 40 reader/writer/parser classes. They are used to read and write various type of data (different objects) in different files format (binary, hdf5, xml, text, ...) ; one type of object is typically bound to one or two file formats. The classes have for most of them just no knowledge of the others. Interfaces and inheritance were apparently unknown to the writer, as well as design patterns.
It seems to me an horrendous mess. On the other hand I am not exactly sure how to handle this situation. I will at least extract interfaces. I would also like to see if I can have common code in some parent classes, for example what is specific to a hdf5 reader/writer. I also thought that the abstract factory pattern could help but the object I get out of the readers are completely different.
How would you handle this situation ? how would you design the classes ? what design pattern would you use if any ? Would you keep the reading and writing parts splitted ?
The Abstract Factory pattern isn't the right track. You usually only need interfaces if you anticipate multiple implementations for a given file type and want both to operate the same way.
Question: can one class be written to multiple file types? As in, object 'a' (of type Class A) potentially needs to be written to either/both XML or text formats?
If that is true, you need to decouple the classes from the readers/writers. Take a look at this question: What design pattern should I use for import/export?

Resources