What does "to close over the enclosing scope/class" mean? - closures

The Akka documentation is documenting dangerous variants of using Props:
// NOT RECOMMENDED within another actor:
// encourages to close over enclosing class
val props7 = Props(new MyActor)
Then carries on stating:
This method is not recommended to be used within another actor because
it encourages to close over the enclosing scope, resulting in
non-serializable Props and possibly race conditions (breaking the
actor encapsulation).
Could someone please explain the meaning of "closing over the enclosing scope"? Been looking all over and found nothing. Thanks.

It's a bit tricky to see in this example, that the new Actor is passed in as a so called "by name" parameter. Think of it as if it is turned into a function of type () => Actor. This function will be called every time the actor will be (re)created by it's supervisor during a restart.
The problem is that this function is a "closure" (very easy to Google ;)), which means it captures and remembers everything in the surrounding scope that it needs (sometimes, but very rarely referred to as "stack ripping"). E.g val f = (a: Int) => a + x. Where does the x come from? It comes from the surrounding scope. The function litetal, assigned to f is called an "open term". At runtime the function literal becomes a function value (that's a fancy way of saying "object"), which when executed closes the open term, while capturing everything in the surrounding scope. That's where the name "closure" comes from.
Closures are very very useful, but you have to be careful what you close over. Sometimes x is a def or god forbid a var, which leads to unpredictable results for f, because you don't have control over the time when f will be called/executed. Try it out!
Two very common anti paterns in Akka are/were:
Closing over (the outer) this reference when creating an Actor
from an inner class.
Closing over def sender when responding to
a message with a future.
I gave you a bunch of fancy terms to Google on purpose, btw ;)
Cheers and happy coding

As a supplement to #agilesteel's fine answer, some references:
Explains what closures are: Programming in Scala, 8.7, Closures
Explains why closures can cause serialization problems: SIP-21 - Spores
And here is a code example of creating a Props object that is not serializable because of closing over a non-serializable object, based on the example in SIP-21:
case class Helper(name: String)
object MyNonserializableObject {
val helper = Helper("the helper")
val props7 = Props(new MyActor(helper))
}
Even though helper itself is serializable, the "new MyActor(helper)" is passed by name and so captures this.helper, and this is not serializable.
You can see that the actor parameter is passed by name from the signature of the Props apply method where there is a ⇒ in the creator parameter:
def apply[T <: Actor](creator: ⇒ T)(implicit arg0: ClassTag[T]): Props

Related

Does dart have an equivalent to C# discards?

C# discards prevent allocation of values not needed. Is there something similar in dart? I saw a lot of people use the underscore as if it were a discard, but using two at the same time (like this (_, _) => method() will say the variable _ is already defined.
Dart does allow you to use the same discard operator as C#. You can define a variable or final with a name of _. This works well with the rule avoid-ignoring-return-values (Dart Code Metrics) Importantly, if you name the variable with this, you will not encounter the warning unused-local-variable. However, there is another code rule called no_leading_underscores_for_local_identifiers. You can safely turn this off as long as you don't have someone in your team that has a habit of prefixing variable names with an underscore.
Ignoring the return value
Discarding the return variable
Unfortunately, it doesn't work the same way as C# because it involves an assignment, and you cannot assign two different types to it. You need to declare it as an Object?

Arrow KT: Reader Monad vs #extension for Dependency Injection

I've read about Reader Monad from this article by Jorge Castillo himself and I've also got this article by Paco. It seems that both tackles the idea of Dependency Injection just in a different way. (Or am I wrong?)
I'm really confused whether I understand the whole Reader Monad and how it relates to the Simple Depenency Injection that Paco is talking about.
Can anyone help me understand these two things? Would I ever need both of them in one project depending on situations?
Your doubt is understandable, since yes both approaches share the same outcome: Passing dependencies implicitly for you all the way across your call stack, so you don't need to pass them explicitly at every level. With both approaches you will pass your dependencies once from the outer edge, and that's it.
Let's say you have the functions a(), b(), c() and d(), and let's say each one calls the next one: a() -> b() -> c() -> d(). That is our program.
If you didn't use any of the mentioned mechanisms, and you needed some dependencies in d(), you would end up forwarding your dependencies (let's call them ctx) all the way down on every single level:
a(ctx) -> b(ctx) -> c(ctx) -> d(ctx)
While after using any of the mentioned two approaches, it'd be like:
a(ctx) -> b() -> c() -> d()
But still, and this is the important thing to remember, you'd have your dependencies accessible in the scope of each one of those functions. This is possible because with the described approaches you enable an enclosing context that automatically forwards them on every level, and that each one of the functions runs within. So, being into that context, the function gets visibility of those dependencies.
Reader: It's a data type. I encourage you to read and try to understand this glossary where data types are explained, since the difference between both approaches requires understanding what type classes and data types are, and how they play together:
https://arrow-kt.io/docs/patterns/glossary/
As a summary, data types represent a context for the program's data. In this case, Reader stands for a computation that requires some dependencies to run. I.e. a computation like (D) -> A. Thanks to it's flatMap / map / and other of its functions and how they are encoded, D will be passed implicitly on every level, and since you will define every one of your program functions as a Reader, you will always be operating within the Reader context hence get access to the required dependencies (ctx). I.e:
a(): Reader<D, A>
b(): Reader<D, A>
c(): Reader<D, A>
d(): Reader<D, A>
So chaining them with the Reader available combinators like flatMap or map you'll get D being implicitly passed all the way down and enabled (accessible) for each of those levels.
In the other hand, the approach described by Paco's post looks different, but ends up achieving the same. This approach is about leveraging Kotlin extension functions, since by defining a program to work over a receiver type (let's call it Context) at all levels will mean every level will have access to the mentioned context and its properties. I.e:
Context.a()
Context.b()
Context.c()
Context.d()
Note that an extension function receiver is a parameter that without extension function support you'd need to manually pass as an additional function argument on every call, so in that way is a dependency, or a "context" that the function requires to run. Understanding those this way and understanding how Kotlin interprets extension functions, the receiver will not need to be forwarded manually on every level but just passed to the entry edge:
ctx.a() -> b() -> c() -> d()
B, c, and d would be called implicitly without the need for you to explicitly call each level function over the receiver since each function is already running inside that context, hence it has access to its properties (dependencies) enabled automatically.
So once we understand both we'd need to pick one, or any other DI approach. That's quite subjective, since in the functional world there are also other alternatives for injecting dependencies like the tagless final approach which relies on type classes and their compile time resolution, or the EnvIO which is still not available in Arrow but will be soon (or an equivalent alternative). But I don't want to get you more confused here. In my opinion the Reader is a bit "noisy" in combination with other common data types like IO, and I usually aim for tagless final approaches, since those allow to keep program constraints determined by injected type classes and rely on IO runtime for the complete your program.
Hopefully this helped a bit, otherwise feel free to ask again and we'll be back to answer 👍

What does that 2 dots mean? What is the difference between 1 and 2?

I have seen a lot of tutorials using dot, while some use 2. What is the actual meaning of this?
Example,
Array().add()
Animation()..addListener(() {})
The .. operator is dart "cascade" operator. Useful for chaining operations when you don't care about the return value.
This is also dart solution to chainable functions that always return this
It is made so that the following
final foo = Foo()
..first()
..second();
Is strictly equals to this:
final foo = Foo();
foo.first();
foo.second();
Just to be a nitpicker, .. isn't actually an operator in Dart, just part of Dart's syntactic sugar.
In addition to the mentioned use of cascades for chaining calls to functions, you can also use it to access fields on the same object.
Consider this code, taken from the Dart documentation:
querySelector('#confirm') // Get an object.
..text = 'Confirm' // Use its members.
..classes.add('important')
..onClick.listen((e) => window.alert('Confirmed!'));
The first method call, querySelector(), returns a selector object. The code that follows the cascade notation operates on this selector object, ignoring any subsequent values that might be returned.
For more information about cascades, check out Dart's outstanding documentation!

F# generalization by overloading

Given the type
type T =
static member OverloadedMethod(p:int) = ()
static member OverloadedMethod(p:string) = ()
Let's suppose we want to create a generic function that resolves to the specific overload based on the type of the parameter. The most intuitive way would be
//Case 1
let inline call o = T.OverloadedMethod o //error
call 3
call "3"
but this, despite the inline definition, doesn't work and the compiler complains
Error FS0041 A unique overload for method 'OverloadedMethod' could not
be determined based on type information prior to this program point. A
type annotation may be needed. Candidates: static member
T.OverloadedMethod : p:int -> unit, static member T.OverloadedMethod :
p:string -> unit
We can achieve what we want though, for example using the "operator trick"
//Case 2
type T2 =
static member ($) (_:T2, p:int) = T.OverloadedMethod(p)
static member ($) (_:T2, p:string) = T.OverloadedMethod(p)
let inline call2 o = Unchecked.defaultof<T2> $ o
call2 3
call2 "3"
The F# compiler here does (apparently) some more work and doesn't simply fall back to the .NET resolution.
Yet, this looks ugly and implies code duplication. It sounds like Case 1 should be possible.
What technical reasons justify this behaviour? My guess is that there is some trade-off (perhaps with .NET interoperability), but couldn't find more information.
EDIT
From the posts I extract this as a reason:
"a trait call is an F# compiler feature, so there must be two different ways of writing a simple call and a trait call. Using the same syntax for both is not convenient because it might be confusing, some uses could arise where a simple call is compiled as a trait call accidentally".
Let's put the question in a different perspective:
Looking at the code, it really seems straightforward what the compiler should do:
1) call is an inline function, so defer compilation to the use site
2) call 3 is an use site, where the parameter is of type int. But T.OverloadedMethod(int) exists, so let's generate a call to it
3) call "3" like previous case with string in place of int
4) call 3.0 error as T.OverloadedMethod(float) doesn't exist
I would really like to see a code example where letting the compiler do this would be a problem that justifies requiring the developer to write the additional code for the trait call.
At the end of the day, isn't one of F# strengths "conciseness and intuitiveness"?
Here we are in presence of a case where it looks like it could be better.
The trade-offs stem from the fact that this it's a completely erased compiler trick. This means that:
C# code can't see (and take advantage of) any call2 function, it can only see your two $ method overloads.
All information about call2 is gone at runtime, meaning you couldn't invoke it through reflection, for example.
It won't show up in call stacks. This may be a good or a bad thing -- for example, in recent versions of the F# compiler, they've selectively inlined certain functions to make async stacktraces a bit nicer.
The F# is baked in at the call site. If your calling code in is assembly A call2 comes from assembly B, you can't just replace assembly B with a new version of call2; you have to recompile A against the new assembly. This could potentially be a backwards compatibility concern.
A rather interesting benefit is that it can lead to drastic performance improvements in specialized cases: Why is this F# code so slow?. On the flip side, I'm sure there are circumstances where it could cause active harm, or just bloat the resulting IL code.
The reason of that behavior is that T.OverloadedMethod o in
let inline call o = T.OverloadedMethod o
is not a trait call. It's rather simple .NET overloading that must be solved at the call site, but since your function type doesn't imply which overload to solve it simply fails to compile, this functionality is desired.
If you want to "defer" the overload resolution you need to do a trait call, making the function inline is necessary but not sufficient:
let inline call (x:'U) : unit =
let inline call (_: ^T, x: ^I) = ((^T or ^I) : (static member OverloadedMethod: _ -> _) x)
call (Unchecked.defaultof<T>, x)
Using an operator saves you many keystrokes by inferring automatically these constraints but as you can see in your question, it requires to include a dummy parameter in the overloads.

How to name a function that produces a closure

I understand closures, even though I scarcely use them, but whenever I can squeeze one I have no idea of how to name it.
The best I can think of is sticking a "make" before what would be the name of the function:
function makeSortSelection(settings1, settings2) {
return function() {
/* sort stuff attending to settings1 and settings2 */
};
}
$("#sort-button").click(makeSortSelection('ascending',foo));
(I almost always use them in Javascript, but I guess this is a very language-agnostic question)
Sadly, most examples I found of closures just use "foo" or "sayHello". I like to give all my functions a verb as name: functions "do stuff", and their name reflects it ("sortSelection", "populateForm"). In the same spirit, how should I name closures, that "do things that do stuff"? What conventions do you use, and what are the most common?
PD: I tend to use Google's style guide when in doubt, but it says nothing about this.
Closures aren't nameable entities. Functions are nameable but a closure isn't a function.
Rather than define "a closure" it is easier to define the circumstances under which closure arises.
A (lexical) closure (in javascript) occurs and continues to exist as long a persistent external reference to an inner function exists, but neither the outer function nor the inner function nor the reference to the inner function is, in itself, a closure. Pragmatically speaking, a closure is a construct comprising all these elements plus a feature of the language by which garbage collection is suppressed when they exist.
In my opinion it is wrong, as some claim, that "all functions are closures". Yes, all functions have a scope chain, but a closure should only be regarded as existing once an outer function has completed and returned, and a persistent reference to an inner function exists.
By all means give the returned function a verb-based name, just like any other named function - there's no need to regard it differently just because it is was returned by another function. In every respect the returned function is just a function - it just happens to be a function with access to the scope chain of the (execution context of the) outer function that brought it into being - nothing more than that.
EDIT:
I just realised I didn't address the critical point of the question - rephrased "how to name a function that exists for the express purpose of forming a closure by returning a function".
Like you, I have a problem here.
Most often I use the "make" prefix, as in your example. This seems to be best most of the time.
I have also used "_closure" as a suffix, This doesn't obey the "verb rule" but has the advantage of being independent of natural language. "Make" is strictly an English word and speakers of other languages will probably choose to use their own "faire", "machen" etc. On the other hand, "closure" is universal - it remains, as far as I'm aware, untranslated in other languages. Therefore, "closure" as a suffix (or prefix) could be better in scripts that are likely to be used/modded on a world-wide basis.

Resources