Why data parameter comes last - f#

Why have the data parameter in F# to come last, like the following code snippet shows:
let startsWith lookFor (s:string) = s.StartsWith(lookFor)
let str1 =
"hello"
|> startsWith "h"

I think part of your answer is in your question. The |> (forward pipe) operator lets you specify the last parameter to a function before you call it. If the parameters were in the opposite order, then that wouldn't work. The best examples of the power of this are with chaining of functions that operate on lists. Each function takes a list as its last parameter and returns a list that can be passed to the next function.
From http://www.tryfsharp.org/Learn/getting-started#chaining-functions:
[0..100]
|> List.filter (fun x -> x % 2 = 0)
|> List.map (fun x -> x * 2)
|> List.sum
The |> operator allows you to reorder your code by specifying the last
argument of a function before you call it. This example is
functionally equivalent to the previous code, but it reads much more
cleanly. First, it creates a list of numbers. Then, it pipes that list
of numbers to filter out the odds. Next, it pipes that result to
List.map to double it. Finally, it pipes the doubled numbers to
List.sum to add them together. The Forward Pipe Operator reorganizes
the function chain so that your code reads the way you think about the
problem instead of forcing you to think inside out.
As mentioned in the comments there is also the concept of currying, but I don't think that is as easy to grasp as chaining functions.

Related

Negate a string method in a lambda function gives error

I am calling the string method "contains" in a lambda function, and would like to negate it. I thought this could be done with not myString.Contains("abbr") but it gives me the error
Successive arguments should be separated by spaces or tupled, and arguments involving function or method applications should be parenthesized
My actual function is this
open System.IO
let createWordArray filePath =
File.ReadLines(filePath)
|> Seq.filter (fun line -> line <> "")
|> Seq.filter (fun line -> not line.Contains("abbr.")) // Error occurs here
|> Seq.map (fun line -> line.Split(' ').[0])
|> Seq.filter (fun word -> word.StartsWith("-") || word.EndsWith("-"))
|> Seq.toArray
Please point out any other obvious mistakes I'm making.
You just need to add parentheses around the argument of the not function:
|> Seq.filter (fun line ->
not (line.Contains("abbr.")))
Without the parentheses, the compiler is interpreteing your code as a call to not with two arguments:
not (line.Contains) ("abbr.")
F# syntax is not like C# (or C, or C++, or Java)
In particular, F# does not use parentheses for passing function arguments. Instead, F# uses whitespace for that:
let x = f y z
You are, of course, free to enclose any terms in parentheses if you wanted to indicate the order of operations, or just for aesthetic reasons:
let x = f (y+5) z // parens for order of operations
let x = f (y) (z) // parens just for the heck of it
So you see, when you write:
line.Contains("abbr.")
There is no special meaning to the parens. You could just as well write this:
line.Contains "abbr."
It would be equivalent.
See what's happening? Not yet? Well, ok, let's try to add the not to the mix:
not line.Contains "abbr."
Is it clearer now? This looks like you're trying to call the not function, and you're giving it two arguments: first argument is line.Contains, and the second argument is "abbr."
This is not what you meant, right? What you meant was probably to first call line.Contains passing it "abbr " as argument, and then pass the result of that to not
The most straightforward way to do this is to use parentheses to indicate the order of operations:
not (line.Contains "abbr.")
Or, alternatively, you could use operator <|, which is intended specifically for this kind of thing. It just passes a parameter to a function, so pretty much does nothing. But its point is that it's an operator, so it's precedence is lower than a function call:
not <| line.Cobtains "abbr."

Different argument order for getting N-th element of Array, List or Seq

Is there a good reason for a different argument order in functions getting N-th element of Array, List or Seq:
Array.get source index
List .nth source index
Seq .nth index source
I would like to use pipe operator and it seems possible only with Seq:
s |> Seq.nth n
Is there a way to have the same notation with Array or List?
I don't think of any good reason to define Array.get and List.nth this way. Given that pipeplining is very common in F#, they should have been defined so that the source argument came last.
In case of List.nth, it doesn't change much because you can use Seq.nth and time complexity is still O(n) where n is length of the list:
[1..100] |> Seq.nth 10
It's not a good idea to use Seq.nth on arrays because you lose random access. To keep O(1) running time of Array.get, you can define:
[<RequireQualifiedAccess>]
module Array =
/// Get n-th element of an array in O(1) running time
let inline nth index source = Array.get source index
In general, different argument order can be alleviated by using flip function:
let inline flip f x y = f y x
You can use it directly on the functions above:
[1..100] |> flip List.nth 10
[|1..100|] |> flip Array.get 10
    
Just use backward pipe operator:
[1..1000] |> List.nth <| 42
Since both operators are left associative, x |> f <| y is parsed as (x |> f) <| y, and this does the trick.
Backward pipe operator is also useful if you want to remove parentheses: f (very long expression) can be replaced with f <| very long expression.
Since Pad and bytebuster answered your last question I will focus on the why part.
This is based my current knowledge and not historical facts.
Since F# derived from OCaml and OCaml has Array and List but not Seq and F# uses |> for natural pipelining and type checking and OCaml lacks the pipleline operator, the authors of F# made the switch for Seq. But obviously to be backward compatablie with OCaml they did not switch everything.

Using Array.map omitting the first element of the array in F#

I have just started playing with F#, so this question will probably be quite basic...
I would like to read a text file line by line, and then ignore the first line and process the other lines using a given function. So I was thinking of using something along the lines of:
File.ReadAllLines(path)
|> Array.(ignore first element)
|> Array.map processLine
what would be an elegant yet efficient way to accomplish it?
There is no simple function to skip the first line in an array, because the operation is not efficient (it would have to copy the whole array), but you can do that easily if you use lazy sequences instead:
File.ReadAllLines(path)
|> Seq.skip 1
|> Seq.map processLine
If you need the result in an array (as opposed to seq<'T>, which is an F# alias for IEnumerable<'T>), then you can add Seq.toArray to the end. However, if you just want to iterate over the lines later on, then you can probably just use sequences.
This is an addition to Tomas' answer, which I generally agree with. One thing to watch is what happens if your array or sequence contains no lines. (Or fewer lines than you want to skip.) In that case, Seq.skip will throw an exception. The most concise way around this that I can think of is:
System.IO.File.ReadAllLines fileName
|> Seq.mapi (fun i elem -> i, elem)
|> Seq.choose (fun (i, elem) -> if i > 0 then Some(elem) else None)
You skip the first element in an F# array simply by myArray.[1..]. Gotta love how elegant this language is.

f# iterating over two arrays, using function from a c# library

I have a list of words and a list of associated part of speech tags. I want to iterate over both, simultaneously (matched index) using each indexed tuple as input to a .NET function. Is this the best way (it works, but doesn't feel natural to me):
let taggingModel = SeqLabeler.loadModel(lthPath +
"models\penn_00_18_split_dict.model");
let lemmatizer = new Lemmatizer(lthPath + "v_n_a.txt")
let input = "the rain in spain falls on the plain"
let words = Preprocessor.tokenizeSentence( input )
let tags = SeqLabeler.tagSentence( taggingModel, words )
let lemmas = Array.map2 (fun x y -> lemmatizer.lookup(x,y)) words tags
Your code looks quite good to me - most of it deals with some loading and initialization, so there isn't much you could do to simplify that part. Alternatively to Array.map2, you could use Seq.zip combined with Seq.map - the zip function combines two sequences into a single one that contains pairs of elements with matching indices:
let lemmas = Seq.zip words tags
|> Seq.map (fun (x, y) -> lemmatizer.lookup (x, y))
Since lookup function takes a tuple that you got as an argument, you could write:
// standard syntax using the pipelining operator
let lemmas = Seq.zip words tags |> Seq.map lemmatizer.lookup
// .. an alternative syntax doing exactly the same thing
let lemmas = (words, tags) ||> Seq.zip |> Seq.map lemmatizer.lookup
The ||> operator used in the second version takes a tuple containing two values and passes them to the function on the right side as two arguments, meaning that (a, b) ||> f means f a b. The |> operator takes only a single value on the left, so (a, b) |> f would mean f (a, b) (which would work if the function f expected tuple instead of two, space separated, parameters).
If you need lemmas to be an array at the end, you'll need to add Array.ofSeq to the end of the processing pipeline (all Seq functions work with sequences, which correspond to IEnumerable<T>)
One more alternative is to use sequence expressions (you can use [| .. |] to construct an array directly if that's what you need):
let lemmas = [| for wt in Seq.zip words tags do // wt is tuple (string * string)
yield lemmatizer.lookup wt |]
Whether to use sequence expressions or not - that's just a personal preference. The first option seems to be more succinct in this case, but sequence expressions may be more readable for people less familiar with things like partial function application (in the shorter version using Seq.map)

What is the name of |> in F# and what does it do?

A real F# noob question, but what is |> called and what does it do?
It's called the forward pipe operator. It pipes the result of one function to another.
The Forward pipe operator is simply defined as:
let (|>) x f = f x
And has a type signature:
'a -> ('a -> 'b) -> 'b
Which resolves to: given a generic type 'a, and a function which takes an 'a and returns a 'b, then return the application of the function on the input.
You can read more detail about how it works in an article here.
I usually refer to |> as the pipelining operator, but I'm not sure whether the official name is pipe operator or pipelining operator (though it probably doesn't really matter as the names are similar enough to avoid confusion :-)).
#LBushkin already gave a great answer, so I'll just add a couple of observations that may be also interesting. Obviously, the pipelining operator got it's name because it can be used for creating a pipeline that processes some data in several steps. The typical use is when working with lists:
[0 .. 10]
|> List.filter (fun n -> n % 3 = 0) // Get numbers divisible by three
|> List.map (fun n -> n * n) // Calculate squared of such numbers
This gives the result [0; 9; 36; 81]. Also, the operator is left-associative which means that the expression input |> f |> g is interpreted as (input |> f) |> g, which makes it possible to sequence multiple operations using |>.
Finally, I find it quite interesting that pipelining operaor in many cases corresponds to method chaining from object-oriented langauges. For example, the previous list processing example would look like this in C#:
Enumerable.Range(0, 10)
.Where(n => n % 3 == 0) // Get numbers divisible by three
.Select(n => n * n) // Calculate squared of such numbers
This may give you some idea about when the operator can be used if you're comming fromt the object-oriented background (although it is used in many other situations in F#).
As far as F# itself is concerned, the name is op_PipeRight (although no human would call it that). I pronounce it "pipe", like the unix shell pipe.
The spec is useful for figuring out these kinds of things. Section 4.1 has the operator names.
http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/manual/spec.html
Don't forget to check out the library reference docs:
http://msdn.microsoft.com/en-us/library/ee353754(v=VS.100).aspx
which list the operators.

Resources