I'm new to f# and I tried to write a program supposed to go through all files in a given dir and for each file of type ".txt" to add an id number + "DONE" to the file.
my program:
//const:
[<Literal>]
let notImportantString= "blahBlah"
let mutable COUNT = 1.0
//funcs:
//addNumber --> add the sequence number COUNT to each file.
let addNumber (file : string) =
let mutable str = File.ReadAllText(file)
printfn "%s" str//just for check
let num = COUNT.ToString()
let str4 = str + " " + num + "\n\n\n DONE"
COUNT <- COUNT + 1.0
let str2 = File.WriteAllText(file,str4)
file
//matchFunc --> check if is ".txt"
let matchFunc (file : string) =
file.Contains(".txt")
//allFiles --> go through all files of a given dir
let allFiles dir =
seq
{ for file in Directory.GetFiles(dir) do
yield file
}
////////////////////////////
let dir = "D:\FSharpTesting"
let a = allFiles dir
|> Seq.filter(matchFunc)
|> Seq.map(addNumber)
printfn "%A" a
My question:
Tf I do not write the last line (printfn "%A" a) the files will not change.(if I DO write this line it works and change the files)
when I use debugger I see that it doesn't really computes the value of 'a' when it arrives to the line if "let a =......" it continues to the printfn line and than when it "sees" the 'a' there it goes back and computes the answer of 'a'.
why is it and how can i "start" the function without printing??
also- Can some one tells me why do I have to add file as a return type of the function "addNumber"?
(I added this because that how it works but I don't really understand why....)
last question-
if I write the COUNT variable right after the line of the [] definition
it gives an error and says that a constant cannot be "mutable" but if a add (and this is why I did so) another line before (like the string) it "forgets" the mistakes and works.
why that? and if you really cannot have a mutable const how can I do a static variable?
if I do not write the last line (printfn "%A" a) the files will not change.
F# sequences are lazy. So to force evaluation, you can execute some operation not returning a sequence. For example, you can call Seq.iter (have side effects, return ()), Seq.length (return an int which is the length of the sequence) or Seq.toList (return a list, an eager data structure), etc.
Can some one tells me why do I have to add file : string as a return type of the function "addNumber"?
Method and property access don't play nice with F# type inference. The type checker works from left to right, from top to bottom. When you say file.Contains, it doesn't know which type this should be with Contains member. Therefore, your type annotation is a good hint to F# type checker.
if I write the COUNT variable right after the line of the [<Literal>] definition
it gives an error and says that a constant cannot be "mutable"
Quoting from MSDN:
Values that are intended to be constants can be marked with the Literal attribute. This attribute has the effect of causing a value to be compiled as a constant.
A mutable value can change its value at some point in your program; the compiler complains for a good reason. You can simply delete [<Literal>] attribute.
To elaborate on Alex's answer -- F# sequences are lazily evaluated. This means that each element in the sequence is generated "on demand".
The benefit of this is that you don't waste computation time and memory on elements you don't ever need. Lazy evaluation does take a little getting used to though -- specifically because you can't assume order of execution (or that execution will even happen at all).
Your problem has a simple fix: just use Seq.iter to force execution/evaluation of the sequence, and pass the 'ignore' function to it since we don't care about the values returned by the sequence.
let a = allFiles dir
|> Seq.filter(matchFunc)
|> Seq.map(addNumber)
|> Seq.iter ignore // Forces the sequence to execute
Seq.map is intended to map one value to another, not generally to mutate a value. seq<_> represents a lazily generated sequence so, as Alex pointed out, nothing will happen until the sequence is enumerated. This is probably a better fit for codereview, but here's how I would write this:
Directory.EnumerateFiles(dir, "*.txt")
|> Seq.iteri (fun i path ->
let text = File.ReadAllText(path)
printfn "%s" text
let text = sprintf "%s %d\n\n\n DONE" text (i + 1)
File.WriteAllText(path, text))
Seq.map requires a return type, as do all expressions in F#. If a function performs an action, as opposed to computing a value, it can return unit: (). Regarding COUNT, a value cannot be mutable and [<Literal>] (const in C#). Those are precise opposites. For a static variable, use a module-scoped let mutable binding:
module Counter =
let mutable count = 1
open Counter
count <- count + 1
But you can avoid global mutable data by making count a function with a counter variable as a part of its private implementation. You can do this with a closure:
let count =
let i = ref 0
fun () ->
incr i
!i
let one = count()
let two = count()
f# is evaluated from top to bottom, but you are creating only lazy values until you do printfn. So, printfn is actually the first thing that gets executed which in turn executes the rest of your code. I think you can do the same thing if you tack on a println after Seq.map(addNumber) and do toList on it which will force evaluation as well.
This is a general behaviour of lazy sequence. you have the same in, say C# using IEnumerable, for which seq is an alias.
In pseudo code :
var lazyseq = "abcdef".Select(a => print a); //does not do anything
var b = lazyseq.ToArray(); //will evaluate the sequence
ToArray triggers the evaluation of a sequence :
This illustrate the fact that a sequence is just a description, and does not tell you when it will be enumerated : this is in control of the consumer of the sequence.
To go a bit further on the subject, you might want to look at this page from F# wikibook:
let isNebraskaCity_bad city =
let cities =
printfn "Creating cities Set"
["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
|> Set.ofList
cities.Contains(city)
let isNebraskaCity_good =
let cities =
printfn "Creating cities Set"
["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
|> Set.ofList
fun city -> cities.Contains(city)
Most notably, Sequence are not cached (although you can make them so). You see then that the dintinguo between the description and the runtime behaviour can have important consequence as the sequence itself is recomputed which can incur a very high cost and introduce quadratic number of operations if each value is itself linear to get !
Related
I'm trying to implement a custom Arbitrary that generates glob syntax patterns like a*c?. I think my implementation is correct, it's just that, when running the test with Xunit, FsCheck doesn't seem to be using the custom arbitrary Pattern to generate the test data. When I use LINQPad however everything works as expected. Here's the code:
open Xunit
open FsCheck
type Pattern = Pattern of string with
static member op_Explicit(Pattern s) = s
type MyArbitraries =
static member Pattern() =
(['a'..'c']#['?'; '*'])
|> Gen.elements
|> Gen.nonEmptyListOf
|> Gen.map (List.map string >> List.fold (+) "")
|> Arb.fromGen
|> Arb.convert Pattern string
Arb.register<MyArbitraries>() |> ignore
[<Fact>]
let test () =
let prop (Pattern p) = p.Length = 0
Check.QuickThrowOnFailure prop
This is the output:
Falsifiable, after 2 tests (0 shrinks) (StdGen (1884571966,296370531)): Original: Pattern null with exception: System.NullReferenceException ...
And here is the code I'm running in LINQPad along with the output:
open FsCheck
type Pattern = Pattern of string with
static member op_Explicit(Pattern s) = s
type MyArbitraries =
static member Pattern() =
(['a'..'c']#['?'; '*'])
|> Gen.elements
|> Gen.nonEmptyListOf
|> Gen.map (List.map string >> List.fold (+) "")
|> Arb.fromGen
|> Arb.convert Pattern string
Arb.register<MyArbitraries>() |> ignore
let prop (Pattern p) = p.Length = 0
Check.Quick prop
Falsifiable, after 1 test (0 shrinks) (StdGen (1148389153,296370531)): Original: Pattern "a*"
As you can see FsCheck generates a null value for the Pattern in the Xunit test although I'm using Gen.elements and Gen.nonEmptyListOf to control the test data. Also, when I run it a couple times, I'm seeing test patterns that are out of the specified character range. In LINQPad those patterns are generated correctly. I also tested the same with a regular F# console application in Visual Studio 2017 and there the custom Arbitrary works as expected as well.
What is going wrong? Is FsCheck falling back to the default string Arbitrary when running in Xunit?
You can clone this repo to see for yourself: https://github.com/bert2/GlobMatcher
(I don't want to use Prop.forAll, because each test will have multiple custom Arbitrarys and Prop.forAll doesn't go well with that. As far as I know I can only tuple them up, because the F# version of Prop.forAll only accepts a single Arbitrary.)
Don't use Arb.register. This method mutates global state, and due to the built-in parallelism support in xUnit.net 2, it's undetermined when it runs.
If you don't want to use the FsCheck.Xunit Glue Library, you can use Prop.forAll, which works like this:
[<Fact>]
let test () =
let prop (Pattern p) = p.Length = 0
Check.QuickThrowOnFailure (Prop.forAll (MyArbitraries.Pattern()) prop)
(I'm writing this partially from memory, so I may have made some small syntax mistakes, but hopefully, this should give you an idea on how to proceed.)
If, on the other hand, you choose to use FsCheck.Xunit, you can register your custom Arbitraries in a Property annotation, like this:
[<Property(Arbitrary = [|typeof<MyArbitraries>|])>]
let test (Pattern p) = p.Length = 0
As you can see, this takes care of much of the boilerplate; you don't even have to call Check.QuickThrowOnFailure.
The Arbitrary property takes an array of types, so when you have more than one, this still works.
If you need to write many properties with the same array of Arbitraries, you can create your own custom attributes that derives from the [<Property>] attribute. Here's an example:
type Letters =
static member Char() =
Arb.Default.Char()
|> Arb.filter (fun c -> 'A' <= c && c <= 'Z')
type DiamondPropertyAttribute() =
inherit PropertyAttribute(
Arbitrary = [| typeof<Letters> |],
QuietOnSuccess = true)
[<DiamondProperty>]
let ``Diamond is non-empty`` (letter : char) =
let actual = Diamond.make letter
not (String.IsNullOrWhiteSpace actual)
All that said, I'm not too fond of 'registering' Arbitraries like this. I much prefer using the combinator library, because it's type-safe, which this whole type-based mechanism isn't.
I have some code which I'm expecting to pause when it asks for user input. It only does this however, if the last expression is Seq.initInfinite.
let consoleaction (i : int) =
Console.WriteLine ("Enter Input: ")
(Console.ReadLine().Trim(), i)
Seq.initInfinite (fun i -> consoleaction i) |> Seq.map (fun f -> printfn "%A" f)
printfn "foo" // program will not pause unless this line is commented out.
Very new to F# and I've spent way too much time on this already. Would like to know what is going on :)
If you try that piece of code in F# interactive you will see different effects depending on how you execute it.
For instance if you execute it in one shot it will create values but nothing will be executed since the Seq.initInfinite instruction is 'lost' I mean, not let-bound to anything and at the same time is a lazy expression so its side effects will not be executed. If you remove the last instruction it will start prompting, that's because fsi bounds to it the last expression so in order to show you the value of it it starts evaluating the seq expression.
Things are different if you put this in a function, for example:
open System
let myProgram() =
let consoleaction ...
Now you will get a warning on the Seq.initInfinite:
warning FS0020: This expression should have type 'unit', but has type
'seq<unit>'. Use 'ignore' to discard the result of the expression, or
'let' to bind the result to a name.
Which is very clear. Additionally to ignore as the warning suggest you can change the Seq.map to Seq.iter since you are not interested in the result of the map which will be a seq of units.
But now again your program will not execute (try myProgram())unless you remove the last line, the printfn and it's clear why, this is because it returns the last expression which is not the Seq.initInfinite which is lost since it's lazy and ignored.
If you remove the printfn it will become the 'return value' of your function so it will be evaluated when calling the function.
I need to import a large text file (55MB) (525000 * 25) and manipulate the data and produce some output. As usual I started exploring with f# interactive, and I get some really strange behaviours.
Is this file too large or my code wrong?
First test was to import and simply comute the sum over one column (not the end goal but first test):
let calctest =
let reader = new StreamReader(path)
let csv = reader.ReadToEnd()
csv.Split([|'\n'|])
|> Seq.skip 1
|> Seq.map (fun line -> line.Split([|','|]))
|> Seq.filter (fun a -> a.[11] = "M")
|> Seq.map (fun values -> float(values.[14]))
As expected this produces a seq of float both in typecheck and in interactive. If I know add:
|> Seq.sum
Type check works and says this function should return a float but if I run it in interactive I get this error:
System.IndexOutOfRangeException: Index was outside the bounds of the array
Then I removed the last line again and thought I look at the seq of float in a text file:
let writetest =
let str = calctest |> Seq.map (fun i -> i.ToString())
System.IO.File.WriteAllLines("test.txt", str )
Again, this passes the type check but throws errors in interactive.
Can the standard StreamReader not handle that amount of data? or am I going wrong somewhere? Should I use a different function then Streamreader?
Thanks.
Seq is lazy, which means that only when you add the Seq.sum is all the mapping and filtering actually being done, that's why you don't see the error before adding that line. Are you sure you have 15 columns on all rows? That's probably the problem
I would advise you to use the CSV Type Provider instead of just doing a string.Split, that way you'll be sure to not have an accidental IndexOutOfRangeException, and you'll handle , escaping correctly.
Additionaly, you're reading the whole csv file into memory by calling reader.ReadToEnd(), the CsvProvider supports streaming if you set the Cache parameter to false. It's not a problem with a 55MB file, but if you have something much larger it might be
let x = [for p in db.ParamsActXes do
if p.NumGroupPar = grp then
yield p.Num, p.Name]
Here is my sequence but the problem is it returns the list of tuples, I can't access single tuple element like
let str = "hello" + x.[1]
and that is the trouble.
how can I realize this functionary ?
To access the second element of a 2-tuple you can either use snd or pattern matching. So if tup is a 2-tuple, where the second element of tup is a string, you can either do:
let str = "hello" + snd tup
or
let (a,b) = tup
let st = "hello" + b
Note that snd only works with 2-tuples, not tuples with more than two elements.
To give you one more alternative solution, you can just create a filtered sequence containing values of the original type by writing yield p:
let x = [ for p in db.ParamsActXes do
if p.NumGroupPar = grp then
yield p ]
And then just access the property you need:
let str = "hello" + x.[1].Name
This is probably a good idea if you're returning only several properties of the p value. The only reason for yielding tuples would be to hide something from the rest of the code or to create sequence that matches some function you use later.
(Also, I would avoid indexing into lists using x.[i], because this is inefficient - but maybe this is just for ilustration in the sample you posted. Use an array if you need index based access by wrapping sequence expression into [| ... |])
Just to throw one more possibility out there, if you are running the .NET 4.0 build of F# 2.0 you can perform a runtime cast from the F# tuple to the .NET 4.0 System.Tuple and then use the ItemX properties of the .NET 4.0 tuples to access the tuple element you need,
let x = (1, 1.2, "hello")
let y = ((box x) :?> System.Tuple<int, float, string>);;
y.Item3 //returns "hello"
However, I would never use that, instead opting for the pattern match extraction. (also, I've read places that the F# compiler may not always choose to represent its tuples as .NET 4.0 tuples, so there may be a possibility that the cast would fail).
Reading your comments in some of the other answers, I'm unsure why the pattern matching solution doesn't work for you. Perhaps you want to access the tuple item at a certain place within an expression? If so, the previous would certainly work:
let str = "hello" + ((box x.[1]) :?> System.Tuple<int,string>).Item2 //though might as well use snd and fst for length 2 F# tuples
but you can achieve the same ends using the pattern matching extraction technique too (again, assuming this is even what you're after):
let str = "hello" + (let (_,name) = x.[1] in name)
you can access individual tuple from the list of tuple using List.nth.
let first, second = List.nth x 0
first and second represents the individual element of the tuple.
I'm trying to learn F# by going through some of the Euler problems and I found an issue I haven't been able to figure out. This is my naive solution.
let compute =
let mutable f = false
let mutable nr = 0
while f = false do
nr <- nr + 20
f = checkMod nr
nr
When i do this I get the error message warning FS0020: This expression should have type 'unit', but has type 'bool' on the expression "nr <- nr +20". I've tried rewriting and moving the expressions around and I always get that error on the line below the while statement.
I'm writing this using VS2010 Beta.
Since I can imagine this weg page becoming the 'canonical' place to look up information about warning FS0020, here's my quick summary of the three commonest cases in which you get the warning, and how to fix them.
Intentionally discarding the result of a function that is called only for its side-effects:
// you are calling a function for its side-effects, intend to ignore result
let Example1Orig() =
let sb = new System.Text.StringBuilder()
sb.Append("hi") // warning FS0020
sb.Append(" there") // warning FS0020
sb.ToString()
let Example1Fixed() =
let sb = new System.Text.StringBuilder()
sb.Append("hi") |> ignore
sb.Append(" there") |> ignore
sb.ToString()
Warning is useful, pointing out an error (function has no effects):
// the warning is telling you useful info
// (e.g. function does not have an effect, rather returns a value)
let Example2Orig() =
let l = [1;2;3]
List.map (fun x -> x * 2) l // warning FS0020
printfn "doubled list is %A" l
let Example2Fixed() =
let l = [1;2;3]
let result = List.map (fun x -> x * 2) l
printfn "doubled list is %A" result
Confusing assignment operator and equality comparison operator:
// '=' versus '<-'
let Example3Orig() =
let mutable x = 3
x = x + 1 // warning FS0020
printfn "%d" x
let Example3Fixed() =
let mutable x = 3
x <- x + 1
printfn "%d" x
The following line:
f = checkMod nr
is an equality check, not an assignment as I believe you are intending. Change it to:
f <- checkMod nr
and all should work fine. I'm not sure why you've used the correct syntax on the previous line and not that line...
Also, the line while f = false do should really be simplified to while not f do; equality checks on booleans are rather convoluted.
As I side note, I feel a need to point out that you are effectively trying to use F# as an imperative language. Use of mutable variables and while loops are strongly discouraged in functional languages (including F#), especially when a purely functional (and simpler) solution exists, as in this situation. I recommend you read up a bit on programming in the functional style. Of course, just getting to grips with the syntax is a useful thing in itself.
If you're trying to adopt the functional style, try to avoid mutable values.
For example like this:
let nr =
let rec compute nr =
if checkMod nr then nr else compute (nr + 20)
compute 0
while expressions in F# take a little getting used to if you're coming from an imperative language. Each line in a while expression must evaluate to unit (think void from C++/C#). The overall expression then also evaluates to unit.
In the example:
nr <- nr + 20
evaluates to unit whereas
f = checkMod nr
evaluates to a bool as Noldorin noted. This results in a warning message being reported. You can actually turn the warning off if you so desire. Just put the following at the top of your file:
#nowarn "0020"
I've been programming in an imperative style for a long time, so getting used to the functional programming mindset took a while.
In your example, you're trying to find the first multiple of 20 that passes your checkMod test. That's the what part. For the functional how part, I recommend browsing through the methods available to sequences. What you need is the first element of a sequence (multiples of 20) passing your test, like this:
let multi20 = Seq.initInfinite (fun i -> i*20)
let compute = multi20 |> Seq.find checkMod
The first let generates an infinite list of twentyples (I made that one up). The second let finds the first number in said list that passes your test. Your task is to make sure that there actually is a number that will pass the test, but that's of course also true for the imperative code.
If you want to condense the two above lines into one, you can also write
let computeCryptic = Seq.initInfinite ((*) 20) |> Seq.find checkMod
but I find that pulling stunts like that in code can lead to headaches when trying to read it a few weeks later.
In the same spirit as Brian's post, here is another way to get warning FS0020: In a nutshell, I accidentally tupled the function arguments.
Being an F# newbie, I had a difficult time debugging the code below, which for the second line (let gdp...) gave the warning FS0020: This expression should have type 'unit', but has type '(string -> ^a -> unit) * string * float'. It turns out that line was not the problem at all; instead, it was the printfn line that was messed up. Removing the comma separators from the argument list fixed it.
for country in wb.Regions.``Arab World``.Countries do
let gdp = country.Indicators.``GDP per capita (current US$)``.[2010]
let gdpThous = gdp / 1.0e3
printfn "%s, %s (%.2f)" country.Name, country.CapitalCity, gdpThous