Calling the File.ReadAllBytes method in F# - f#

Here is a sample of my code:
let getAllBooks bookDirectory =
seq {yield! Directory.GetFiles(bookDirectory)}
|> Seq.map (fun eachFile -> eachFile.ReadAllBytes)
It fails to work. What is wrong? What should I write instead?

Like you said in the title, the function is called File.ReadAllBytes ;)
It is a static method of the System.IO.File class, and therefore has to be called on the File type, not some instance:
open System.IO
let getAllBooks bookDirectory =
seq {yield! Directory.GetFiles (bookDirectory) }
|> Seq.map (fun eachFile -> File.ReadAllBytes eachFile)
Btw, this code could be improved, since the result of GetFiles is an array, which is also a seq:
open System.IO
let getAllBooks bookDirectory =
Directory.GetFiles bookDirectory
|> Seq.map (fun eachFile -> File.ReadAllBytes eachFile)
Furthermore, you could simply pass the ReadAllBytes function directly to map:
open System.IO
let getAllBooks bookDirectory =
Directory.GetFiles bookDirectory
|> Seq.map File.ReadAllBytes
Finally, you can get rid of the function parameter as well, if you like:
open System.IO
let getAllBooks =
Directory.GetFiles
>> Seq.map File.ReadAllBytes
Update for the comment:
I have an additional question: How may I convert a byte array into its numeric equivalent?
You may use the static methods in the System.BitConverter class, for example System.BitConverter.ToInt32:
open System
let array = [| 0x00uy; 0x10uy; 0x00uy; 0x00uy |]
BitConverter.ToInt32 (array, 0)
|> printfn "Array as Int32: %d"
// Prints: Array as Int32: 4096
Don't overlook the ToString method in there, which can convert a byte array to a hex string:
BitConverter.ToString array
|> printfn "%s"
// Prints: 00-10-00-00
All available conversion methods can be found on MSDN.

Related

Reading text file, iterating over lines to find a match, and return the value with FSharp

I have a text file that contains the following and I need to retrieve the value assigned to taskId, which in this case is AWc34YBAp0N7ZCmVka2u.
projectKey=ProjectName
serverUrl=http://localhost:9090
serverVersion=10.5.32.3
strong text**interfaceUrl=http://localhost:9090/interface?id=ProjectName
taskId=AWc34YBAp0N7ZCmVka2u
taskUrl=http://localhost:9090/api/ce/task?id=AWc34YBAp0N7ZCmVka2u
I have two different ways of reading the file that I've wrote.
let readLines (filePath:string) = seq {
use sr = new StreamReader (filePath)
while not sr.EndOfStream do
yield sr.ReadLine ()
}
readLines (FindFile currentDirectory "../**/sample.txt")
|> Seq.iter (fun line ->
printfn "%s" line
)
and
let readLines (filePath:string) =
(File.ReadAllLines filePath)
readLines (FindFile currentDirectory "../**/sample.txt")
|> Seq.iter (fun line ->
printfn "%s" line
)
At this point, I don't know how to approach getting the value I need. Options that, I think, are on the table are:
use Contains()
Regex
Record type
Active Pattern
How can I get this value returned and fail if it doesn't exist?
I think all the options would be reasonable - it depends on how complex the file will actually be. If there is no escaping then you can probably just look for = in the line and use that to split the line into a key value pair. If the syntax is more complex, this might not always work though.
My preferred method would be to use Split on string - you can then filter to find values with your required key, map to get the value and use Seq.head to get the value:
["foo=bar"]
|> Seq.map (fun line -> line.Split('='))
|> Seq.filter (fun kvp -> kvp.[0] = "foo")
|> Seq.map (fun kvp -> kvp.[1])
|> Seq.head
Using active patterns, you could define a pattern that takes a string and splits it using = into a list:
let (|Split|) (s:string) = s.Split('=') |> List.ofSeq
This then lets you get the value using Seq.pick with a pattern matching that looks for strings where the substring before = is e.g. foo:
["foo=bar"] |> Seq.pick (function
| Split ["foo"; value] -> Some value
| _ -> None)
The active pattern trick is quite neat, but it might be unnecessarily complicating the code if you only need this in one place.

Can I call a function by name in f#?

Is there any way to call a function by name in F#? Given a string, I want to pluck a function value from the global namespace (or, in general, a given module), and call it. I know the type of the function already.
Why would I want to do this? I'm trying to work around fsi not having an --eval option. I have a script file that defines many int->() functions, and I want to execute one of them. Like so:
fsianycpu --use:script_with_many_funcs.fsx --eval "analyzeDataSet 1"
My thought was to write a trampoline script, like:
fsianycpu --use:script_with_many_funcs.fsx trampoline.fsx analyzeDataSet 1
In order to write "trampoline.fsx", I'd need to look up the function by name.
There is no built-in function for this, but you can implement it using .NET reflection. The idea is to search through all types available in the current assembly (this is where the current code is compiled) and dynamically invoke the method with the matching name. If you had this in a module, you'd have to check the type name too.
// Some sample functions that we might want to call
let hello() =
printfn "Hello world"
let bye() =
printfn "Bye"
// Loader script that calls function by name
open System
open System.Reflection
let callFunction name =
let asm = Assembly.GetExecutingAssembly()
for t in asm.GetTypes() do
for m in t.GetMethods() do
if m.IsStatic && m.Name = name then
m.Invoke(null, [||]) |> ignore
// Use the first command line argument (after -- in the fsi call below)
callFunction fsi.CommandLineArgs.[1]
This runs hello world when called by:
fsi --use:C:\temp\test.fsx --exec -- "hello"
You can use reflection to get the functions as MethodInfo's by FSharp function name
open System
open System.Reflection
let rec fsharpName (mi:MemberInfo) =
if mi.DeclaringType.IsNestedPublic then
sprintf "%s.%s" (fsharpName mi.DeclaringType) mi.Name
else
mi.Name
let functionsByName =
Assembly.GetExecutingAssembly().GetTypes()
|> Seq.filter (fun t -> t.IsPublic || t.IsNestedPublic)
|> Seq.collect (fun t -> t.GetMethods(BindingFlags.Static ||| BindingFlags.Public))
|> Seq.filter (fun m -> not m.IsSpecialName)
|> Seq.groupBy (fun m -> fsharpName m)
|> Map.ofSeq
|> Map.map (fun k v -> Seq.exactlyOne v)
You can then invoke the MethodInfo
functionsByName.[fsharpFunctionNameString].Invoke(null, objectArrayOfArguments)
But you probably need to do more work to parse your string arguments using the MethodInfo.GetParameters() types as a hint.
You could also use FSharp.Compiler.Service to make your own fsi.exe with an eval flag
open System
open Microsoft.FSharp.Compiler.Interactive.Shell
open System.Text.RegularExpressions
[<EntryPoint>]
let main(argv) =
let argAll = Array.append [| "C:\\fsi.exe" |] argv
let argFix = argAll |> Array.map (fun a -> if a.StartsWith("--eval:") then "--noninteractive" else a)
let optFind = argv |> Seq.tryFind (fun a -> a.StartsWith "--eval:")
let evalData = if optFind.IsSome then
optFind.Value.Replace("--eval:",String.Empty)
else
String.Empty
let fsiConfig = FsiEvaluationSession.GetDefaultConfiguration()
let fsiSession = FsiEvaluationSession(fsiConfig, argFix, Console.In, Console.Out, Console.Error)
if String.IsNullOrWhiteSpace(evalData) then
fsiSession.Run()
else
fsiSession.EvalInteraction(evalData)
0
If the above was compiled into fsieval.exe it could be used as so
fsieval.exe --load:script_with_many_funcs.fsx --eval:analyzeDataSet` 1

F#: Approach to Trigrams

I have a string "ABCDEFG". I want to convert that into a string array with the contents: [|"ABC"; "BCD"; "CDE"; "DEF"; "EFG"|]
I first thought about using a loop. Then I thought about using a recursive function. Finally, I was wondering if there is a function in the F# spec like Seq.Fold I can use.
Take a look at Seq.windowed, should do what you want.
> "ABCDEFG" |> Seq.windowed 3 |> Seq.map (fun a -> System.String a);;
val it : seq<System.String> = seq ["ABC"; "BCD"; "CDE"; "DEF"; ...]

F# custom type provider: "Container type already set" error

Recently I attended a tutorial by Keith Battochi on type providers, in which he introduced a variant of the MiniCsv type provider in the MSDN tutorial. Unfortunately, my laptop wasn't available, so I had to write down the code by hand as well as I could. I believe I've recreated the type provider, but I'm getting
error FS3033: The type provider 'CsvFileTypeProvider+CsvFileTypeProvider' reported an error: container type for 'CsvFileProvider.Row' was already set to 'CsvFileProvider.CsvFile,Filename="events.csv"
When I look at the code, I can't see how I'm adding the Row type to the container twice (or to some other container). Removing selected lines of the code doesn't help.
Here's how I'm calling the code in fsi:
#r "CsvFileTypeProvider.dll"
open CsvFileProvider
let eventInfos = new CsvFile<"events.csv">() ;;
And here's the code itself:
module CsvFileTypeProvider
open Samples.FSharp.ProvidedTypes
open Microsoft.FSharp.Core.CompilerServices
let getType str =
if System.DateTime.TryParse(str, ref Unchecked.defaultof<_>) then
typeof<System.DateTime>, (fun (str:Quotations.Expr) -> <## System.DateTime.Parse(%%str) ##>)
elif System.Int32.TryParse(str, ref Unchecked.defaultof<_>) then
typeof<System.Int32>, (fun (str:Quotations.Expr) -> <## System.Int32.Parse(%%str) ##>)
elif System.Double.TryParse(str, ref Unchecked.defaultof<_>) then
typeof<System.Double>, (fun (str:Quotations.Expr) -> <## System.Double.Parse(%%str) ##>)
else
typeof<string>, (fun (str:Quotations.Expr) -> <## %%str ##>)
[<TypeProvider>]
type CsvFileTypeProvider() =
inherit TypeProviderForNamespaces()
let asm = typeof<CsvFileTypeProvider>.Assembly
let ns = "CsvFileProvider"
let csvFileProviderType = ProvidedTypeDefinition(asm, ns, "CsvFile", None)
let parameters = [ProvidedStaticParameter("Filename", typeof<string>)]
do csvFileProviderType.DefineStaticParameters(parameters, fun tyName [| :? string as filename |] ->
let rowType = ProvidedTypeDefinition(asm, ns, "Row", Some(typeof<string[]>))
let lines = System.IO.File.ReadLines(filename) |> Seq.map (fun line -> line.Split(','))
let columnNames = lines |> Seq.nth 0
let resultTypes = lines |> Seq.nth 1 |> Array.map getType
for idx in 0 .. (columnNames.Length - 1) do
let col = columnNames.[idx]
let ty, converter = resultTypes.[idx]
let prop = ProvidedProperty(col, ty)
prop.GetterCode <- fun [row] -> converter <## (%%row:string[]).[idx] ##>
rowType.AddMember(prop)
let wholeFileType = ProvidedTypeDefinition(asm, ns, tyName, Some(typedefof<seq<_>>.MakeGenericType(rowType)))
wholeFileType.AddMember(rowType)
let ctor = ProvidedConstructor(parameters = []) // the *type* is parameterized but the *constructor* gets no args
ctor.InvokeCode <- //given the inputs, what will we get as the outputs? Now we want to read the *data*, skip the header
fun [] -> <## System.IO.File.ReadLines(filename) |> Seq.skip 1 |> Seq.map (fun line -> line.Split(',')) ##>
wholeFileType.AddMember(ctor)
wholeFileType
)
do base.AddNamespace(ns, [csvFileProviderType])
[<TypeProviderAssembly>]
do()
Thanks for any help!
you need to use another constructor when defining 'Row' type. Existing ProvidedTypeDefinition type exposes two constructors:
(assembly, namespace, typename, base type) - defines top level type whose container is namespace.
(typename, basetype) - defines nested type that should be added to some another type.
Now Row type is defined using first constructor so it is treated as top level type. Exception is raised when this type is later added to wholeFileType as nested.

How to write a functional file "scanner"

First let me apologize for the scale of this problem but I'm really trying to think functionally and this is one of the more challenging problems I have had to work with.
I wanted to get some suggestions on how I might handle a problem I have in a functional manner, particularly in F#. I am writing a program to go through a list of directories and using a list of regex patterns to filter the list of files retrieved from the directories and using a second list of regex patterns to find matches in the text of the retreived files. I want this thing to return the filename, line index, column index, pattern and matched value for each piece of text that matches a given regex pattern. Also, exceptions need to be recorded and there are 3 possible exceptions scenarios: can't open the directory, can't open the file, reading content from the file failed. The final requirement of this is the the volume of files "scanned" for matches could be very large so this whole thing needs to be lazy. I'm not too worried about a "pure" functional solution as much as I'm interested in a "good" solution that reads well and performs well. One final challenge is to make it interop with C# because I would like to use the winform tools to attach this algorithm to a ui. Here is my first attempt and hopefully this will clarify the problem:
open System.Text.RegularExpressions
open System.IO
type Reader<'t, 'a> = 't -> 'a //=M['a], result varies
let returnM x _ = x
let map f m = fun t -> t |> m |> f
let apply f m = fun t -> t |> m |> (t |> f)
let bind f m = fun t -> t |> (t |> m |> f)
let Scanner dirs =
returnM dirs
|> apply (fun dirExHandler ->
Seq.collect (fun directory ->
try
Directory.GetFiles(directory, "*", SearchOption.AllDirectories)
with | e ->
dirExHandler e directory
Array.empty))
|> map (fun filenames ->
returnM filenames
|> apply (fun (filenamepatterns, lineExHandler, fileExHandler) ->
Seq.filter (fun filename ->
filenamepatterns |> Seq.exists (fun pattern ->
let regex = new Regex(pattern)
regex.IsMatch(filename)))
>> Seq.map (fun filename ->
let fileinfo = new FileInfo(filename)
try
use reader = fileinfo.OpenText()
Seq.unfold (fun ((reader : StreamReader), index) ->
if not reader.EndOfStream then
try
let line = reader.ReadLine()
Some((line, index), (reader, index + 1))
with | e ->
lineExHandler e filename index
None
else
None) (reader, 0)
|> (fun lines -> (filename, lines))
with | e ->
fileExHandler e filename
(filename, Seq.empty))
>> (fun files ->
returnM files
|> apply (fun contentpatterns ->
Seq.collect (fun file ->
let filename, lines = file
lines |>
Seq.collect (fun line ->
let content, index = line
contentpatterns
|> Seq.collect (fun pattern ->
let regex = new Regex(pattern)
regex.Matches(content)
|> (Seq.cast<Match>
>> Seq.map (fun contentmatch ->
(filename,
index,
contentmatch.Index,
pattern,
contentmatch.Value))))))))))
Thanks for any input.
Updated -- here is any updated solution based on feedback I received:
open System.Text.RegularExpressions
open System.IO
type ScannerConfiguration = {
FileNamePatterns : seq<string>
ContentPatterns : seq<string>
FileExceptionHandler : exn -> string -> unit
LineExceptionHandler : exn -> string -> int -> unit
DirectoryExceptionHandler : exn -> string -> unit }
let scanner specifiedDirectories (configuration : ScannerConfiguration) = seq {
let ToCachedRegexList = Seq.map (fun pattern -> new Regex(pattern)) >> Seq.cache
let contentRegexes = configuration.ContentPatterns |> ToCachedRegexList
let filenameRegexes = configuration.FileNamePatterns |> ToCachedRegexList
let getLines exHandler reader =
Seq.unfold (fun ((reader : StreamReader), index) ->
if not reader.EndOfStream then
try
let line = reader.ReadLine()
Some((line, index), (reader, index + 1))
with | e -> exHandler e index; None
else
None) (reader, 0)
for specifiedDirectory in specifiedDirectories do
let files =
try Directory.GetFiles(specifiedDirectory, "*", SearchOption.AllDirectories)
with e -> configuration.DirectoryExceptionHandler e specifiedDirectory; [||]
for file in files do
if filenameRegexes |> Seq.exists (fun (regex : Regex) -> regex.IsMatch(file)) then
let lines =
let fileinfo = new FileInfo(file)
try
use reader = fileinfo.OpenText()
reader |> getLines (fun e index -> configuration.LineExceptionHandler e file index)
with | e -> configuration.FileExceptionHandler e file; Seq.empty
for line in lines do
let content, index = line
for contentregex in contentRegexes do
for mmatch in content |> contentregex.Matches do
yield (file, index, mmatch.Index, contentregex.ToString(), mmatch.Value) }
Again, any input is welcome.
I think that the best approach is to start with the simplest solution and then extend it. Your current approach seems to be quite hard to read to me for two reasons:
The code uses a lot of combinators and function compositions in patterns that are not too common in F#. Some of the processing can be more easily written using sequence expressions.
The code is all written as a single function, but it is fairly complex and would be more readable if it was separated into multiple functions.
I would probably start by splitting the code in a function that tests a single file (say fileMatches) and a function that walks over the files and calls fileMatches. The main iteration can be quite nicely written using F# sequence expressions:
// Checks whether a file name matches a filename pattern
// and a content matches a content pattern.
let fileMatches fileNamePatterns contentPatterns
(fileExHandler, lineExHandler) file =
// TODO: This can be imlemented using
// File.ReadLines which returns a sequence.
// Iterates over all the files and calls 'fileMatches'.
let scanner specifiedDirectories fileNamePatterns contentPatterns
(dirExHandler, fileExHandler, lineExHandler) = seq {
// Iterate over all the specified directories.
for specifiedDir in specifiedDirectories do
// Find all files in the directories (and handle exceptions).
let files =
try Directory.GetFiles(specifiedDir, "*", SearchOption.AllDirectories)
with e -> dirExHandler e specifiedDir; [||]
// Iterate over all files and report those that match.
for file in files do
if fileMatches fileNamePatterns contentPatterns
(fileExHandler, lineExHandler) file then
// Matches! Return this file as part of the result.
yield file }
The function is still quite complicated, because you need to pass a lot of parameters around. Wrapping the parameters in a simple type or a record could be a good idea:
type ScannerArguments =
{ FileNamePatterns:string
ContentPatterns:string
FileExceptionHandler:exn -> string -> unit
LineExceptionHandler:exn -> string -> unit
DirectoryExceptionHandler:exn -> string -> unit }
Then you can define both fileMatches and scanner as functions that take just two parameters, which will make your code a lot more readable. Something like:
// Iterates over all the files and calls 'fileMatches'.
let scanner specifiedDirectories (args:ScannerArguments) = seq {
for specifiedDir in specifiedDirectories do
let files =
try Directory.GetFiles(specifiedDir, "*", SearchOption.AllDirectories)
with e -> args.DirectoryExceptionHandler e specifiedDir; [||]
for file in files do
// No need to propagate all arguments explicitly to other functions.
if fileMatches args file then yield file }

Resources