Loop through a string array to match a pattern - f#

I have a log file that I'm trying to parse with Regex.
I create an array of rows from the log file like this:
let loadLog =
File.ReadAllLines "c:/access.log"
|> Seq.filter (fun l -> not (l.StartsWith("#")))
|> Seq.map (fun s -> s.Split())
|> Seq.map (fun l -> l.[7],1)
|> Seq.toArray
I then need to loop through this array. But I don't think this will work because line needs to be a string.
Is there a special way to handle something like this in f#?
type ActorDetails =
{
Date: DateTime
Name: string
Email: string
}
for line in loadLog do
let line queryString =
match queryString with
| Regex #"[\?|&]system=([^&]+)" [json] ->
let jsonValue = JValue.Parse(Uri.UnescapeDataString(json))
{
Date = DateTime.UtcNow (* replace with parsed date *)
Name = jsonValue.Value<JArray>("name").[0].Value<string>()
Email = jsonValue.Value<JArray>("mbox").[0].Value<string>().[7..]
}

Use a Partial Active Pattern (|Regex|_|) to do that
open System.Text.RegularExpressions
let (|Regex|_|) regexPattern input =
let regex = new Regex(regexPattern)
let regexMatch = regex.Match(input)
if regexMatch.Success
then Some regexMatch.Value
else None
let queryString input = function
| Regex #"[\?|&]system=([^&]+)" s -> s
| _ -> sprintf "other: %s" input

Related

F# equivalent of C# operator/symbol "?."

I have the following f# code
product.code <- productPage.Html
.Descendants["li"]
.Select(fun node -> node.InnerText())
.Where(fun link -> (Regex.Match(link,#"code:").Success))
.FirstOrDefault()
.Replace("code:", "")
.Trim()
I'm having some trouble with nulls.
In c# I would do something like this.
product.code = productPage?.Html
?.Descendants["li"]
?.Select(node => node.InnerText())
?.Where(link => Regex.Match(link,#"code:").Success)
?.FirstOrDefault()
?.Replace("code:", "")
?.Trim() ?? "Not Found"
Is this possible?
In the second example, it looks to me like "?." has to be carried through the whole call chain due to its initial use. Rather than try to recreate this operator and preserve how this looks in C#, I suggest you go for more idiomatic F#. For example:
module String =
let replace (oldValue: string) (newValue: string) (s: string) =
s.Replace (oldValue, newValue)
let trim (s: string) =
s.Trim()
let result =
match isNull productPage with
| true -> None
| false ->
productPage.Html.Descendants.["li"]
|> Seq.map (fun node -> node.InnerText())
|> Seq.tryPick (fun link -> (Regex.Match (link, "code:").Success))
let code =
match result with
| Some html ->
html
|> String.replace "code:" ""
|> String.trim
| None -> "Not Found"
product.code <- code

iterating JArray without for loop in F#

I don't want to use this for loop for iterating the JArray. Is there any other method which can replace this for loop?
let tablesInJson = jsonModel.["tables"] :?> JArray //Converting JOject into JArray
for table in tablesInJson do
let TableName = table.["name"] :?> JValue
let columns = table.["columns"] :?> JArray
for col in columns do
let name = col.["name"] :?> JValue
let types = col.["type"] :?> JValue
let length = col.["length"] :?> JValue
let Result_ = sqlTableInfos
|> List.tryFind (fun s -> s.TableName = TableName.ToString() && s.ColumnName = name.ToString())
if Result_ = Unchecked.defaultof<_> then
printfn "is null"
else
printfn "not null"
If you want to iterate over a collection and perform an imperative operation than using for loop is the idiomatic way of doing this in F# and you should just use that. After all, for is an F# language construct! There is a reason why it exists and the reason is that it lets you easily write code that iterates over a collection and does something for each element!
There are cases where for loop is not a good fit. For example, if you wanted to turn a collection of columns into a new collection with information about the tables. Then you could use Seq.map:
let tableInfos = columns |> Seq.map (fun col ->
let name = col.["name"] :?> JValue
let types = col.["type"] :?> JValue
let length = col.["length"] :?> JValue
let result = sqlTableInfos |> List.tryFind (fun s ->
s.TableName = TableName.ToString() && s.ColumnName = name.ToString())
if result = Unchecked.defaultof<_> then None
else Some result)
This looks like something you might be trying to do - but it is difficult to say. Your question does not say what is the problem that you are actually trying to solve.
Your example with printfn is probably misleading, because if you actually just want to print, then for loop is the best way of doing that.
You can use the Seq module to perform sequence-processing operations over the JArray. In your case, I think I would probably do this for the second for loop (over the columns), but not for the outer loop. The reason being, if you factor the code in the inner-loop out to a function, then you can use pipelining and partial application to clean up the code a bit:
open Newtonsoft.Json
open Newtonsoft.Json.Linq
type SqlTableInfo = {TableName: string; ColumnName: string}
let tablesInJson = JArray()
let sqlTableInfo = []
let tryFindColumn (tableName: JValue) (column: JToken) =
let columnName = column.["name"] |> unbox<JValue>
if sqlTableInfo |> List.exists (fun s -> s.TableName = tableName.ToString() && s.ColumnName = columnName.ToString())
then printfn "Table %A, Column %A Found" tableName columnName
else printfn "Table %A, Column %A Found" tableName columnName
for table in tablesInJson do
let tableName = table.["name"] |> unbox<JValue>
table.["columns"]
|> unbox<JArray>
|> Seq.iter (tryFindColumn tableName)

F# return empty string in case of null

I'm trying to touch some F# language by developing a small "web crawler". I've got a functions declared like this:
let results = HtmlDocument.Load("http://joemonster.org//")
let images =
results.Descendants ["img"]
|> Seq.map (fun x ->
x.TryGetAttribute("src").Value.Value(),
x.TryGetAttribute("alt").Value.Value()
)
which of course should return for me a map of "src" and "alt" attributes for "img" tag. But when I'm encountering a situation when one of those are missing in the tag I'm getting an exception that TryGetAttribute is returning null. I want to change that function to return the attribute value or empty string in case of null.
I've tried out answers from this ticket but with no success.
TryGetAttribute returns an option type, and when it is None you can't get its value—you get an exception instead. You can pattern match against the returned option value and return an empty string for the None case:
let getAttrOrEmptyStr (elem: HtmlNode) attr =
match elem.TryGetAttribute(attr) with
| Some v -> v.Value()
| None -> ""
let images =
results.Descendants ["img"]
|> Seq.map (fun x -> getAttrOrEmptyStr x "src", getAttrOrEmptyStr x "alt")
Or a version using defaultArg and Option.map:
let getAttrOrEmptyStr (elem: HtmlNode) attr =
defaultArg (elem.TryGetAttribute(attr) |> Option.map (fun a -> a.Value())) ""
Or another option now that Option.defaultValue exists, and using HtmlAttribute.value function for a terser Option.map call:
let getAttrOrEmptyStr (elem: HtmlNode) attr =
elem.TryGetAttribute(attr)
|> Option.map HtmlAttribute.value
|> Option.defaultValue ""

Convert String to Key Value Pair in F#

Given a string such as
one:1.0|two:2.0|three:3.0
how do we create a dictionary of the form string: float?
open System
open System.Collections.Generic
let ofSeq (src:seq<'a * 'b>) =
// from fssnip
let d = new Dictionary<'a, 'b>()
for (k,v) in src do
d.Add(k,v)
d
let msg = "one:1.0|two:2.0|three:3.0"
let msgseq = msg.Split[|'|'|] |> Array.toSeq |> Seq.map (fun i -> i.Split(':'))
let d = ofSeq msgseq // The type ''a * 'b' does not match the type 'string []'
This operation would be inside a tight loop so efficiency would be a plus. Although I'd like to see a simple solution as well just to get my F# bearings.
Thanks.
How about something like this:
let msg = "one:1.0|two:2.0|three:3.0"
let splitKeyVal (str : string) =
match str.Split(':') with
|[|key; value|] -> (key, System.Double.Parse(value))
|_ -> invalidArg "str" "str must have the format key:value"
let createDictionary (str : string) =
str.Split('|')
|> Array.map (splitKeyVal)
|> dict
|> System.Collections.Generic.Dictionary
You could drop the System.Collections.Generic.Dictionary if you don't mind an IDictionary return type.
If you expect the splitKeyVal function to fail then you'd be better off expressing it as a function that returns option, e.g.:
let splitKeyVal (str : string) =
match str.Split(':') with
|[|key; valueStr|] ->
match System.Double.TryParse(valueStr) with
|true, value -> Some (key, value)
|false, _ -> None
|_ -> None
But then you'd also have to decide how you wanted to handle failure in the createDictionary function.
Not sure about the perf side but if you're sure of your input and can "afford" a warning you can go with :
let d =
msg.Split '|'
|> Array.map (fun s -> let [|key; value|] (*warning here*) = s.Split ':' in key, value)
|> dict
|> System.Collections.Generic.Dictionary // optional if a IDictionary<string, string> suffice

parse log files with f#

I'm trying to parse data from iis log files.
Each row has a date that I need like this:
u_ex15090503.log:3040:2015-09-05 03:57:45
And a name and email address I need in here:
&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&
I start off by getting the correct column like this. This part works fine.
//get the correct column
let getCol =
let line = fileReader inputFile
line
|> Seq.filter (fun line -> not (line.StartsWith("#")))
|> Seq.map (fun line -> line.Split())
|> Seq.map (fun line -> line.[7],1)
|> Seq.toArray
getCol
Now I need to parse the above and get the date, name, and email, but I'm having a hard time figuring out how to do that.
So far I have this, which gives me 2 errors(below):
//split the above column at every "&"
let getDataInCol =
let line = getCol
line
|> Seq.map (fun line -> line.Split('&'))
|> Seq.map (fun line -> line.[5], 1)
|> Seq.toArray
getDataInCol
Seq.map (fun line -> line.Split('&'))
the field constructor 'Split' is not defined
The errors:
Seq.map (fun line -> line.[5], 1)
the operator 'expr.[idx]' has been used on an object of indeterminate type based on information prior to this program point.
Maybe I'm going about this all wrong. I'm very new to f# so I apologize for the sloppy code.
Something like this would get the name and email. You'll still need to parse the date.
#r "Newtonsoft.Json.dll"
open System
open System.Text.RegularExpressions
open Newtonsoft.Json.Linq
let (|Regex|_|) pattern input =
let m = Regex.Match(input, pattern)
if m.Success then Some(List.tail [ for g in m.Groups -> g.Value ])
else None
type ActorDetails =
{
Date: DateTime
Name: string
Email: string
}
let parseActorDetails queryString =
match queryString with
| Regex #"[\?|&]actor=([^&]+)" [json] ->
let jsonValue = JValue.Parse(Uri.UnescapeDataString(json))
{
Date = DateTime.UtcNow (* replace with parsed date *)
Name = jsonValue.Value<JArray>("name").[0].Value<string>()
Email = jsonValue.Value<JArray>("mbox").[0].Value<string>().[7..]
}
| _ -> invalidArg "queryString" "Invalid format"
parseActorDetails "&actor=%7B%22name%22%3A%5B%22James%2C%20Smith%22%5D%2C%22mbox%22%3A%5B%22mailto%3AJames.Smith%40student.colled.edu%22%5D%7D&"
val it : ActorDetails = {Date = 11/10/2015 9:14:25 PM;
Name = "James, Smith";
Email = "James.Smith#student.colled.edu";}

Resources