I have the following code
let getHtml location =
let request (url:string) =
let response = httpRequest (getFullUri url)
response.Headers.TryFind "Location"
request location
|> Option.bind (fun x -> request x)
|> Option.bind (fun x -> request x) // need the return of httpRequest inside request
I want the code return the last call of httpRequest. Not the return of request.
Update: tried the following code. Error on the last snd. I think I can use a mutable variable to implement it. But is it F# idiomatic?
let getHtml location =
let request (url:string) =
let response = httpRequest (getFullUri url)
match response.Headers.TryFind "Location" with
| Some location -> Some location, response
| None -> None, response
request location |> fst
|> Option.bind (fun x -> request x |> fst)
|> Option.bind (fun x -> request x |> snd) // Error on snd
Use mutable variable?
let getHtml location =
let mutable resp : FSharp.Data.HttpResponse = ???
let request (url:string) =
let response = httpRequest (getFullUri url)
resp <- response
response.Headers.TryFind "Location"
request location
|> Option.bind (fun x -> request x)
|> Option.bind (fun x -> request x)
if not (resp = null) then Some resp else None
I think want you want to do is actually make getHtml recursive, so that when an HTTP request returns a 201 or a 300-level response code, you follow the Location header to the redirected page and return the correct HTML. You could do that with a simple pattern match on the response.StatusCode and the location header, as follows:
open FSharp.Data
// stub
let getFullUri (url: string) =
sprintf "%A" <| System.UriBuilder(url)
// stub
let httpRequest = Http.Request
// fetches the requested URL, following redirects as necessary
let rec getHtml location =
let response = httpRequest (getFullUri location)
match response.StatusCode, response.Headers |> Map.tryFind "Location" with
| (status, Some redirectUrl) when status = 201 || (status >= 300 && status < 400) ->
getHtml redirectUrl
| _ ->
response
Is that what you were going for? I tested it with the following URL that returns a 302, and it got the HTML for the page to which it was redirected: https://jigsaw.w3.org/HTTP/300/302.html
Related
I need to request data from several URLs and then use the results.
I am using plain Fable 3 with the Fable-Promise and Fable-Fetch libraries.
I have worked out how to fetch from multiple URLs and combine the results into a single Promise that I can then use to update the UI (the multiple results need to be drawn only once).
But if one of the fetch errors then the whole thing falls over. Ideally I'd like to use tryFetch and then propagate the Result<TermData, None | Exception> but nothing I do seems to compile.
How exactly do I use tryFetch and then unwrap the result with a second let! in the CE? (The comments explain more)
module App
open Browser.Dom
open App
open System.Collections.Generic
open System.Text.RegularExpressions
open Fetch
open System
type TermData =
abstract counts : int []
abstract scores : int []
abstract term : string
abstract allWords : bool
type QueryTerm =
{ mutable Term: string
mutable AllWords: bool }
let loadSingleSeries (term: QueryTerm) =
promise {
let url =
$"/api/plot/{term.Term}?allWords={term.AllWords}"
// Works but doesn't handle errors.
let! plotData = fetch url [] // type of plotData: Response
// let plotDataResult = tryFetch url []
// This ideally becomes Promise<Result<TermData, None>>
// let unwrapped = match plotDataResult with
// | Ok res -> Ok (res.json<TermData>()) // type: Promise<TermData>
// | Error err -> ??? // tried Error (Promise.create(fun resolve reject -> resolve None)) among others
let! result = plotData.json<TermData>() // type of result: TermData
return result
}
let dataArrays =
parsed // type Dictionary<int, TermData>
|> Seq.map (fun term -> loadSingleSeries term.Value)
|> Promise.Parallel
|> Promise.map (fun allData -> console.log(allData))
// Here we will handle None when we have it
I don't have much Fable experience, but if I understand your question correctly, I think the following should work:
let loadSingleSeries (term: QueryTerm) =
promise {
let url =
$"/api/plot/{term.Term}?allWords={term.AllWords}"
let! plotDataResult = tryFetch url []
match plotDataResult with
| Ok resp ->
let! termData = resp.json<TermData>()
return Ok termData
| Error ex ->
return Error ex
}
The idea here is that if you get an error, you simply propagate that error in the new Result value. This returns a Promise<Result<TermData, Exception>>, as you requested.
Update: Fixed return type using a second let!.
I haven't run this code but looking at the docs it looks like you need to use Promise.catch
let loadSingleSeries (term: QueryTerm) =
promise {
let url =
$"/api/plot/{term.Term}?allWords={term.AllWords}"
let! plotDataResult =
fetch url []
|> Promise.map Ok // Wraps the "happy path" in a Result.Ok
|> Promise.catch (fun err ->
//handle the error
Error err)
return
match plotDataResult with
| Ok res -> ...
| Error err -> ...
}
I ended up having to use the pipeline rather than CE approach for this as follows:
let loadSingleSeries (term: QueryTerm) =
let url =
$"/api/plot/{term.Term}?allWords={term.AllWords}"
let resultPromise =
fetch url []
|> Promise.bind (fun response ->
let arr = response.json<TermData> ()
arr)
|> Promise.map (Ok)
|> Promise.catch (Error)
resultPromise
The key was using Promise.bind to convert the first promise to get the Response to the promise of Promise<TermData>. The map and catch then convert to a Promise<Result<TermData, exn>>.
Building on a snippet and answer, would it be possible to return results to the caller from the throttling queue? I've tried PostAndAsyncReply to receive reply on a channel but it's throwing an error if I pipe it with Enqueue. Here's the code.
Appreciate a F# core vanilla based solution around Queue or Mailbox design patterns.
Question
The question is to be able to call functions asynchronously based on the throttle (max 3 at a time), passing each item from the array, wait on the whole queue/array until it's finished while collecting all the results and then return the results to the caller. (Return the results to the caller is what's pending in here)
Callee Code
// Message type used by the agent - contains queueing
// of work items and notification of completion
type ThrottlingAgentMessage =
| Completed
| Enqueue of Async<unit>
/// Represents an agent that runs operations in concurrently. When the number
/// of concurrent operations exceeds 'limit', they are queued and processed later
let throttlingAgent limit =
MailboxProcessor.Start(fun inbox ->
async {
// The agent body is not executing in parallel,
// so we can safely use mutable queue & counter
let queue = System.Collections.Generic.Queue<Async<unit>>()
let running = ref 0
while true do
// Enqueue new work items or decrement the counter
// of how many tasks are running in the background
let! msg = inbox.Receive()
match msg with
| Completed -> decr running
| Enqueue w -> queue.Enqueue(w)
// If we have less than limit & there is some work to
// do, then start the work in the background!
while running.Value < limit && queue.Count > 0 do
let work = queue.Dequeue()
incr running
do! // When the work completes, send 'Completed'
// back to the agent to free a slot
async {
do! work
inbox.Post(Completed)
}
|> Async.StartChild
|> Async.Ignore
})
let requestDetailAsync (url: string) : Async<Result<string, Error>> =
async {
Console.WriteLine ("Simulating request " + url)
try
do! Async.Sleep(1000) // let's say each request takes about a second
return Ok (url + ":body...")
with :? WebException as e ->
return Error {Code = "500"; Message = "Internal Server Error"; Status = HttpStatusCode.InternalServerError}
}
let requestMasterAsync() : Async<Result<System.Collections.Concurrent.ConcurrentBag<_>, Error>> =
async {
let urls = [|
"http://www.example.com/1";
"http://www.example.com/2";
"http://www.example.com/3";
"http://www.example.com/4";
"http://www.example.com/5";
"http://www.example.com/6";
"http://www.example.com/7";
"http://www.example.com/8";
"http://www.example.com/9";
"http://www.example.com/10";
|]
let results = System.Collections.Concurrent.ConcurrentBag<_>()
let agent = throttlingAgent 3
for url in urls do
async {
let! res = requestDetailAsync url
results.Add res
}
|> Enqueue
|> agent.Post
return Ok results
}
Caller Code
[<TestMethod>]
member this.TestRequestMasterAsync() =
match Entity.requestMasterAsync() |> Async.RunSynchronously with
| Ok result -> Console.WriteLine result
| Error error -> Console.WriteLine error
You could use Hopac.Streams for that. With such tool it is pretty trivial:
open Hopac
open Hopac.Stream
open System
let requestDetailAsync url = async {
Console.WriteLine ("Simulating request " + url)
try
do! Async.Sleep(1000) // let's say each request takes about a second
return Ok (url + ":body...")
with :? Exception as e ->
return Error e
}
let requestMasterAsync() : Stream<Result<string,exn>> =
[| "http://www.example.com/1"
"http://www.example.com/2"
"http://www.example.com/3"
"http://www.example.com/4"
"http://www.example.com/5"
"http://www.example.com/6"
"http://www.example.com/7"
"http://www.example.com/8"
"http://www.example.com/9"
"http://www.example.com/10" |]
|> Stream.ofSeq
|> Stream.mapPipelinedJob 3 (requestDetailAsync >> Job.fromAsync)
requestMasterAsync()
|> Stream.iterFun (printfn "%A")
|> queue //prints all results asynchronously
let allResults : Result<string,exn> list =
requestMasterAsync()
|> Stream.foldFun (fun results cur -> cur::results ) []
|> run //fold stream into list synchronously
ADDED
In case you want to use only vanilla FSharp.Core with mailboxes only try this:
type ThrottlingAgentMessage =
| Completed
| Enqueue of Async<unit>
let inline (>>=) x f = async.Bind(x, f)
let inline (>>-) x f = async.Bind(x, f >> async.Return)
let throttlingAgent limit =
let agent = MailboxProcessor.Start(fun inbox ->
let queue = System.Collections.Generic.Queue<Async<unit>>()
let startWork work =
work
>>- fun _ -> inbox.Post Completed
|> Async.StartChild |> Async.Ignore
let rec loop curWorkers =
inbox.Receive()
>>= function
| Completed when queue.Count > 0 ->
queue.Dequeue() |> startWork
>>= fun _ -> loop curWorkers
| Completed ->
loop (curWorkers - 1)
| Enqueue w when curWorkers < limit ->
w |> startWork
>>= fun _ -> loop (curWorkers + 1)
| Enqueue w ->
queue.Enqueue w
loop curWorkers
loop 0)
Enqueue >> agent.Post
It is pretty much the same logic, but slightly optimized to not use queue if there is free worker capacity (just start job and don't bother with queue/dequeue).
throttlingAgent is a function int -> Async<unit> -> unit
Because we don't want client to bother with our internal ThrottlingAgentMessage type.
use like this:
let throttler = throttlingAgent 3
for url in urls do
async {
let! res = requestDetailAsync url
results.Add res
}
|> throttler
I've created a website using WebSharper and has stumbled into a problem. I wish to integrate the site with VSTS REST API. To do that (seemlessly) I need to forward a session cookie. How do I do that in an WebSharper-Ajax call. My current implementation of the Ajax call prior to needing this looks like this and works just fine for the other needs I've had so far
let Ajax (request : Request) =
let httpMethod = request.Method
let url = request.EndPoint
let data = request.AsJson
let success ok =
System.Action<obj,string,JqXHR>(
fun res _ _ ->
let result = (res :?> string |> Json.Parse)
if JS.HasOwnProperty result "error" then
{
ErrorType = result?error
Reason = result?reason
} |> pushError
else
result
|> Success
|> ok
)
let contentType = Union<bool,string>.Union2Of2("application/json")
try
Async.FromContinuations
<| fun (ok, ko, _) ->
let settings = JQuery.AjaxSettings(
Url = url,
DataType = JQuery.DataType.Text,
Type = As<JQuery.RequestType> httpMethod,
Success = success ok,
ContentType = contentType,
Error = System.Action<JqXHR,string,string>(fun jqXHR _ _ ->
let error =
jqXHR?responseText
|> Json.Parse
{
ErrorType = error?error
Reason = error?reason
} |> pushError |> ok
)
)
match data with
Some data ->
settings.Data <- data
| None -> ()
JQuery.Ajax(settings) |> ignore
with e ->
async {
return {
ErrorType ="uncaught exception";
Reason = e.Message
} |> Error
}
It turns out that the solution is pretty easy. After creating the AjaxSetting object, simply use dynamic typing to add the xhrFields object
settings?xhrFields <- obj()
settings?xhrFields?withCredentials <- true
I'm doing many async web requests and using Async.Parallel. Something like:
xs
|> Seq.map (fun u -> downloadAsync u.Url)
|> Async.Parallel
|> Async.Catch
Some request may throw exceptions, I want to log them and continue with the rest of urls. I found the Async.Catch function, but this stop the computation when the first exception is thrown. I know I can use a try...with expression within the async expression in order to compute the entire list, but, i think, this implies passing a log function to my downloadAsync function changing his type. Is there any other way to catch the exceptions, log them and continue with the rest of urls?
The 'trick' is to move the catch into the map such that catching is parallelized as well:
open System
open System.IO
open System.Net
type T = { Url : string }
let xs = [
{ Url = "http://microsoft.com" }
{ Url = "thisDoesNotExists" } // throws when constructing Uri, before downloading
{ Url = "https://thisDotNotExist.Either" }
{ Url = "http://google.com" }
]
let isAllowedInFileName c =
not <| Seq.contains c (Path.GetInvalidFileNameChars())
let downloadAsync url =
async {
use client = new WebClient()
let fn =
[|
__SOURCE_DIRECTORY__
url |> Seq.filter isAllowedInFileName |> String.Concat
|]
|> Path.Combine
printfn "Downloading %s to %s" url fn
return! client.AsyncDownloadFile(Uri(url), fn)
}
xs
|> Seq.map (fun u -> downloadAsync u.Url |> Async.Catch)
|> Async.Parallel
|> Async.RunSynchronously
|> Seq.iter (function
| Choice1Of2 () -> printfn "Succeeded"
| Choice2Of2 exn -> printfn "Failed with %s" exn.Message)
(*
Downloading http://microsoft.com to httpmicrosoft.com
Downloading thisDoesNotExists to thisDoesNotExists
Downloading http://google.com to httpgoogle.com
Downloading https://thisDotNotExist.Either to httpsthisDotNotExist.Either
Succeeded
Failed with Invalid URI: The format of the URI could not be determined.
Failed with The remote name could not be resolved: 'thisdotnotexist.either'
Succeeded
*)
Here I wrapped the download into another async to capture the Uri construction exception.
I'm trying to crawl a webpage, and get all the links, and add them to a list<string> which will be returned in the end, from the function.
My code:
let getUrls s : seq<string> =
let doc = new HtmlDocument() in
doc.LoadHtml s
doc.DocumentNode.SelectNodes "//a[#href]"
|> Seq.map(fun z -> (string z.Attributes.["href"]))
let crawler uri : seq<string> =
let rec crawl url =
let web = new WebClient()
let data = web.DownloadString url
getUrls data |> Seq.map crawl (* <-- ERROR HERE *)
crawl uri
The problem is that at the last line in the crawl function (the getUrls seq.map...), it simply throws an error:
Type mismatch. Expecting a string -> 'a but given a string
-> seq<'a> The resulting type would be infinite when unifying ''a'
and 'seq<'a>'
crawl is returning unit, but is expected to return seq<string>. I think you want something like:
let crawler uri =
let rec crawl url =
seq {
let web = new WebClient()
let data = web.DownloadString url
for url in getUrls data do
yield url
yield! crawl url
}
crawl uri
Adding a type annotation to crawl should point out the issue.
i think something like this:
let crawler (uri : seq<string>) =
let rec crawl url =
let data = Seq.empty
getUrls data
|> Seq.toList
|> function
| h :: t ->
crawl h
t |> List.iter crawl
| _-> ()
crawl uri
In order to fetch links:
open System.Net
open System.IO
open System.Text.RegularExpressions
type Url(x:string)=
member this.tostring = sprintf "%A" x
member this.request = System.Net.WebRequest.Create(x)
member this.response = this.request.GetResponse()
member this.stream = this.response.GetResponseStream()
member this.reader = new System.IO.StreamReader(this.stream)
member this.html = this.reader.ReadToEnd()
let linkex = "href=\s*\"[^\"h]*(http://[^&\"]*)\""
let getLinks (txt:string) = [
for m in Regex.Matches(txt,linkex)
-> m.Groups.Item(1).Value
]
let collectLinks (url:Url) = url.html
|> getLinks