What can cause SystemLimitError when calling timer.tc? - erlang

I am writing a simple module in Elixir that loads data into a gen_server written in Erlang. I need to measure time of loading operations, and this is the thing I have problem with. When I call :timer.tc() on line 46 (the one with crateStations) I get SystemLimitError, I have no idea what can cause such behaviour and would be grateful for any tips. Bellow is the code for elxir module used, I have worked with the Erlang gen_server previously and no such errors occurred.
defmodule Data do
defp nameStation({:cords,lat,lang}) do
"station_#{lang}_#{lat}}}"
end
defp identifyStations(data) do
data |> Enum.map((&(&1.location))) |> Enum.uniq |> Enum.map((fn({lat,lang})->{:cords,lat,lang}end))
end
defp createStations(data)do
identifyStations(data) |>
Enum.each((fn (cords)->:pollution_gen_server.addStation(nameStation(cords),cords) end))
end
defp createMesurements(data) do
data |> Enum.each((fn(value)->
:pollution_gen_server.addValue(value.location,value.datetime,"PM10",value.pollutionLevel ) end))
end
defp importLinesFromCSV(path) do
File.read!(path) |> (&(String.split(&1, "\r\n"))).()
end
defp parse(line) do
[date_l,time_l,lang_l,lat_l,polltion_l] = String.split(line,",")
lat = lat_l |> Float.parse() |> elem(0)
lang = lang_l |> Float.parse() |> elem(0)
pollution = polltion_l |> Integer.parse() |> elem(0)
{hours,mins} = time_l |> String.split(":") |> Enum.map(&(Integer.parse/1)) |>Enum.map(&(elem(&1,0))) |>
:erlang.list_to_tuple()
date = date_l |> String.split(":") |> Enum.map(&(Integer.parse/1)) |> Enum.reverse() |>Enum.map(&(elem(&1,0))) |>
:erlang.list_to_tuple()
%{
:datetime => {date,{hours,mins,0}},
:location => {lat,lang},
:pollutionLevel => pollution
}
end
def run(path) do
:pollution_sup.start_link()
lines = importLinesFromCSV(path) |> Enum.map(&parse/1)
time_stations = :timer.tc(&createStations/1,lines) |> elem(0) |> Kernel./(1_000_000)
time_measurements = :timer.tc(&createMesurements/1,lines) |> elem(0) |> Kernel./(1_000_000)
time_mean = :timer.tc(&:pollution_gen_server.getStationMean/2,["PM10",{:cords, 49.986, 20.06}]) |> elem(0) |> Kernel./(1_000_000)
mean = :pollution_gen_server.getStationMean("PM10",{:cords, 49.986, 20.06})
time_daily = :timer.tc(&:pollution_gen_server.getDailyMean/2,["PM10",{2017, 5, 3}]) |> elem(0) |> Kernel./(1_000_000)
daily = :pollution_gen_server.getDailyMean("PM10",{2017, 5, 3})
IO.puts "Time of loading stations: #{time_stations}"
IO.puts "Time of loading mesurements: #{time_measurements}"
IO.puts "Time of getting mean: #{time_mean} result: #{mean}"
IO.puts "Time of getting daily: #{time_daily} result: #{daily}"
end
end

Does the call :pollution_gen_server.addStation(nameStation(cords),...) create an atom from the name? In that case, you could be overflowing the atom table, which by default has room for 1048576 unique atoms (i.e., just over a million). If you can't rewrite the code, you could try raising the level with the +t flag when starting the system.

Related

F# sort by indexes

Let's say I have two lists:
let listOfValues = [100..105] //can be list of strings or whatever
let indexesToSortBy = [1;2;0;4;5;3]
Now I need listOfValues_sorted: 102;100;101;105;103;104
It can be done with zip and "conversion" to Tuple:
let listOfValues_sorted = listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy( fun x-> fst x)
|> Seq.iter(fun c -> printfn "%i" (snd c))
But I guess, there is better solution for that?
I think your solution is pretty close. I would do this
let listOfValues_sorted =
listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy fst
|> Seq.toList
|> List.unzip
|> List.head
you can collapse fun x -> fst x into simply fst. And then unzip and get what ever list you want
If indexesToSortBy is a complete set of indexes you could simply use:
indexesToSortBy |> List.map (fun x -> listOfValues |> List.item x )
Your example sounds precisely what the List.permute function is for:
let listOfValues = [100..105]
let indexesToSortBy = [|1;2;0;4;5;3|] // Note 0-based indexes
listOfValues |> List.permute (fun i -> indexesToSortBy.[i])
// Result: [102; 100; 101; 105; 103; 104]
Two things: First, I made indexesToSortBy an array since I'll be looking up a value inside it N times, and doing that in a list would lead to O(N^2) run time. Second, List.permute expects to be handed a 0-based index into the original list, so I subtracted 1 from all the indexes in your original indexToSortBy list. With these two changes, this produces exactly the same ordering as the let listOfValues_sorted = ... example in your question.

http download to disk with fsharp.data.dll and async workflows stalls

The following .fsx file is supposed to download and save to disk binary table base files which are posted as links in a html page on the internet, using Fsharp.Data.dll.
What happens, is that the whole thing stalls after a while and way before it is done, not even throwing an exception or alike.
I am pretty sure, I kind of mis-handle the CopyToAsync() thingy in my async workflow. As this is supposed to run while I go for a nap, it would be nice if someone could tell me how it is supposed to be done correctly. (In more general terms - how to handle a System.Threading.Task thingy in an async workflow thingy?)
#r #"E:\R\playground\DataTypeProviderStuff\packages\FSharp.Data.2.2.3\lib\net40\FSharp.Data.dll"
open FSharp.Data
open Microsoft.FSharp.Control.CommonExtensions
let document = HtmlDocument.Load("http://www.olympuschess.com/egtb/gaviota/")
let links =
document.Descendants ["a"] |> Seq.choose (fun x -> x.TryGetAttribute("href") |> Option.map (fun a -> a.Value()))
|> Seq.filter (fun v -> v.EndsWith(".cp4"))
|> List.ofSeq
let targetFolder = #"E:\temp\tablebases\"
let downloadUrls =
links |> List.map (fun name -> "http://www.olympuschess.com/egtb/gaviota/" + name, targetFolder + name )
let awaitTask = Async.AwaitIAsyncResult >> Async.Ignore
let fetchAndSave (s,t) =
async {
printfn "Starting with %s..." s
let! result = Http.AsyncRequestStream(s)
use fileStream = new System.IO.FileStream(t,System.IO.FileMode.Create)
do! awaitTask (result.ResponseStream.CopyToAsync(fileStream))
printfn "Done with %s." s
}
let makeBatches n jobs =
let rec collect i jl acc =
match i,jl with
| 0, _ -> acc,jl
| _, [] -> acc,jl
| _, x::xs -> collect (i-1) (xs) (acc # [x])
let rec loop remaining acc =
match remaining with
| [] -> acc
| x::xs ->
let r,rest = collect n remaining []
loop rest (acc # [r])
loop jobs []
let download () =
downloadUrls
|> List.map fetchAndSave
|> makeBatches 2
|> List.iter (fun l -> l |> Async.Parallel |> Async.RunSynchronously |> ignore )
|> ignore
download()
Note Updated code so it creates batches of 2 downloads at a time and only the first batch works. Also added the awaitTask from the first answer as this seems the right way to do it.
News What is also funny: If I interrupt the stalled script and then #load it again into the same instance of fsi.exe, it stalls right away. I start to think it is a bug in the library I use or something like that.
Thanks, in advance!
Here fetchAndSave has been modified to handle the Task returned from CopyToAsync asynchronously. In your version you are waiting on the Task synchronously. Your script will appear to lock up as you are using Async.RunSynchronously to run the whole workflow. However the files do download as expected in the background.
let awaitTask = Async.AwaitIAsyncResult >> Async.Ignore
let fetchAndSave (s,t) = async {
let! result = Http.AsyncRequestStream(s)
use fileStream = new System.IO.FileStream(t,System.IO.FileMode.Create)
do! awaitTask (result.ResponseStream.CopyToAsync(fileStream))
}
Of course you also need to call
do download()
on the last line of your script to kick things into motion.

GroupBy Year then take Pairwise diffs except for the head value then Flatten Using Deedle and F#

I have the following variable:
data:seq<(DateTime*float)>
and I want to do something like the following F# code but using Deedle:
data
|> Seq.groupBy (fun (k,v) -> k.Year)
|> Seq.map (fun (k,v) ->
let vals = v |> Seq.pairwise
let first = seq { yield v |> Seq.head }
let diffs = vals |> Seq.map (fun ((t0,v0),(t1,v1)) -> (t1, v1 - v0))
(k, diffs |> Seq.append first))
|> Seq.collect snd
This works fine using F# sequences but I want to do it using Deedle series. I know I can do something like:
(data:Series<DateTime*float>) |> Series.groupBy (fun k v -> k.Year)...
But then I need to take the within group year diffs except for the head value which should just be the value itself and then flatten the results into on series...I am bit confused with the deedle syntax
Thanks!
I think the following might be doing what you need:
ts
|> Series.groupInto
(fun k _ -> k.Month)
(fun m s ->
let first = series [ fst s.KeyRange => s.[fst s.KeyRange]]
Series.merge first (Series.diff 1 s))
|> Series.values
|> Series.mergeAll
The groupInto function lets you specify a function that should be called on each of the groups
For each group, we create series with the differences using Series.diff and append a series with the first value at the beginning using Series.merge.
At the end, we get all the nested series & flatten them using Series.mergeAll.

How to filter rows using Deedle

In order to get comfortable with Deedle I made up a CSV file that represents a log of video rentals.
RentedOn,Shop,Title
12/dec/2013 00:00:00,East,Rambo
12/dec/2013 00:00:00,West,Rocky
12/dec/2013 00:00:00,West,Rambo
12/dec/2013 00:00:00,East,Rambo
13/dec/2013 00:00:00,East,Rocky
13/dec/2013 00:00:00,East,Rocky
13/dec/2013 00:00:00,East,Rocky
14/dec/2013 00:00:00,West,Rocky 2
I have the following function, that groups the rentals by Shop (East or West):
let overview =
__SOURCE_DIRECTORY__ + "/rentallog.csv"
|> Frame.ReadCsv
|> Frame.groupRowsByString "Shop"
|> Frame.nest
|> Series.map (fun dtc df ->
df.GetSeries<string>("Title") |> Series.groupBy (fun k v -> v)
|> Frame.ofColumns |> Frame.countValues )
|> Frame.ofRows
I'd like to be able to filter the rows by the date in the RentedOn col, however, I'm not sure how to do this. I know its probably using the Frame.filterRowValues function but I'm unsure the best way to use this. Any guidance on how to filter would be appreciated.
Update based on #jeremyh advice
let overview rentedOnDate =
let addRentedDate (f:Frame<_,_>) =
f.AddSeries ("RentedOnDate", f.GetSeries<DateTime>("RentedOn"))
f
__SOURCE_DIRECTORY__ + "/rentallog.csv"
|> Frame.ReadCsv
|> addRentedDate
|> Frame.filterRowValues (fun row -> row.GetAs<DateTime>("RentedOnDate") = rentedOnDate)
|> Frame.groupRowsByString "Shop"
|> Frame.nest
|> Series.map (fun dtc df ->
df.GetSeries<string>("Title") |> Series.groupBy (fun k v -> v)
|> Frame.ofColumns |> Frame.countValues )
|> Frame.ofRows
Thanks,
Rob
Hey I think that you might get a faster answer if you add an f# tag to your question too.
I used the following link to answer your question which has some helpful examples.
This is the solution I came up with. Please note that I added a new column RentedOnDate that actually has a DateTime type that I do the filtering on.
let overview rentedOnDate =
let rentalLog =
__SOURCE_DIRECTORY__ + "/rentallog.csv"
|> Frame.ReadCsv
rentalLog
|> Frame.addSeries "RentedOnDate" (rentalLog.GetSeries<DateTime>("RentedOn"))
|> Frame.filterRowValues (fun row -> row.GetAs<DateTime>("RentedOnDate") = rentedOnDate)
|> Frame.groupRowsByString "Shop"
|> Frame.nest
|> Series.map (fun dtc df ->
df.GetSeries<string>("Title") |> Series.groupBy (fun k v -> v)
|> Frame.ofColumns |> Frame.countValues )
|> Frame.ofRows
// Testing
overview (DateTime.Parse "12/dec/2013 00:00:00")

F# Manage multiple lazy sequences from a single method?

I am trying to figure out how to manage multiple lazy sequences from a single function in F#.
For example, in the code below, I am trying to get two sequences - one that returns all files in the directories, and one that returns a sequence of tuples of any directories that could not be accessed (for example due to permissions) with the exception.
While the below code compiles and runs, errorSeq never has any elements when used by other code, even though I know that UnauthorizedAccess exceptions have occurred.
I am using F# 2.0.
#light
open System.IO
open System
let rec allFiles errorSeq dir =
Seq.append
(try
dir |> Directory.GetFiles
with
e -> Seq.append errorSeq [|(dir, e)|]
|> ignore
[||]
)
(try
dir
|> Directory.GetDirectories
|> Seq.map (allFiles errorSeq)
|> Seq.concat
with
e -> Seq.append errorSeq [|(dir, e)|]
|> ignore
Seq.empty
)
[<EntryPoint>]
let main args =
printfn "Arguments passed to function : %A" args
let errorSeq = Seq.empty
allFiles errorSeq args.[0]
|> Seq.filter (fun x -> (Path.GetExtension x).ToLowerInvariant() = ".jpg")
|> Seq.iter Console.WriteLine
errorSeq
|> Seq.iter (fun x ->
Console.WriteLine("Error")
x)
0
If you wanted to take a more functional approach, here's one way to do it:
let rec allFiles (errorSeq, fileSeq) dir =
let files, errs =
try
Seq.append (dir |> Directory.GetFiles) fileSeq, errorSeq
with
e -> fileSeq, Seq.append [dir,e] errorSeq
let subdirs, errs =
try
dir |> Directory.GetDirectories, errs
with
e -> [||], Seq.append [dir,e] errs
Seq.fold allFiles (errs, files) subdirs
Now we pass the sequence of errors and the sequence of files into the function each time and return new sequences created by appending to them within the function. I think that the imperative approach is a bit easier to follow in this case, though.
Seq.append returns a new sequence, so this
Seq.append errorSeq [|(dir, e)|]
|> ignore
[||]
has no effect. Perhaps you want your function to return a tuple of two sequences? Or use some kind of mutable collection to write errors as you encounter them?

Resources