Subtracting dates with Deedle - f#

I want to get the difference between two dates. Ideally to add this back into the frame as a new column. Have tried the code below but get "Overloaded subtraction operator"
let df = Frame.ReadCsv("someDateColumns.csv", hasHeaders=true, separators="|")
let s1:Series<int,DateTime> = df.GetColumn<DateTime>("StartDate")
let s2:Series<int,DateTime> = df.GetColumn<DateTime>("EndDate")
let s3 = s2 - s1

I agree this would be a nice addition to Deedle. If you were thinking of contributing, it should be relatively easy to add - it would be a matter of adding another overload to the list of supported overloads in the Series module, which currently has vector binary operations for various numerical types and strings.
Without changing Deedle, the easiest option is to use the Series.zipInto operation, which aligns two series (no-op if they came from the same data frame) and applies an operation to matching elements:
#r "nuget: Deedle"
open Deedle
open System
let df =
Frame.ofRecords
[ {| Start=DateTime(2000,1,1); End=DateTime(2000,1,10) |}
{| Start=DateTime(2000,12,1); End=DateTime(2001,1,1) |}]
let s1 = df.GetColumn<DateTime>("Start")
let s2 = df.GetColumn<DateTime>("End")
let s3 = Series.zipInto (-) s2 s1

Related

Is it possible to vectorize this calculation in numpy?

Can the following expression of numpy arrays be vectorized for speed-up?
k_lin1x = [2*k_lin[i]*k_lin[i+1]/(k_lin[i]+k_lin[i+1]) for i in range(len(k_lin)-1)]
Is it possible to vectorize this calculation in numpy?
x1 = k_lin
x2 = k_lin
s = len(k_lin)-1
np.roll(x2, -1) #do this do bring the column one position right
result1 = x2[:s]+x1[:s] #your divider. You add everything but the last element
result2 = x2[:s]*x1[:s] #your upper part
# in one line
result = 2*x2[:s]*x1[:s] / (x2[:s]+x1[:s])
You last column wont be added or taken into the calculations and you can do this by simply using np.roll to shift the columns. x2[0] = x1[1], x2[1] = x1[2].
This is barely a demo of how you should approach google numpy roll. Also instead of using s on x2 you can simply drop the last column since it's useless for the calculations.

iOS slow image pixel iterating

I am trying to implement RGB histogram computation for images in Swift (I am new to iOS).
However the computation time for 1500x1000 image is about 66 sec, which I consider to be too slow.
Are there any ways to speed up image traversal?
P.S. current code is the following:
func calcHistogram(image: UIImage) {
let bins: Int = 20;
let width = Int(image.size.width);
let height = Int(image.size.height);
let binStep: Double = Double(bins-1)/255.0
var hist = Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Int())))
for i in 0..<bins {
for j in 0..<bins {
for k in 0..<bins {
hist[i][j][k] = 0;
}
}
}
var pixelData = CGDataProviderCopyData(CGImageGetDataProvider(image.CGImage))
var data: UnsafePointer<UInt8> = CFDataGetBytePtr(pixelData)
for x in 0..<width {
for y in 0..<height {
var pixelInfo: Int = ((width * y) + x) * 4
var r = Double(data[pixelInfo])
var g = Double(data[pixelInfo+1])
var b = Double(data[pixelInfo+2])
let r_bin: Int = Int(floor(r*binStep));
let g_bin: Int = Int(floor(g*binStep));
let b_bin: Int = Int(floor(b*binStep));
hist[r_bin][g_bin][b_bin] += 1;
}
}
}
As noted in my comment on the question, there are some things you might rethink before you even try to optimize this code.
But even if you do move to a better overall solution like GPU-based histogramming, a library, or both... There are some Swift pitfalls you're falling into here that are good to talk about so you don't run into them elsewhere.
First, this code:
var hist = Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Array(count:bins, repeatedValue:Int())))
for i in 0..<bins {
for j in 0..<bins {
for k in 0..<bins {
hist[i][j][k] = 0;
}
}
}
... is initializing every member of your 3D array twice, with the same result. Int() produces a value of zero, so you could leave out the triple for loop. (And possibly change Int() to 0 in your innermost repeatedValue: parameter to make it more readable.)
Second, arrays in Swift are copy-on-write, but this optimization can break down in multidimensional arrays: changing an element of a nested array can cause the entire nested array to be rewritten instead of just the one element. Multiply that by the depth of nested arrays and number of element writes you have going on in a double for loop and... it's not pretty.
Unless there's a reason your bins need to be organized this way, I'd recommend finding a different data structure for them. Three separate arrays? One Int array where index i is red, i + 1 is green, and i + 2 is blue? One array of a custom struct you define that has separate r, g, and b members? See what conceptually fits with your tastes or the rest of your app, and profile to make sure it works well.
Finally, some Swift style points:
pixelInfo, r, g, and b in your second loop don't change. Use let, not var, and the optimizer will thank you.
Declaring and initializing something like let foo: Int = Int(whatever) is redundant. Some people like having all their variables/constants explicitly typed, but it does make your code a tad less readable and harder to refactor.
Int(floor(x)) is redundant — conversion to integer always takes the floor.
If you have some issues about performance in your code, first of all, use Time Profiler from Instruments. You can start it via Xcode menu Build->Profile, then, Instruments app opened, where you can choose Time Profiler.
Start recording and do all interactions in the your app.
Stop recording and analyse where is the "tightest" place of your code.
Also check options "Invert call tree", "Hide missing symbols" and "Hide system libraries" for better viewing profile results.
You can also double click at any listed function to view it in code and seeing percents of usage

MS Chart Control Range Bar

I am trying to somehow replicate the range bar chart here.
I've found this reference but I don't fully grasp the code.
What I have is a series of task (sometimes accomplished in different periods).
let d = [("task1", DateTime.Parse("11/01/2014 08:30"), DateTime.Parse("12/01/2014 10:30"));
("task2", DateTime.Parse("15/01/2014 09:30"), DateTime.Parse("16/01/2014 10:30"));
("task3", DateTime.Parse("11/01/2014 08:30"), DateTime.Parse("16/01/2014 10:30"))]
let chart = d |> FSharp.Charting.Chart.RangeBar
chart.ShowChart()
I am struggling to understand the logic of the API.
I have also tried:
let chart = new Windows.Forms.DataVisualization.Charting.Chart(Dock = DockStyle.Fill)
let area = new ChartArea("Main")
chart.ChartAreas.Add(area)
let mainForm = new Form(Visible = true, TopMost = true, Width = 700, Height = 500)
mainForm.Controls.Add(chart)
let seriesColumns = new Series("NameOfTheSerie")
seriesColumns.ChartType <- SeriesChartType.RangeBar
type SupportToChart(serieVals: Series) =
member this.addPointXY(lbl, [<ParamArray>] yVals: Object[]) =
serieVals.Points.AddXY(lbl, yVals) |> ignore
let supporter = SupportToChart(seriesColumns)
supporter.addPointXY("AAA", DateTime.Parse("11/01/2014 08:30"), DateTime.Parse("12/01/2014 10:30") )
which results in
System.ArgumentOutOfRangeException: You can only set 1 Y values for
this data point.
Has something changed in the API since then?
I'm not entirely sure that F# Charting is currently powerful enough to be able to reconstruct the above chart. However, one of the problems seems to be that it treats dates as float values (for some reason) and incorrectly guesses the ranges. You can at least see the chart if you use:
Chart.RangeBar(d)
|> Chart.WithYAxis(Min=41650.0, Max=41660.0)
Please submit this as an issue on GitHub. If you want to dig deeper into how F# Charting works and help us get this fixed, that would be amazing :-)
The trick is initializing the Series with
let serie = new Series("Range", yValues)
where yValues defines the max number of "Y-values".

How to add Tuples and apply a ceiling/clamp function in F#

So I am working on a project using F# for some SVG line manipulations.
I thought it would be good to represent color an RGB value as a tuple (R,G,B). It just made sense to me. Well since my project involves generating SVG lines in a loop. I decided to have a color offset, conveniently also represented in a tuple (Roffset, Goffset, Boffset)
An offset in this case represents how much each line differs from the previous.
I got to a point where I needed to add the tuples. I thought since they were of the same dimensions and types, it would be fine. But apparently not. I also checked the MSDN on tuples, but I did not find anything about how to add them or combine them.
Here is what I tried. Bear in mind I tried to omit as much irrelevant code as possible since this is a long class definition with LOTS of members.
type lineSet ( 10+ params omitted ,count, colorOff :byte*byte*byte, color :byte*byte*byte ,strokeWid , strokeWidthOff ) =
member val Color = color with get, set
member val ColorOffset = colorOff with get, set
member val lineCount = count with get, set
interface DrawingInterfaces.IRepresentable_SVG with
member __.getSVGRepresenation() =
let mutable currentColor = __.Color
for i in 1..__.lineCount do
currentColor <- currentColor + __.ColorOffset
That last line of code is what I wanted to do. However, it appears you cannot add tuples directly.
I also need a way to clamp the result so it cannot go over 255, but I suspect a simple try with block will do the trick. OR I could let the params take a type int*int*int and just use an if to reset it back to 255 each time.
As I mentioned in the comments, the clamping function in your code does not actually work - you need to convert the numbers to integers before doing the addition (and then you can check if the integer is greater than 255). You can do something like this:
let addClamp (a:byte) (b:byte) =
let r = int a + int b
if r > 255 then 255uy else byte r
Also, if you work with colors, then it might make sense to define a custom color type rather than passing colors around as tuples. That way, you can also define + on colors (with clamping) and it will make your code simpler (but still, 10 constructor arguments is a bit scary, so I'd try to think if there is a way to simplify that a bit). A color type might look like this:
type Color(r:byte, g:byte, b:byte) =
static let addClamp (a:byte) (b:byte) =
let r = int a + int b
if r > 255 then 255uy else byte r
member x.R = r
member x.B = b
member x.G = g
static member (+) (c1:Color, c2:Color) =
Color(addClamp c1.R c2.R, addClamp c1.G c2.G,addClamp c1.B c2.B)
Using the type, you can then add colors pretty easily and do not have to add clamping each time you need to do that. For example:
Color(255uy, 0uy, 0uy) + Color(1uy, 0uy, 0uy)
But I still think you could make the code more readable and more composable by refactoring some of the visual properties (like stroke & color) to a separate type and then just pass that to LineSet. This way you won't have 10+ parameters to a constructor and your code will probably be more flexible too.
Here is a modified version of your code which I think is a bit nicer
let add3DbyteTuples (tuple1:byte*byte*byte , tuple2:byte*byte*byte) =
let inline intify (a,b,c) = int a,int b,int c
let inline tripleadd (a,b,c) (d,e,f) = a+d,b+e,c+f
let clamp a = if a > 255 then 255 else a
let R,G,B = tripleadd (intify tuple1) (intify tuple2)
clamp R,clamp G,clamp B

F# lazy pixels reading

I want to make a lazy loading of image pixels to the 3 dimensional array of integers.
For example in simple way it looks like this:
for i=0 to Width
for j=0 to Height
let point=image.GetPixel(i,j)
pixels.[0,i,j] <- point.R
pixels.[1,i,j] <- point.G
pixels.[2,i,j] <- point.B
How it can be made in lazy way?
What would be slow is the call to GetPixel. If you want to call it only as needed, you could use something like this:
open System.Drawing
let lazyPixels (image:Bitmap) =
let Width = image.Width
let Height = image.Height
let pixels : Lazy<byte>[,,] = Array3D.zeroCreate 3 Width Height
for i = 0 to Width-1 do
for j = 0 to Height-1 do
let point = lazy image.GetPixel(i,j)
pixels.[0,i,j] <- lazy point.Value.R
pixels.[1,i,j] <- lazy point.Value.G
pixels.[2,i,j] <- lazy point.Value.B
pixels
GetPixel will be called at most once for every pixel, and then reused for the other components.
Another way of approaching this problem would be to do a bulk-load of the entire image. This will be a lot quicker than calling GetPixel over and over again.
open System.Drawing
open System.Drawing.Imaging
let pixels (image:Bitmap) =
let Width = image.Width
let Height = image.Height
let rect = new Rectangle(0,0,Width,Height)
// Lock the image for access
let data = image.LockBits(rect, ImageLockMode.ReadOnly, image.PixelFormat)
// Copy the data
let ptr = data.Scan0
let stride = data.Stride
let bytes = stride * data.Height
let values : byte[] = Array.zeroCreate bytes
System.Runtime.InteropServices.Marshal.Copy(ptr,values,0,bytes)
// Unlock the image
image.UnlockBits(data)
let pixelSize = 4 // <-- calculate this from the PixelFormat
// Create and return a 3D-array with the copied data
Array3D.init 3 Width Height (fun i x y ->
values.[stride * y + x * pixelSize + i])
(adopted from the C# sample on Bitmap.LockBits)
What do you mean by lazy?
An array is not a lazy data type, which means that if you want to use arrays, you need to load all pixels during the initialization. If we were using single-dimensional array, an alternative would be to use seq<_> which is lazy (but you can access elements only sequentially). There is nothing like seq<_> for multi-dimensional arrays, so you'll need to use something else.
Probably the closest option would be to use three-dimensional array of lazy values (Lazy<int>[,,]). This is an array of delayed thunks that access pixels and are evaluated only when you actually read the value at the location. You could initialize it like this:
for i=0 to Width
for j=0 to Height
let point = lazy image.GetPixel(i,j)
pixels.[0,i,j] <- lazy point.Value.R
pixels.[1,i,j] <- lazy point.Value.G
pixels.[2,i,j] <- lazy point.Value.B
The snippet creates a lazy value that reads the pixel (point) and then three lazy values to get the individual color components. When accessing color component, the point value is evaluated (by accessing Value).
The only difference in the rest of your code is that you'll need to call Value (e.g. pixels.[0,10,10].Value to get the actual color component of the pixel.
You could define more complex data structures (such as your own type that supports indexing and is lazy), but I think that array of lazy values should be a good starting point.
As mentioned already by other comments that you can use the lazy pixel loading in the 3D array but that would just make the GetPixel operation lazy and not the memory allocation of the 3D array as the array is allocated already when you call create method of Array3D.
If you want to make the memory allocation as well as GetPixel lazy then you can use sequences as shown by below code:
let getPixels (bmp:Bitmap) =
seq {
for i = 0 to bmp.Height-1 do
yield seq {
for j = 0 to bmp.Width-1 do
let pixel = bmp.GetPixel(j,i)
yield (pixel.R,pixel.G,pixel.B)
}
}

Resources