fscheck doesn't generate random enough data - f#

I'm playing with FsCheck so I have this implementation:
let add a b =
if a > 100
then failwith "nasty bug"
else a + b
...and this FsCheck based test:
fun (a:int) -> (add a 0) = a
|> Check.QuickThrowOnFailure
and the test never fails. My guess is that the 100 values produced by the random generator are never bigger than 100.
Shouldn't the values be more "random"?

When you use Check.QuickThrowOnFailure, it uses the configuration Config.QuickThrowOnFailure, which has these values:
> Config.QuickThrowOnFailure;;
val it : Config =
{MaxTest = 100;
MaxFail = 1000;
Replay = null;
Name = "";
StartSize = 1;
EndSize = 100;
QuietOnSuccess = false;
Every = <fun:get_Quick#342>;
EveryShrink = <fun:get_Quick#343-1>;
Arbitrary = [];
Runner = <StartupCode$FsCheck>.$Runner+get_throwingRunner#355;}
The important values to consider here are StartSize, but particularly EndSize. Some of the generators in FsCheck uses the size context to determine the size or range of values it generates.
If you change the EndSize to e.g. 1,000 you can make your test fail:
> Check.One({Config.QuickThrowOnFailure with EndSize = 1000}, fun (a:int) -> (add a 0) = a);;
System.Exception: Falsifiable, after 15 tests (0 shrinks) (StdGen (1912816373,296229213)):
Original:
101
with exception:
> System.Exception: nasty bug
at FSI_0040.add(Int32 a, Int32 b)
at FSI_0055.it#69-6.Invoke(Int32 a)
at FsCheck.Testable.evaluate[a,b](FSharpFunc`2 body, a a) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Testable.fs:line 161
at <StartupCode$FsCheck>.$Runner.get_throwingRunner#365-1.Invoke(String message) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Runner.fs:line 365
at <StartupCode$FsCheck>.$Runner.get_throwingRunner#355.FsCheck-IRunner-OnFinished(String , TestResult ) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Runner.fs:line 365
at FsCheck.Runner.check[a](Config config, a p) in C:\Users\Kurt\Projects\FsCheck\FsCheck\src\FsCheck\Runner.fs:line 275
at <StartupCode$FSI_0055>.$FSI_0055.main#()
Stopped due to error

Related

Having trouble converting some C# code to F# when having to do counts

I have the following C# code:
(no need to understand the details of it, it's just to illustrate the question)
long VolumeBeforePrice = 0;
long Volume = 0;
var ContractsCount = 0.0;
var VolumeRequested = Candle.ConvertVolumes(MinVolume);
// go through all entries
foreach (var B in Entries)
{
// can we add the whole block?
if (Volume + B.VolumeUSD <= VolumeRequested)
{
// yes, add the block and calculate the number of contracts
Volume += B.VolumeUSD;
ContractsCount += B.VolumeUSD / B.PriceUSD;
}
else
{
// no, we need to do a partial count
var Difference = VolumeRequested - Volume;
ContractsCount += Difference / B.PriceUSD;
Volume = VolumeRequested; // we reached the max
}
VolumeBeforePrice += B.VolumeUSD;
if (Volume >= VolumeRequested) break;
}
it goes through entries of a trading order book and calculates the number of contracts available for a specific usd amount.
the logic is quite simple: for each entry there is a block of contracts at a given price, so it either adds the whole block, or it will add a partial block if it doesn't fit within the request.
I am trying to move this to F# and I am encountering some problems since I'm new to the language:
this is a partial implementation:
let mutable volume = 0L
let mutable volumeBeforePrice = 0L
let mutable contractsCount = 0.0
entries |> List.iter (fun e ->
if volume + e.VolumeUSD <= volumeRequested then
volume <- volume + e.VolumeUSD;
contractsCount <- contractsCount + float(e.VolumeUSD) / e.PriceUSD
else
let difference = volumeToTrade - volume
contractsCount <- contractsCount + difference / B.PriceUSD
volume = volumeRequested // this is supposed to trigger an exit on the test below, in C#
)
And I stopped there because it doesn't look like a very F# way to do this :)
So, my question is: how can I structure the List.iter so that:
- I can use counters from one iteration to the next? like sums and average passed to the next iteration
- I can exit the loop when I reached a specific condition and skip the last elements?
I'd avoid using mutable and use a pure function. For example, you could define a record for your result, e.g Totals (you may have a more meaningful name):
type Totals =
{ VolumeBeforePrice : int64
Volume : int64
ContractsCount : float }
And then you can create a function that takes the current totals and an entry as input, and returns a new totals as its result. I've annotated the function below with types for clarity, but these could be removed as they'd be inferred:
let addEntry (volumeRequested:int64) (totals:Totals) (entry:Entry) : Totals =
if totals.Volume >= volumeRequested then
totals
elif totals.Volume + entry.VolumeUSD <= volumeRequested then
{ Volume = totals.Volume + entry.VolumeUSD
ContractsCount = totals.ContractsCount + float entry.VolumeUSD / entry.PriceUSD
VolumeBeforePrice = totals.VolumeBeforePrice + entry.VolumeUSD }
else
let diff = volumeRequested - totals.Volume
{ Volume = volumeRequested
ContractsCount = totals.ContractsCount + float diff / entry.PriceUSD
VolumeBeforePrice = totals.VolumeBeforePrice + entry.VolumeUSD }
Now you can iterate the list passing in the last total each time. Fortunately, there's a built in function List.fold that does this. You can read more about folds on F# for fun and profit.
let volumeRequested = Candle.ConvertVolumes(minVolume)
let zero =
{ VolumeBeforePrice = 0L
Volume = 0L
ContractsCount = 0. }
let result = entries |> List.fold (addEntry volumeRequested) zero
Note that this will give you the correct result, but it does always iterate all entries. Whether this is acceptable likely depends on the size of the entries list. If you want to avoid this, you'd need to use recursion. Something like this:
let rec calculateTotals (volumeRequested:int64) (totals:Totals) (entries:Entry list) : Totals =
if totals.Volume >= volumeRequested then
totals
else
match entries with
| [] -> totals
| entry::remaining ->
let newTotals =
if totals.Volume + entry.VolumeUSD <= volumeRequested then
{ Volume = totals.Volume + entry.VolumeUSD
ContractsCount = totals.ContractsCount + float entry.VolumeUSD / entry.PriceUSD
VolumeBeforePrice = totals.VolumeBeforePrice + entry.VolumeUSD }
else
let diff = volumeRequested - totals.Volume
{ Volume = volumeRequested
ContractsCount = totals.ContractsCount + float diff / entry.PriceUSD
VolumeBeforePrice = totals.VolumeBeforePrice + entry.VolumeUSD }
calculateTotals volumeRequested newTotals remaining
let result = calculateTotals volumeRequested zero entries

libsvm-java throws NullPointerException after a few iteration of training

I am using libsvm java package for a sentence classification task. I have 3 classes. Every sentence is represented as a vector of size 435. The format of vector_file is as follows:
1 0 0.12 0 0.5 0.24 0.32 0 0 0 ... 0.43 0 First digit indicates class label and remaining is the vector.
The following is how I am making the svm_problem:
public void makeSvmProb(ArrayList<Float> inputVector,float label,int p){
// p is 0 to 77 (total training sentences)
int idx=0,count=0;
svm_prob.y[p]=label;
for(int i=0;i<inputVector.size();i++){
if(inputVector.get(i)!=0) {
count++; // To get the count of non-zero values
}
}
svm_node[] x = new svm_node[count];
for(int i=0;i<inputVector.size();i++){
if(inputVector.get(i)!=0){
x[idx] = new svm_node();
x[idx].index = i;
x[idx].value = inputVector.get(i);
idx++;
}
}
svm_prob.x[p]=x;
}
Parameter settings:
param.svm_type = svm_parameter.C_SVC;
param.kernel_type = svm_parameter.RBF;
param.degree = 3;
param.gamma = 0.5;
param.coef0 = 0;
param.nu = 0.5;
param.cache_size = 40;
param.C = 1;
param.eps = 1e-3;
param.p = 0.1;
param.shrinking = 1;
param.probability = 0;
param.nr_weight = 0;
param.weight_label = new int[0];
param.weight = new double[0];
While executing the program, After 2 iterations, I am getting a NullPointerException. I couldn't figure out what is going wrong.
This is the error coming:
optimization finished, #iter = 85
nu = 0.07502654779820772
obj = -15.305162227093849, rho = -0.03157808477381625
nSV = 47, nBSV = 1
*
optimization finished, #iter = 88
nu = 0.08576821199868506
obj = -17.83925196551639, rho = 0.1297986754900152
nSV = 51, nBSV = 3
Exception in thread "main" java.lang.NullPointerException
at libsvm.Kernel.dot(svm.java:207)
at libsvm.Kernel.<init>(svm.java:199)
at libsvm.SVC_Q.<init>(svm.java:1156)
at libsvm.svm.solve_c_svc(svm.java:1333)
at libsvm.svm.svm_train_one(svm.java:1510)
at libsvm.svm.svm_train(svm.java:2067)
at SvmOp.<init>(SvmOp.java:130)
at Main.main(Main.java:8)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Any idea on what is going wrong?
The NullPointerException is thrown in Line 207 in svm.class. Investigating the source code shows:
static double dot(svm_node[] x, svm_node[] y)
{
double sum = 0;
int xlen = x.length;
...
}
Line 207 is int xlen = x.length;. So in this case, we see, that one of your svm_node (or vectors) is null.
For this reason, we cannot really help you here, as we would need more information / source code to debug it.
I would go for the following strategy:
Investigate the svm_node objects after you completed the building of the svm_problem in a debugger and look for null values.
Check the build process of your svm_problem. The problem might be there.
An other possibility would be to change your data-format and be compliant to the official LIBSVM format:
As stated in the documentation, the data format uses sparse-data format and should be like that:
<label> 0:i 1:K(xi,x1) ... L:K(xi,xL)
The ascending integer refers to the attribute or feature id, which is necessary for internal representation of the vector.
I previously replied to a similar question here and added an example for the data format.
This format can be read out of the box as the code to construct the svm_problem is included in the library.

F# Function where x is divisible by 2 or 3 but not 5

I have a function that determines whether a value is divisible by 2 or 3, but **NOT** 5:
let ttnf x =
if (x % 2 = 0) || (x % 3 = 0) && not(x % 5 = 0) then true
else
false
I'm getting a weird response from Visual Studio 2015 in the interactive panel.
I execute the above code in the F# interactive panel then enter say...
ttnf 15
Hit enter, nothing...
hit alt + enter then it returns it on the second time.
Any idea why it isn't returning true/false from entering:
ttnf 15
The first time?
Thanks.
#ildjarn commented about the error in your code, but about F# interactive's behavior: when you type code directly into fsi, you need to terminate each declaration with ;; to tell fsi to interpret it, otherwise it will just wait for you to continue your input (as you experienced). So:
> let ttnf x =
if (x % 2 = 0 || x % 3 = 0) && not(x % 5 = 0) then true
else
false;;
val ttnf : x:int -> bool
> ttnf 15;;
val it : bool = false
>

Why fsharp autogenerated gethashcode generates too many collisions?

In our fsharp code autogenerated gethashcode implementation shows very bad performance and big collisions rate. Is it a problem in fsharp implementation of gethashcode generator or just an edge case?
open System
open System.Collections.Generic
let check keys e name =
let dict = new Dictionary<_,_>(Array.length keys, e)//, HashIdentity.Structural)
let stopWatch = System.Diagnostics.Stopwatch.StartNew()
let add k = dict.Add(k, 1.02)
Array.iter add keys
stopWatch.Stop()
let hsahes = new HashSet<int>()
let add_hash x = hsahes.Add(e.GetHashCode(x)) |> not
let collisions = Array.filter add_hash keys |> Array.length
printfn "%s %f sec %f collisions" name stopWatch.Elapsed.TotalSeconds (double(collisions) / double(keys.Length))
type StructTuple<'T,'T2> =
struct
val fst: 'T
val snd : 'T2
new(fst: 'T, snd : 'T2) = {fst = fst; snd = snd}
end
let bad_keys = seq{
let rnd = new Random();
while true do
let j = uint32(rnd.Next(0, 3346862))
let k = uint16 (rnd.Next(0, 658))
yield StructTuple(j,k)
}
let good_keys = seq{
for k in 0us..658us do
for j in 0u.. 3346862u do
yield StructTuple(j,k)
}
module CmpHelpers =
let inline combine (h1:int) (h2:int) = (h1 <<< 5) + h1 ^^^ h2;
type StructTupleComparer<'T,'T2>() =
let cmparer = EqualityComparer<Object>.Default
interface IEqualityComparer<StructTuple<'T,'T2>> with
member this.Equals (a,b) = cmparer.Equals(a.fst, b.fst) && cmparer.Equals(a.snd, b.snd)
member this.GetHashCode (x) = CmpHelpers.combine (cmparer.GetHashCode(x.fst)) (cmparer.GetHashCode(x.snd))
type AutoGeneratedStructTupleComparer<'T,'T2>() =
let cmparer = LanguagePrimitives.GenericEqualityComparer
interface IEqualityComparer<StructTuple<'T,'T2>> with
member this.Equals (a:StructTuple<'T,'T2>,b:StructTuple<'T,'T2>) =
LanguagePrimitives.HashCompare.GenericEqualityERIntrinsic<'T> a.fst b.fst
&& LanguagePrimitives.HashCompare.GenericEqualityERIntrinsic<'T2> a.snd b.snd
member this.GetHashCode (x:StructTuple<'T,'T2>) =
let mutable num = 0
num <- -1640531527 + (LanguagePrimitives.HashCompare.GenericHashWithComparerIntrinsic<'T2> cmparer x.snd + ((num <<< 6) + (num >>> 2)))
-1640531527 + (LanguagePrimitives.HashCompare.GenericHashWithComparerIntrinsic<'T> cmparer x.fst + ((num <<< 6) + (num >>> 2)));
let uniq (sq:seq<'a>) = Array.ofSeq (new HashSet<_>(sq))
[<EntryPoint>]
let main argv =
let count = 15000000
let keys = good_keys |> Seq.take count |> uniq
printfn "good keys"
check keys (new StructTupleComparer<_,_>()) "struct custom"
check keys HashIdentity.Structural "struct auto"
check keys (new AutoGeneratedStructTupleComparer<_,_>()) "struct auto explicit"
let keys = bad_keys |> Seq.take count |> uniq
printfn "bad keys"
check keys (new StructTupleComparer<_,_>()) "struct custom"
check keys HashIdentity.Structural "struct auto"
check keys (new AutoGeneratedStructTupleComparer<_,_>()) "struct auto explicit"
Console.ReadLine() |> ignore
0 // return an integer exit code
output
good keys
struct custom 1.506934 sec 0.000000 collisions
struct auto 4.832881 sec 0.776863 collisions
struct auto explicit 3.166931 sec 0.776863 collisions
bad keys
struct custom 3.631251 sec 0.061893 collisions
struct auto 10.340693 sec 0.777034 collisions
struct auto explicit 8.893612 sec 0.777034 collisions
I am no expert on the overall algorithm used to produce auto-generated Equals and GetHashCode, but it just seems to produce something non-optimal here. I don't know offhand if that is normal for a general-purpose auto-generated implementation, or if there are practical ways of auto-generating close-to-optimal implementations reliably.
It's worth noting that if you just use the standard tuple, the autogenerated hashing and comparison give the same collision rate and equal performance as your custom implementation. And using the latest F# 4.0 bits (where there has recently been a significant perf improvement in this area), the autogenerated stuff becomes significantly faster than the custom implementation.
My numbers:
// F# 3.1, struct tuples
good keys
custom 0.951254 sec 0.000000 collisions
auto 2.737166 sec 0.776863 collisions
bad keys
custom 2.923103 sec 0.061869 collisions
auto 7.706678 sec 0.777040 collisions
// F# 3.1, standard tuples
good keys
custom 0.995701 sec 0.000000 collisions
auto 0.965949 sec 0.000000 collisions
bad keys
custom 3.091821 sec 0.061869 collisions
auto 2.924721 sec 0.061869 collisions
// F# 4.0, standard tuples
good keys
custom 1.018672 sec 0.000000 collisions
auto 0.619066 sec 0.000000 collisions
bad keys
custom 3.082988 sec 0.061869 collisions
auto 1.829720 sec 0.061869 collisions
Opened issue in fsharp issue tracker. Accepted as a bug https://github.com/fsharp/fsharp/issues/343

Adding Overloaded Constructors to Implicit F# Type

I have created the following type using implicit type construction:
open System
type Matrix(sourceMatrix:double[,]) =
let rows = sourceMatrix.GetUpperBound(0) + 1
let cols = sourceMatrix.GetUpperBound(1) + 1
let matrix = Array2D.zeroCreate<double> rows cols
do
for i in 0 .. rows - 1 do
for j in 0 .. cols - 1 do
matrix.[i,j] <- sourceMatrix.[i,j]
//Properties
///The number of Rows in this Matrix.
member this.Rows = rows
///The number of Columns in this Matrix.
member this.Cols = cols
///Indexed Property for this matrix.
member this.Item
with get(x, y) = matrix.[x, y]
and set(x, y) value =
this.Validate(x,y)
matrix.[x, y] <- value
//Methods
/// Validate that the specified row and column are inside of the range of the matrix.
member this.Validate(row, col) =
if(row >= this.Rows || row < 0) then raise (new ArgumentOutOfRangeException("row is out of range"))
if(col >= this.Cols || col < 0) then raise (new ArgumentOutOfRangeException("column is out of range"))
However now I need to add the following overloaded constructor to this type (which is in C# here):
public Matrix(int rows, int cols)
{
this.matrix = new double[rows, cols];
}
The problem that I have is that it seems any overloaded constructors in an implicit type must have a parameter list that is a subset of the first constructor. Obviously the constructor I want to add does not meet this requirement. Is there any way to do this using implicit type construction? Which way should I do this? I'm pretty new to F# so if you could show the whole type with your changes in it I would greatly appreciate it.
Thanks in advance,
Bob
P.S. If you have any other suggestions to make my class more in the functional style please feel free to comment on that as well.
I would probably just do this:
type Matrix(sourceMatrix:double[,]) =
let matrix = Array2D.copy sourceMatrix
let rows = (matrix.GetUpperBound 0) + 1
let cols = (matrix.GetUpperBound 1) + 1
new(rows, cols) = Matrix( Array2D.zeroCreate rows cols )
unless we are talking about very large arrays which are created very often (i.e. copying the empty array becomes a performance bottleneck).
If you want to emulate the C# version, you need an explicit field that can be accessed from both constructors, like so:
type Matrix(rows,cols) as this =
[<DefaultValue>]
val mutable matrix : double[,]
do this.matrix <- Array2D.zeroCreate rows cols
new(source:double[,]) as this =
let rows = source.GetUpperBound(0) + 1
let cols = source.GetUpperBound(1) + 1
Matrix(rows, cols)
then
for i in 0 .. rows - 1 do
for j in 0 .. cols - 1 do
this.matrix.[i,j] <- source.[i,j]
BTW, there is also a matrix type in the F# PowerPack.

Resources