Possible F# Interactive PInvoke bug - f#

While trying to prove to a colleague that it's possible to use C++ classes from F#, I came up with the following proof of concept. The first snippet is the code he provided for the challenge, and the code snippet below is my implementation in F#.
namespace testapp {
struct trivial_foo {
int bar;
__declspec(dllexport) void set(int n) { bar = n; }
__declspec(dllexport) int get() { return bar; }
}
}
open System.Runtime.InteropServices
type TrivialFoo =
struct
val bar: int
new(_bar: int) = { bar = _bar }
end
[<DllImport("Win32Project2.dll", EntryPoint="?get#trivial_foo#testapp##QAEHXZ", CallingConvention = CallingConvention.ThisCall)>]
extern int trivial_foo_get(TrivialFoo& trivial_foo)
[<DllImport("Win32Project2.dll", EntryPoint="?set#trivial_foo#testapp##QAEXH#Z", CallingConvention = CallingConvention.ThisCall)>]
extern void trivial_foo_set(TrivialFoo& trivial_foo, int bar)
type TrivialFoo with
member this.Get() = trivial_foo_get(&this)
member this.Set(bar) = trivial_foo_set(&this, bar)
When debugged in Visual Studio or run as a standalone program, this works predictably: TrivialFoo.Get returns the value of bar and TrivialFoo.Set assigns to it. When run from F# Interactive however, TrivialFoo.Set will not set the field. I suspect it might have something to do with accessing managed memory from unmanaged code, but that doesn't explain why it only happens when using F# Interactive. Does anyone know what's going on here?

I don't think this proof of concept is a good proof of interoperability. You may be better off creating DLL export definitions from your C++ project and use the de-decorated names instead.
As a PoC: F# creates MSIL that fits in the CLI, so it can interoperate with any other CLI language out there. If that is not enough and you want native-to-net interop, consider using COM, or as mentioned above, DLL export definitions on your C++. I personally wouldn't try to interop with C++ class definitions the way you suggest here, there are way easier ways to do that.
Alternatively, just change your C++ project into a .NET C++ project and you can access the classes directly from F#, while still having the power of C++.
Naturally, you may still be wondering why the example doesn't run in FSI. You can see a hint of an answer by running the following:
> System.IO.Directory.GetCurrentDirectory();;
val it : string = "R:\TMP"
To fix this, you have a myriad of options:
copy Win32Project2.dll to that directory
add whatever path it is in to PATH
use an absolute path
use a compile-time constant
or use an environment variable (the path will be expanded)
dynamically locate the dll and dynamically bind to it (complex)
Copying is probably the easiest of these solutions.
Since FSI is meant to be a REPL, it may not be best tailored for this kind of tasks that require multiple projects, libraries or otherwise complex configurations. You may consider voting on this FSI request for support for #package to import NuGet packages, which could be used to ease such tasks.

The counterpart of a C++ struct in F# is not necessarily a struct. In C++, the only difference between classes and structs resides in their default access restrictions.
In F#, structs are used for value types, classes are used for reference types. One problem with value types is that they are meant to be used as immutable values, and temporary copies are often created silently.
The problem you are observing is consistent with that scenario. For some reason, F# interactive creates a copy of your struct and passes a reference to that. The C++ code then modifies the copy, leaving the original untouched.
If you switch to using a class, make sure you pin the instance before letting native code use it, or you can end up in a situation where the garbage collector moves it after the native code gets a reference to it.

Related

Are there extension points in the Clang static libraries that I can leverage to add metadata to functions?

I'm working on a project that parses C headers with the Clang static libraries (not libclang) for binary program analysis purposes. I consume the AST and do not rely on codegen.
I would love to be able to add some additional metadata to functions by modifying the header files that they appear in. Specifically, I'd like to add the expected address of the function in the binary program. This could be one way to do it:
int foo(int bar) __attribute__((address(0xdeadbeef)));
Attributes are especially convenient because they are easy to embed in a macro, which could end up looking like:
#define ADDRESS(x) __attribute__((address(x)))
int foo(int bar) ADDRESS(0xdeadbeef);
Since this is an open-source project maintained by me only, I would prefer to not fork Clang, and just allow people to build my program with their distributions' stock Clang packages. Sadly, this apparently means that I can't use attributes for this purpose, because the attribute list appears to be hard-coded into Clang, and there doesn't seem to be any attribute that I can co-opt for this purpose.
Are there extension points that I can leverage in the Clang static libraries to allow this?
Per Eli Friedman, the annotate attribute can be used for this purpose. An example usage would be:
int foo(int bar) __attribute__((annotate("address:0xdeadbeef")));

Dynamically modify symbol table at runtime (in C)

Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
My eventual goal is the following:
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Note: this is different from LD_PRELOAD which would override malloc during the entire program execution.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
In theory this is possible, but in practice it's too hard to do.
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Modifying symbol table (even if it were possible) would not get you to your desired goal.
All calls from anywhere inside your ELF binary (let's assume foo is in the main executable), resolve to the same PLT import slot malloc#plt. That slot is resolved to glibc malloc on the first call (from anywhere in your program, assuming you are not using LD_BIND_NOW=1 or similar). After that slot has been resolved, any further modification to the symbol table will have no effect.
You didn't say how much control over foo you have.
If you can recompile it, the problem becomes trivial:
#define malloc my_malloc
int foo() {
// same code as before
}
#undef malloc
If you are handed a precompiled foo.o, you are linking it with my_malloc.o, and you want to redirect all calls from inside foo.o from malloc to my_malloc, that's actually quite simple to do at the object level (i.e. before final link).
All you have to do is go through foo.o relocation records, and change the ones that say "put address of external malloc here" to "put address of external my_malloc here".
If foo.o contains additional functions besides foo, it's quite simple to limit the relocation rewrite to just the relocations inside foo.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
Yes, it is not easy, but the functionality can be packaged into a library, so at the end of the day, it can be made practical.
Typemock Isolator++
(https://www.typemock.com/isolatorpp-product-page/isolate-pp/)
This is free-to-use, but closed source solution. The usage example from documentation should be instructive
TEST_F(IsolatorPPTests, IsExpired_YearIs2018_ReturnTrue) {
Product product;
// Prepare a future time construct
tm* fakeTime = new tm();
fakeTime->tm_year = 2018;
// Fake the localtime method
FAKE_GLOBAL(localtime);
// Replace the returned value when the method is called
// with the fake value.
WHEN_CALLED(localtime(_)).Return(fakeTime);
ASSERT_TRUE(product.IsExpired());
}
Other libraries of this kind
Mimick, from Q: Function mocking in C?
cpp-stub, from Q: Creating stub functionality in C++
Elfspy, for C++, but sometimes it's OK to test C code from C++ unittests, from Q: C++ mock framework capable of mocking non-virtual methods and C functions
HippoMocks, from Q: Mocking C functions in MSVC (Visual Studio)
the subprojects in https://github.com/coolxv/cpp-stub/tree/master/other
... there is still more, feel free to append ...
Alternate approaches
ld's --wrap option and linker scripts, https://gitlab.com/hedayat/powerfake
various approaches described in answers for Q: Advice on Mocking System Calls
and in answers to Q: How to mock library calls?
Rewrite code to make it testable
This is easier in other languages than C, but still doable even in C. Structure code into small functions without side-effects that can be unit-tested without resorting to trickery, and so on. I like this blog Modularity. Details. Pick One. about the tradeoffs this brings. Personally, I guturally dislike the "sea of small functions and tons of dependency injection" style of code, but I realize that that's the easiest to work with, all things considered.
Excursion to other languages
What you are asking for is trivial to do in Python, with the unittest.mock.patch, or possibly by just assigning the mock into the original function directly, and undoing that at the end of the test.
In Java, there is Mockito/PowerMock, which can be used to replace static methods for the duration of a test. Static methods in Java approximately correspond to regular functions in C.
In Go, there is Mockey, which works similarly to what needs to be done in C. It has similar limitations in that inlining can break this kind of runtime mocking. I am not sure if in C you can hit the issue that very short methods are unmockable because there is not enough space to inject the redirection code; I think more likely not, if all calls go through the Procedure Linkage Table.

Writing CLS compliant code in F#

I am very new to F# and I started to write my functional wrapper on top of OpenGL. I also intend to use it to write a graphics engine which should have interop with all .Net languages.
But it is hard to find information about which code constructs in F# are not CLS compliant.
For example I already know of a few that are not CLS compliant:
static type constrains
tupleless functions
probably 'T list and Seq
maybe even union types
How do I know what features of F# are not CLS compliant?
Use the CLSCompliant attribute. It will guarantee that your code is CLS-Compliant at compile time.
Like this:
module myProject.AssemblyInfo
open System
[<assembly: CLSCompliant(true)>]
do()
Source: Mike-Ward.Net: Learning F#–Assembly Level Attributes
For a more complete discussion of the CLSCompliant attribute, see C# Corner: Making Your Code CLS Compliant

Looking for robust, general op_Dynamic implementation

I've not been able to find a robust, general op_Dynamic implementation: can anyone point me to one? So far searches have only turned up toys or specific purpose implementations, but I'd like to have one on hand which, say, compares in robustness to C#'s default static dynamic implementation (i.e. handle lots / all cases, cache reflection calls) (it's been a while since I've looked at C#'s static dynamic, so forgive me if my assertions about it's abilities are false).
Thanks!
There is a module FSharp.Interop.Dynamic, on nuget that should robustly handle the dynamic operator using the dlr.
It has several advantages over a lot of the snippets out there.
Performance it uses Dynamitey for the dlr call which implements caching and is a .NET Standard Library
Handles methods that return void, you'll get a binding exception if you don't discard results of those.
The dlr handles the case of calling a delegate return by a function automatically, this will also allow you to do the same with an FSharpFunc
Adds an !? prefix operator to handle invoking directly dynamic objects and functions you don't have the type at runtime.
It's open source, Apache license, you can look at the implementation and it includes unit test example cases.
You can never get fully general implementation of the ? operator. The operator can be implemented differently for various types where it may need to do something special depending on the type:
For Dictionary<T, R>, you'd want it to use the lookup function of the dictionary
For the SQL objects in my article you referenced, you want it to use specific SQL API
For unknown .NET objects, you want it to use .NET Reflection
If you're looking for an implementation that uses Reflection, then you can use one I implemented in F# binding for MonoDevelop (available on GitHub). It is reasonably complete and handles property access, method calls as well as static members. (The rest of the linked file uses it heavily to call internal members of F# compiler). It uses Reflection directly, so it is quite slow, but it is quite feature-complete.
Another alternative would be to implement the operator on top of .NET 4.0 Dynamic Language Runtime (so that it would use the same underlying API as dynamic in C# 4). I don't think there is an implementation of this somewhere out there, but here is a simple example how you can get it:
#r "Microsoft.CSharp.dll"
open System
open System.Runtime.CompilerServices
open Microsoft.CSharp.RuntimeBinder
let (?) (inst:obj) name (arg:'T) : 'R =
// Create site (representing dynamic operation for converting result to 'R
let convertSite =
CallSite<Func<CallSite, Object, 'R>>.Create //'
(Binder.Convert(CSharpBinderFlags.None, typeof<'R>, null)) //'
// Create site for the method call with single argument of type 'T
let callSite =
CallSite<Func<CallSite, Object, 'T, Object>>.Create //'
(Binder.InvokeMember
( CSharpBinderFlags.None, name, null, null,
[| CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.None, null);
CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.None, null) |]))
// Run the method and perform conversion
convertSite.Target.Invoke
(convertSite, callSite.Target.Invoke(callSite, inst, arg))
let o = box (new Random())
let a : int = o?Next(10)
This works only for instance method calls with single argument (You can find out how to do this by looking at code generated by C# compiler for dynamic invocations). I guess if you mixed the completeness (from the first one) with the approach to use DLR (in the second one), you'd get the most robust implementation you can get.
EDIT: I also posted the code to F# Snippets. Here is the version using DLR: http://fssnip.net/2U and here is the version from F# plugin (using .NET Reflection): http://fssnip.net/2V

F# mutual recursion between modules

For recursion in F#, existing documentation is clear about how to do it in the special case where it's just one function calling itself, or a group of physically adjacent functions calling each other.
But in the general case where a group of functions in different modules need to call each other, how do you do it?
I don't think there is a way to achieve this in F#. It is usually possible to structure the application in a way that doesn't require this, so perhaps if you described your scenario, you may get some useful comments.
Anyway, there are various ways to workaround the issue - you can declare a record or an interface to hold the functions that you need to export from the module. Interfaces allow you to export polymorphic functions too, so they are probably a better choice:
// Before the declaration of modules
type Module1Funcs =
abstract Foo : int -> int
type Module2Funcs =
abstract Bar : int -> int
The modules can then export a value that implements one of the interfaces and functions that require the other module can take it as an argument (or you can store it in a mutable value).
module Module1 =
// Import functions from Module2 (needs to be initialized before using!)
let mutable module2 = Unchecked.defaultof<Module2Funcs>
// Sample function that references Module2
let foo a = module2.Bar(a)
// Export functions of the module
let impl =
{ new Module1Funcs with
member x.Foo(a) = foo a }
// Somewhere in the main function
Module1.module2 <- Module2.impl
Module2.module1 <- Module1.impl
The initializationcould be also done automatically using Reflection, but that's a bit ugly, however if you really need it frequently, I could imagine developing some reusable library for this.
In many cases, this feels a bit ugly and restructuring the application to avoid recursive references is a better approach (in fact, I find recursive references between classes in object-oriented programming often quite confusing). However, if you really need something like this, then exporting functions using interfaces/records is probably the only option.
This is not supported. One evidence is that, in visual stuido, you need to order the project files correctly for F#.
It would be really rare to recursively call two functions in two different modules.
If this case does happen, you'd better factor the common part of the two functions out.
I don't think that there's any way for functions in different modules to directly refer to functions in other modules. Is there a reason that functions whose behavior is so tightly intertwined need to be in separate modules?
If you need to keep them separated, one possible workaround is to make your functions higher order so that they take a parameter representing the recursive call, so that you can manually "tie the knot" later.
If you were talking about C#, and methods in two different assemblies needed to mutually recursively call each other, I'd pull out the type signatures they both needed to know into a third, shared, assembly. I don't know however how well those concepts map to F#.
Definetely solution here would use module signatures. A signature file contains information about the public signatures of a set of F# program elements, such as types, namespaces, and modules.
For each F# code file, you can have a signature file, which is a file that has the same name as the code file but with the extension .fsi instead of .fs.

Resources