How to embed OPA as a lib in a low latency C++ process - open-policy-agent

We are currently evaluating OPA as our main fine-grained-access control engine. Our data path is written in C++ for high performance requirements. I see that it is possible to embed OPA in a GO process, but not sure if this is evaluated in a C++ container.
Are there any existing deployments where OPA was embedded as a library in C++ container?
If we embed OPA as a library, will there be any communication through the network (to other processes or data bases) when policies are evaluated?

For using OPA from C++, there are a few options, ordered roughly by complexity and increasingly unchartered territory:
Use the HTTP API, in a sidecar process or some standalone service. (Obviously now what you're looking for, included for completeness' sake.)
Use Wasm: there is no SDK for C++, but the ABI hopefully isn't too complicated, see the docs.
Embed OPA as a Cgo library: the amount of work is considerable, you'd have to define the surface API, i.e., do the work necessary to re-wrap OPA's core into a library you could link in.
I'd go with trying (1.) first, seeing if it really isn't feasible for your performance requirements (using a Unix socket, profiling the evaluation, having a good look at your policy code...); then I'd reach for Wasm (2.). OPA's Wasm modules contain the compiled recipe for evaluating your policy's logic; there is no interpreter overhead. With (3.), you'd have to do more work than for (2.), and (in my opinion) get less for it.

Related

Are API scripting commands embedded or created externally?

I’m trying to use Lua/Moonsharp API scripting. There’s a command library, but there isn’t a function that I need. My question is, am I able to create my own function, or am I limited to what’s been written?
Specifically, for a software called BobCAD a Lua plugin is available. There are lists of commands like Bcc.SetCamObjParameter, though, there isn’t the command that I need. I’m assuming that some aspects of the BobCAD software are inaccessible to API, but am I limited to the library in this plugin, or can I add my own commands? I thought that there would be some C# file somewhere in the program directory where I can read the functions and possibly learn how to create my own, but I don’t see anything like that. (Or are scripting functions set up internally to the software, and I’m only given access to what has been provided?)
(N.b. I'm not familiar with BobCAD specifically, but this answer should be generally applicable.)
Generally speaking, for use cases like this (where Lua scripting is provided as part of a larger program, or by a plugin for a larger program), it's the developer of the program or plugin who decides what API is available to Lua. In the latter case they may also be limited by what the program's plugin API allows; in the case of BobCAD you are likely limited to what the BobCAD plugin API permits, and then further limited by what parts of that API the developers of the Lua plugin chose to make available to Lua itself.
You can of course write your own functions in Lua, but in terms of actually talking to the host program you are restricted to the API that program makes available, unless they make available some mechanism (like LuaJIT's Foreign Function Interface) for reaching into the program from Lua and calling functions that were not made explicitly available -- which most do not.
As for finding "a C# file somewhere in the program directory" -- C# is a compiled language; C# libraries are generally shipped as pre-compiled CLR binaries (with a .dll extension) and do not contain source code. If the source code for BobCAD and/or the Lua plugin is available you could always modify it and re-compile it, but as BobCAD is commercial, closed-source software, I would not expect it to be available.
At that point, your options are basically:
figure out a way to do what you need with the commands that are available;
try to reverse engineer enough of the program to do what you need without access to the source code;
or look for another program that has the features you need.

Electron with C++ backend - secure?

I have written a UI in Electron and I would like to connect it with my C++ code. However, I will be selling this product and so I would like to know if this makes it easier for people to crack my C++ code? Obviously I know compiled C++ can be cracked anyway, but does this affect it in any way?
Additionally, what is the best way to go about this while preserving maximum possible security?
Thanks.
EDIT: How about this? Is it possible to use c++ as back-end for Electron.js?
EDIT2: To clarify, my Electron app will be showing the status of operations being performed in the C++ program. As such, I will need to send lists, dictionaries, strings etc. from C++ to JS which will then render it. Additionally, buttons on my Electron app need to trigger actions in the C++ code, such as stopping or starting certain parts of the program.
I have written a UI in Electron and I would like to connect it with my C++ code ...
I would like to know if this makes it easier for people to crack my C++ code?
Using electron does not make any meaningful difference for protecting the C++ source code. (Your intellectual property)
The Javascript code running in electron will be very easy to reverse engineer though, which gives users a head start on experimenting with your C++ binary. Using minification and obfuscation tools can at least make that harder.
For the C++ side, connecting C++ to Electron can be done in at least these two ways:
By dynamically linking to a shared library (Node.js C++ Addons)
In this case your C++ API would be functions that get exported by the shared library. There are many tools to inspect shared libraries (DLLs) and view these functions.
By communicating with another process using some sort of Inter-process communication.
In this case your API would depend on the IPC method used. If it was TCP/UDP messages you could use Wireshark to inspect the packets between the processes. There are ways to inspect messages going over any type of IPC.
Either way, your application must be delivered to the end-user with a compiled binary. Preventing reverse engineering of the binary itself is impossible if you actually give the binary to your users.
You should also expect that a savvy end-user will have access to other tools that can inspect the API and implement third-party code that talks to that API.
Additionally, what is the best way to go about this while preserving maximum possible security?
By "maximum possible security", I will assume you are referring to preventing unauthorized use of the C++ code with other applications.
You would need a licensing system that can authenticate the application that is using your C++ binary's API. Explaining what that would be exactly is probably too large of an answer for a Stack Overflow, and you will have to do some research on how licensing systems are implemented.
It may be theoretically impossible to develop a perfect licensing system though. Look at the gaming industry, it takes just a matter of days to for the licensing software become circumvented for every new game that is released. The only software architecture that cracks haven't completely conquered are cloud-based applications, which don't actually deliver compiled code with their business logic to the end-user's computer.

Distributed Dask CPP workers

DASK has a very powerful distributed api. As far as I can understand it can only support though native python code and modules.
Does anyone know if distributed DASK can support c++ workers?
I could not find anything in the docs.
Would there be any other approach apart from adding python bindings to cpp code to use that functionality?
You are correct, if you wanted to call into C++ code using Dask, you would do it by calling from python, which usually means writing some form of binding layer to make the calling convenient. If there is also a C API, you could use ctypes or cffi.
In theory, the scheduler is agnostic of the language of the client and workers, so long as they agree with each other, but no one has implemented a C++ client/worker. This has been done, at leats a POC, for Julia.

WebAssembly: Reconstructing the stack from scratch

By transforming .wasm source files or interacting with a suitable debugger with Javascript it should be possible to serialize the full Wasm execution state (mainly the stack, call frames, local variables etc.).
I wonder if it is possible to reconstruct it using this serialized representation and continue running the program where it was stopped on another machine.
Could current browser runtimes support this?
Not sure what transformation or debugger you have in mind, but your premise that it is possible to serialise JavaScript execution state is false. It would in fact be extremely difficult to implement such a mechanism in browser engines. No production JS engine I'm aware of can even serialise its heap in the general case (even though some, like V8, have a very limited snapshot mechanism for the start-up heap). Let alone the call stack and live function state, which may be in one of many optimisation modes, arbitrarily intermixed with C or assembly stack frames from the runtime or the embedder, and generally is super tricky.
The mechanisation you have in mind would require general serialisation on top of first-class undelimited continuations. TC39, the JavaScript committee, discarded the idea of adding full-blown continuations to the language many years ago because it was deemed too hard and too expensive to implement in most engines (which is why ES6 instead introduced generators as a much more limited mechanism). Edit: Generic serialisation wasn't even ever considered, since it would actually break encapsulation via closures or proxies, and thus all existing security patterns of the language.

Is Dart statically compiled, or is code interpetted at runtime as it's parsed and loaded into the VM?

I'm trying to understand why adding traits to Dart would cause the shape of objects in memory to change, and am therefore curious how it loads in code right now.
Dart is a dynamically typed language that generates its own machine language equivalents straight from source code with no intermediate byte-code step. There is no generic bytecode (like the JVM or llvm) and instead it is directly compiled into machine code.
I would add that despite compiling straight to machine code, the language itself is not designed in a way that would allow a C/C++ style compiler to effectively generate fast efficient code. This is by design as Dart seems to be an attempt to fill the gap between JavaScript and Java rather than the gap between Java and C/C++. Dart addresses many issues that make JavaScript hard to optimize most importantly typing of numeric variables.
There are some efforts to port the Dart environment to various platforms beyond Windows/Mac/Linux but I have yet to see an actual straight to machine language compiler for Dart. That doesn't mean they don't exist, I just haven't seen anything other than ports of the Linux Dart environment onto Beagleboard and other small Linux distros.
From the Dart FAQ
Q. Why didn’t Google build a bytecode VM targetable by multiple
languages including Dart? Each approach has advantages and
disadvantages, but we feel that in the context of Dart it made sense
to build a language-specific VM for the following reasons:
Google already works on a multi-language bytecode: LLVM bitcode in
PNaCl.
Even if a bytecode VM is specialized for Dart, a language VM will be
simpler and faster because it can work under stronger assumptions—for
instance, a structured control flow. These assumptions make the
implementation cleaner and optimizations easier.
A general-purpose bytecode VM would be even larger and slower, as it
generalizes assumptions and adds functionality that for Dart is dead
code: for example, multithreading with a shared heap.
No bytecode VM is truly general-purpose; they all make assumptions
that privilege some class of languages. A language VM leaves more room
to improve the VM and make deep changes to optimization of the
language. Some Dart engineers wrote an article talking about the VM
question in more detail.
A pretty good presentation on Compiling Dart to Efficient
Machine Code

Resources