How to ask Clang++ not to cache function result during -O3 optimization? - clang

This is my code:
int foo(int x) {
return x + 1; // I have more complex code here
}
int main() {
int s = 0;
for (int i = 0; i < 1000000; ++i) {
s += foo(42);
}
}
Without -O3 this code works for a few minutes. With -O3 it returns the same result in no time. Clang++, I believe, caches the value of foo(42) (it's a pure function) and doesn't call it a million times. How can I instruct it NOT to apply this particular optimization for this particular function call?

Out of curiosity, can you share why you would want to disable that optimization?
Anyway, about your question:
In your example code, s is never read after the loop, so the compiler would throw the whole loop away. So let's assume that s is used after the loop.
I'm not aware of any pragmas or compiler options to disable a particular optimization in a particular section of code.
Is changing the code an option?
To prevent that optimization in a portable manner, you can look for a creative way to compute the function call argument in a way such that the compiler is no longer able to treat the argument as constant. Of course the challenge here is to actually use a trick that does not rely on undefined behavior and that cannot be "outsmarted" by a newer compiler version.
See the commented example below.
pro: you use a trick that uses only the language that you can apply selectively
con: you get an additional memory access in every loop iteration; however, the access will be satisfied by your CPU cache most of the time
I verified the generated assembly for your particular example with clang++ -O3 -S. The compiler now generates your loop and no longer caches the result. However, the function gets inlined. If you want to prevent that as well, you can declare foo with __attribute__((noinline)), for example.
int foo(int x) {
return x + 1; // I have more complex code here
}
volatile int dummy = 0; // initialized to 0 and never changed
int main() {
int s = 0;
for (int i = 0; i < 1000000; ++i) {
// Because of the volatile variable, the compiler is forced to assume
// that the function call argument is different for each loop
// iteration and it is no longer able to use a cached result.
s += foo(42 + dummy);
}
}

Related

About extending a Look Up Table at compile time

I'd like to extend my instrumental Profiler in order to avoid it affect too much performances.
Im my current implementation, I'm using a ProfilerHelper taking one string, which is put whereever you want in the profiling f().
The ctor is starting the measurement and the dector is closing it, logging the Delta in an unordered_map entry, which is key is the string.
Now, I'd like to turn all of that into a faster stuff.
First of all, I'd like to create a string LUT (Look Up Table) contaning the f()s names at compile time, and turn the unordered_map to a plain vector which is paired by the string function LUT.
Now the question is: I've managed to create a LUT but std::string_view, but I cannot find a way to extend it at compile time.
A first rought trial sounds like this:
template<unsigned N>
constexpr auto LUT() {
std::array<std::string_view, N> Strs{};
for (unsigned n = 0; n < N; n++) {
Strs[n] = "";
}
return Strs;
};
constexpr std::array<std::string_view, 0> StringsLUT { LUT<0>() };
constexpr auto AddString(std::string_view const& Str)
{
constexpr auto Size = StringsLUT.size();
std::array<std::string_view, Size + 1> Copy{};
for (auto i = 0; i < Size; ++i)
Copy[i] = StringsLUT[i];
Copy[Size] = Str;
return Copy;
};
int main()
{
constexpr auto Strs = AddString(__builtin_FUNCTION());
//for (auto const Str : Strs)
std::cout << Strs[0] << std::endl;
}
So my idea should be to recall the AddString whenever needed in my f()s to be profiled, extending this list at compile time.
But of course I should take the returned Copy and replace the StringsLUT everytime, to land to a final StringsLUT with all the f() names inside it.
Is there a way to do that at compile time?
Sorry, but I'm just entering the magic "new" world of constexpr applied to LUT right in these days.
Tx for your support in advance.

Search for sequence in Uint8List

Is there a fast (native) method to search for a sequence in a Uint8List?
///
/// Return index of first occurrence of seq in list
///
int indexOfSeq(Uint8List list, Uint8List seq) {
...
}
EDIT: Changed List<int> into Uint8List
No. There is no built-in way to search for a sequence of elements in a list.
I am also not aware of any dart:ffi based implementations.
The simplest approach would be:
extension IndexOfElements<T> on List<T> {
int indexOfElements(List<T> elements, [int start = 0]) {
if (elements.isEmpty) return start;
var end = length - elements.length;
if (start > end) return -1;
var first = elements.first;
var pos = start;
while (true) {
pos = indexOf(first, pos);
if (pos < 0 || pos > end) return -1;
for (var i = 1; i < elements.length; i++) {
if (this[pos + i] != elements[i]) {
pos++;
continue;
}
}
return pos;
}
}
}
This has worst-case time complexity O(length*elements.length). There are several more algorithms with better worst-case complexity, but they also have larger constant factors and more expensive pre-computations (KMP, BMH). Unless you search for the same long list several times, or do so in a very, very long list, they're unlikely to be faster in practice (and they'd probably have an API where you compile the pattern first, then search with it.)
You could use dart:ffi to bind to memmem from string.h as you suggested.
We do the same with binding to malloc from stdlib.h in package:ffi (source).
final DynamicLibrary stdlib = Platform.isWindows
? DynamicLibrary.open('kernel32.dll')
: DynamicLibrary.process();
final PosixMalloc posixMalloc =
stdlib.lookupFunction<Pointer Function(IntPtr), Pointer Function(int)>('malloc');
Edit: as lrn pointed out, we cannot expose the inner data pointer of a Uint8List at the moment, because the GC might relocate it.
One could use dart_api.h and use the FFI to pass TypedData through the FFI trampoline as Dart_Handle and use Dart_TypedDataAcquireData from the dart_api.h to access the inner data pointer.
(If you want to use this in Flutter, we would need to expose Dart_TypedDataAcquireData and Dart_TypedDataReleaseData in dart_api_dl.h https://github.com/dart-lang/sdk/issues/40607 I've filed https://github.com/dart-lang/sdk/issues/44442 to track this.)
Alternatively, could address https://github.com/dart-lang/sdk/issues/36707 so that we could just expose the inner data pointer of a Uint8List directly in the FFI trampoline.

How to modify a functions internal variables at runtime and pass it to another function?

Functions in Dart are first-class objects, allowing you to pass them to other objects or functions.
void main() {
var shout = (msg) => ' ${msg.toUpperCase()} ';
print(shout("yo"));
}
This made me wonder if there was a way to modify a function a run time, just like an object, prior to passing it to something else. For example:
Function add(int input) {
return add + 2;
}
If I wanted to make the function a generic addition function, then I would do:
Function add(int input, int increment) {
return add + increment;
}
But then the problem would be that the object I am passing the function to would need to specify the increment. I would like to pass the add function to another object, with the increment specified at run time, and declared within the function body so that the increment cannot be changed by the recipient of the function object.
The answer seems to be to use a lexical closure.
From here: https://dart.dev/guides/language/language-tour#built-in-types
A closure is a function object that has access to variables in its
lexical scope, even when the function is used outside of its original
scope.
Functions can close over variables defined in surrounding scopes. In
the following example, makeAdder() captures the variable addBy.
Wherever the returned function goes, it remembers addBy.
/// Returns a function that adds [addBy] to the
/// function's argument.
Function makeAdder(int addBy) {
return (int i) => addBy + i;
}
void main() {
// Create a function that adds 2.
var add2 = makeAdder(2);
// Create a function that adds 4.
var add4 = makeAdder(4);
assert(add2(3) == 5);
assert(add4(3) == 7);
}
In the above cases, we pass 2 or 4 into the makeAdder function. The makeAdder function uses the parameter to create and return a function object that can be passed to other objects.
You most likely don't need to modify a closure, just the ability to create customized closures.
The latter is simple:
int Function(int) makeAdder(int increment) => (int value) => value + increment;
...
foo(makeAdder(1)); // Adds 1.
foo(makeAdder(4)); // Adds 2.
You can't change which variables a closure is referencing, but you can change their values ... if you an access the variable. For local variables, that's actually hard.
Mutating state which makes an existing closure change behavior can sometimes be appropriate, but those functions should be very precise about how they change and where they are being used. For a function like add which is used for its behavior, changing the behavior is rarely a good idea. It's better to replace the closure in the specific places that need to change behavior, and not risk changing the behavior in other places which happen to depend on the same closure. Otherwise it becomes very important to control where the closure actually flows.
If you still want to change the behavior of an existing global, you need to change a variable that it depends on.
Globals are easy:
int increment = 1;
int globalAdder(int value) => value + increment;
...
foo(globalAdd); // Adds 1.
increment = 2;
foo(globalAdd); // Adds 2.
I really can't recommend mutating global variables. It scales rather badly. You have no control over anything.
Another option is to use an instance variable to hold the modifiable value.
class MakeAdder {
int increment = 1;
int instanceAdd(int value) => value + increment;
}
...
var makeAdder = MakeAdder();
var adder = makeAdder.instanceAdd;
...
foo(adder); // Adds 1.
makeAdder.increment = 2;
foo(adder); // Adds 2.
That gives you much more control over who can access the increment variable. You can create multiple independent mutaable adders without them stepping on each other's toes.
To modify a local variable, you need someone to give you access to it, from inside the function where the variable is visible.
int Function(int) makeAdder(void Function(void Function(int)) setIncrementCallback) {
var increment = 1;
setIncrementCallback((v) {
increment = v;
});
return (value) => value + increment;
}
...
void Function(int) setIncrement;
int Function(int) localAdd = makeAdder((inc) { setIncrement = inc; });
...
foo(localAdd); // Adds 1.
setIncrement(2);
foo(localAdd); // Adds 2.
This is one way of passing back a way to modify the local increment variable.
It's almost always far too complicated an approach for what it gives you, I'd go with the instance variable instead.
Often, the instance variable will actually represent something in your model, some state which can meaningfully change, and then it becomes predictable and understandable when and how the state of the entire model changes, including the functions referring to that model.
Using partial function application
You can use a partial function application to bind arguments to functions.
If you have something like:
int add(int input, int increment) => input + increment;
and want to pass it to another function that expects to supply fewer arguments:
int foo(int Function(int input) applyIncrement) => applyIncrement(10);
then you could do:
foo((input) => add(input, 2); // `increment` is fixed to 2
foo((input) => add(input, 4); // `increment` is fixed to 4
Using callable objects
Another approach would be to make a callable object:
class Adder {
int increment = 0;
int call(int input) => input + increment;
}
which could be used with the same foo function above:
var adder = Adder()..increment = 2;
print(foo(adder)); // Prints: 12
adder.increment = 4;
print(foo(adder)); // Prints: 14

C++ memory issue

I'm currently building a prime number finder, and am having a memory problem:
This may be due to a corruption of the heap, which indicates a bug in PrimeNumbers.exe or any of the DLLs it has loaded.
PS. Please don't say to me if this isn't the way to find prime numbers, I want to figure it out myself!
Code:
// PrimeNumbers.cpp : main project file.
#include "stdafx.h"
#include <vector>
using namespace System;
using namespace std;
int main(array<System::String ^> ^args)
{
Console::WriteLine(L"Until what number do you want to stop?");
signed const int numtstop = Convert::ToInt16(Console::ReadLine());
bool * isvalid = new bool[numtstop];
int allattempts = numtstop*numtstop; // Find all the possible combinations of numbers
for (int currentnumb = 0; currentnumb <= allattempts; currentnumb++) // For each number try to find a combination
{
for (int i = 0; i <= numtstop; i++)
{
for (int tnumb = 0; tnumb <= numtstop; tnumb++)
{
if (i*tnumb == currentnumb)
{
isvalid[currentnumb] = false;
Console::WriteLine("Error");
}
}
}
}
Console::WriteLine(L"\nAll prime number in the range of:" + Convert::ToString(numtstop));
for (int pnts = 0; pnts <= numtstop; pnts++)
{
if (isvalid[pnts] != false)
{
Console::WriteLine(pnts);
}
}
return 0;
}
I don't see the memory problem.
Please help.
You are allocating numtstop booleans, but you index that array using a variable that ranges from zero to numtstop*numtstop. This will be severely out of bounds for all numstop values greater than 1.
You should either allocate more booleans (numtstop*numtstop) or use a different variable to index into isvalid (for example, i, which ranges from 0 to numstop). I am sorry, I cannot be more precise than that because of your request not to comment on your algorithm of finding primes.
P.S. If you would like to read something on the topic of finding small primes, here is a link to a great book by Dijkstra. He teaches you how to construct a program for the first 1000 primes on pages 35..49.
Problem is that you use native C++ in managed C++/CLI code. And use new without delete of course.
`currentnumb` :
is bigger than the size of the array, which is just numtstop. You are probably going out of bound, this might be your issue.
You never delete[] your isvalid local, this is a memory leak.

Evaluating Mathematical Expressions using Lua

In my previous question I was looking for a way of evaulating complex mathematical expressions in C, most of the suggestions required implementing some type of parser.
However one answer, suggested using Lua for evaluating the expression. I am interested in this approach but I don't know anything about Lua.
Can some one with experience in Lua shed some light?
Specifically what I'd like to know is
Which API if any does Lua provide that can evaluate mathematical expressions passed in as a string? If there is no API to do such a thing, may be some one can shed some light on the linked answer as it seemed like a good approach :)
Thanks
The type of expression I'd like to evaluate is given some user input such as
y = x^2 + 1/x - cos(x)
evaluate y for a range of values of x
It is straightforward to set up a Lua interpreter instance, and pass it expressions to be evaluated, getting back a function to call that evaluates the expression. You can even let the user have variables...
Here's the sample code I cooked up and edited into my other answer. It is probably better placed on a question tagged Lua in any case, so I'm adding it here as well. I compiled this and tried it for a few cases, but it certainly should not be trusted in production code without some attention to error handling and so forth. All the usual caveats apply here.
I compiled and tested this on Windows using Lua 5.1.4 from Lua for Windows. On other platforms, you'll have to find Lua from your usual source, or from www.lua.org.
Update: This sample uses simple and direct techniques to hide the full power and complexity of the Lua API behind as simple as possible an interface. It is probably useful as-is, but could be improved in a number of ways.
I would encourage readers to look into the much more production-ready ae library by lhf for code that takes advantage of the API to avoid some of the quick and dirty string manipulation I've used. His library also promotes the math library into the global name space so that the user can say sin(x) or 2 * pi without having to say math.sin and so forth.
Public interface to LE
Here is the file le.h:
/* Public API for the LE library.
*/
int le_init();
int le_loadexpr(char *expr, char **pmsg);
double le_eval(int cookie, char **pmsg);
void le_unref(int cookie);
void le_setvar(char *name, double value);
double le_getvar(char *name);
Sample code using LE
Here is the file t-le.c, demonstrating a simple use of this library. It takes its single command-line argument, loads it as an expression, and evaluates it with the global variable x changing from 0.0 to 1.0 in 11 steps:
#include <stdio.h>
#include "le.h"
int main(int argc, char **argv)
{
int cookie;
int i;
char *msg = NULL;
if (!le_init()) {
printf("can't init LE\n");
return 1;
}
if (argc<2) {
printf("Usage: t-le \"expression\"\n");
return 1;
}
cookie = le_loadexpr(argv[1], &msg);
if (msg) {
printf("can't load: %s\n", msg);
free(msg);
return 1;
}
printf(" x %s\n"
"------ --------\n", argv[1]);
for (i=0; i<11; ++i) {
double x = i/10.;
double y;
le_setvar("x",x);
y = le_eval(cookie, &msg);
if (msg) {
printf("can't eval: %s\n", msg);
free(msg);
return 1;
}
printf("%6.2f %.3f\n", x,y);
}
}
Here is some output from t-le:
E:...>t-le "math.sin(math.pi * x)"
x math.sin(math.pi * x)
------ --------
0.00 0.000
0.10 0.309
0.20 0.588
0.30 0.809
0.40 0.951
0.50 1.000
0.60 0.951
0.70 0.809
0.80 0.588
0.90 0.309
1.00 0.000
E:...>
Implementation of LE
Here is le.c, implementing the Lua Expression evaluator:
#include <lua.h>
#include <lauxlib.h>
#include <stdlib.h>
#include <string.h>
static lua_State *L = NULL;
/* Initialize the LE library by creating a Lua state.
*
* The new Lua interpreter state has the "usual" standard libraries
* open.
*/
int le_init()
{
L = luaL_newstate();
if (L)
luaL_openlibs(L);
return !!L;
}
/* Load an expression, returning a cookie that can be used later to
* select this expression for evaluation by le_eval(). Note that
* le_unref() must eventually be called to free the expression.
*
* The cookie is a lua_ref() reference to a function that evaluates the
* expression when called. Any variables in the expression are assumed
* to refer to the global environment, which is _G in the interpreter.
* A refinement might be to isolate the function envioronment from the
* globals.
*
* The implementation rewrites the expr as "return "..expr so that the
* anonymous function actually produced by lua_load() looks like:
*
* function() return expr end
*
*
* If there is an error and the pmsg parameter is non-NULL, the char *
* it points to is filled with an error message. The message is
* allocated by strdup() so the caller is responsible for freeing the
* storage.
*
* Returns a valid cookie or the constant LUA_NOREF (-2).
*/
int le_loadexpr(char *expr, char **pmsg)
{
int err;
char *buf;
if (!L) {
if (pmsg)
*pmsg = strdup("LE library not initialized");
return LUA_NOREF;
}
buf = malloc(strlen(expr)+8);
if (!buf) {
if (pmsg)
*pmsg = strdup("Insufficient memory");
return LUA_NOREF;
}
strcpy(buf, "return ");
strcat(buf, expr);
err = luaL_loadstring(L,buf);
free(buf);
if (err) {
if (pmsg)
*pmsg = strdup(lua_tostring(L,-1));
lua_pop(L,1);
return LUA_NOREF;
}
if (pmsg)
*pmsg = NULL;
return luaL_ref(L, LUA_REGISTRYINDEX);
}
/* Evaluate the loaded expression.
*
* If there is an error and the pmsg parameter is non-NULL, the char *
* it points to is filled with an error message. The message is
* allocated by strdup() so the caller is responsible for freeing the
* storage.
*
* Returns the result or 0 on error.
*/
double le_eval(int cookie, char **pmsg)
{
int err;
double ret;
if (!L) {
if (pmsg)
*pmsg = strdup("LE library not initialized");
return 0;
}
lua_rawgeti(L, LUA_REGISTRYINDEX, cookie);
err = lua_pcall(L,0,1,0);
if (err) {
if (pmsg)
*pmsg = strdup(lua_tostring(L,-1));
lua_pop(L,1);
return 0;
}
if (pmsg)
*pmsg = NULL;
ret = (double)lua_tonumber(L,-1);
lua_pop(L,1);
return ret;
}
/* Free the loaded expression.
*/
void le_unref(int cookie)
{
if (!L)
return;
luaL_unref(L, LUA_REGISTRYINDEX, cookie);
}
/* Set a variable for use in an expression.
*/
void le_setvar(char *name, double value)
{
if (!L)
return;
lua_pushnumber(L,value);
lua_setglobal(L,name);
}
/* Retrieve the current value of a variable.
*/
double le_getvar(char *name)
{
double ret;
if (!L)
return 0;
lua_getglobal(L,name);
ret = (double)lua_tonumber(L,-1);
lua_pop(L,1);
return ret;
}
Remarks
The above sample consists of 189 lines of code total, including a spattering of comments, blank lines, and the demonstration. Not bad for a quick function evaluator that knows how to evaluate reasonably arbitrary expressions of one variable, and has rich library of standard math functions at its beck and call.
You have a Turing-complete language underneath it all, and it would be an easy extension to allow the user to define complete functions as well as to evaluate simple expressions.
Since you're lazy, like most programmers, here's a link to a simple example that you can use to parse some arbitrary code using Lua. From there, it should be simple to create your expression parser.
This is for Lua users that are looking for a Lua equivalent of "eval".
The magic word used to be loadstring but it is now, since Lua 5.2, an upgraded version of load.
i=0
f = load("i = i + 1") -- f is a function
f() ; print(i) -- will produce 1
f() ; print(i) -- will produce 2
Another example, that delivers a value :
f=load('return 2+3')
print(f()) -- print 5
As a quick-and-dirty way to do, you can consider the following equivalent of eval(s), where s is a string to evaluate :
load(s)()
As always, eval mechanisms should be avoided when possible since they are expensive and produce a code difficult to read.
I personally use this mechanism with LuaTex/LuaLatex to make math operations in Latex.
The Lua documentation contains a section titled The Application Programming Interface which describes how to call Lua from your C program. The documentation for Lua is very good and you may even be able to find an example of what you want to do in there.
It's a big world in there, so whether you choose your own parsing solution or an embeddable interpreter like Lua, you're going to have some work to do!
function calc(operation)
return load("return " .. operation)()
end

Resources