About extending a Look Up Table at compile time - c++17

I'd like to extend my instrumental Profiler in order to avoid it affect too much performances.
Im my current implementation, I'm using a ProfilerHelper taking one string, which is put whereever you want in the profiling f().
The ctor is starting the measurement and the dector is closing it, logging the Delta in an unordered_map entry, which is key is the string.
Now, I'd like to turn all of that into a faster stuff.
First of all, I'd like to create a string LUT (Look Up Table) contaning the f()s names at compile time, and turn the unordered_map to a plain vector which is paired by the string function LUT.
Now the question is: I've managed to create a LUT but std::string_view, but I cannot find a way to extend it at compile time.
A first rought trial sounds like this:
template<unsigned N>
constexpr auto LUT() {
std::array<std::string_view, N> Strs{};
for (unsigned n = 0; n < N; n++) {
Strs[n] = "";
}
return Strs;
};
constexpr std::array<std::string_view, 0> StringsLUT { LUT<0>() };
constexpr auto AddString(std::string_view const& Str)
{
constexpr auto Size = StringsLUT.size();
std::array<std::string_view, Size + 1> Copy{};
for (auto i = 0; i < Size; ++i)
Copy[i] = StringsLUT[i];
Copy[Size] = Str;
return Copy;
};
int main()
{
constexpr auto Strs = AddString(__builtin_FUNCTION());
//for (auto const Str : Strs)
std::cout << Strs[0] << std::endl;
}
So my idea should be to recall the AddString whenever needed in my f()s to be profiled, extending this list at compile time.
But of course I should take the returned Copy and replace the StringsLUT everytime, to land to a final StringsLUT with all the f() names inside it.
Is there a way to do that at compile time?
Sorry, but I'm just entering the magic "new" world of constexpr applied to LUT right in these days.
Tx for your support in advance.

Related

How to make an operation similar to _mm_extract_epi8 with non-immediate input?

What I want is extracting a value from vector using a variable scalar index.
Like _mm_extract_epi8 / _mm256_extract_epi8 but with non-immediate input.
(There are some results in the vector, the one with the given index is found out to be the true result, the rest are discarded)
Especially, if index is in a GPR, the easiest way is probably to store val to memory and then movzx it into another GPR. Sample implementation using C:
uint8_t extract_epu8var(__m256i val, int index) {
union {
__m256i m256;
uint8_t array[32];
} tmp;
tmp.m256 = val;
return tmp.array[index];
}
Godbolt translation (note that a lot of overhead happens for stack alignment -- if you don't have an aligned temporary storage area, you could just vmovdqu instead of vmovdqa): https://godbolt.org/z/Gj6Eadq9r
So far the best option seem to be using _mm_shuffle_epi8 for SSE
uint8_t extract_epu8var(__m128i val, int index) {
return (uint8_t)_mm_cvtsi128_si32(
_mm_shuffle_epi8(val, _mm_cvtsi32_si128(index)));
}
Unfortunately this does not scale well for AVX. vpshufb does not shuffle across lanes. There is a cross lane shuffle _mm256_permutevar8x32_epi32, but the resulting stuff seem to be complicated:
uint8_t extract_epu8var(__m256i val, int index) {
int index_low = index & 0x3;
int index_high = (index >> 2);
return (uint8_t)(_mm256_cvtsi256_si32(_mm256_permutevar8x32_epi32(
val, _mm256_zextsi128_si256(_mm_cvtsi32_si128(index_high))))
>> (index_low << 3));
}

Search for sequence in Uint8List

Is there a fast (native) method to search for a sequence in a Uint8List?
///
/// Return index of first occurrence of seq in list
///
int indexOfSeq(Uint8List list, Uint8List seq) {
...
}
EDIT: Changed List<int> into Uint8List
No. There is no built-in way to search for a sequence of elements in a list.
I am also not aware of any dart:ffi based implementations.
The simplest approach would be:
extension IndexOfElements<T> on List<T> {
int indexOfElements(List<T> elements, [int start = 0]) {
if (elements.isEmpty) return start;
var end = length - elements.length;
if (start > end) return -1;
var first = elements.first;
var pos = start;
while (true) {
pos = indexOf(first, pos);
if (pos < 0 || pos > end) return -1;
for (var i = 1; i < elements.length; i++) {
if (this[pos + i] != elements[i]) {
pos++;
continue;
}
}
return pos;
}
}
}
This has worst-case time complexity O(length*elements.length). There are several more algorithms with better worst-case complexity, but they also have larger constant factors and more expensive pre-computations (KMP, BMH). Unless you search for the same long list several times, or do so in a very, very long list, they're unlikely to be faster in practice (and they'd probably have an API where you compile the pattern first, then search with it.)
You could use dart:ffi to bind to memmem from string.h as you suggested.
We do the same with binding to malloc from stdlib.h in package:ffi (source).
final DynamicLibrary stdlib = Platform.isWindows
? DynamicLibrary.open('kernel32.dll')
: DynamicLibrary.process();
final PosixMalloc posixMalloc =
stdlib.lookupFunction<Pointer Function(IntPtr), Pointer Function(int)>('malloc');
Edit: as lrn pointed out, we cannot expose the inner data pointer of a Uint8List at the moment, because the GC might relocate it.
One could use dart_api.h and use the FFI to pass TypedData through the FFI trampoline as Dart_Handle and use Dart_TypedDataAcquireData from the dart_api.h to access the inner data pointer.
(If you want to use this in Flutter, we would need to expose Dart_TypedDataAcquireData and Dart_TypedDataReleaseData in dart_api_dl.h https://github.com/dart-lang/sdk/issues/40607 I've filed https://github.com/dart-lang/sdk/issues/44442 to track this.)
Alternatively, could address https://github.com/dart-lang/sdk/issues/36707 so that we could just expose the inner data pointer of a Uint8List directly in the FFI trampoline.

How do I find the SourceLocation of the commas between function arguments using libtooling?

My main goal is trying to get macros (or even just the text) before function parameters. For example:
void Foo(_In_ void* p, _Out_ int* x, _Out_cap_(2) int* y);
I need to gracefully handle things like macros that declare parameters (by ignoring them).
#define Example _In_ int x
void Foo(Example);
I've looked at Preprocessor record objects and used Lexer::getSourceText to get the macro names In, Out, etc, but I don't see a clean way to map them back to the function parameters.
My current solution is to record all the macro expansions in the file and then compare their SourceLocation to the ParamVarDecl SourceLocation. This mostly works except I don't know how to skip over things after the parameter.
void Foo(_In_ void* p _Other_, _In_ int y);
Getting the SourceLocation of the comma would work, but I can't find that anywhere.
The title of the questions asks for libclang, but as you use Lexer::getSourceText I assume that it's libTooling. The rest of my answer is viable only in terms of libTooling.
Solution 1
Lexer works on the level of tokens. Comma is also a token, so you can take the end location of a parameter and fetch the next token using Lexer::findNextToken.
Here is a ParmVarDecl (for function parameters) and CallExpr (for function arguments) visit functions that show how to use it:
template <class T> void printNextTokenLocation(T *Node) {
auto NodeEndLocation = Node->getSourceRange().getEnd();
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto NextToken = Lexer::findNextToken(NodeEndLocation, SM, LO);
if (!NextToken) {
return;
}
auto NextTokenLocation = NextToken->getLocation();
llvm::errs() << NextTokenLocation.printToString(SM) << "\n";
}
bool VisitParmVarDecl(ParmVarDecl *Param) {
printNextTokenLocation(Param);
return true;
}
bool VisitCallExpr(CallExpr *Call) {
for (auto *Arg : Call->arguments()) {
printNextTokenLocation(Arg);
}
return true;
}
For the following code snippet:
#define FOO(x) int x
#define BAR float d
#define MINUS -
#define BLANK
void foo(int a, double b ,
FOO(c) , BAR) {}
int main() {
foo( 42 ,
36.6 , MINUS 10 , BLANK 0.0 );
return 0;
}
it produces the following output (six locations for commas and two for parentheses):
test.cpp:6:15
test.cpp:6:30
test.cpp:7:19
test.cpp:7:24
test.cpp:10:17
test.cpp:11:12
test.cpp:11:28
test.cpp:11:43
This is quite a low-level and error-prone approach though. However, you can change the way you solve the original problem.
Solution 2
Clang stores information about expanded macros in its source locations. You can find related methods in SourceManager (for example, isMacroArgExpansion or isMacroBodyExpansion). As the result, you can visit ParmVarDecl nodes and check their locations for macro expansions.
I would strongly advice moving in the second direction.
I hope this information will be helpful. Happy hacking with Clang!
UPD speaking of attributes, unfortunately, you won't have a lot of choices. Clang does ignore any unknown attribute and this behaviour is not tweakable. If you don't want to patch Clang itself and add your attributes to Attrs.td, then you're limited indeed to tokens and the first approach.

Print cv::Mat opencv

I am trying to print cv::Mat which contains my image. However whenever I print the Mat using cout, a 2D array printed into my text file. I want to print one one pixel in one line only. How can i print line wise pixels from cv::Mat.
A generic for_each loop, you could use it to print your data
/**
*#brief implement details of for_each_channel, user should not use this function
*/
template<typename T, typename UnaryFunc>
UnaryFunc for_each_channel_impl(cv::Mat &input, int channel, UnaryFunc func)
{
int const rows = input.rows;
int const cols = input.cols;
int const channels = input.channels();
for(int row = 0; row != rows; ++row){
auto *input_ptr = input.ptr<T>(row) + channel;
for(int col = 0; col != cols; ++col){
func(*input_ptr);
input_ptr += channels;
}
}
return func;
}
use it like
for_each_channel_impl<uchar>(input, 0, [](uchar a){ std::cout<<(size_t)a<<", "; });
you could do some optimization to continuous channel, then it may looks like
/**
*#brief apply stl like for_each algorithm on a channel
*
* #param
* T : the type of the channel(ex, uchar, float, double and so on)
* #param
* channel : the channel need to apply for_each algorithm
* #param
* func : Unary function that accepts an element in the range as argument
*
*#return :
* return func
*/
template<typename T, typename UnaryFunc>
inline UnaryFunc for_each_channel(cv::Mat &input, int channel, UnaryFunc func)
{
if(input.channels() == 1 && input.isContinuous()){
return for_each_continuous_channels<T>(input, func);
}else{
return for_each_channel_impl<T>(input, channel, func);
}
}
This kind of generic loopsave me a lot of times, I hope you find it helpful.If there are
any bugs, or you have better idea, please tell me.
I would like to design some generic algorithms for opencl too, sadly it do not support
template, I hope one day CUDA will become an open standard, or opencl will support template.
This works for any number of channels as long as the channels type are base on byte, non-byte
channel may not work.

Evaluating Mathematical Expressions using Lua

In my previous question I was looking for a way of evaulating complex mathematical expressions in C, most of the suggestions required implementing some type of parser.
However one answer, suggested using Lua for evaluating the expression. I am interested in this approach but I don't know anything about Lua.
Can some one with experience in Lua shed some light?
Specifically what I'd like to know is
Which API if any does Lua provide that can evaluate mathematical expressions passed in as a string? If there is no API to do such a thing, may be some one can shed some light on the linked answer as it seemed like a good approach :)
Thanks
The type of expression I'd like to evaluate is given some user input such as
y = x^2 + 1/x - cos(x)
evaluate y for a range of values of x
It is straightforward to set up a Lua interpreter instance, and pass it expressions to be evaluated, getting back a function to call that evaluates the expression. You can even let the user have variables...
Here's the sample code I cooked up and edited into my other answer. It is probably better placed on a question tagged Lua in any case, so I'm adding it here as well. I compiled this and tried it for a few cases, but it certainly should not be trusted in production code without some attention to error handling and so forth. All the usual caveats apply here.
I compiled and tested this on Windows using Lua 5.1.4 from Lua for Windows. On other platforms, you'll have to find Lua from your usual source, or from www.lua.org.
Update: This sample uses simple and direct techniques to hide the full power and complexity of the Lua API behind as simple as possible an interface. It is probably useful as-is, but could be improved in a number of ways.
I would encourage readers to look into the much more production-ready ae library by lhf for code that takes advantage of the API to avoid some of the quick and dirty string manipulation I've used. His library also promotes the math library into the global name space so that the user can say sin(x) or 2 * pi without having to say math.sin and so forth.
Public interface to LE
Here is the file le.h:
/* Public API for the LE library.
*/
int le_init();
int le_loadexpr(char *expr, char **pmsg);
double le_eval(int cookie, char **pmsg);
void le_unref(int cookie);
void le_setvar(char *name, double value);
double le_getvar(char *name);
Sample code using LE
Here is the file t-le.c, demonstrating a simple use of this library. It takes its single command-line argument, loads it as an expression, and evaluates it with the global variable x changing from 0.0 to 1.0 in 11 steps:
#include <stdio.h>
#include "le.h"
int main(int argc, char **argv)
{
int cookie;
int i;
char *msg = NULL;
if (!le_init()) {
printf("can't init LE\n");
return 1;
}
if (argc<2) {
printf("Usage: t-le \"expression\"\n");
return 1;
}
cookie = le_loadexpr(argv[1], &msg);
if (msg) {
printf("can't load: %s\n", msg);
free(msg);
return 1;
}
printf(" x %s\n"
"------ --------\n", argv[1]);
for (i=0; i<11; ++i) {
double x = i/10.;
double y;
le_setvar("x",x);
y = le_eval(cookie, &msg);
if (msg) {
printf("can't eval: %s\n", msg);
free(msg);
return 1;
}
printf("%6.2f %.3f\n", x,y);
}
}
Here is some output from t-le:
E:...>t-le "math.sin(math.pi * x)"
x math.sin(math.pi * x)
------ --------
0.00 0.000
0.10 0.309
0.20 0.588
0.30 0.809
0.40 0.951
0.50 1.000
0.60 0.951
0.70 0.809
0.80 0.588
0.90 0.309
1.00 0.000
E:...>
Implementation of LE
Here is le.c, implementing the Lua Expression evaluator:
#include <lua.h>
#include <lauxlib.h>
#include <stdlib.h>
#include <string.h>
static lua_State *L = NULL;
/* Initialize the LE library by creating a Lua state.
*
* The new Lua interpreter state has the "usual" standard libraries
* open.
*/
int le_init()
{
L = luaL_newstate();
if (L)
luaL_openlibs(L);
return !!L;
}
/* Load an expression, returning a cookie that can be used later to
* select this expression for evaluation by le_eval(). Note that
* le_unref() must eventually be called to free the expression.
*
* The cookie is a lua_ref() reference to a function that evaluates the
* expression when called. Any variables in the expression are assumed
* to refer to the global environment, which is _G in the interpreter.
* A refinement might be to isolate the function envioronment from the
* globals.
*
* The implementation rewrites the expr as "return "..expr so that the
* anonymous function actually produced by lua_load() looks like:
*
* function() return expr end
*
*
* If there is an error and the pmsg parameter is non-NULL, the char *
* it points to is filled with an error message. The message is
* allocated by strdup() so the caller is responsible for freeing the
* storage.
*
* Returns a valid cookie or the constant LUA_NOREF (-2).
*/
int le_loadexpr(char *expr, char **pmsg)
{
int err;
char *buf;
if (!L) {
if (pmsg)
*pmsg = strdup("LE library not initialized");
return LUA_NOREF;
}
buf = malloc(strlen(expr)+8);
if (!buf) {
if (pmsg)
*pmsg = strdup("Insufficient memory");
return LUA_NOREF;
}
strcpy(buf, "return ");
strcat(buf, expr);
err = luaL_loadstring(L,buf);
free(buf);
if (err) {
if (pmsg)
*pmsg = strdup(lua_tostring(L,-1));
lua_pop(L,1);
return LUA_NOREF;
}
if (pmsg)
*pmsg = NULL;
return luaL_ref(L, LUA_REGISTRYINDEX);
}
/* Evaluate the loaded expression.
*
* If there is an error and the pmsg parameter is non-NULL, the char *
* it points to is filled with an error message. The message is
* allocated by strdup() so the caller is responsible for freeing the
* storage.
*
* Returns the result or 0 on error.
*/
double le_eval(int cookie, char **pmsg)
{
int err;
double ret;
if (!L) {
if (pmsg)
*pmsg = strdup("LE library not initialized");
return 0;
}
lua_rawgeti(L, LUA_REGISTRYINDEX, cookie);
err = lua_pcall(L,0,1,0);
if (err) {
if (pmsg)
*pmsg = strdup(lua_tostring(L,-1));
lua_pop(L,1);
return 0;
}
if (pmsg)
*pmsg = NULL;
ret = (double)lua_tonumber(L,-1);
lua_pop(L,1);
return ret;
}
/* Free the loaded expression.
*/
void le_unref(int cookie)
{
if (!L)
return;
luaL_unref(L, LUA_REGISTRYINDEX, cookie);
}
/* Set a variable for use in an expression.
*/
void le_setvar(char *name, double value)
{
if (!L)
return;
lua_pushnumber(L,value);
lua_setglobal(L,name);
}
/* Retrieve the current value of a variable.
*/
double le_getvar(char *name)
{
double ret;
if (!L)
return 0;
lua_getglobal(L,name);
ret = (double)lua_tonumber(L,-1);
lua_pop(L,1);
return ret;
}
Remarks
The above sample consists of 189 lines of code total, including a spattering of comments, blank lines, and the demonstration. Not bad for a quick function evaluator that knows how to evaluate reasonably arbitrary expressions of one variable, and has rich library of standard math functions at its beck and call.
You have a Turing-complete language underneath it all, and it would be an easy extension to allow the user to define complete functions as well as to evaluate simple expressions.
Since you're lazy, like most programmers, here's a link to a simple example that you can use to parse some arbitrary code using Lua. From there, it should be simple to create your expression parser.
This is for Lua users that are looking for a Lua equivalent of "eval".
The magic word used to be loadstring but it is now, since Lua 5.2, an upgraded version of load.
i=0
f = load("i = i + 1") -- f is a function
f() ; print(i) -- will produce 1
f() ; print(i) -- will produce 2
Another example, that delivers a value :
f=load('return 2+3')
print(f()) -- print 5
As a quick-and-dirty way to do, you can consider the following equivalent of eval(s), where s is a string to evaluate :
load(s)()
As always, eval mechanisms should be avoided when possible since they are expensive and produce a code difficult to read.
I personally use this mechanism with LuaTex/LuaLatex to make math operations in Latex.
The Lua documentation contains a section titled The Application Programming Interface which describes how to call Lua from your C program. The documentation for Lua is very good and you may even be able to find an example of what you want to do in there.
It's a big world in there, so whether you choose your own parsing solution or an embeddable interpreter like Lua, you're going to have some work to do!
function calc(operation)
return load("return " .. operation)()
end

Resources