what is LLVM ExtractValueInst? - clang

I was studying one llvm code, where i have found a line,
const ExtractValueInst *EI = cast<ExtractValueInst>(I);
st.setValue(I, st.getValue(EI->getAggregateOperand()));
Now, I understand why cast<...> is being used, but I can't relate with the ExtarctValueInst, Can you give me one example what is this instruction in IR and what is the equivalence C code? and also I want to know about getAggregateOperand() function also. Thank you in advance.

Suppose you have a function that returns a 32-bit integer that isn't really one integer, but rather a set of smaller bitfields and/or bools. Perhaps the bottom six bits are an integer in the range 0-63, the seventh bit is a boolean, etc.
Somewhere you have a call to that function, and a little below the call, you have code that uses the i6 that's a part of the return value. So you create an extractvalue to extract a value from the composite return value. (If the composite were in main memory you'd probably create a getelementptr/load pair, but since it's most likely in a CPU register you create an extractvalue.)
This is quite often used e.g. in exception handling; a catch clause has a single parameter which is a composite of two things, and the catch clause tests one of the two components to determine whether to catch the exception or pass it on.

Related

Trying to understand this profiling result from F# code

A quick screenshot with the point of interest:
There are 2 questions here.
This happens in a tight loop. The 12.8% code is this:
{
this with Side = side; PositionPrice = position'; StopLossPrice = sl'; TakeProfitPrice = tp'; Volume = this.Volume + this.Quantity * position'
}
This object is passed around a lot and has 23 fields so it's not tiny. It looks like immutability is great for stable code, but it's horrible for performance.
Since this recursive loop is run in parallel, I need to store it's context variables in an object.
I am looking for a general idea of what makes sense, not something specific to that code because I have a few tight loops with a bunch of math which I need to profile as well. I am sure I'll find the same pattern in several places.
The flaw here is that I store both the context for the calculations and its variables in a singe type that gets passed in the loop. As the variable fields get updated, the whole object has to be recreated.
What would make sense here (in general for this type of situations)?
make the fields that can change mutable. In this case, that means keeping the type as is (23 fields) and make some fields mutable (only 5 fields get regularly changed)
move the mutable fields to their own type to have a general context object and one holding all the variables. In this case, that means having a context with (23 - 5 fields) and a separate 5 fields type
make the mutable fields variables and move them out of the type. In this case, these 5 fields would be passed as variables in the recursive loop?
and for the second question:
I have no idea what the 10.0% line with get_Tag is. I have nothing called 'Tag' in the code, so I assume that's a dotnet internal thing.
I have a type called Side and there is a field with the same name used in the loop, but what is the 'Tag' part?
What I would suggest is not to modify your existing immutable type at all. Instead, create a new type with mutable fields that is only used within your tight loop. If the type leaves that loop, convert it back to your immutable type (assuming you don't need a copy to go through the rest of your program with every iteration).
get_Tag in this case is likely the auto-generated get-only property on a discriminated union, it's just how the F# compiler represents this sort of type in CLR. The property can most easily be seen when looking at F# code from C#, here's a great page on F# decompiled:
https://fsharpforfunandprofit.com/posts/fsharp-decompiled/#unions
For the performance issues I can only offer some suggestions:
If you can constrain the context object to your code only, then try making a mutable version and see which effect it has.
You mention that the context object is quite large, is it possible to split it up?

What's the difference between luaL_checknumber and lua_tonumber?

Clarification (sorry the question was not specific): They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. How does luaL_checknumber deal with something that's not a number?
There's also luaL_checklong and luaL_checkinteger. Are they the same as (int)luaL_checknumber and (long)luaL_checknumber respectively?
The reference manual does answer this question. I'm citing the Lua 5.2 Reference Manual, but similar text is found in the 5.1 manual as well. The manual is, however, quite terse. It is rare for any single fact to be restated in more than one sentence. Furthermore, you often need to correlate facts stated in widely separated sections to understand the deeper implications of an API function.
This is not a defect, it is by design. This is the reference manual to the language, and as such its primary goal is to completely (and correctly) describe the language.
For more information about "how" and "why" the general advice is to also read Programming in Lua. The online copy is getting rather long in the tooth as it describes Lua 5.0. The current paper edition describes Lua 5.1, and a new edition describing Lua 5.2 is in process. That said, even the first edition is a good resource, as long as you also pay attention to what has changed in the language since version 5.0.
The reference manual has a fair amount to say about the luaL_check* family of functions.
Each API entry's documentation block is accompanied by a token that describes its use of the stack, and under what conditions (if any) it will throw an error. Those tokens are described at section 4.8:
Each function has an indicator like this: [-o, +p, x]
The first field, o, is how many elements the function pops from the
stack. The second field, p, is how many elements the function pushes
onto the stack. (Any function always pushes its results after popping
its arguments.) A field in the form x|y means the function can push
(or pop) x or y elements, depending on the situation; an interrogation
mark '?' means that we cannot know how many elements the function
pops/pushes by looking only at its arguments (e.g., they may depend on
what is on the stack). The third field, x, tells whether the function
may throw errors: '-' means the function never throws any error; 'e'
means the function may throw errors; 'v' means the function may throw
an error on purpose.
At the head of Chapter 5 which documents the auxiliary library as a whole (all functions in the official API whose names begin with luaL_ rather than just lua_) we find this:
Several functions in the auxiliary library are used to check C
function arguments. Because the error message is formatted for
arguments (e.g., "bad argument #1"), you should not use these
functions for other stack values.
Functions called luaL_check* always throw an error if the check is not
satisfied.
The function luaL_checknumber is documented with the token [-0,+0,v] which means that it does not disturb the stack (it pops nothing and pushes nothing) and that it might deliberately throw an error.
The other functions that have more specific numeric types differ primarily in function signature. All are described similarly to luaL_checkint() "Checks whether the function argument arg is a number and returns this number cast to an int", varying the type named in the cast as appropriate.
The function lua_tonumber() is described with the token [-0,+0,-] meaning it has no effect on the stack and does not throw any errors. It is documented to return the numeric value from the specified stack index, or 0 if the stack index does not contain something sufficiently numeric. It is documented to use the more general function lua_tonumberx() which also provides a flag indicating whether it successfully converted a number or not.
It too has siblings named with more specific numeric types that do all the same conversions but cast their results.
Finally, one can also refer to the source code, with the understanding that the manual is describing the language as it is intended to be, while the source is a particular implementation of that language and might have bugs, or might reveal implementation details that are subject to change in future versions.
The source to luaL_checknumber() is in lauxlib.c. It can be seen to be implemented in terms of lua_tonumberx() and the internal function tagerror() which calls typerror() which is implemented with luaL_argerror() to actually throw the formatted error message.
They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. luaL_checknumber throws a (Lua) error when it fails a conversion - it long jumps and never returns from the POV of the C function. lua_tonumber merely returns 0 (which can be a valid return as well.) So you could write this code which should be faster than checking with lua_isnumber first.
double r = lua_tonumber(_L, idx);
if (r == 0 && !lua_isnumber(_L, idx))
{
// Error handling code
}
return r;

fortran save integer

I came across some code today that looked somewhat like this:
subroutine foo()
real blah
integer bar,k,i,j,ll
integer :: n_called=1
save integer
...
end
It seems like the intent here was probably save n_called, but is that even a valid statment to save all integers -- or is it implicitly declaring a variable named integer and saving it?
The second interpretation is correct. Fortran has many keywords, INTEGER being one of them, but it has no reserved words, which means that keywords can be used as identifiers, though this is usually a terrible idea (but nevertheless it carries on to C# where one can prefix a keyword with # and use it as an identifier, right?)
The SAVE statement, even if it was intended for n_called is superficial. Fortran automatically saves all variables that have initialisers and that's why the code probably works as intended.
integer :: n_called=1
Here n_called is automatically SAVE. This usually comes as a really bad surprise to C/C++ programmers forced to maintain/extend/create new Fortran code :)
I agree with your 2nd interpretation, that is, the statement save integer implicitly declares a variable called integer and gives it the save attribute. Fortran, of course, has no rule against using keywords as program entity names, though most sensible software developers do have such a rule.
If I try to compile your code snippet as you have presented it, my compiler (Intel Fortran) makes no complaint. If I insert implicit none at the right place it reports the error
This name does not have a type, and must have an explicit type. [INTEGER]
The other interpretation, that it gives the save attribute to all integer variables, seems at odds with the language standards and it's not a variation that I've ever come across.

Recommended initialization values for numbers

Assume you have a variety of number or int based variables that you want to be initialized to some default value. But using 0 could be problematic because 0 is meaningful and could have side affects.
Are there any conventions around this?
I have been working in Actionscript lately and have a variety of value objects with optional parameters so for most variables I set null but for numbers or ints I can't use null. An example:
package com.website.app.model.vo
{
public class MyValueObject
{
public function MyValueObject (
_id:String=null,
_amount:Number=0,
_isPurchased:Boolean=false
)
{ // Constructor
if( _id != null ) this.id = _id;
if( _amount != 0 ) this.amount = _amount;
if( _isPurchased != false ) this.isPurchased = _isPurchased;
}
public var id:String;
public var amount:Number;
public var isPurchased:Boolean;
}
}
The difficulty is that using 0 in the above code might be problematic if the value is not ever changed from its initial value. It is easy to detect if a variable has a null value. But detecting 0 may not be so easy because 0 might be a legitimate value. I want to set a default value to make the parameter optional but I also want to later detect in my code if the value was changed from its default without hard to debug side affects.
I suppose I could use something like -1 for a value. I was wondering if there are any well known coding conventions for this kind of thing? I suppose it depends on the nature of the variable and the data.
This is first my stack overflow question. Hopefully the gist of my question makes sense.
A lot of debuggers will use 0xdeadbeef for initializing registers. I always get a chuckle when I see that.
But, in all honesty, your question contains its own answer - use a value that your variable is not ever expected to become. It doesn't matter what the value is.
Since you asked in a comment I'll talk a little bit about C and C++. For efficiency reasons local variables and allocated memory are not initialized by default. But debug builds often do this to help catch errors. A common value used is 0xcdcdcdcd which is reasonably unlikely. It has the high bit set and is either a rather large unsigned or rather large negative signed number. As a pointer address it is odd which will cause an alignment exception if used on anything but a char (but not on X86). It has no special meaning as a 32 bit floating point number so it isn't a perfect choice.
Occasionally you'll see a partially aligned value in a variable such as 0xcdcd0000 or 0x0000cdcd. These can be treated as suspcious at the very least.
Sometimes different values will be used depending on the allocation area of library. That gives you a clue where a bad value may have originated (i.e., it itself wasn't initialized but it was copied from an unititialized value).
The ideal value would be invalid no matter what alignment you read from memory and is invalid over all primitive types. It also should look suspicious to a human so even if they do not know the convention they can suspect something is a foot. That's why 0xdeadbeef can be a good choice because the (hex viewing) programmer will recognize that as the work of a human and not random chance. Note also that it is odd and has the high bit set so it has that going for it.
The value -1 is often traditionally used as an "out of range" or "invalid" value to indicate failure or non-initialised data. Then again, that goes right down the pan if -1 is a semantically valid value for the variable...or you're using an unsigned type.
You seem to like null (and for a good reason), so why not just use it throughout?
In ActionScript you can only assign Number.NaN to variables that are typed Number, not int or uint.
That being said, because AS3 does not support named arguments you can always look at the arguments array (it's a built-in array that all functions have, unless you use the ...rest construct). If that array's length is less than the position of your numeric argument you know it wasn't passed in.
I often use a maximum value for this. As you say, zero often is a valid value. Generally max-int, while theoretically valid, is safe to exclude. But not always; be careful.
I like 0xD15EA5ED, it's similar to 0xDEADBEEF but is usually more accurate when debugging.

Pointer to generic type

In the process of transforming a given efficient pointer-based hash map implementation into a generic hash map implementation, I stumbled across the following problem:
I have a class representing a hash node (the hash map implementation uses a binary tree)
THashNode <KEY_TYPE, VALUE_TYPE> = class
public
Key : KEY_TYPE;
Value : VALUE_TYPE;
Left : THashNode <KEY_TYPE, VALUE_TYPE>;
Right : THashNode <KEY_TYPE, VALUE_TYPE>;
end;
In addition to that there is a function that should return a pointer to a hash node. I wanted to write
PHashNode = ^THashNode <KEY_TYPE, VALUE_TYPE>
but that doesn't compile (';' expected but '<' found).
How can I have a pointer to a generic type?
And adressed to Barry Kelly: if you read this: yes, this is based on your hash map implementation. You haven't written such a generic version of your implementation yourself, have you? That would save me some time :)
Sorry, Smasher. Pointers to open generic types are not supported because generic pointer types are not supported, although it is possible (compiler bug) to create them in certain circumstances (particularly pointers to nested types inside a generic type); this "feature" can't be removed in an update in case we break someone's code. The limitation on generic pointer types ought to be removed in the future, but I can't make promises when.
If the type in question is the one in JclStrHashMap I wrote (or the ancient HashList unit), well, the easiest way to reproduce it would be to change the node type to be a class and pass around any double-pointers as Pointer with appropriate casting. However, if I were writing that unit again today, I would not implement buckets as binary trees. I got the opportunity to write the dictionary in the Generics.Collections unit, though with all the other Delphi compiler work time was too tight before shipping for solid QA, and generic feature support itself was in flux until fairly late.
I would prefer to implement the hash map buckets as one of double-hashing, per-bucket dynamic arrays or linked lists of cells from a contiguous array, whichever came out best from tests using representative data. The logic is that cache miss cost of following links in tree/list ought to dominate any difference in bucket search between tree and list with a good hash function. The current dictionary is implemented as straight linear probing primarily because it was relatively easy to implement and worked with the available set of primitive generic operations.
That said, the binary tree buckets should have been an effective hedge against poor hash functions; if they were balanced binary trees (=> even more modification cost), they would be O(1) on average and O(log n) worst case performance.
To actually answer your question, you can't make a pointer to a generic type, because "generic types" don't exist. You have to make a pointer to a specific type, with the type parameters filled in.
Unfortunately, the compiler doesn't like finding angle brackets after a ^. But it will accept the following:
TGeneric<T> = record
value: T;
end;
TSpecific = TGeneric<string>;
PGeneric = ^TSpecific;
But "PGeneric = ^TGeneric<string>;" gives a compiler error. Sounds like a glitch to me. I'd report that over at QC if I was you.
Why are you trying to make a pointer to an object, anyway? Delphi objects are a reference type, so they're pointers already. You can just cast your object reference to Pointer and you're good.
If Delphi supported generic pointer types at all, it would have to look like this:
type
PHashNode<K, V> = ^THashNode<K, V>;
That is, mention the generic parameters on the left side where you declare the name of the type, and then use those parameters in constructing the type on the right.
However, Delphi does not support that. See QC 66584.
On the other hand, I'd also question the necessity of having a pointer to a class type at all. Generic or not. they are needed only very rarely.
There's a generic hash map called TDictionary in the Generics.Collections unit. Unfortunately, it's badly broken at the moment, but it's apparently going to be fixed in update #3, which is due out within a matter of days, according to Nick Hodges.

Resources