I'm reading through two articles right now and am a little confused.
This article - http://blog.golang.org/laws-of-reflection says
> var r io.Reader
tty, err := os.OpenFile("/dev/tty", os.O_RDWR, 0)
if err != nil {
return nil, err
}
r = tty
r contains, schematically, the (value, type) pair, (tty, *os.File).
Notice that the type *os.File implements methods other than Read; even
though the interface value provides access only to the Read method,
the value inside carries all the type information about that value.
This other article, says
In terms of our example, the itable for Stringer holding type Binary
lists the methods used to satisfy Stringer, which is just String:
Binary's other methods (Get) make no appearance in the itable.
It feels like these two are in opposition. According to the second article, the variable r in the first extract should be (tty, io.Reader), as that is the static type of r. Instead, the article says that *os.File is the type of tty. If the second example were right, then the diagram in the first example should have all of the methods implemented by the Binary type.
Where am I going wrong?
The two articles are explaining a similar concept at two very different levels of granularity. As Volker said, "Laws of Reflection" is a basic overview of what is actually happening when you examine objects via reflection. Your second article is examining the dynamic dispatch properties of an interface (which, can be resolved via reflection as well) and how the runtime resolves them at runtime.
According to the second article, the variable r in the first extract should be (tty, io.Reader)
Given that understanding, think of an interface at runtime as a "wrapper object". It exists to provide information about another object (the itable from your second article) to know where to jump to in the wrapped objects layout (implementation may differ between versions .. but the principle is basically the same for most languages).
This is why calling Read on r works .. first it will check the itable and jump to the function that is laid out for the os.File type. If that was an interface .. you would be looking at another dereference and dispatch (which IIRC isn't applicable at all in Go).
RE: Reflection - you're getting an easily digestible representation of this, in the form of a (value, type) pair (via the reflect.ValueOf and reflect.TypeOf methods).
Both are correct, r "holds" (tty, *os.File) and that is what the second article says. Note that Laws of Reflection is a bit more high-level and does not mention implementation details (which might change in each release) like discussed in the second article. The diagram of the second article read as follows: "s contains schematically (b, *Binary). s is of type Stringer, its data is a Binary with value 200 and s's itable contains one method String and the other methods of Binary (or *Binary) are not represented in the itable and thus not accessible for s.
Note that I think the actual implementation of interfaces in Go 1.4 is different from what the second article (Russ's?) states. "
Related
I was given a fragment of code (a function called bubbleSort(), written in Java, for example). How can I, or rather my program, tell if a given source code implements a particular sorting algorithm the correct way (using bubble method, for instance)?
I can enforce a user to give a legitimate function by analyzing function signature: making sure the the argument and return value is an array of integers. But I have no idea how to determine that algorithm logic is being done the right way. The input code could sort values correctly, but not in an aforementioned bubble method. How can my program discern that? I do realize a lot of code parsing would be involved, but maybe there's something else that I should know.
I hope I was somewhat clear.
I'd appreciate if someone could point me in the right direction or give suggestions on how to tackle such a problem. Perhaps there are tested ways that ease the evaluation of program logic.
In general, you can't do this because of the Halting problem. You can't even decide if the function will halt ("return").
As a practical matter, there's a bit more hope. If you are looking for a bubble sort, you can decide that it has number of parts:
a to-be-sorted datatype S with a partial order,
a container data type C with single instance variable A ("the array")
that holds the to-be-sorted data
a key type K ("array index") used to access the container that has a partial order
such that container[K] is type S
a comparison of two members of container, using key A and key B
such that A < B according to the key partial order, that determines
if container[B]>container of A
a swap operation on container[A], container[B] and some variable T of type S, that is conditionaly dependent on the comparison
a loop wrapped around the container that enumerates keys in according the partial order on K
You can build bits of code that find each of these bits of evidence in your source code, and if you find them all, claim you have evidence of a bubble sort.
To do this concretely, you need standard program analysis machinery:
to parse the source code and build an abstract syntax tree
build symbol tables (ST) that know the type of each identifier where it is used
construct a control flow graph (CFG) so that you check that various recognized bits occur in appropriate ordering
construct a data flow graph (DFG), so that you can determine that values recognized in one part of the algorithm flow properly to another part
[That's a lot of machinery just to get started]
From here, you can write ad hoc code procedural code to climb over the AST, ST, CFG, DFG, to "recognize" each of the individual parts. This is likely to be pretty messy as each recognizer will be checking these structures for evidence of its bit. But, you can do it.
This is messy enough, and interesting enough, so there are tools which can do much of this.
Our DMS Software Reengineering Toolkit is one. DMS already contains all the machinery to do standard program analysis for several languages. DMS also has a Dataflow pattern matching language, inspired by Rich and Water's 1980's "Programmer's Apprentice" ideas.
With DMS, you can express this particular problem roughly like this (untested):
dataflow pattern domain C;
dataflow pattern swap(in out v1:S, in out v2:S, T:S):statements =
" \T = \v1;
\v1 = \v2;
\v2 = \T;";
dataflow pattern conditional_swap(in out v1:S, in out v2:S,T:S):statements=
" if (\v1 > \v2)
\swap(\v1,\v2,\T);"
dataflow pattern container_access(inout container C, in key: K):expression
= " \container.body[\K] ";
dataflow pattern size(in container:C, out: integer):expression
= " \container . size "
dataflow pattern bubble_sort(in out container:C, k1: K, k2: K):function
" \k1 = \smallestK\(\);
while (\k1<\size\(container\)) {
\k2 = \next\(k1);
while (\k2 <= \size\(container\) {
\conditionalswap\(\container_access\(\container\,\k1\),
\container_access\(\container\,\k2\) \)
}
}
";
Within each pattern, you can write what amounts to the concrete syntax of the chosen programming language ("pattern domain"), referencing dataflows named in the pattern signature line. A subpattern can be mentioned inside another; one has to pass the dataflows to and from the subpattern by naming them. Unlike "plain old C", you have to pass the container explicitly rather than by implicit reference; that's because we are interested in the actual values that flow from one place in the pattern to another. (Just because two places in the code use the same variable, doesn't mean they see the same value).
Given these definitions, and ask to "match bubble_sort", DMS will visit the DFG (tied to CFG/AST/ST) to try to match the pattern; where it matches, it will bind the pattern variables to the DFG entries. If it can't find a match for everything, the match fails.
To accomplish the match, each of patterns above is converted essentially into its own DFG, and then each pattern is matched against the DFG for the code using what is called a subgraph isomorphism test. Constructing the DFG for the patter takes a lot of machinery: parsing, name resolution, control and data flow analysis, applied to fragments of code in the original language, intermixed with various pattern meta-escapes. The subgraph isomorphism is "sort of easy" to code, but can be very expensive to run. What saves the DMS pattern matchers is that most patterns have many, many constraints [tech point: and they don't have knots] and each attempted match tends to fail pretty fast, or succeed completely.
Not shown, but by defining the various bits separately, one can provide alternative implementations, enabling the recognition of variations.
We have used this to implement quite complete factory control model extraction tools from real industrial plant controllers for Dow Chemical on their peculiar Dowtran language (meant building parsers, etc. as above for Dowtran). We have version of this prototyped for C; the data flow analysis is harder.
Suppose one needs a numeric data type whose allowed values fall within a specified range. More concretely, suppose one wants to define an integral type whose min value is 0 and maximum value is 5000. This type of scenario arises in many situations, such as when modeling a database data type, an XSD data type and so on.
What is the best way to model such a type in F#? In C#, one way to do this would be to define a struct that implemented the range checking overloaded operators, formatting and so on. A analogous approach in F# is described here: http://tomasp.net/blog/fsharp-custom-numeric.aspx/
I don't really need though a fully-fledged custom type; all I really want is an existing type with a constrained domain. For example, I would like to be able to write something like
type MyInt = Value of uint16 where Value <= 5000 (pseudocode)
Is there a shorthand way to do such a thing in F# or is the best approach to implement a custom numeric type as described in the aforementioned blog post?
You're referring to what are called refinement types in type theory, and as pointed out by Daniel, look for F*. But it is a research project.
As far as doing it with F#, in addition to Tomas' post, take a look at the designing with types series.
My suggestion would be to implement a custom struct wrapping your data type (e.g., int), just as you would in C#.
The idea behind creating this custom struct is that it allows you to "intercept" all uses of the underlying data value at run-time and check them for correctness. The alternative is to check all of these uses at compile-time, which is possible with something like F* (as others mentioned), although it's much more difficult and not something you would use for everyday code.
Clarification (sorry the question was not specific): They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. How does luaL_checknumber deal with something that's not a number?
There's also luaL_checklong and luaL_checkinteger. Are they the same as (int)luaL_checknumber and (long)luaL_checknumber respectively?
The reference manual does answer this question. I'm citing the Lua 5.2 Reference Manual, but similar text is found in the 5.1 manual as well. The manual is, however, quite terse. It is rare for any single fact to be restated in more than one sentence. Furthermore, you often need to correlate facts stated in widely separated sections to understand the deeper implications of an API function.
This is not a defect, it is by design. This is the reference manual to the language, and as such its primary goal is to completely (and correctly) describe the language.
For more information about "how" and "why" the general advice is to also read Programming in Lua. The online copy is getting rather long in the tooth as it describes Lua 5.0. The current paper edition describes Lua 5.1, and a new edition describing Lua 5.2 is in process. That said, even the first edition is a good resource, as long as you also pay attention to what has changed in the language since version 5.0.
The reference manual has a fair amount to say about the luaL_check* family of functions.
Each API entry's documentation block is accompanied by a token that describes its use of the stack, and under what conditions (if any) it will throw an error. Those tokens are described at section 4.8:
Each function has an indicator like this: [-o, +p, x]
The first field, o, is how many elements the function pops from the
stack. The second field, p, is how many elements the function pushes
onto the stack. (Any function always pushes its results after popping
its arguments.) A field in the form x|y means the function can push
(or pop) x or y elements, depending on the situation; an interrogation
mark '?' means that we cannot know how many elements the function
pops/pushes by looking only at its arguments (e.g., they may depend on
what is on the stack). The third field, x, tells whether the function
may throw errors: '-' means the function never throws any error; 'e'
means the function may throw errors; 'v' means the function may throw
an error on purpose.
At the head of Chapter 5 which documents the auxiliary library as a whole (all functions in the official API whose names begin with luaL_ rather than just lua_) we find this:
Several functions in the auxiliary library are used to check C
function arguments. Because the error message is formatted for
arguments (e.g., "bad argument #1"), you should not use these
functions for other stack values.
Functions called luaL_check* always throw an error if the check is not
satisfied.
The function luaL_checknumber is documented with the token [-0,+0,v] which means that it does not disturb the stack (it pops nothing and pushes nothing) and that it might deliberately throw an error.
The other functions that have more specific numeric types differ primarily in function signature. All are described similarly to luaL_checkint() "Checks whether the function argument arg is a number and returns this number cast to an int", varying the type named in the cast as appropriate.
The function lua_tonumber() is described with the token [-0,+0,-] meaning it has no effect on the stack and does not throw any errors. It is documented to return the numeric value from the specified stack index, or 0 if the stack index does not contain something sufficiently numeric. It is documented to use the more general function lua_tonumberx() which also provides a flag indicating whether it successfully converted a number or not.
It too has siblings named with more specific numeric types that do all the same conversions but cast their results.
Finally, one can also refer to the source code, with the understanding that the manual is describing the language as it is intended to be, while the source is a particular implementation of that language and might have bugs, or might reveal implementation details that are subject to change in future versions.
The source to luaL_checknumber() is in lauxlib.c. It can be seen to be implemented in terms of lua_tonumberx() and the internal function tagerror() which calls typerror() which is implemented with luaL_argerror() to actually throw the formatted error message.
They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. luaL_checknumber throws a (Lua) error when it fails a conversion - it long jumps and never returns from the POV of the C function. lua_tonumber merely returns 0 (which can be a valid return as well.) So you could write this code which should be faster than checking with lua_isnumber first.
double r = lua_tonumber(_L, idx);
if (r == 0 && !lua_isnumber(_L, idx))
{
// Error handling code
}
return r;
I came across the original statement of the Liskov Substitution Principle on Ward's wiki tonight:
What is wanted here is something like the following substitution property: If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T." - Barbara Liskov, Data Abstraction and Hierarchy, SIGPLAN Notices, 23,5 (May, 1988).
I've always been crap at parsing predicate logic (I failed Calc IV the first time though), so while I kind of understand how the above translates to:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.
what I don't understand is why the property Liskov describes implies that S is a subtype of T and not the other way around.
Maybe I just don't know enough yet about OOP, but why does Liskov's statement only allow for the possibility S -> T, and not T -> S?
the hypothetical set of programs P (in terms of T) is not defined in terms of S, and therefore it doesn't say much about S. On the other hand, we do say that S works as well as T in that set of programs P, and so we can draw conclusions about S and its relationship to T.
One way to think about this is that P demands certain properties of T. S happens to satisfy those properties. Perhaps you could even say that 'every o1 in S is also in T'. This conclusion is used to define the word subtype.
As pointed by sepp2k there are multiple views explaining this in the other post.
Here are my two cents about it.
I like to look at it this way
If for each object o1 of type TallPerson there is an object o2 of type Person such that for all programs P defined in terms of Person, the behavior of P is unchanged when o1 is substituted for o2 then TallPerson is a subtype of Person. (replacing S with TallPerson and T with Person)
We typically have a perspective of objects deriving some base class have more functionality, since its extended. However with more functionality we are specializing it and reducing the scope in which they can be used thus becoming subtypes to its base class(broader type).
A derived class inherits its base class's public interface and is expected to either use the inherited implementation or to provide an implementation that behaves similarly (e.g., a Count() method should return the number of elements regardless of how those elements are stored.)
A base class wouldn't necessarily have the interface of any (let alone all) of its derived classes so it wouldn't make sense to expect an arbitrary reference to a base class to be substitutable for a specified derived class. Even if it appears that only the subset of the interface that supported by the base class's interface is required, that might not be the case (e.g., it could be that a shadowing method in the specific derived class is referred to).
In the process of transforming a given efficient pointer-based hash map implementation into a generic hash map implementation, I stumbled across the following problem:
I have a class representing a hash node (the hash map implementation uses a binary tree)
THashNode <KEY_TYPE, VALUE_TYPE> = class
public
Key : KEY_TYPE;
Value : VALUE_TYPE;
Left : THashNode <KEY_TYPE, VALUE_TYPE>;
Right : THashNode <KEY_TYPE, VALUE_TYPE>;
end;
In addition to that there is a function that should return a pointer to a hash node. I wanted to write
PHashNode = ^THashNode <KEY_TYPE, VALUE_TYPE>
but that doesn't compile (';' expected but '<' found).
How can I have a pointer to a generic type?
And adressed to Barry Kelly: if you read this: yes, this is based on your hash map implementation. You haven't written such a generic version of your implementation yourself, have you? That would save me some time :)
Sorry, Smasher. Pointers to open generic types are not supported because generic pointer types are not supported, although it is possible (compiler bug) to create them in certain circumstances (particularly pointers to nested types inside a generic type); this "feature" can't be removed in an update in case we break someone's code. The limitation on generic pointer types ought to be removed in the future, but I can't make promises when.
If the type in question is the one in JclStrHashMap I wrote (or the ancient HashList unit), well, the easiest way to reproduce it would be to change the node type to be a class and pass around any double-pointers as Pointer with appropriate casting. However, if I were writing that unit again today, I would not implement buckets as binary trees. I got the opportunity to write the dictionary in the Generics.Collections unit, though with all the other Delphi compiler work time was too tight before shipping for solid QA, and generic feature support itself was in flux until fairly late.
I would prefer to implement the hash map buckets as one of double-hashing, per-bucket dynamic arrays or linked lists of cells from a contiguous array, whichever came out best from tests using representative data. The logic is that cache miss cost of following links in tree/list ought to dominate any difference in bucket search between tree and list with a good hash function. The current dictionary is implemented as straight linear probing primarily because it was relatively easy to implement and worked with the available set of primitive generic operations.
That said, the binary tree buckets should have been an effective hedge against poor hash functions; if they were balanced binary trees (=> even more modification cost), they would be O(1) on average and O(log n) worst case performance.
To actually answer your question, you can't make a pointer to a generic type, because "generic types" don't exist. You have to make a pointer to a specific type, with the type parameters filled in.
Unfortunately, the compiler doesn't like finding angle brackets after a ^. But it will accept the following:
TGeneric<T> = record
value: T;
end;
TSpecific = TGeneric<string>;
PGeneric = ^TSpecific;
But "PGeneric = ^TGeneric<string>;" gives a compiler error. Sounds like a glitch to me. I'd report that over at QC if I was you.
Why are you trying to make a pointer to an object, anyway? Delphi objects are a reference type, so they're pointers already. You can just cast your object reference to Pointer and you're good.
If Delphi supported generic pointer types at all, it would have to look like this:
type
PHashNode<K, V> = ^THashNode<K, V>;
That is, mention the generic parameters on the left side where you declare the name of the type, and then use those parameters in constructing the type on the right.
However, Delphi does not support that. See QC 66584.
On the other hand, I'd also question the necessity of having a pointer to a class type at all. Generic or not. they are needed only very rarely.
There's a generic hash map called TDictionary in the Generics.Collections unit. Unfortunately, it's badly broken at the moment, but it's apparently going to be fixed in update #3, which is due out within a matter of days, according to Nick Hodges.