In Dart can hashCode() method calls return different values on equal (==) Objects? - dart

My immediate project is to develop a system of CheckSums for proving that two somewhat complex objects are (functionally)EQUAL - in the sense that they have the same values for the critical properties. (Have discovered that dates/times cannot be included, so can't use JSON on the bigger object - duh :) (For my purposes) ).
To do this calling the hashCode() method on selected strings seemed to be the way to go.
Upon implementing this, I note that in practice I am getting very different values on multiple runs of highest level objects that are functionally 'identical'.
There are a number of "nums" that I have not rounded, there are integers, bools, Strings and not much more.
I have 'always' thought that a hashCode on the same set of values would return the same number, am I missing something?
BTW the only context that I have found material on hashCode() has been with WebSockets.
Of course I can write my own String to a unique value but I want to understand if this is a problem with Dart or something else.

I can attempt to answer the question posed in the title: "Can hashCode() method calls return different values on equal (==) Objects?"
Short answer: hash codes for two objects must be the same if those two objects are equals (==).
If you override hashCode you must also override equals. Two objects that are equal, as defined by ==, must also have the same hash code.
However, hash codes do not have to be unique. That is, a perfectly valid hash code is the value 1. A good hash code, however, should be uniformly distributed.
From the docs from Object:
Hash codes are guaranteed to be the same for objects that are equal
when compared using the equality operator ==. Other than that there
are no guarantees about the hash codes. They will not be consistent
between runs and there are no distribution guarantees.
If a subclass overrides hashCode it should override the equality
operator as well to maintain consistency.

I found the immediate problem. The object stringify() method, at one level, was not getting called, but rather some stringify property that must exist in all objects (?).
With this fixed everything is working as exactly as I would expect, and multiple runs of our Statistical Studies are returning exactly the same CheckSum at the highest levels (based on some 5 levels of hierarchy).
Meanwhile the JSON.stringify has continued to fail. Even in the most basic object. I have not been able to determine what is causing to fail. Of course, the question is not how "stringify" is accomplished.
So, empirically at least, I believe it is true that "objects with equal properties" will return equal checkSums in Dart. It was decided to round nums, I don't know if this was causing a problem - perhaps good to be aware of? And, of course, remember to be beware of things like dates, times, or anything that could legitimately vary.
_swarmii

The doc linked by Seth Ladd now include info:
They need not be consistent between executions of the same program and there are no distribution guarantees.`
so technically hashCode value can be change with same object in different executions for your question:
I have 'always' thought that a hashCode on the same set of values would return the same number, am I missing something?

Related

How the hashcode is calculated

When we write:
"exampleString".hashCode
Is there a genral mathematical way, some algorithm that calculates it or it gets it from somewhere else in Dart laungage?
And is the hashcode of a String value still the same in another languages like java, c++...?
To answer your questions, hashcodes are Dart specific, and perhaps even execution-run specific. To see how the various hashcodes work, see the source code for any given class.
The Fine Manual says:
A hash code is a single integer which represents the state of the object that affects operator == comparisons.
All objects have hash codes. The default hash code implemented by Object represents only the identity of the object, the same way as the default operator == implementation only considers objects equal if they are identical (see identityHashCode).
If operator == is overridden to use the object state instead, the hash code must also be changed to represent that state, otherwise the object cannot be used in hash based data structures like the default Set and Map implementations.
Hash codes must be the same for objects that are equal to each other according to operator ==. The hash code of an object should only change if the object changes in a way that affects equality. There are no further requirements for the hash codes. They need not be consistent between executions of the same program and there are no distribution guarantees.
Objects that are not equal are allowed to have the same hash code. It is even technically allowed that all instances have the same hash code, but if clashes happen too often, it may reduce the efficiency of hash-based data structures like HashSet or HashMap.
If a subclass overrides hashCode, it should override the operator == operator as well to maintain consistency.
That last point is important. If two objects are considered == by whatever strategy you want, they must also always have the same hashcode. The inverse is not necessarily true.

Is it legal to have a query string with key names but not associated values (or equals signs)?

I'm trying to understanding the legalities of query params. Specifically, I'm curious if it's a hard-and-fast rule that the query string must be made up of key-value pairs or if it's technically valid to have a key-only. No, I don't mean a key with an equals-sign then nothing after it. I mean not even the equals-sign. Just a key.
For instance 'demomode' in this example:
https://www.somesite.com/search?searchString=abc&demomode
Again, this is how I would actually do it...
https://www.somesite.com/search?searchString=abc&mode=demo
-or-
https://www.somesite.com/search?searchString=abc&demomode=1
-or-
https://www.somesite.com/search?searchString=abc&demomode=true
... and I am in no way promoting the first example above. I'm just curious.
I haven't been able to find anything that says if it's a hard-rule in the standard, a convention that people just agreed upon, or if it's more platform/parser specific, and therefore it depends.
Again, I am not encouraging this. Just curious. It came about when looking at some code earlier that contained some pound-defines. You can pound-define something with a value, like #define mode=4 then check for 4 later, but you can also simply pound-define something without giving it a value, such as #define header_x then simply check if that value is defined, not caring what the value is, or even if there is one in the first place.

There seem to be a lot of ruby methods that are very similar, how do I pick which one to use?

I'm relatively new to Ruby, so this is a pretty general question. I have found through the Ruby Docs page a lot of methods that seem to do the exact same thing or very similar. For example chars vs split(' ') and each vs map vs collect. Sometimes there are small differences and other times I see no difference at all.
My question here is how do I know which is best practice, or is it just personal preference? I'm sure this varies from instance to instance, so if I can learn some of the more important ones to be cognizant of I would really appreciate that because I would like to develop good habits early.
I am a bit confused by your specific examples:
map and collect are aliases. They don't "do the exact same thing", they are the exact same thing. They are just two names for the same method. You can use whatever name you wish, or what reads best in context, or what your team has decided as a Coding Standard. The Community seems to have settled on map.
each and map/collect are completely different, there is no similarity there, apart from the general fact that they both operate on collections. map transform a collection by mapping every element to a new element using a transformation operation. It returns a new collection (an Array, actually) with the transformed elements. each performs a side-effect for every element of the collection. Since it is only used for its side-effect, the return value is irrelevant (it might just as well return nil like Kernel#puts does, in languages like C, C++, Java, C♯, it would return void), but it is specified to always return its receiver.
split splits a String into an Array of Strings based on a delimiter that can be either a Regexp (in which case you can also influence whether or not the delimiter itself gets captured in the output or ignored) or a String, or nil (in which case the global default separator gets used). chars returns an Array with the individual characters (represented as Strings of length 1, since Ruby doesn't have an specific Character type). chars belongs together in a family with bytes and codepoints which do the same thing for bytes and codepoints, respectively. split can only be used as a replacement for one of the methods in this family (chars) and split is much more general than that.
So, in the examples you gave, there really isn't much similarity at all, and I cannot imagine any situation where it would be unclear which one to choose.
In general, you have a problem and you look for the method (or combination of methods) that solve it. You don't look at a bunch of methods and look for the problem they solve.
There'll typically be only one method that fits a specific problem. Larger problems can be broken down into different subproblems in different ways, so it is indeed possible that you may end up with different combinations of methods to solve the same larger problem, but for each individual subproblem, there will generally be only one applicable method.
When documentation states that 2 methods do the same, it's just matter of preference. To learn the details, you should always start with Ruby API documentation

id values of different variables in python 3

I am able to understand immutability with python (surprisingly simple too). Let's say I assign a number to
x = 42
print(id(x))
print(id(42))
On both counts, the value I get is
505494448
My question is, does python interpreter allot ids to all the numbers, alphabets, True/False in the memory before the environment loads? If it doesn't, how are the ids kept track of? Or am I looking at this in the wrong way? Can someone explain it please?
What you're seeing is an implementation detail (an internal optimization) calling interning. This is a technique (used by implementations of a number of languages including Java and Lua) which aliases names or variables to be references to single object instances where that's possible or feasible.
You should not depend on this behavior. It's not part of the language's formal specification and there are no guarantees that separate literal references to a string or integer will be interned nor that a given set of operations (string or numeric) yielding a given object will be interned against otherwise identical objects.
I've heard that the C Python implementation does include a set of the first hundred or so integers as statically instantiated immutable objects. I suspect that other very high level language run-time libraries are likely to include similar optimizations: the first hundred integers are used very frequently by most non-trivial fragments of code.
In terms of how such things are implemented ... for strings and larger integers it would make sense for Python to maintain these as dictionaries. Thus any expression yielding an integer (and perhaps even floats) and strings (at least sufficiently short strings) would be hashed, looked up in the appropriate (internal) object dictionary, added if necessary and then returned as references to the resulting object.
You can do your own similar interning of any sorts of custom object you like by wrapping the instantiation in your own calls to your own class static dictionary.

Recommended initialization values for numbers

Assume you have a variety of number or int based variables that you want to be initialized to some default value. But using 0 could be problematic because 0 is meaningful and could have side affects.
Are there any conventions around this?
I have been working in Actionscript lately and have a variety of value objects with optional parameters so for most variables I set null but for numbers or ints I can't use null. An example:
package com.website.app.model.vo
{
public class MyValueObject
{
public function MyValueObject (
_id:String=null,
_amount:Number=0,
_isPurchased:Boolean=false
)
{ // Constructor
if( _id != null ) this.id = _id;
if( _amount != 0 ) this.amount = _amount;
if( _isPurchased != false ) this.isPurchased = _isPurchased;
}
public var id:String;
public var amount:Number;
public var isPurchased:Boolean;
}
}
The difficulty is that using 0 in the above code might be problematic if the value is not ever changed from its initial value. It is easy to detect if a variable has a null value. But detecting 0 may not be so easy because 0 might be a legitimate value. I want to set a default value to make the parameter optional but I also want to later detect in my code if the value was changed from its default without hard to debug side affects.
I suppose I could use something like -1 for a value. I was wondering if there are any well known coding conventions for this kind of thing? I suppose it depends on the nature of the variable and the data.
This is first my stack overflow question. Hopefully the gist of my question makes sense.
A lot of debuggers will use 0xdeadbeef for initializing registers. I always get a chuckle when I see that.
But, in all honesty, your question contains its own answer - use a value that your variable is not ever expected to become. It doesn't matter what the value is.
Since you asked in a comment I'll talk a little bit about C and C++. For efficiency reasons local variables and allocated memory are not initialized by default. But debug builds often do this to help catch errors. A common value used is 0xcdcdcdcd which is reasonably unlikely. It has the high bit set and is either a rather large unsigned or rather large negative signed number. As a pointer address it is odd which will cause an alignment exception if used on anything but a char (but not on X86). It has no special meaning as a 32 bit floating point number so it isn't a perfect choice.
Occasionally you'll see a partially aligned value in a variable such as 0xcdcd0000 or 0x0000cdcd. These can be treated as suspcious at the very least.
Sometimes different values will be used depending on the allocation area of library. That gives you a clue where a bad value may have originated (i.e., it itself wasn't initialized but it was copied from an unititialized value).
The ideal value would be invalid no matter what alignment you read from memory and is invalid over all primitive types. It also should look suspicious to a human so even if they do not know the convention they can suspect something is a foot. That's why 0xdeadbeef can be a good choice because the (hex viewing) programmer will recognize that as the work of a human and not random chance. Note also that it is odd and has the high bit set so it has that going for it.
The value -1 is often traditionally used as an "out of range" or "invalid" value to indicate failure or non-initialised data. Then again, that goes right down the pan if -1 is a semantically valid value for the variable...or you're using an unsigned type.
You seem to like null (and for a good reason), so why not just use it throughout?
In ActionScript you can only assign Number.NaN to variables that are typed Number, not int or uint.
That being said, because AS3 does not support named arguments you can always look at the arguments array (it's a built-in array that all functions have, unless you use the ...rest construct). If that array's length is less than the position of your numeric argument you know it wasn't passed in.
I often use a maximum value for this. As you say, zero often is a valid value. Generally max-int, while theoretically valid, is safe to exclude. But not always; be careful.
I like 0xD15EA5ED, it's similar to 0xDEADBEEF but is usually more accurate when debugging.

Resources