What is Perl's default hash implementation? - hash-function

According to perldoc perlsec
Alternative Hash Functions The source code includes multiple hash algorithms to choose from. While we believe that the default perl hash is robust to attack, we have included the hash function Siphash as a fall-back option. At the time of release of Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is not the default as it is much slower than the default hash.
What is the default perl hash function? From the context we can infer it's not SIPHASH?

From the INSTALL file in the source,
Since Perl 5.18 we have included support for multiple hash functions,
although from time to time we change which functions we support,
and which function is default (currently SBOX+STADTX on 64 bit builds
and SBOX+ZAPHOD32 for 32 bit builds).
You can see that default in the source code in hv_func.h. So it seems for 64 bit builds it's StadtX "fast hash function" which is implemented in stadtx_hash.h.

Related

Why Hashids can be decoded?

I've used this popular library Hashids.
As this poster mentioned,Hashes produced by these algorithms are designed to be 'one-way'. Then, why is it possible for a hash value to be decoded?
I've read the documentation (and searched the issues), but don't see why hashes created by this library can be decoded.
I was about to ask this question in the git repo, but this is a question rather than an issue.
Any insight will be appreciated.
You find the reason in the documentation on the project site:
How does it work?
Hashids works similarly to the way integers are converted to hex, but with a few exceptions:
The alphabet is not base16, but base base62 by default.
The alphabet is also shuffled based on salt.
So, in short, this is not a hash at all, but merely an alternate encoding, more similar to a simple substitution cipher than to a hash (which would work as compression function). Which is, of course, pretty trivially reversible.

How can I change the formatter's decimal separator in Rust?

The function below results in "10.000". Where I live this means "ten thousand".
format!("{:.3}", 10.0);
I would like the output to be "10,000".
There is no support for internationalization (i18n) or localization (l10n) baked in the Rust standard library.
There are several reasons, in no particular order:
a locale-dependent output should be a conscious choice, not a default,
i18n and l10n are much more complicated than just formatting numbers,
the Rust std aims at being small.
The format! machinery is going to be used to write JSON or XML files. You really do NOT want to end up with a differently formatted file depending on the locale of the machine that encoded it. It's a recipe for disaster.
The detection of locale at run-time is also optimization unfriendly. Suddenly you cannot pre-compute things at compile-time (even partially), you cannot even know which size of buffer to allocate at compile-time.
And this ties in with a dubious usefulness. Dates and numbers are arguably important, however this American vs English formatting war is ultimately a drop in the ocean. A French grammar schooler will certainly appreciate that the number is formatted in the typical French format... but it will be of no avail to her if the surrounding text is in English (we French are notoriously bad at teaching/learning foreign languages). Locale should influence language selection, sorting order, etc... merely changing the format of numbers is pointless, everything should switch with it, and this requires much more serious support (check gettext for a C library that provides a good base).
Basing the detection of the locale on the host locale, and it being global to the whole process, is also a very dubious architectural choice in this age of multi-threaded web servers. Imagine if Facebook was served in Swedish in Europe just because its datacenter is running there.
Finally, all this language/date/... support requires a humongous amount of data. ICU has several dozens (or is it hundreds?) of MBs of such data embedded inside it. This would make the size of the std explode, and make it completely unsuitable for embedded development; which probably do not care about this anyway.
Of course, you could cut down on this significantly if you only chose to support a handful of languages... which is yet another argument for putting this outside the standard library.
Since the standard library doesn't have this functionality (localization of number format), you can just replace the dot with a comma:
fn main() {
println!("{}", format!("{:.3}", 10.0).replacen(".", ",", 1));
}
There are other ways of doing this, but this is probably the most straightforward solution.
This is not the role of the macro format!. This option should be handle by Rust. Unfortunately, my search lead me to the conclusion that Rust don't handle locale (yet ?).
There is a library rust-locale, but they are still in alpha.

Lua floating point operations

I run Lua on a CPU without dedicated floating point HW, depending on SW emulation.
From luaopt.h I can see that some macros are set to double, but it does not clearly state when floats are used and its a little hard to track it.
If my script does simple stuff like:
a=0
a=a+1
for...
Would that involve a floating point operations at any level?
If no that's fine, but what is then the benefit to change macros to long?
(I tried of course but did not work....)
All numeric operations in Lua are performed (according to the default configuration) in floating point. There is no distinction made between floating point and integer, all values are simply numbers.
The actual C type used to store a Lua number is set in luaconf.h, and it is both allowed and even practical to change that to a suitable integral type. You start by changing LUA_NUMBER from double to int, long, or perhaps ptrdiff_t. Then you will find you need to tweak the related macros that control the conversions between strings and numbers. And, of course, you will likely need to eliminate most or all of the base math library since math.sin() and its friends and neighbors are not particularly useful over integers.
The result will be a Lua interpreter where all numbers are integers. The language will still allow you to type 3.14, but it will be stored as 3. Your code will likely not be completely portable to a Lua interpreter built with the standard configuration since a huge amount of Lua code casually assumes that floating point arithmetic is permitted, and remember that your compiled byte code will definitely not be compatible since byte code will store numbers as LUA_NUMBER.
There is LNUM patch (used, for example, by OpenWrt project which relies heavily on Lua for providing Web UI on hardware without FPU) that allows dual integer/floating point representation of numbers in Lua with conversions happening behind the scenes when required. With it most integer computations will be performed without resorting to FPU. Unfortunately, it's only applicable to Lua 5.1; 5.2 is not supported.

Is there a bcrypt implementation available for Delphi?

I'm trying to find a bcrypt implementation I can use in Delphi. About the only useful thing that Googling brings me is this download page, containing translated headers for a winapi unit called bcrypt.h. But when I look at the functionality it provides, bcrypt.h doesn't appear to actually contain any way to use the Blowfish algorithm to hash passwords!
I've found a few bcrypt implementations in C that I could build a DLL from and link to, except they seem to all require *nix or be GCC-specific, so that won't work either!
This is sorta driving me up the wall. I'd think that it would be easy to find an implementation, but that doesn't seem to be the case at all. Does anyone know where I could get one?
Okay, so i wrote it.
Usage:
hash: string;
hash := TBCrypt.HashPassword('mypassword01');
returns something like:
$2a$10$Ro0CUfOqk6cXEKf3dyaM7OhSCvnwM9s4wIX9JeLapehKK5YdLxKcm
The useful thing about this (OpenBSD) style password hash is:
that it identifies the algorithm (2a = bcrypt)
the salt is automatically created for you, and shipped with the hash (Ro0CUfOqk6cXEKf3dyaM7O)
the cost factor parameter is also carried with the hash (10).
To check a password is correct:
isValidPassword: Boolean;
isValidPassword := TBCrypt.CheckPassword('mypassword1', hash);
BCrypt uses a cost factor, which determines how many iterations the key setup will go though. The higher the cost, the more expensive it is to compute the hash. The constant BCRYPT_COST contains the default cost:
const
BCRYPT_COST = 10; //cost determintes the number of rounds. 10 = 2^10 rounds (1024)
In this case a cost of 10 means the key will be expanded and salted 210=1,024 rounds. This is the commonly used cost factor at this point in time (early 21st century).
It is also interesting to note that, for no known reason, OpenBSD hashed passwords are converted to a Base-64 variant that is different from the Base64 used by everyone else on the planet. So TBCrypt contains a custom base-64 encoder and decoder.
It's also useful to note that the hash algorithm version 2a is used to mean:
bcrypt
include the password's null terminator in the hashed data
unicode strings are UTF-8 encoded
So that is why the HashPassword and CheckPassword functions take a WideString (aka UnicodeString), and internally convert them to UTF-8. If you're running this on a version of Delphi where UnicodeString is a reserved word, then simply define out:
type
UnicodeString = WideString;
i, as David Heffernan knows, don't own Delphi XE 2. i added the UnicodeString alias, but didn't include compilers.inc and define away UnicodeString (since i don't know the define name, nor could i test it). What do you want from free code?
The code comprises of two units:
Bcrypt.pas (which i wrote, with embedded DUnit tests)
Blowfish.pas (which Dave Barton wrote, which i adapted, extended, fixed some bugs and added DUnit tests to).
Where on the intertubes can i put some code where it can live in perpetuity?
Update 1/1/2015: It was placed onto GitHub some time ago: BCrypt for Delphi.
Bonus 4/16/2015: There is now Scrypt for Delphi

Faster CompareText implementation for D2009

I'm extensively using hash map data structures in my program. I'm using a hash map implementation by Barry Kelly posted on the Codegear forums. That implementation internally uses RTL's CompareText function. Profiling made me realize that A LOT of time is spent in SysUtils CompareText function.
I had a look at the
Fastcode site
and found some faster implementations of CompareText. Unfortunately they seem not to work for D2009 and its unicode strings.
Now for the question: Is there a similar faster version that supports D2009 strings? The CompareText functions seems to be called a lot when using hash maps (at least in the implemenation I'm currently using), so little performance improvements could really make a difference. Or should the implementations presented there also work for unicode strings?
Many of the FastCode functions will probably compile and appear to work just fine in Delphi 2009, but they won't be right for all input. The ones that are implemented in assembler will fail because they assume characters are just one byte each. The ones implemented in Delphi will fare a little better, but they'll still return incorrect results sometimes because the old CompareText's notion of "case-insensitive" is based on ASCII whereas the new one should be based on Unicode. The rules for which characters are considered the same save for case are much different for Unicode from how they are for ASCII.
Andreas says in a comment below that Unicode CompareText still uses the ASCII case-comparison rules, so a number of the FastCode functions should work fine. Just look them over before using them to make sure they're not making any character-size assumptions. I seem to recall that some FastCode functions were incorporated into the Delphi RTL already. I have no idea whether CompareText was one of them.
If you're calling CompareText a lot in a hash table, then that suggests your hash table isn't doing a very good job. CompareText should only get called when the hash of the thing you're searching for designated a non-empty bucket in the hash table. From there, a hash table will often use a linear search to find the right item in the bucket, and it will call CompareText for every item during that search. I don't know whether that's how the one you're using works.
You might solve this by using a different hash function that distributes its results more evenly over the available buckets. If your buckets are already evenly filled, then you may need more buckets (and then make sure the hash function still distributes evenly over that number as well).
If the hash-map class you're using is based on TBucketList, then there is room for improvement in the bucket storage. That class doesn't calculate a hash on the entire input. It uses the input only to determine the bucket to use. If the class would also keep track of the full hash computed for a string, then comparisons during the linear search could go much faster. Just compare the hashes, and only compare the strings when the hashes match completely. (For a 256-bucket bucket-list, the largest supported size, only one byte of the input determines the bucket, and the rest of the bytes are ignored.) I've written about TBucketList here before.

Resources