I'm trying to find a bcrypt implementation I can use in Delphi. About the only useful thing that Googling brings me is this download page, containing translated headers for a winapi unit called bcrypt.h. But when I look at the functionality it provides, bcrypt.h doesn't appear to actually contain any way to use the Blowfish algorithm to hash passwords!
I've found a few bcrypt implementations in C that I could build a DLL from and link to, except they seem to all require *nix or be GCC-specific, so that won't work either!
This is sorta driving me up the wall. I'd think that it would be easy to find an implementation, but that doesn't seem to be the case at all. Does anyone know where I could get one?
Okay, so i wrote it.
Usage:
hash: string;
hash := TBCrypt.HashPassword('mypassword01');
returns something like:
$2a$10$Ro0CUfOqk6cXEKf3dyaM7OhSCvnwM9s4wIX9JeLapehKK5YdLxKcm
The useful thing about this (OpenBSD) style password hash is:
that it identifies the algorithm (2a = bcrypt)
the salt is automatically created for you, and shipped with the hash (Ro0CUfOqk6cXEKf3dyaM7O)
the cost factor parameter is also carried with the hash (10).
To check a password is correct:
isValidPassword: Boolean;
isValidPassword := TBCrypt.CheckPassword('mypassword1', hash);
BCrypt uses a cost factor, which determines how many iterations the key setup will go though. The higher the cost, the more expensive it is to compute the hash. The constant BCRYPT_COST contains the default cost:
const
BCRYPT_COST = 10; //cost determintes the number of rounds. 10 = 2^10 rounds (1024)
In this case a cost of 10 means the key will be expanded and salted 210=1,024 rounds. This is the commonly used cost factor at this point in time (early 21st century).
It is also interesting to note that, for no known reason, OpenBSD hashed passwords are converted to a Base-64 variant that is different from the Base64 used by everyone else on the planet. So TBCrypt contains a custom base-64 encoder and decoder.
It's also useful to note that the hash algorithm version 2a is used to mean:
bcrypt
include the password's null terminator in the hashed data
unicode strings are UTF-8 encoded
So that is why the HashPassword and CheckPassword functions take a WideString (aka UnicodeString), and internally convert them to UTF-8. If you're running this on a version of Delphi where UnicodeString is a reserved word, then simply define out:
type
UnicodeString = WideString;
i, as David Heffernan knows, don't own Delphi XE 2. i added the UnicodeString alias, but didn't include compilers.inc and define away UnicodeString (since i don't know the define name, nor could i test it). What do you want from free code?
The code comprises of two units:
Bcrypt.pas (which i wrote, with embedded DUnit tests)
Blowfish.pas (which Dave Barton wrote, which i adapted, extended, fixed some bugs and added DUnit tests to).
Where on the intertubes can i put some code where it can live in perpetuity?
Update 1/1/2015: It was placed onto GitHub some time ago: BCrypt for Delphi.
Bonus 4/16/2015: There is now Scrypt for Delphi
Related
The Spring4D library has cryptography classes, however I cannot get them to work as expected. I'm probably using them incorrectly, however lack of any examples makes it difficult.
For example on the website https://quickhash.com/hash-sha256-online, I can hash the word "test" to generate the following hash:
9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08
Using the Spring4D library, the following code produces a different hash:
CreateSHA256.ComputeHash('test').ToString;
results in:
9EFEA1AEAC9EDA04A892885A65FDAE0E6D9BE8C9FC96DA76D31B929262E12B1D
Upper/lower case aside, it is a different hash altogether. I know must be doing something wrong, but again there's no examples of use so I'm stuck on how to do this.
Hashing algorithms operate on binary data, typically represented using byte arrays.
Unfortunately, both of the resources you have used offer the ability to hash text. In order to hash text, you first need to convert from text to binary. To do so requires a choice of encoding. And neither method makes it clear what that choice is.
When I use this Delphi code:
LowerCase(CreateSHA256.ComputeHash(TEncoding.UTF8.GetBytes('test')).ToString)
I get the same hash as appears in your question.
I urge you never to attempt to encrypt/hash text and instead regard these operations as operating on binary. Always use an explicit encoding and then encrypt/hash the array of bytes that the encoding produced.
I've picked the UTF-8 encoding here, because it is a full Unicode encoding, and tends to be efficient in terms of space. However, I don't think your online encoder uses UTF-8. In fact I've no idea what encoding it uses, it is unclear on the matter. This is of course the same old issue of text being different from binary.
In my opinion it is a design flaw of the Delphi library that you use that it allows you to hash text without an explicit choice of encoding. If this library must offer a function that hashes text, then it should require the caller to supply an extra TEncoding parameter.
There is no conversion going on internally so it hashes the UnicodeString which is at least 2 bytes per character.
If you want the same result as on the page you have to use UTF8Encode or directly pass as AnsiString.
However I tried some strings that contained different unicode characters and the page returned a different result. So I am not quite sure how they treat the strings there. I guess it's a codepage thing.
Edit: If you use this page http://www.xorbin.com/tools/sha256-hash-calculator it generates the same hash as TSHA256 with UTF8Encode.
Which type of string are you using? Do you use AnsiString or WideString (Unicode string). Delphi 2009 and Newer are using WideString by default.
Why is string type inportant? All hasging algorithm operates on raw bytes data so it is omportant if each character of your string is stored in one Byte of memory (AnsiString) or multiple Bytes of memory (WideString).
There's a bunch of ways you can compare strings in modern Delphi (say 2010-XE3):
'<=' operator which resolves to UStrCmp / LStrCmp
CompareStr
AnsiCompareStr
Can someone give (or point to) a description of what those methods do, in principle?
So far I've figured that AnsiCompareStr calls CompareString on Windows, which is a "textual" comparison (i.e. takes into account unicode combined characters etc). Simple CompareStr does not do that and seems to do a binary comparison instead.
But what is the difference between CompareStr and UStrCmp? Between UStrCmp and LStrCmp? Do they all produce identical results? Do those results change between versions of Delphi?
I'm asking because I need a comparison which will always produce the same results, so that indexes in app built with one version of Delphi remain consistent with code built with another.
AnsiCompareStr is specified as taking locale into account, and should return identical results regardless of Delphi version, but may return different results based on Windows version and/or settings.. CompareStr is a pure binary comparison: "The comparison operation is based on the 16-bit ordinal value of each character and is not affected by the current locale" (for the CompareStr(const S1, S2: string) overload). UStrCmp also uses a pure binary comparison: "Strings are compared according to the ordinal values that make up the characters that make up the string." So there should not be a difference between the latter two. The way they return the result is different, so two implementations are needed (although it would be possible to make one rely on the other).
As for the differences between LStrCmp and UStrCmp, LStrCmp takes AnsiStrings, UStrCmp takes UnicodeStrings. It's entirely possible that two characters (let's say A and B) are ordered in the misnamed "ANSI" code page as A < B, but are ordered in Unicode as A > B. You should almost always just use the comparison appropriate for the data you have.
I'm using the DCPcrypt library in Delphi 2007 for encrypting text for an in-house app.
I'm currently using the following code (not my actual key):
Cipher := TDCP_rijndael.Create(nil);
try
Cipher.InitStr('5t#ck0v3rf10w', TDCP_md5);
Result := Cipher.EncryptString('Test string');
finally
Cipher.Burn;
Cipher.Free;
end;
The comment for InitStr is:
Do key setup based on a hash of the key string
Will exchanging the MD5 algorithm for, say, SHA2-256 or SHA2-512 make any theoretical or actual difference to the strength of the encryption?
The direct answer to your question is 'No' - it won't make any appreciable difference to cryptographic strength. Yes, MD5 is broken, but really it's weakness does not make any difference in this particular application. AES has key sizes of 128, 192 and 256 bits. All you are doing here is creating a string pseudonym for a key (being either 16 bytes, 24 bytes or 32 bytes). When cryptographic experts say that a hash function is broken, what they mean by this is that given a known hash output, it is feasible to compute a message different from the original message, which also hashes to the same output. In other words, in order for the cryptographic strength or weakness of the hash function to have any meaning, the binary key must already be known to the malicious party, which means that it is only relevant when your security is already completely defeated.
The strength of the hashing algorithm is COMPLETELY irrelevant to the strength of the asymmetric cipher.
However...
However, of a much more serious concern is the lack of salting in your code. Unless you plan to manually salt your message (unlikely), your communications are very vulnerable to replay attack. This will be infinity worse if you use ECB mode, but without salting, it is a major security issue for any mode. 'Salting' means injecting a sufficiently large non-predictable non-repeating value in either the IV or at the head of the message before encryption.
This highlights a huge problem with DCPCrypt. Most users of DCPcrypt will not know enough about cryptography to appreciate the importance of proper salting, and will use the crypto component in exactly the way you have. When you use DCPcrypt in this way (which is very natural), DCPcrypt does NOT salt. In fact, it sets the IV to zero. And it gets worse... If you have chosen a key-streaming type of chaining mode (which is very popular), and your IV is habitually zero, your security will be completely and utterly broken if a single plaintext message is known or guessed, (OR even just a fragment of the message is guessed). DCPcrypt does offer an alternative way to initialize a binary key (not from string), together with allowing the user to set the IV (you must generate a random IV yourself). The next problem is that the whole IV management gets a bit complicated.
Disclosure
I am the author of TurboPower LockBox 3. Dave Barton's DCPcrypt, an admirable and comprehensive engineering work, was one of my inspirations for writing LockBox 3.
You should specify the type of attack on your encryption; suppose known-plaintext attack is used, and intruder uses precomputed hash values to find key string - then there should be no difference between the hash algorithms used, any hash algorithm will require nearly the same time to find key string.
I finally upgraded to Delphi XE. I have a library of units where I use strings to store plain ANSI characters (chars between A and U). I am 101% sure that I will never ever use UNICODE characters in those places.
I want to convert all other libraries to Unicode, but for this specific library I think it will be better to stick with ANSI. The advantage is the memory requirement as in some cases I load very large TXT files (containing ONLY Ansi characters). The disadvantage might be that I have to do lots and lots of typecasts when I make those libraries to interact with normal (unicode) libraries.
There are some general guidelines to show when is good to convert to Unicode and when to stick with Ansi?
The problem with general guidelines is that something like this can be very specific to a person's situation. Your example here is one of those.
However, for people Googling and arriving here, some general guidelines are:
Yes, convert to Unicode. Don't try to keep an old app fully using AnsiStrings. The reason is that the whole VCL is Unicode, and you shouldn't try to mix the two, because you will convert every time you assign a Unicode string to an ANSI string, and that is a lossy conversion. Trying to keep the old way because it's less work (or some similar reason) will cause you pain; just embrace the new string type, convert, and go with it.
Instead of randomly mixing the two, explicitly perform any conversions you need to, once - for example, if you're loading data from an old version of your program you know it will be ANSI, so read it into a Unicode string there, and that's it. Ever after, it will be Unicode.
You should not need to change the type of your string variables - string pre-D2009 is ANSI, and in D2009 and alter is Unicode. Instead, follow compiler warnings and watch which string methods you use - some still take an AnsiString parameter and I find it all confusing. The compiler will tell you.
If you use strings to hold bytes (in other words, using them as an array of bytes because a character was a byte) switch to TBytes.
You may encounter specific problems for things like encryption (strings are no longer byte/characters, so 'character' for 'character' you may get different output); reading text files (use the stream classes and TEncoding); and, frankly, miscellaneous stuff. Search here on SO, most things have been asked before.
Commenters, please add more suggestions... I mostly use C++Builder, not Delphi, and there are probably quite a few specific things for Delphi I don't know about.
Now for your specific question: should you convert this library?
If:
The values between A and U are truly only ever in this range, and
These values represent characters (A really is A, not byte value 65 - if so, use TBytes), and
You load large text files and memory is a problem
then not converting to Unicode, and instead switching your strings to AnsiStrings, makes sense.
Be aware that:
There is an overhead every time you convert from ANSI to Unicode
You could use UTF8String, which is a specific type of AnsiString that will not be lossy when converted, and will still store most text (Roman characters) in a single byte
Changing all the instances of string to AnsiString could be a bit of work, and you will need to check all the methods called with them to see if too many implicit conversions are being performed (for performance), etc
You may need to change the outer layer of your library to use Unicode so that conversion code or ANSI/Unicode compiler warnings are not visible to users of your library
If you convert to Unicode, sets of characters (can't remember the syntax, maybe if 'S' in MySet?) won't work. From your description of characters A to U, I could guess you would like to use this syntax.
My recommendation? Personally, the only reason I would do this from the information you've given is the memory use, and possibly performance depending on what you're doing with this huge amount of A..Us. If that truly is significant, it's both the driver and the constraint, and you should convert to ANSI.
You should be able to wrap up the conversion at the interface between this unit and its clients. Use AnsiString internally and string everywhere else and you should be fine.
In general only use AnsiString if it is important that the Chars are single bytes, Otherwise the use of string ensures future compatibility with Unicode.
You need to check all libraries anyway because all Windows API functions in Delhpi XE replaced by their unicode-analogues, etc. If you will never use UNICODE you need to use Delphi 7.
Use AnsiString explicitly everywhere in this unit and then you'll get compiler warning errors (which you should never ignore) for String to AnsiString conversion errors if you happen to access the routines incorrectly.
Alternately, perhaps preferably depending on your situation, simply convert everything to UTF8.
Stick with Ansi strings ONLY if you do not have the time to convert the code properly. The use of Ansi strings is really only for backward compatibility - to my knowledge C# does not have an equiavalent to Ansi strings. Otherwise use the standard Unicode strings. If you have a look on my web-site I have a whole strings routines unit (about 5,000 LOC) that works with both Delphi 2007 (non-Uniocde) and XE (Unicode) with only "string" interfaces and contains almost all of the conversion issues you might face.
I'm extensively using hash map data structures in my program. I'm using a hash map implementation by Barry Kelly posted on the Codegear forums. That implementation internally uses RTL's CompareText function. Profiling made me realize that A LOT of time is spent in SysUtils CompareText function.
I had a look at the
Fastcode site
and found some faster implementations of CompareText. Unfortunately they seem not to work for D2009 and its unicode strings.
Now for the question: Is there a similar faster version that supports D2009 strings? The CompareText functions seems to be called a lot when using hash maps (at least in the implemenation I'm currently using), so little performance improvements could really make a difference. Or should the implementations presented there also work for unicode strings?
Many of the FastCode functions will probably compile and appear to work just fine in Delphi 2009, but they won't be right for all input. The ones that are implemented in assembler will fail because they assume characters are just one byte each. The ones implemented in Delphi will fare a little better, but they'll still return incorrect results sometimes because the old CompareText's notion of "case-insensitive" is based on ASCII whereas the new one should be based on Unicode. The rules for which characters are considered the same save for case are much different for Unicode from how they are for ASCII.
Andreas says in a comment below that Unicode CompareText still uses the ASCII case-comparison rules, so a number of the FastCode functions should work fine. Just look them over before using them to make sure they're not making any character-size assumptions. I seem to recall that some FastCode functions were incorporated into the Delphi RTL already. I have no idea whether CompareText was one of them.
If you're calling CompareText a lot in a hash table, then that suggests your hash table isn't doing a very good job. CompareText should only get called when the hash of the thing you're searching for designated a non-empty bucket in the hash table. From there, a hash table will often use a linear search to find the right item in the bucket, and it will call CompareText for every item during that search. I don't know whether that's how the one you're using works.
You might solve this by using a different hash function that distributes its results more evenly over the available buckets. If your buckets are already evenly filled, then you may need more buckets (and then make sure the hash function still distributes evenly over that number as well).
If the hash-map class you're using is based on TBucketList, then there is room for improvement in the bucket storage. That class doesn't calculate a hash on the entire input. It uses the input only to determine the bucket to use. If the class would also keep track of the full hash computed for a string, then comparisons during the linear search could go much faster. Just compare the hashes, and only compare the strings when the hashes match completely. (For a 256-bucket bucket-list, the largest supported size, only one byte of the input determines the bucket, and the rest of the bytes are ignored.) I've written about TBucketList here before.