DCPcrypt Hashing German Umlauts - delphi

I am using DCPcrypt and SHA512 to hash strings.
I am using the version by Warren Postma https://bitbucket.org/wpostma/dcpcrypt2010
It is working fine. However it failes with german umlauts like ä, ö, ü and probably other unicodes.
I am using the library like this:
function TForm1.genhash(str: string): string;
var
Hash : TDCP_sha512;
Digest: array[0..63] of byte;
i: integer;
s: string;
begin
s:= '';
hash := TDCP_sha512.Create(nil);
if hash<>nil then
begin
try
Hash.Init;
Hash.UpdateStr(str);
Hash.Final(Digest);
for i:= 0 to length(Digest)-1 do
s:= s + IntToHex(Digest[i],2);
finally
hash.free;
end;
end;
Result := s;
end;
When i input the letter ä i expect the output to be:
64868C5784A6004E675BCF405F549369BF607CD3269C0CAC1711E21BA9F40A5ABBF0C7535856E7CF77EA55A072DD04AA89EEA361E95F497AA965309B50587157
I checked it with those sites:
http://hashgenerator.de/
http://passwordsgenerator.net/sha512-hash-generator/
However i get:
1A7F725BD18E062020A646D4639F264891368863160A74DF2BFC069C4DADE04E6FA854A2474166EED0914B922A9D8BE0C89858D437DDD7FBCA5C9C89FC07323A
So my question is:
How can i use the DCPcrypt library to generate hashes for german umlauts? THanks

This must be the single most common mistake that people make with hashing and encryption. These algos operate on binary data, but you are passing text. Something somewhere has got to encode that text as binary. And what encoding should be used. How do you know that your library uses the same as the online tool? You don't.
So, here's a rule for you to follow. Never hash text. Just don't do it. Encode the text as binary using a well-defined, explicitly chosen encoding. And hash that. I suggest you encode as UTF-8 and hash that. So, TEncoding.UTF8.GetBytes(...) is your friend here.
Now, looking at the actual detail here, you are calling this method:
procedure UpdateStr(const Str: RawByteString);
The RawByteString parameter, means that your Unicode text is being converted to an ANSI string, with the default system code page. I'm sure that's not what you intend to happen. Indeed the compiler says this:
[dcc32 Warning] W1058 Implicit string cast with potential data loss from 'string' to 'RawByteString'
So the compiler is telling you that you are doing something wrong. You really must take good heed of compiler messages.
Now, you could call UpdateUnicodeStr instead of UpdateStr. But again, how do you know what encoding is used? It happens to be the native internal encoding, UTF-16LE.
But, let's follow my rule of never encoding text.
{$APPTYPE CONSOLE}
uses
SysUtils, Classes, DCPsha512;
function genhash(str: string): string;
var
Bytes: TBytes;
Hash: TDCP_sha512;
Digest: array[0..63] of byte;
begin
Bytes := TEncoding.UTF8.GetBytes(str); // encode text as UTF-8 bytes
hash := TDCP_sha512.Create(nil);
try
Hash.Init;
Hash.Update(Pointer(Bytes)^, Length(Bytes));
Hash.Final(Digest);
finally
hash.Free;
end;
// convert the digest to a hex hash string
SetLength(Result, Length(Digest)*2);
BinToHex(Digest, PChar(Result), Length(Digest));
end;
begin
Writeln(genhash('ä'));
Readln;
end.
Output
64868C5784A6004E675BCF405F549369BF607CD3269C0CAC1711E21BA9F40A5ABBF0C7535856E7CF77EA55A072DD04AA89EEA361E95F497AA965309B50587157
Note that I simplified the code in some other ways. I removed the local string variable and worked directly with Result. I used BinToHex from the Classes unit to do the digest to hex conversion. I also changed this code:
hash := TDCP_sha512.Create(nil);
if hash<>nil then
....
to remove the if statement which is not needed. If a constructor fails, an exception is raised.
Please follow my rule never to hash text. It will serve you well!

Related

Base64 decode fails with "No mapping for the Unicode character exists in the target multi-byte code page."

I am trying to create the signature string needed for Azure Storage Tables as described here:
https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key#encoding-the-signature
The Access key is "p1qPD2lBiMlDtl/Vb/T/Xt5ysd/ZXNg5AGrwJvtrUbyqiAQxGabWkR9NmKWb8Jhd6ZIyBd9HA3kCRHr6eveKaA=="
The failing code is:
property AccessKey: string read FAccessKey write SetAccessKey;
function TAzureStorageAPI.GetAuthHeader(StringToSign: UTF8String): UTF8String;
var
AccessKey64: string;
begin
AccessKey64 := URLEncoding.Base64.Decode(AccessKey);
Result := URLEncoding.Base64.Encode(CalculateHMACSHA256(StringtoSign,AccessKey64));
end;
What can I do to get this working?
I have checked that AccessKey decodes on https://www.base64decode.org/ and it converts to strange data, but it is what I am supposed to do.... :-/
It looks like CalculateHMACSHA256() comes from blindly copying it and just fiddling everything to make the compiler happy. But there are still logical flaws all over. Without testing I'd write it this way:
function GetAuthHeader
( StringToSign: UTF8String // Expecting an UTF-8 encoded text
; AzureAccountKey: String // Base64 is ASCII, so text encoding is almost irrelevant
): String; // Result is ASCII, too, not UTF-8
var
HMAC: TIdHMACSHA256;
begin
HMAC:= TIdHMACSHA256.Create;
try
// Your key must be treated as pure binary - don't confuse it with
// a human-readable text-like password. This corresponds to the part
// "Base64.decode(<your_azure_storage_account_shared_key>)"
HMAC.Key:= URLEncoding.Base64.DecodeStringToBytes( AzureAccessKey );
result:= URLEncoding.Base64.EncodeBytesToString // The part "Signature=Base64(...)"
( HMAC.HashValue // The part "HMAC-SHA256(...,...)"
( IndyTextEncoding_UTF8.GetBytes( StringToSign ) // The part "UTF8(StringToSign)"
)
);
finally
HMAC.Free;
end;
end;
No guarantee as I have no chance to compile it myself.
Using ToHex() anywhere makes no sense if you encode it all in Base64 anyway. And stuffing everything into UTF8String makes no sense either then. Hashing always implies working with bytes, not text - never mix up binary with String.

LockBox 3 Encrypting not matching online tool, suspect padding or key problem

I have a client providing an API that dictates the data to send to them must be encrypted with AES, 128-bit key, ECB mode, and PKCS5Padding. I'm trying to use LockBox 3 in Delphi 10.3 Rio and am not getting the same encrypted string as an online test tool they pointed to for verification. It is close, but not quite there.
With lots of reading here about Unicode, PKCS5Padding, and related questions, I've come to the end of what to try. I must admit I don't do very much with encryption and have been reading as much as I can to get my head around this before I ask questions.
There are a couple things I'd like confirmation on:
The difference between a password and a key. I've read that LB3 uses a password to generate a key, but I have specific instructions from the client on how the key is to be generated so I've made my own Base64-encoded key and am calling InitFromStream to initialize it. I believe this takes the place of setting a password, is that right? Or maybe passwords are only used by Asymetric cipers (not Symetric ones, like AES)?
PKCS5Padding: I was worried by something I read on the LB3 Help site that said padding is done intelligently depending on the choice of cipher, chaining mode, etc. So does that mean there's no way to force it to use a specific padding method? I've converted the data to a byte array and implemented by own PKCS5Padding but I think LB3 may still be padding beyond that. (I've tried looking through the code and have not found any evidence this is what it's doing.)
Should I use a different encryption library in Delphi to accomplish this? I've checked out DelphiEncryptionCompendium and DcPCryptV2 but I found LB3 seems to have the most support and I felt it was the easiest to work with, especially in my Unicode version of Delphi. Plus I have used LockBox 2 quite a bit in years past, so I figured it would be more familiar (this turned out not to be the case).
To illustrate what I've tried, I extracted my code from the project to a console application. Perhaps my assumptions above are correct and there's a glaring error in my code or a LB3 parameter I don't understand that someone will point out:
program LB3ConsoleTest;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, System.Classes, System.NetEncoding,
uTPLb_Codec, uTPLb_CryptographicLibrary,
uTPLb_StreamUtils, uTPLb_Constants;
var
Codec: TCodec;
CryptographicLibrary: TCryptographicLibrary;
function PKCS5PadStringToBytes(RawData: string; const PadSize: Integer): TBytes;
{ implement our own block padding }
var
DataLen: Integer;
PKCS5PaddingCount: ShortInt;
begin
Result := TEncoding.UTF8.GetBytes(RawData);
DataLen := Length(RawData);
PKCS5PaddingCount := PadSize - DataLen mod PadSize;
if PKCS5PaddingCount = 0 then
PKCS5PaddingCount := PadSize;
Inc(DataLen, PKCS5PaddingCount);
SetLength(Result, DataLen);
FillChar(Result[DataLen - PKCS5PaddingCount], PKCS5PaddingCount, PKCS5PaddingCount);
end;
procedure InitializeAESKey(const AESKey: string);
{ convert the string to a byte array,
use that to initialize a ByteStream,
and call LB3's InitFromStream }
var
AESKeyBytes: TBytes;
AESKeyStream: TBytesStream;
begin
AESKeyBytes := TEncoding.UTF8.GetBytes(AESKey);
AESKeyStream := TBytesStream.Create(AESKeyBytes);
Codec.InitFromStream(AESKeyStream);
end;
const
RawData = '{"invoice_id":"456456000018047","clerk_id":"0023000130234234","trans_amount":1150034534,"cust_code":"19455605000987890641","trans_type":"TYPE1"}';
AESKeyStr = 'CEAA31AD1EE4BDC8';
var
DataBytes: TBytes;
DataStream: TBytesStream;
ResultStream: TBytesStream;
ResultBytes: TBytes;
Base64Encoder: TBase64Encoding;
begin
// create the LockBox3 objects
Codec := TCodec.Create(nil);
CryptographicLibrary := TCryptographicLibrary.Create(nil);
try
// setup LB3 for AES, 128-bit key, ECB
Codec.CryptoLibrary := CryptographicLibrary;
Codec.StreamCipherId := uTPLb_Constants.BlockCipher_ProgId;
Codec.BlockCipherId := Format(uTPLb_Constants.AES_ProgId, [128]);
Codec.ChainModeId := uTPLb_Constants.ECB_ProgId;
// prep the data, the key, and the resulting stream
DataBytes := PKCS5PadStringToBytes(RawData, 8);
DataStream := TBytesStream.Create(DataBytes);
InitializeAESKey(AESKeyStr);
ResultStream := TBytesStream.Create;
// ENCRYPT!
Codec.EncryptStream(DataStream, ResultStream);
// take the result stream, convert it to a byte array
ResultStream.Seek(0, soFromBeginning);
ResultBytes := Stream_to_Bytes(ResultStream);
// convert the byte array to a Base64-encoded string and display
Base64Encoder := TBase64Encoding.Create(0);
Writeln(Base64Encoder.EncodeBytesToString(ResultBytes));
Readln;
finally
Codec.Free;
CryptographicLibrary.Free;
end;
end.
This program generates an encrypted string 216 characters long with only the last 25 different than what the online tool produces.
Why?
AES uses 16-bytes blocks, not 8-bytes.
So you need PKCS7 with 16 bytes padding, not PKCS5 which is fixed to 8 bytes.
Please try
DataBytes := PKCS5PadStringToBytes(RawData, 16);
Also consider changing the chaining mode away from ECB, which is pretty weak, so is to be avoided for any serious work.

Encoding in Indy 10 and Delphi

I am using Indy 10 with Delphi. Following is my code which uses EncodeString method of Indy to encode a string.
var
EncodedString : String;
StringToBeEncoded : String;
EncoderMIME: TIdEncoderMIME;
....
....
EncodedString := EncoderMIME.EncodeString(StringToBeEncoded);
I am not getting the correct value in encoded sting.
What is the purpose of IndyTextEncoding_OSDefault?
Here's the source code for IndyTextEncoding_OSDefault.
function IndyTextEncoding_OSDefault: IIdTextEncoding;
begin
if GIdOSDefaultEncoding = nil then begin
LEncoding := TIdMBCSEncoding.Create;
if InterlockedCompareExchangeIntf(IInterface(GIdOSDefaultEncoding), LEncoding, nil) <> nil then begin
LEncoding := nil;
end;
end;
Result := GIdOSDefaultEncoding;
end;
Note that I stripped out the .net conditional code for simplicity. Most of this code is to arrange singleton thread-safety. The actual value returned is synthesised by a call to TIdMBCSEncoding.Create. Let's look at that.
constructor TIdMBCSEncoding.Create;
begin
Create(CP_ACP, 0, 0);
end;
Again I've remove conditional code that does not apply to your Windows setting. Now, CP_ACP is the Active Code Page, the current system Windows ANSI code page. So, on Windows at least, IndyTextEncoding_OSDefault is an encoding for the current system Windows ANSI code page.
Why did using IndyTextEncoding_OSDefault give the same behaviour as my Delphi 7 code?
That's because the Delphi 7 / Indy 9 code for TEncoderMIME.EncodeString does not perform any code page transformation and MIME encodes the input string as though it were a byte array. Since the Delphi 7 string is encoded in the active ANSI code page, this has the same effect as passing IndyTextEncoding_OSDefault to TEncoderMIME.EncodeString in your Unicode version of the code.
What is the difference between IndyTextEncoding_Default and IndyTextEncoding_OSDefault?
Here is the source code for IndyTextEncoding_OSDefault:
function IndyTextEncoding_Default: IIdTextEncoding;
var
LType: IdTextEncodingType;
begin
LType := GIdDefaultTextEncoding;
if LType = encIndyDefault then begin
LType := encASCII;
end;
Result := IndyTextEncoding(LType);
end;
This returns an encoding that is determined by the value of GIdDefaultTextEncoding. By default, GIdDefaultTextEncoding is encASCII. And so, by default, IndyTextEncoding_Default yields an ASCII encoding.
Beyond all this you should be asking yourself which encoding you want to be using. Relying on default values leaves you at the mercy of those defaults. What if those defaults don't do what you want to do. Especially as the defaults are not Unicode encodings and so support only a limited range of characters. And what's more are dependent on system settings.
If you wish to encode international text, you would normally choose to use the UTF-8 encoding.
One other point to make is that you are calling EncodeString as though it were an instance method, but it is actually a class method. You can remove EncoderMIME and call TEncoderMIME.EncodeString. Or keep EncoderMIME and call EncoderMIME.Encode.

Cast from RawByteString to string does automatically invoke UTF8Decode?

I want to store arbitary binary data as BLOB into a SQlite database.
The data will be added as value with this function:
procedure TSQLiteDatabase.AddParamText(name: string; value: string);
Now I want to convert a WideString into its UTF8 representation, so it can be stored to the database. After calling UTF8Encode and storing the result into the database, I noticed that the data inside the database is not UTF8 decoded. Instead, it is encoded as AnsiString in my computer's locale.
I ran following test to check what happened:
type
{$IFDEF Unicode}
TBinary = RawByteString;
{$ELSE}
TBinary = AnsiString;
{$ENDIF}
procedure TForm1.Button1Click(Sender: TObject);
var
original: WideString;
blob: TBinary;
begin
original := 'ä';
blob := UTF8Encode(original);
// Delphi 6: ä (as expected)
// Delphi XE4: ä (unexpected! How did it do an automatic UTF8Decode???)
ShowMessage(blob);
end;
After the character "ä" has been converted to UTF8, the data is correct in memory ("ä"), however, as soon as I pass the TBinary value to a function (as string or AnsiString), Delphi XE4 does a "magic typecast" invoking UTF8Decode for some reason I don't know.
I have already found a workaround to avoid this:
function RealUTF8Encode(AInput: WideString): TBinary;
var
tmp: TBinary;
begin
tmp := UTF8Encode(AInput);
SetLength(result, Length(tmp));
CopyMemory(#result[1], #tmp[1], Length(tmp));
end;
procedure TForm1.Button2Click(Sender: TObject);
var
original: WideString;
blob: TBinary;
begin
original := 'ä';
blob := RealUTF8Encode(original);
// Delphi 6: ä (as expected)
// Delphi XE4: ä (as expected)
ShowMessage(blob);
end;
However, this workaround with RealUTF8Encode looks dirty to me and I would like to understand why a simple call of UTF8Encode did not work and if there is a better solution.
In Ansi versions of Delphi (prior to D2009), UTF8Encode() returns a UTF-8 encoded AnsiString. In Unicode versions (D2009 and later), it returns a UTF-8 encoded RawByteString with a code page of CP_UTF8 (65001) assigned to it.
In Ansi versions, ShowMessage() takes an AnsiString as input, and the UTF-8 string is an AnsiString, so it gets displayed as-is. In Unicode versions, ShowMessage() takes a UTF-16 encoded UnicodeString as input, so the UTF-8 encoded RawByteString gets converted to UTF-16 using its assigned CP-UTF8 code page.
If you actually wrote the blob data directly to the database you would find that it may or may not be UTF-8 encoded, depending on how you are writing it. But your approach is wrong; the use of RawByteString is incorrect in this situation. RawByteString is meant to be used as a procedure parameter only. Do not use it as a local variable. That is the source of your problem. From the documentation:
The purpose of RawByteString is to reduce the need for multiple
overloads of procedures that read string data. This means that
parameters of routines that process strings without regard for the
string's code page should typically be of type RawByteString.
RawByteString should only be used as a parameter type, and only in
routines which otherwise would need multiple overloads for AnsiStrings
with different codepages. Such routines need to be written with care
for the actual codepage of the string at run time.
For Unicode versions of Delphi, instead of RawByteString, I would suggest that you use TBytes to hold your UTF-8 data, and encode it with TEncoding:
var
utf8: TBytes;
str: string;
...
str := ...;
utf8 := TEncoding.UTF8.GetBytes(str);
You are looking for a data type that does not perform implicit text encodings when passed around, and TBytes is that type.
For Ansi versions of Delphi, you can use AnsiString, WideString and UTF8Encode exactly as you do.
Personally however, I would recommend using TBytes consistently for your UTF-8 data. So if you need a single code base that supports Ansi and Unicode compilers (ugh!) then you should create some helpers:
{$IFDEF Unicode}
function GetUTF8Bytes(const Value: string): TBytes;
begin
Result := TEncoding.UTF8.GetBytes(Value);
end;
{$ELSE}
function GetUTF8Bytes(const Value: WideString): TBytes;
var
utf8str: UTF8String;
begin
utf8str := UTF8Encode(Value);
SetLength(Result, Length(utf8str));
Move(Pointer(utf8str)^, Pointer(Result)^, Length(utf8str));
end;
{$ENDIF}
The Ansi version incurs more heap allocations than are necessary. You might well choose to write a more efficient helper that calls WideCharToMultiByte() directly.
In Unicode versions of Delphi, if for some reason you don't want to use TBytes for UTF-8 data, you can use UTF8String instead. This is a special AnsiString that always uses the CP_UTF8 code page. You can then write:
var
utf8: UTF8String;
str: string;
....
utf8 := str;
and the compiler will convert from UTF-16 to UTF-8 behind the scenes for you. I would not recommend this though, because it is not supported on mobile platforms, or in Ansi versions of Delphi (UTF8String has existed since Delphi 6, but it was not a true UTF-8 string until Delphi 2009). That is, amongst other reasons, why I suggest that you use TBytes. My philosophy is, at least in the Unicode age, that there is the native string type, and any other encoding should be held in TBytes.

Delphi Unicode and Console

I am writing a C# application that interfaces with a pair of hardware sensors. Unfortunately the only interface that is exposed on the devices requires a provided dll written in Delphi.
I am writing a Delphi executable wrapper that takes calls the necessary functions for the DLL and returns the sensor data over stout. However, the return type of this data is a PWideChar (or PChar) and I have been unable to convert it to ansi for printing on command line.
If I directly pass the data to WriteLn, I get '?' for each character. If I look through the array of characters and attempt to print them one at a time with an Ansi Conversion, only a few of the characters print (they do confirm the data though) and they will often print out of order. (printing with the index exposed simply jumps around.) I also tried converting the PWideChar's to integer straight: 'I' corresponds to 21321. I could potentially figure out all the conversions, but some of the data has a multitude of values.
I am unsure of what version of Delphi the dll uses, but I believe it is 4. Definately prior to 7.
Any help is appreciated!
TLDR: Need to convert UTF-16 PWideChar to AnsiString for printing.
Example application:
program SensorReadout;
{$APPTYPE CONSOLE}
{$R *.res}
uses
Windows,
SysUtils,
dllFuncUnit in 'dllFuncUnit.pas'; //This is my dll interface.
var state: integer;
answer: PChar;
I: integer;
J: integer;
output: string;
ch: char;
begin
try
for I := 0 to 9 do
begin
answer:= GetDeviceChannelInfo_HSI(1, Ord('a'), I, state); //DLL Function, returns a PChar with the results. See example in comments.
if state = HSI_NO_ERRORCODE then
begin
output:= '';
for J := 0 to length(answer) do
begin
ch:= answer[J]; //This copies the char. Originally had an AnsiChar convert here.
output:= output + ch;
end;
WriteLn(output);
end;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
ReadLn(I);
end.`
The issue was PAnsiChar needed to be the return type of the function sourced from the DLL.
To convert PWideChar to AnsiString:
function WideCharToAnsiString(P: PWideChar): AnsiString;
begin
Result := P;
end;
The code converts from UTF-16, null-terminated PWideChar to AnsiString. If you are getting question marks in the output then either your input is not UTF-16, or it contains characters that cannot be encoded in your ANSI codepage.
My guess is that what is actually happening is that your Delphi DLL was created with a pre-Unicode Delphi and so uses ANSI text. But now you are trying to link to it from a post-Unicode Delphi where PChar has a different meaning. I'm sure Rob explained this to you in your other question. So you can simply fix it by declaring your DLL import to return PAnsiChar rather than PChar. Like this:
function GetDeviceChannelInfo_HSI(PortNumber, Address, ChNumber: Integer;
var State: Integer): PAnsiChar; stdcall; external DLL_FILENAME;
And when you have done this you can assign to a string variable in a similar vein as I describe above.
What you need to absorb is that in older versions of Delphi, PChar is an alias for PAnsiChar. In modern Delphi it is an alias for PWideChar. That mismatch would explain everything that you report.
It does occur to me that writing a Delphi wrapper to the DLL and communicating via stdout with your C# app is a very roundabout approach. I'd just p/invoke the DLL directly from the C# code. You seem to think that this is not possible, but it is quite simple.
[DllImport(#"mydll.dll")]
static extern IntPtr GetDeviceChannelInfo_HSI(
int PortNumber,
int Address,
int ChNumber,
ref int State
);
Call the function like this:
IntPtr ptr = GetDeviceChannelInfo_HSI(Port, Addr, Channel, ref State);
If the function returns a UTF-16 string (which seems doubtful) then you can convert the IntPtr like this:
string str = Marshal.PtrToStringUni(ptr);
Or if it is actually an ANSI string which seems quite likely to me then you do it like this:
string str = Marshal.PtrToStringAnsi(ptr);
And then of course you'll want to call into your DLL to deallocate the string pointer that was returned to you, assuming it was allocated on the heap.
Changed my mind on the comment - I'll make it an answer:)
According to that code if "state" is a code <> HSI_NO_ERRORCODE and there is no exception then it will write the uninitialised string "output" to the console. Which could be anything including accidentally showing "S" and "4" and a series of 1 or more question marks
type of answer(variable) is PChar. use length function good for string variable.
use strlen instead of length.
for J := 0 to StrLen(answer)-1 do
also accessible range of PChar(char *) is 0..n-1
To convert UTF-16 PWideChar to AnsiString you can use simple cast:
var
WStr: WideString;
pWStr: PWideString;
AStr: AnsiString;
begin
WStr := 'test';
pWStr := PWideString(WStr);
AStr := AnsiString(WideString(pWStr));
end;

Resources