I'm having a strange problem that is affecting at least some of my international users of my Delphi 6 application. Here's the scenario:
My program requests status reports periodically from an external device that acts as an HTTP server.
The device sends back the status report as a response document that has a series fields delimited with the pipe character in name value pair format (e.g. - field1=-0.437).
I split the report string into the fields and then again to get each field name and numeric value.
I use StrToFloat() to convert the floating point field values in string format and assign the result of that function to a Variant variable.
This works fine on most PCs, but some of my international users are getting EConvertError's when I try to use StrToFloat() on the numeric values. Here's a concrete example of an error message from my logs:
EConvertError: '-0.685' is not a valid floating point value
As you can see -0.685 is a valid floating point number, yet I am getting the EConvertError Exception. Normally I would expect to see a comma where the decimal point is, or some other locale specific punctuation problem, but the number appears fine in this case. Also, to the best of my knowledge the external device does not even have the option to set the character set.
So what subtle nuance about Delphi 6 and international character sets might be causing this problem, perhaps related to the user's Windows XP/Win7 character settings? Note, I use standard Delphi 6 "string" cast strings throughout my program so I don't see how a multi-byte character set issue could be the root cause. Has anyone had this problem and knows what to do about it?
Your remote user's machine is expecting , for the decimal separator. When it encounters . the EConvertError exception is raised. On a machine which expects , as the decimal separator (e.g. most European and South American countries) -0.685 is indeed not a valid floating point value.
Normally I would expect to see a comma where the decimal point is, or some other locale specific punctuation problem, but the number appears fine in this case.
Your current problem is just the flip-side of the above issue. Normally, because your locale uses . as the separator, you are accustomed to seeing problems when data with , is used instead. Put yourselves in the position of somebody from a country which uses , as a separator. For them, they will be accustomed to seeing exceptions when data with . is used.
You could solve the problem by normalising the input to use the same decimal separator as the the machine locale. On a modern Delphi you could solve the problem by use the StrToFloat overload that receives a TFormatSettings parameter and explicitly specify that . is to be used as the decimal separator for this conversion. Unfortunately that facility is not available in Delphi 6.
I faced this issue for Belgian users. I also had to manually replace the '.' or ',' in the input data.
Also, if you are inserting the data into the database (sql) then, you will have to replace the ',' with '.' during insertion of the data into the database.
Related
i am trying to check if a string is empty by doing the following
if trim(somestring) = '' then
begin
//that an empty string
end
i am doing this empty check in my client application , but i notice that some clients inserts empty strings even with this check applied .
on my Linux server side those empty chars showing as squares and when i copy those chars i can be able to bypass the empty string check like the follwing example
if you copy this empty chars the check will not be effected how can i avoid that ?
Your code is working correctly and the strings are not empty. An empty string is one whose length is zero, it has no characters. You refer to empty characters, but there is no such thing. When. your system displays a small empty square that indicates that the chosen font has no glyph for that character.
Let us assume that these strings are invalid. In that case the real problem you have is that you don't yet fully understand what properties are required for a string to be valid. Your code is written assuming that a string is valid if it contains at least one character with ordinal value greater than 32. Clearly that test is not correct. You need to step back and work out the precise rules for validity. Only when these are clear in your mind can you correct you program and check for validity correctly.
On the other hand perhaps these strings are valid and the mistake is simply that you are erroneously determining otherwise when you inspect the data. Only you can know this, we don't have the information.
A useful technique in all of this is to inspect the ordinal values of the strings. Loop through the characters printing the ordinal value of each one. That allows you to see what is really there and not be at the mercy of non-printing characters, characters with no glyph, invalid encodings, etc.
Since Trim is pretty simple function, it omits only characters with less than or equal dec 32 in ASCII table
( Sample from System.SysUtils.pas )
while S.Chars[L] <= ' ' do Dec(L);
Therefore it's possible that You just can't see some exotic chars ( > ASCII 128) due to bad encoding used with Your input string.
try to use :
StrToInt(Ord(SomeChar))
On every char that is not "trimmed" and remove them by hand or check Your encoding.
Kind Regards
My application is parsing incoming emails. I try to parse them as best as possible but every now and then I get one with puzzling content. This time is an email that looks to be in ASCII but the specified charset is: ansi_x3.110-1983.
My application handles it correctly by defaulting to ASCII, but it throws a warning which I'd like to stop receiving, so my question is: what is ansi_x3.110-1983 and what should I do with it?
According to this page on the IANA's site, ANSI_X3.110-1983 is also known as:
iso-ir-99
CSA_T500-1983
NAPLPS
csISO99NAPLPS
Of those, only the name NAPLPS seems interesting or informative. If you can, consider getting in touch with the people sending those mails. If they're really using Prodigy in this day and age, I'd be amazed.
The IANA site also has a pointer to RFC 1345, which contains a description of the bytes and the characters that they map to. Compared to ISO-8859-1, the control characters are the same, as are most of the punctuation, all of the numbers and letters, and most of the remaining characters in the first 7 bits.
You could possibly use the guide in the RFC to write a tool to map the characters over, if someone hasn't written a tool for it already. To be honest, it may be easier to simply ignore the whines about the weird character set given that the character mapping is close enough to what is expected anyway...
As already pointed out in the topic, I got the following error:
Character #\u009C cannot be represented in the character set CHARSET:CP1252
trying to print out a string given back by drakma:http-request, as far as I understand the error-code the problem is that the windows-encoding (CP1252) does not support this character.
Therefore to be able to process it, I might/must convert the whole string.
My question is what package/library does support converting strings to certain character-sets efficiently?
An alike question is this one, but just ignoring the error would not help in my case.
Drakma already does the job of "converting strings": after all, when it reads from some random webserver, it just gets a stream of bytes. It then has to convert that to a lisp string. You probably want to bind *drakma-default-external-format* to something else, although I can't remember off-hand what the allowable values are. Maybe something like :utf-8?
In EDIFACT there are numeric data elements, specified e.g. as format n..5 -- we want to store those fields in a database table (with alphanumeric fields, so we can check them). How long must the db-fields be, so we can for sure store every possible valid value? I know it's at least two additional chars (for decimal point (or comma or whatever) and possibly a leading minus sign).
We are building our tables after the UN/EDIFACT standard we use in our message, not the specific guide involved, so we want to be able to store everything matching that standard. But documentation on the numeric data elements isn't really straightforward (or at least I could not find that part).
Thanks for any help
I finally found the information on the UNECE web site in the documentation on UN/EDIFACT rules Part 4. UN/EDIFACT rules Chapter 2.2 Syntax Rules . They don't say it directly, but when you put all the parts together, you get it. See TOC-entry 10: REPRESENTATION OF NUMERIC DATA ELEMENT VALUES.
Here's what it basically says:
10.1: Decimal Mark
Decimal mark must be transmitted (if needed) as specified in UNA (comma or point, put always one character). It shall not be counted as a character of the value when computing the maximum field length of a data element.
10.2: Triad Seperator
Triad separators shall not be used in interchange.
10.3: Sign
[...] If a value is to be indicated to be negative, it shall in transmission be immediately preceded by a minus sign e.g. -112. The minus sign shall not be counted as a character of the value when computing the maximum field length of a data element. However, allowance has to be made for the character in transmission and reception.
To put it together:
Other than the digits themselves there are only two (optional) chars allowed in a numeric field: the decimal seperator and a minus sign (no blanks are permitted in between any of the characters). These two extra chars are not counted against the maximum length of the value in the field.
So the maximum number of characters in a numeric field is the maximal length of the numeric field plus 2. If you want your database to be able to store every syntactically correct value transmitted in a field specified as n..17, your column would have to be 19 chars long (something like varchar(19)). Every EDIFACT-message that has a value longer than 19 chars in a field specified as n..17 does not need to be stored in the DB for semantic checking, because it is already syntactically wrong and can be rejected.
I used EDI Notepad from Liaison to solve a similar challenge. https://liaison.com/products/integrate/edi/edi-notepad
I recommend anyone looking at EDI to at least get their free (express) version of EDI Notepad.
The "high end" version (EDI Notepad Productivity Suite) of their product comes with a "Dictionary Viewer" tool that you can export the min / max lengths of the elements, as well as type. You can export the document to HTML from the Viewer tool. It would also handle ANSI X12 too.
I have a stored procedure for SQL 2000 that has an input parameter with a data type of varchar(17) to handle a vehicle identifier (VIN) that is alphanumeric. However, whenever I enter a value for the parameter when executing that has a numerical digit in it, it gives me an error. It appears to only accept alphabetic characters. What am I doing wrong here?
Based on comments, there is a subtle "feature" of SQL Server that allows letters a-z to be used as stored proc parameters without delimiters. It's been there forever (since 6.5 at least)
I'm not sure of the full rules, but it's demonstrated in MSDN (rename SQL Server etc): there are no delimiters around the "local" parameter. And I just found this KB article on it
In this case, it could be starting with a number that breaks. I assume it works for a contained number (but as I said I'm not sure of the full rules).
Edit: confirmed by Martin as "breaks with leading number", OK for "containing number"
This doesn't help much, but somewhere, you have a bug, typo, or oversight in your code. I spent 2+ years working with VINs as parameters, and other than regretting not having made it char(17) instead ov varchar(17), we never had any problems passing in alphanumeric VIN values. Somewhere, and I'd guess it's in the application layer, something is not liking digits -- perhaps a filter looking for only alphabetical characters?