What character encoding should I use when addings events to Keen ?
Whenever I use international characters (e.g. ö / ë) , they are stored fine (when I look in the event explorer).
But whenever I use them in reports or the data explorer, they get mangled.
It seems this problem has been fixed over the last couple of days. Awesome!
Related
I am facing a weird problem.
I have extracted data from an Excel file. It should contain an IBAN account number.
Then I tried to analyze the set of account numbers (which the source guarantees to be good) with a Java library.
To keep the scope of the question narrow, I can't explain the following. The below strings are different
03069
03069
The first is a copy & paste from the Excel file, the second is handwritten. Google returns different results for abi [above number] and in fact in the second case I can find that it is the bank code for Intesa Sanpaolo bank (exact page displaying the ABI code, localized, here).
So, to keep the scope narrow: how is that possible? Is it something to do with the encoding?
Try it yourself: do CTRL+F and try type "030", it will select both lines. Now type 6, it will match only the 2nd line.
Same happened in Notepad++
There's an U+200B ZERO WIDTH SPACE in between 030 and 69 in the first text.
Paste the text in https://www.branah.com/unicode-converter for example, or edit in a hexadecimal capable editor.
The solution for cleaning such strings could be for example to whitelist characters, so replace everything that isn't A-Z0-9 will be scrubbed.
I am using TCPDF for many years. Recently I had to work on Arabic language display. The client wanted SakkalMajalla font (available in Windows/font) and I converted this using TCPDF tool. The conversion process was successful without error.
Now, I am facing a little issue and could not solve it since last 2 months. One of the special characters (called tanween) is placed at the bottom of the preceding character whereas it should be on top.Everything else is working fine but little thing (ٍ
) displayed at wrong place changes the meaning of the word.
يمنع استخدام الهاتف الجوال داخل صالة الاختبار
منعاً باتاً
(I can not upload image as I need 10 reputation points for that, but please notice the little thing on top of this letter تاً. Here, it is displaying properly, but in the pdf it displays at the bottom of the letter.
Is there anyway to edit manually the positioning of this character?
I am searching for the solution for the last 2 months. I event wrote 2 emails to the author of TCPDF Nicolas, but he did not give any response.
Please help.
Even though the font conversion process appeared to work successfully, you should double-check with a font editor (like FontForge) to check that the character is actually encoded correctly in the converted font file.
I have found, after many years of trying to convert all sorts of non-Latin fonts from one format to another, that the most reliable solution for font conversion is this site:
http://www.xml-convert.com/en/convert-tff-font-to-afm-pfa-fpdf-tcpdf
I've done my homework, and specifically:
1) Read the whole FASTREPORT 4 manual. It does not mention UTF8, nor Unicode support
2) Looked for an answer here on SO
3) Googled it around
If I set a Text field and fill it with Thai characters, they are perfectly printed, so FastReport CAN handle Unicode characters, at least it can print them.
If I try to "pass" a value using the callbacks provided by the frxUserDataSet, then what I see is some garbled not-unicode text. In particular, if I pass e.g. a string made with the same 10 Thai characters, I see the same "set" of 3 or 4 garbled characters repeated ten times, so I am sure the data is passed correctly, but then FastReport has probably no way to know that they should be handled as Unicode.
The callback requires the data passed back to be of "variant" type, so I guess it's totally useless to cast them to any type, because variant will accept any of them.
I forgot to mention that I get the strings from a MySql DB and the data is stored as UTF8, and I do not even copy the data in a local variable: what I get from the DB is put into the variant.
Is there a way to force FastReport to print the data received as Unicode?
Thank you
Yes, FR4 with Delphi7 supports UTF8 using frxUserDataSet.
Just for future reference:
1) You MUST set your DB (MySql in my case) to use UTF8
2) You MUST set the character set in the component you use to access the DB to utf8 ("DAC for MySql" in my case, and the property is called ConnectionCharacterSet)
3) In all the frxUserDataSet callbacks, before setting the "value" variable, you MUST CONVERT whatever you have using the Utf8decode Delphi system routine, like this:
value := Utf8decode(fReports.q1.FieldValueByFieldName('yourDBfield'));
where fReports is the form name, and q1 the component used to access the DB.
I keep reading that using D7 and UniCode is almost impossible, but - as long as you use XP and up - it's only harder from what I am seeing. Unfortunately, I must use XP, D7 and cannot upgrade. But, as said, I am quickly becoming used to solve these problems so, in the future, I hope to be able to give back some help in the same way everybody has always helped me here :)
My application is parsing incoming emails. I try to parse them as best as possible but every now and then I get one with puzzling content. This time is an email that looks to be in ASCII but the specified charset is: ansi_x3.110-1983.
My application handles it correctly by defaulting to ASCII, but it throws a warning which I'd like to stop receiving, so my question is: what is ansi_x3.110-1983 and what should I do with it?
According to this page on the IANA's site, ANSI_X3.110-1983 is also known as:
iso-ir-99
CSA_T500-1983
NAPLPS
csISO99NAPLPS
Of those, only the name NAPLPS seems interesting or informative. If you can, consider getting in touch with the people sending those mails. If they're really using Prodigy in this day and age, I'd be amazed.
The IANA site also has a pointer to RFC 1345, which contains a description of the bytes and the characters that they map to. Compared to ISO-8859-1, the control characters are the same, as are most of the punctuation, all of the numbers and letters, and most of the remaining characters in the first 7 bits.
You could possibly use the guide in the RFC to write a tool to map the characters over, if someone hasn't written a tool for it already. To be honest, it may be easier to simply ignore the whines about the weird character set given that the character mapping is close enough to what is expected anyway...
Using Crystal Reports 11, I want to report on the number of characters within a memo field. The problem is that it is feeding through HTML from the System.
I want to remove all text that is within "<" and ">" and am having some problems doing this.
I guessed that my best way of doing this was a Replace formula, but it only seems to allow specific characters or group of characters to be replaced. I tried the following:
Replace ({Field_Name},"","")
Basically I want to be able to remove all text within the "<>" symbols to then be able to view the remaining data.
I know that there is an option to format the field with HTML which will then display only the text, but when a count is done on the field, it still includes the HTML text in the character count.
Any help wouldbe great.
Thanks
Dave
After struggling for # a wk. to try to remove HTML from my emails, I decided to go into system restore and went back to a date when I thought this had happened (a wk) and voila - gone! Nada! No HTML! Problem solved so easily.