When i run my program,a error hanppened, and when i look into the log, appears this {k,3108,"s"},{k,3109,"}, how can a one double quote as a varible's value.
In the text font it is a little hard to see exactly what you actually got in the log but I am guessing it is:
{k,3108,"s"},{k,3109,''}
The first true double quotes make an Erlang string (which is really a list of integers) while the second is actually a pair of ' which is the quote character for atoms. In this case it is the atom with the empty name which is allowed. This is what #shk indicated.
But without more information from you it is really hard to give a proper answer.
Related
i am trying to check if a string is empty by doing the following
if trim(somestring) = '' then
begin
//that an empty string
end
i am doing this empty check in my client application , but i notice that some clients inserts empty strings even with this check applied .
on my Linux server side those empty chars showing as squares and when i copy those chars i can be able to bypass the empty string check like the follwing example
if you copy this empty chars the check will not be effected how can i avoid that ?
Your code is working correctly and the strings are not empty. An empty string is one whose length is zero, it has no characters. You refer to empty characters, but there is no such thing. When. your system displays a small empty square that indicates that the chosen font has no glyph for that character.
Let us assume that these strings are invalid. In that case the real problem you have is that you don't yet fully understand what properties are required for a string to be valid. Your code is written assuming that a string is valid if it contains at least one character with ordinal value greater than 32. Clearly that test is not correct. You need to step back and work out the precise rules for validity. Only when these are clear in your mind can you correct you program and check for validity correctly.
On the other hand perhaps these strings are valid and the mistake is simply that you are erroneously determining otherwise when you inspect the data. Only you can know this, we don't have the information.
A useful technique in all of this is to inspect the ordinal values of the strings. Loop through the characters printing the ordinal value of each one. That allows you to see what is really there and not be at the mercy of non-printing characters, characters with no glyph, invalid encodings, etc.
Since Trim is pretty simple function, it omits only characters with less than or equal dec 32 in ASCII table
( Sample from System.SysUtils.pas )
while S.Chars[L] <= ' ' do Dec(L);
Therefore it's possible that You just can't see some exotic chars ( > ASCII 128) due to bad encoding used with Your input string.
try to use :
StrToInt(Ord(SomeChar))
On every char that is not "trimmed" and remove them by hand or check Your encoding.
Kind Regards
I have a string that, by using string.format("%02X", char), I've received the following:
74657874000000EDD37001000300
In the end, I'd like that string to look like the following:
t e x t NUL NUL NUL í Ó p SOH NUL ETX NUL (spaces are there just for clarification of characters desired in example).
I've tried to use \x..(hex#), string.char(0x..(hex#)) (where (hex#) is alphanumeric representation of my desired character) and I am still having issues with getting the result I'm looking for. After reading another thread about this topic: what is the way to represent a unichar in lua and the links provided in the answers, I am not fully understanding what I need to do in my final code that is acceptable for this to work.
I'm looking for some help in better understanding an approach that would help me to achieve my desired result provided below.
ETA:
Well I thought that I had fixed it with the following code:
function hexToAscii(input)
local convString = ""
for char in input:gmatch("(..)") do
convString = convString..(string.char("0x"..char))
end
return convString
end
It appeared to work, but didnt think about characters above 127. Rookie mistake. Now I'm unsure how I can get the additional characters up to 256 display their ASCII values.
I did the following to check since I couldn't truly "see" them in the file.
function asciiSub(input)
input = input:gsub(string.char(0x00), "<NUL>") -- suggested by a coworker
print(input)
end
I did a few gsub strings to substitute in other characters and my file comes back with the replacement strings. But when I ran into characters in the extended ASCII table, it got all forgotten.
Can anyone assist me in understanding a fix or new approach to this problem? As I've stated before, I read other topics on this and am still confused as to the best approach towards this issue.
The simple way to transform a base16-encoded string is just to
function unhex( input )
return (input:gsub( "..", function(c)
return string.char( tonumber( c, 16 ) )
end))
end
This is basically what you have, just a bit cleaner. (There's no need to say "(..)", ".." is enough – if you specify no captures, you'll automatically get the whole match. And while it might work if you write string.char( "0x"..c ), it's just evil – you concatenate lots of strings and then trigger the automatic conversion to numbers. Much better to just specify the base when explicitly converting.)
The resulting string should be exactly what went into the hex-dumper, no matter the encoding.
If you cannot correctly display the result, your viewer will also be unable to display the original input. If you used different viewers for the original input and the resulting output (e.g. a text editor and a terminal), try writing the output to a file instead and looking at it with the same viewer you used for the original input, then the two should be exactly the same.
Getting viewers that assume different encodings (e.g. one of the "old" 8-bit code pages or one of the many versions of Unicode) to display the same thing will require conversion between different formats, which tends to be quite complicated or even impossible. As you did not mention what encodings are involved (nor any other information like OS or programs used that might hint at the likely encodings), this could be just about anything, so it's impossible to say anything more specific on that.
You actually have a couple of problems:
First, make sure you know the meaning of the term character encoding, and that you know the difference between characters and bytes. A popular post on the topic is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Then, what encoding was used for the bytes you just received? You need to know this, otherwise you don't know what byte 234 means. For example it could be ISO-8859-1, in which case it is U+00EA, the character ê.
The characters 0 to 31 are control characters (eg. 0 is NUL). Use a lookup table for these.
Then, displaying the characters on the terminal is the hard part. There is no platform-independent way to display ê on the terminal. It may well be impossible with the standard print function. If you can't figure this step out you can search for a question dealing specifically with how to print Unicode text from Lua.
As already pointed out in the topic, I got the following error:
Character #\u009C cannot be represented in the character set CHARSET:CP1252
trying to print out a string given back by drakma:http-request, as far as I understand the error-code the problem is that the windows-encoding (CP1252) does not support this character.
Therefore to be able to process it, I might/must convert the whole string.
My question is what package/library does support converting strings to certain character-sets efficiently?
An alike question is this one, but just ignoring the error would not help in my case.
Drakma already does the job of "converting strings": after all, when it reads from some random webserver, it just gets a stream of bytes. It then has to convert that to a lisp string. You probably want to bind *drakma-default-external-format* to something else, although I can't remember off-hand what the allowable values are. Maybe something like :utf-8?
I understand the functional difference between single and double quotes in Ruby, but I'm wondering what concrete reasons people have for varying between the two. In my mind it seems like you should just always use a double quote, and not think about it.
A couple rationales that I've read in researching the topic...
Use a single quote unless a double quote is required.
There's a very, very minor performance advantage to a single quotes.
Any other interesting thoughts out there? (Or maybe this is a case of the freedom or Ruby leaving the door open for no One Right Way to do something...)
I usually follow the following rule:
never use double quotes (or %Q or %W) if you don't interpolate
The reason for this is that if you're trying to track down an error or a security bug, you immediately know when looking at the beginning of the string that there cannot possibly any code inside it, therefore the bug cannot be in there.
However, I also follow the following exception to the rule:
use double quotes if they make the code more readable
I.e. I prefer
"It's time"
over
'It\'s time'
%q{It's time}
It is technically true that single quoted strings are infinitesimally faster to parse than double quoted strings, but that's irrelevant because
the program only gets parsed once, during startup, there is no difference in runtime performance
the performance difference really is extremely small
the time taken to parse strings is irrelevant compared to the time taken to parse some of Ruby's crazier syntax
So, the answer to your question is: Yes, there is an advantage, namely that you can spot right away whether or not a string may contain code.
I can think of three reasons to use single quoted strings:
They look cleaner (the reason I use them)
They make it easier to create a string you'd otherwise have to escape ('he said "yes"' vs "he said \"yes\"")
They are slightly more performant.
I would assume using a single-quoted string is faster, since double quotes allow string interpolation, and single-quoted strings do not.
That's the only difference I know of. For that reason, it's probably best to only use a single-quoted string unless you need string interpolation:
num = 59
"I ate #{num} pineapples"`
Well, there are a lot of fuzz about the "performance gain" of single quoted strings vs double quoted strings.
The fact is that it doesn't really matter if you don't interpolate. There are a lot of benchmarks around the web that corroborate that assertion. (Some here at stackoverflow)
Personally, I use double for strings that have interpolation just for the sake of readability. I prefer to see the double quotes when I need them. But in fact there are methods in ruby for interpolating strings other than "double quoting" them:
%q{#{this} doesn't get interpolated}
%Q{#{this} is interpolated}
1.9.2-p290 :004 > x = 3
=> 3
1.9.2-p290 :005 > "#{x}"
=> "3"
1.9.2-p290 :006 > '#{x}'
=> "\#{x}"
In any other case, i prefer single quotes, because it's easier to type and just makes the code less overbloated to my eyes.
Since asking this question I've discovered this unofficial Ruby Style Guide that addresses this, and many many more styling questions I've had floating around in my head. I'd highly recommend checking it out.
I found that when putting variables in a string using #{} did not work in single quotes, but did work in double quotes as below.
comp_filnam and num (integer) are the variables I used to create the file name in the file path:
file_path_1 = "C:/CompanyData/Components/#{comp_filnam}#{num.to_s}.skp"
When would it be appropriate to localize a single ascii character?
for instance /, or | ?
is it ever necessary to add these "strings" to the localization effort?
just want to give some people the benefit of the doubt and make sure there's not something I didn't think of.
Generally it wouldn't be appropriate to use something like that except as a graphic element (which of course wouldn't be I18N'd in the first place, much less L10N'd). If you are trying to use it to e.g. indicate a ratio then you should have something like "%d / %d" instead, and localize the whole thing.
Yes, there are cases where these individual characters change in localization. This is not a comprehensive list, just examples I happen to know.
Not every locale uses , to separate thousands and . for the decimal. (However, these will usually be handled by your number formatter. If you do so yourself, you're probably doing it wrong. See this MSDN blog post by Michael Kaplan, Number format and currency format are not always the same.)
Not every language uses the same quotation marks (“, ”, ‘ and ’). See Wikipedia on Non-English Uses of Quotation Marks. (Many of these are only easy to replace if you use full quote marks. If you use the " and ' on your keyboard to mark both the start and end of sentences, you won't know which of two symbols to substitute.)
In Spanish, a question or exclamation is preceded by an inverted ? or !. ¿Question? ¡Exclamation! (Obviously, you can't fix this with a locale substitution for a single character. Any questions or exclamations in your application should be entire strings anyway, unless you're writing some stunningly intelligent natural language generator.)
If you do find a circumstance where you need to localize these symbols, be extra cautious not to accidentally localize a symbol like / used as a file separator, " to denote a string literal or ? for a search wildcard.
However, this has already happened with CSV files. These may be separated by ,, or may be separated by the local list separator. See What would happen if you defined your system's CSV delimiter as being a quotation mark?
In Greek, questions end with a semicolon rather than ?, so essentially the ? is replaced with ; ... however, you should aim to always translate the question as a complete string including question mark anyway.