Characters and Strings in Swift - ios

Reading the documentation and this answer, I see that I can initialize a Unicode character in either of the following ways:
let narrowNonBreakingSpace: Character = "\u{202f}"
let narrowNonBreakingSpace = "\u{202f}"
As I understand, the second one would actually be a String. And unlike Java, both of them use double quotes (and not single quotes for characters). I've seen several examples, though, where the second form (without Character) is used even though the variable is only holding a single character. Is that people just being lazy or forgetting to write Character? Or does Swift take care of all the details and I don't need to bother with it? If I know I have a constant that contains only a single Unicode value, should I always use Character?

When a type isn't specified, Swift will create a String instance out of a string literal when creating a variable or constant, no matter the length. Since Strings are so prevalent in Swift and Cocoa/Foundation methods, you should just use that unless you have a specific need for a Character—otherwise you'll just need to convert to String every time you need to use it.

The Swift compiler will infer the type of the string to actually be character in the second case. Adding : Character is therefor not really needed. In this case I would add it though because it's easy to mistakenly assume that this Character is a String and another developer might try to treat it as such. However, the compiler would throw errors because of that since it inferred the type of this String to not be a String but a Character.
So in my opinion adding Character is not a matter of being lazy or forgetting it, it's a matter of trusting the compiler to correctly infer the type of this constant en to rely on the compiler throwing the correct error whenever I try to use this constant wrong.
Swift's compiler basically takes care of all the details and it doesn't really matter if you add Character, the compiler will (should) take care of it.

Related

Can code injection in Lua be performed with just a variable definition?

Assuming I define a variable like this in Lua
local input = "..."
Where the ... comes from a user-provided string. Would that user be able to perform code injection just from a variable definition? Do I need to sanitize the string?
As a general rule, if you ever need to ask yourself if you need to sanitize your inputs, the correct answer is "yes".
As to this particular case, if you just copy/paste the user's string directly into the Lua source file, even in quotes like that, they will be able to execute arbitrary code. It's not even particularly difficult; they can provide some text"; my_code = 20; last = "end of string.
The best way to sanitize this is by using a long-form literal string with [[...]] syntax. But even that can be broken out, so you need to search through the given string for repeated sequences of the = character. Each time you find a sequence, note how many = characters are in that sequence. After searching, insert a number of = characters into your literal string that isn't one of the lengths found in the user string.
Of course, the internal implementation of Lua may have some limits on the length of the = sequence in a long-form literal string. In such a case, an external user could break your code by forcing you to use a longer sequence than the implementation supports. But it won't be able to cause arbitrary code execution; you'll just get a compile error.

How to use chr() function with unicode symbols?

I know that if I want to add an ASCII character in a string like a black space, for example, all I need to do is to add with a call to CHAR(32) and the space will be added to the string.
But what if I want to put the infinite symbol ∞ (U+221E) how should I do it?
If I paste it into a literal string like 'infinite is ∞' then Delphi wants to change the file to UTF8.
Char is a data type, so Char() is a typecast, not a function. Chr() is a function.
In D2009+, you can use either:
Char($221E) or Char(8734) (in earlier versions, use WideChar() instead)
Chr($221E) or Chr(8734)
#$221E or #8734 character constants
TCharacter.ConvertFromUtf32()
TCharHelper.ConvertFromUtf32()
'∞'. There is nothing wrong with using this in code and letting the IDE decide how to save it. This is actually the preferred solution. The only time you would need to avoid this is if you are using other tools to process your source files and they don't support UTF-8 files.

Inserting Unicode Hyphen-minus into String Causes Error

I am trying to insert a unicode hyphen-minus character into a text string. I am seeing an "Invalid universal character" error with the following:
u+002D (hyphen-minus)
[textViewContent insertString:#"\u002D" atIndex:cursorPosition.location];
However, these work fine:
u+2212 (minus)
[textViewContent insertString:#"\u2212" atIndex:cursorPosition.location];
u+2010 (hyphen)
[textViewContent insertString:#"\u2010" atIndex:cursorPosition.location];
I've poked at several of the existing Unicode discussions here, but I have not found one that explains what is different amongst my examples that causes the first one to error. Insight greatly appreciated.
Unversal character names have some restrictions on their use. In C99 and C++98 you were not allowed to use one that referred to a character in the basic character set (which includes U+002D).
C++11 has updated this requirement so if you are inside a string or character literal then you are allowed to use a UCN that refers to basic characters. Depending on the compiler version you're using I would guess that you could use Objective-C++11 to make your code legal.
That said, since this character is part of ASCII and the basic character set, why don't you just write it literally?
#"-"

Character #\u009C cannot be represented in the character set CHARSET:CP1252 - how to fix it

As already pointed out in the topic, I got the following error:
Character #\u009C cannot be represented in the character set CHARSET:CP1252
trying to print out a string given back by drakma:http-request, as far as I understand the error-code the problem is that the windows-encoding (CP1252) does not support this character.
Therefore to be able to process it, I might/must convert the whole string.
My question is what package/library does support converting strings to certain character-sets efficiently?
An alike question is this one, but just ignoring the error would not help in my case.
Drakma already does the job of "converting strings": after all, when it reads from some random webserver, it just gets a stream of bytes. It then has to convert that to a lisp string. You probably want to bind *drakma-default-external-format* to something else, although I can't remember off-hand what the allowable values are. Maybe something like :utf-8?

What are the negatives of adding a to_str method to Symbol?

I'm working in a ruby app in which symbols are used in various places where one would usually use strings or enums in other languages (to specify configurations mostly).
So my question is, why should I not add a to_str method to symbol?
It seems seems sensible, as it allows implicit conversion between symbol and string. So I can do stuff like this without having to worry about calling :symbol.to_s:
File.join(:something, "something_else") # => "something/something_else"
The negative is the same as the positive, it implicitly converts symbols to strings, which can be REALLY confusing if it causes an obscure bug, but given how symbols are generally used, I'm not sure if this is a valid concern.
Any thoughts?
when an object does respond_to? :to_str, you expect him to really act like a String. This means it should implement all of String's methods, so you could potentially break some code relying on this.
to_s means that you get a string representation of your object, that's why so many objects implement it - but the string you get is far from being 'semantically' equivalent to your object ( an_hash.to_s is far from being a Hash ). :symbol.to_str's absence reflects this : a symbol is NOT and MUST NOT be confused with a string in Ruby, because they serve totally different purposes.
You wouldn't think about adding to_str to an Int, right ? Yet an Int has a lot of common with a symbol : each one of them is unique. When you have a symbol, you expect it to be unique and immutable as well.
You don't have to implicitly convert it right? Because doing something like this will automatically coerce it to a string.
"#{:something}/something_else" # "something/something_else"
The negative is what you say--at one point, anyway, some core Ruby had different behavior based on symbol/string. I don't know if that's still the case. The threat alone makes me a little twitchy, but I don't have a solid technical reason at this point. I guess the thought of making a symbol more string-like just makes me nervous.

Resources