Parsing strings into integers - parsing

So I'm trying to find a pattern in a string and convert it to an integer.
Firstly I look for a string:
let haystack = "HTTP/1.1 200\r\n";
let needle = "HTTP/1.";
let http_location = haystack.rfind(needle);
if (http_location.is_some()) {
Now that I've found it I can think of two ways to get the numerical status. Either:
let mut temp_str = haystack.char_at(http_location.unwrap());
let status = String::from_str(temp_str);
}
Or:
let status = String::from_str(&haystack[http_location.unwrap()]);
}
Unfortunately both of them are deprecated (and probably wrong anyway). What is currently the correct way of doing this?
Also, is this part stylistically correct?:
let http_location = haystack.rfind(needle);
if (http_location.is_some())

Parsing is a wide and varied topic. There are easy parsing tools and there are performant parsing tools and a spectrum in between.
fn main() {
let haystack = "HTTP/1.1 200\r\n";
let needle = "HTTP/1.";
let z: Option<u8> = haystack.rfind(needle).and_then(|pt| {
let after_match = &haystack[(pt + needle.len())..];
after_match.splitn(2, " ").next()
}).and_then(|val| {
val.parse().ok()
});
println!("{:?}", z)
}
Here, we use rfind as you did before, which can fail. We use and_then to run the closure if the result was Some. The first closure slices the string after the needle, then splits it on spaces, with a maximum of 2 parts. That can fail, so we use a second and_then to use parse, which can also fail with a Result, so we convert that into an Option to preserve the type.
And the end of this, we still might have failed, as the thing we parsed might not have been a parseable number!
Rust really helps you make explicit places you can fail, and you have to deal with them. ^_^
In this case:
Maybe the string doesn't have "HTTP/1." in it
Iterators have to end at some point, so they can return None.
Parsing a string to a number can fail.
Here's an alternate solution that uses the regex crate:
extern crate regex;
use regex::Regex;
fn main() {
let haystack = "HTTP/1.1 200\r\n";
let re = Regex::new(r"HTTP/1.(\d) (\d+)\r\n").unwrap();
let captures = re.captures(haystack).unwrap();
let version: Option<u8> = captures.at(1).and_then(|version| version.parse().ok());
let status: Option<u8> = captures.at(2).and_then(|version| version.parse().ok());
assert_eq!(Some(1), version);
assert_eq!(Some(200), status);
println!("Version: {:?}, Status: {:?}", version, status);
}
You'll see that we have the same types of failure modes, but the structure is a bit different.
Or maybe a version that uses Result and try!:
#[derive(Debug,Copy,Clone,PartialEq)]
enum Error {
StartNotFound,
NotANumber,
}
fn parse_it(haystack: &str) -> Result<u8, Error> {
let needle = "HTTP/1.";
let pt = try!(haystack.rfind(needle).ok_or(Error::StartNotFound));
let after_match = &haystack[(pt + needle.len())..];
let val = after_match.splitn(2, " ").next().unwrap();
val.parse().map_err(|_| Error::NotANumber)
}
fn main() {
println!("{:?}", parse_it("HTTP/1.1 200\r\n"));
println!("{:?}", parse_it("HTTP/1"));
println!("{:?}", parse_it("HTTP/1.cow"));
}

Related

Parsing an f64 variable into a usize variable in Rust

I have currently been dabbling in the Rust programming language and decided a good way to test my skills was to program an application that would find the median of any given list of numbers.
Eventually I got into the Final stretch of code and stumbled into a problem.
I needed to parse an f64 variable into a usize variable.
However, I don't know how to go about doing this (Wow what a surprise!).
Take a look at the second function, calc_med() in my code. The variable n2 is supposed to take n and parse it into a usize. The code is not finished yet, but if you can see any more problems with the code please let me know.
use std::io;
use std::sync::Mutex;
#[macro_use]
extern crate lazy_static;
lazy_static! {
static ref v1: Mutex<Vec<f64>> = Mutex::new(Vec::new());
}
fn main() {
loop {
println!("Enter: ");
let mut inp: String = String::new();
io::stdin().read_line(&mut inp).expect("Failure");
let upd_inp: f64 = match inp.trim().parse() {
Ok(num) => num,
Err(_) => if inp.trim() == String::from("q") {
break;
} else if inp.trim() == String::from("d"){
break
{
println!("Done!");
calc_med();
}
} else {
continue;
}
};
v1.lock().unwrap().push(upd_inp);
v1.lock().unwrap().sort_by(|a, b| a.partial_cmp(b).unwrap());
println!("{:?}", v1.lock().unwrap());
}
}
fn calc_med() { // FOR STACKOVERFLOW: THIS FUNCTION
let n: f64 = ((v1.lock().unwrap().len()) as f64 + 1.0) / 2.0;
let n2: usize = n.to_usize().expect("Failure");
let median: f64 = v1[n2];
println!("{}", median)
}

The compiler is unable to type-check this expression in reasonable time; try breaking up the expression into distinct sub-expressions

I am new in Ios programming and the below expressing is giving an error:
let combine = date.enumerated().map {index, date in
return (date,self.arrFriendId[index],self.arrFriendName[index],self.arrFriendImage[index],self.arrMsgType[index],self.arrMessage[index], self.arrLastMsgTime[index], self.arrNotifyStatus[index])}
please help me to solve this.
thanks in advance
This error generally occurs when a single expression is doing a lot of things. So compiler tells you to break it to sub-expressions.
Assuming you want the output combine of type Array<Any>, You can do it like this:
let combine = date.enumerated().map { index, date -> Any in
let id = self.arrFriendId[index]
let name = self.arrFriendName[index]
let image = self.arrFriendImage[index]
let messageType = self.arrMsgType[index]
let message = self.arrMessage[index]
let messageTime = self.arrLastMsgTime[index]
let status = self.arrNotifyStatus[index]
return (date, id, name, image, messageType, message, messageTime, status)
}

NSNetService dictionaryFromTXTRecord fails an assertion on invalid input

The input to dictionary(fromTXTRecord:) comes from the network, potentially from outside the app, or even the device. However, Apple's docs say:
... Fails an assertion if txtData cannot be represented as an NSDictionary object.
Failing an assertion leaves the programmer (me) with no way of handling the error, which seems illogic for a method that processes external data.
If I run this in Terminal on a Mac:
dns-sd -R 'My Service Name' _myservice._tcp local 4567 asdf asdf
my app, running in an iPhone, crashes.
dictionary(fromTXTRecord:) expects the TXT record data (asdf asdf) to be in key=val form. If, like above, a word doesn't contain any = the method won't be able to parse it and fail the assertion.
I see no way of solving this problem other than not using that method at all and implementing my own parsing, which feels wrong.
Am I missing something?
Here's a solution in Swift 4.2, assuming the TXT record has only strings:
/// Decode the TXT record as a string dictionary, or [:] if the data is malformed
public func dictionary(fromTXTRecord txtData: Data) -> [String: String] {
var result = [String: String]()
var data = txtData
while !data.isEmpty {
// The first byte of each record is its length, so prefix that much data
let recordLength = Int(data.removeFirst())
guard data.count >= recordLength else { return [:] }
let recordData = data[..<(data.startIndex + recordLength)]
data = data.dropFirst(recordLength)
guard let record = String(bytes: recordData, encoding: .utf8) else { return [:] }
// The format of the entry is "key=value"
// (According to the reference implementation, = is optional if there is no value,
// and any equals signs after the first are part of the value.)
// `ommittingEmptySubsequences` is necessary otherwise an empty string will crash the next line
let keyValue = record.split(separator: "=", maxSplits: 1, omittingEmptySubsequences: false)
let key = String(keyValue[0])
// If there's no value, make the value the empty string
switch keyValue.count {
case 1:
result[key] = ""
case 2:
result[key] = String(keyValue[1])
default:
fatalError()
}
}
return result
}
I'm still hoping there's something I'm missing here, but in the mean time, I ended up checking the data for correctness and only then calling Apple's own method.
Here's my workaround:
func dictionaryFromTXTRecordData(data: NSData) -> [String:NSData] {
let buffer = UnsafeBufferPointer<UInt8>(start: UnsafePointer(data.bytes), count: data.length)
var pos = 0
while pos < buffer.count {
let len = Int(buffer[pos])
if len > (buffer.count - pos + 1) {
return [:]
}
let subdata = data.subdataWithRange(NSRange(location: pos + 1, length: len))
guard let substring = String(data: subdata, encoding: NSUTF8StringEncoding) else {
return [:]
}
if !substring.containsString("=") {
return [:]
}
pos = pos + len + 1
}
return NSNetService.dictionaryFromTXTRecordData(data)
}
I'm using Swift 2 here. All contributions are welcome. Swift 3 versions, Objective-C versions, improvements, corrections.
I just ran into this one using Swift 3. In my case the problem only occurred when I used NetService.dictionary(fromTXTRecord:) but did not occur when I switched to Objective-C and called NSNetService dictionaryFromTXTRecord:. When the Objective-C call encounters an entry without an equal sign it creates a key containing the data and shoves it into the dictionary with an NSNull value. From what I can tell the Swift version then enumerates that dictionary and throws a fit when it sees the NSNull. My solution was to add an Objective-C file and a utility function that calls dictionaryFromTXTRecord: and cleans up the results before handing them back to my Swift code.

What is a safe way to turn streamed (utf8) data into a string?

Suppose I'm a server written in objc/swift. The client is sending me a large amount of data, which is really a large utf8 encoded string. As the server, i have my NSInputStream firing events to say it has data to read. I grab the data and build up a string with it.
However what if the next chunk of data I get falls on an unfortunate position in the utf8 data? Like on a composed character. It seems like it would mess the string up if you try to append a chunk of non compliant utf8 to it.
What is a suitable way to deal with this? I was thinking I could just keep the data as an NSData, but then I don't have anyway to know when the data has finished being received (think HTTP where the length of data is in the header).
Thanks for any ideas.
The tool you probably want to use here is UTF8. It will handle all the state issues for you. See How to cast decrypted UInt8 to String? for a simple example that you can likely adapt.
The major concern in building up a string from UTF-8 data isn't composed characters, but rather multi-byte characters. "LATIN SMALL LETTER A" + "COMBINING GRAVE ACCENT" works fine even if decode each of those characters separately. What doesn't work is gathering the first byte of 你, decoding it, and then appending the decoded second byte. The UTF8 type will handle this for you, though. All you need to do is bridge your NSInputStream to a GeneratorType.
Here's a basic (not fully production-ready) example of what I'm talking about. First, we need a way to convert an NSInputStream into a generator. That's probably the hardest part:
final class StreamGenerator {
static let bufferSize = 1024
let stream: NSInputStream
var buffer = [UInt8](count: StreamGenerator.bufferSize, repeatedValue: 0)
var buffGen = IndexingGenerator<ArraySlice<UInt8>>([])
init(stream: NSInputStream) {
self.stream = stream
stream.open()
}
}
extension StreamGenerator: GeneratorType {
func next() -> UInt8? {
// Check the stream status
switch stream.streamStatus {
case .NotOpen:
assertionFailure("Cannot read unopened stream")
return nil
case .Writing:
preconditionFailure("Impossible status")
case .AtEnd, .Closed, .Error:
return nil // FIXME: May want a closure to post errors
case .Opening, .Open, .Reading:
break
}
// First see if we can feed from our buffer
if let result = buffGen.next() {
return result
}
// Our buffer is empty. Block until there is at least one byte available
let count = stream.read(&buffer, maxLength: buffer.capacity)
if count <= 0 { // FIXME: Probably want a closure or something to handle error cases
stream.close()
return nil
}
buffGen = buffer.prefix(count).generate()
return buffGen.next()
}
}
Calls to next() can block here, so it should not be called on the main queue, but other than that, it's a standard Generator that spits out bytes. (This is also the piece that probably has lots of little corner cases that I'm not handling, so you want to think this through pretty carefully. Still, it's not that complicated.)
With that, creating a UTF-8 decoding generator is almost trivial:
final class UnicodeScalarGenerator<ByteGenerator: GeneratorType where ByteGenerator.Element == UInt8> {
var byteGenerator: ByteGenerator
var utf8 = UTF8()
init(byteGenerator: ByteGenerator) {
self.byteGenerator = byteGenerator
}
}
extension UnicodeScalarGenerator: GeneratorType {
func next() -> UnicodeScalar? {
switch utf8.decode(&byteGenerator) {
case .Result(let scalar): return scalar
case .EmptyInput: return nil
case .Error: return nil // FIXME: Probably want a closure or something to handle error cases
}
}
}
You could of course trivially turn this into a CharacterGenerator instead (using Character(_:UnicodeScalar)).
The last problem is if you want to combine all combining marks, such that "LATIN SMALL LETTER A" followed by "COMBINING GRAVE ACCENT" would always be returned together (rather than as the two characters they are). That's actually a bit trickier than it sounds. First, you'd need to generate Strings, not Characters. And then you'd need a good way to know what all the combining characters are. That's certainly knowable, but I'm having a little trouble deriving a simple algorithm. There's no "combiningMarkCharacterSet" in Cocoa. I'm still thinking about it. Getting something that "mostly works" is easy, but I'm not sure yet how to build it so that it's correct for all of Unicode.
Here's a little sample program to try it out:
let textPath = NSBundle.mainBundle().pathForResource("text.txt", ofType: nil)!
let inputStream = NSInputStream(fileAtPath: textPath)!
inputStream.open()
dispatch_async(dispatch_get_global_queue(0, 0)) {
let streamGen = StreamGenerator(stream: inputStream)
let unicodeGen = UnicodeScalarGenerator(byteGenerator: streamGen)
var string = ""
for c in GeneratorSequence(unicodeGen) {
print(c)
string += String(c)
}
print(string)
}
And a little text to read:
Here is some normalish álfa你好 text
And some Zalgo i̝̲̲̗̹̼n͕͓̘v͇̠͈͕̻̹̫͡o̷͚͍̙͖ke̛̘̜̘͓̖̱̬ composed stuff
And one more line with no newline
(That second line is some Zalgo encoded text, which is nice for testing.)
I haven't done any testing with this in a real blocking situation, like reading from the network, but it should work based on how NSInputStream works (i.e. it should block until there's at least one byte to read, but then should just fill the buffer with whatever's available).
I've made all of this match GeneratorType so that it plugs into other things easily, but error handling might work better if you didn't use GeneratorType and instead created your own protocol with next() throws -> Self.Element instead. Throwing would make it easier to propagate errors up the stack, but would make it harder to plug into for...in loops.
I'm revisiting this question cause I've had the very same problem to solve.
My solution adopts the UTF8.ForwardParser, hence it works with chunks of UInt8 values, keeping around the bytes of a scalar which might be among two consecutive chunks of bytes.
// This class generates chunks of bytes from the given InputStream
final class ChunksGenerator {
static let bSize = 1024
let stream: InputStream
var buffer = Array<UInt8>(repeating: 0, count: bSize)
init(_ stream: InputStream) {
self.stream = stream
self.stream.open()
}
// Pull a chunk of bytes from the stream
func pull() throws -> ArraySlice<UInt8> {
switch stream.streamStatus {
// We've got to read the stream
case .opening: fallthrough
case .open: fallthrough
case .reading: break
// We're either done reading or having an error
case .error: fallthrough
case .atEnd:
stream.close()
if let error = stream.streamError {
throw error
} else {
fallthrough
}
case .closed: fallthrough
case .notOpen: return []
// Let's also address other status
case .writing: fallthrough
#unknown default: preconditionFailure("status: \(stream.streamStatus) not manageable for InputStream")
}
// read from stream in buffer
let length = stream.read(&buffer, maxLength: Self.bSize)
guard
length > 0
else {
// Either stream.read(_&:maxLength:) returned 0 or -1
defer {
stream.close()
}
if length == 0 {
return []
}
throw stream.streamError!
}
return buffer.prefix(length)
}
}
// This Iterator returns Character from an InputStream
struct CharacterParser: IteratorProtocol {
typealias Element = Character
let chunksGenerator: ChunksGenerator
var chunk: Array<UInt8> = []
var chunkIterator: IndexingIterator<Array<UInt8>> = [].makeIterator()
var error: Swift.Error? = nil
var utf8Parser = UTF8.ForwardParser()
init(inputStream: InputStream) {
self.chunksGenerator = ChunksGenerator(inputStream)
self.chunk = _pulledChunk ?? []
self.chunkIterator = chunk.makeIterator()
}
mutating func next() -> Character? {
switch utf8Parser.parseScalar(from: &chunkIterator) {
case .valid(let encoded):
// We've parsed a scalar encoded in UTF8,
// let's decode it and return the Character:
let scalar = UTF8.decode(encoded)
return Character(scalar)
case .emptyInput:
// We've consumed this chunk of bytes,
// let's pull another one from the stream
// and update the iterator underlaying data:
chunk = _pulledChunk ?? []
self.chunkIterator = chunk.makeIterator()
guard
// In case we've pulled an empty one then
// we're done and we return nil
!chunk.isEmpty
else { return nil }
return next()
case .error(length: let length):
// We've gotten a parsing error, therefore
// the suffix of the actual chunk up to the
// length from the parse error contains bytes
// of a potential encoded scalar spanning
// across two chunks:
let remaninigChunk = chunk.suffix(length)
let pulledChunk = _pulledChunk ?? []
chunk = remaninigChunk + pulledChunk
chunkIterator = chunk.makeIterator()
guard
chunk.count > remaninigChunk.count
else {
// No more data could be pulled from the stream:
// this is it, the stream ends with bytes that aren't an UTF8 scalar, thus we set the error and return nil.
self.error = DecodingError.dataCorrupted(DecodingError.Context(codingPath: [], debugDescription: "Parse error. Bytes: \(remaninigChunk) cannot be parsed into a valid UTF8 scalar", underlyingError: self.error))
return nil
}
return next()
}
}
// Attempt to pull a chunk from the stream,
// in case there was an error we set it in this
// iterator and return nil.
private var _pulledChunk: Array<UInt8>? {
mutating get {
do {
let pulled = try chunksGenerator.pull()
return Array(pulled)
} catch let e {
self.error = e
return nil
}
}
}
}

Compare a part of text to a String

I'm trying to get the range of a text, I did this:
let telRange = Range(start: tfTelephone.text!.startIndex, end: tfTelephone.text!.endIndex.advancedBy(2))
After this I'm trying to compare:
if (telRange == "08")
But I get an error:
Binary operator '==' cannot be applied to operands of type
'Range' (aka 'Range') and 'String'
Firstly, don't force unwrap, use an if-let instead. Secondly, you can't compare a Range and a String, most likely you want to compare the substring for that range. And thirdly, you can't advance forward from the endIndex.
let telRange: String
if let text = tfTelephone.text {
telRange = text[text.startIndex...text.endIndex.advancedBy(-2)]
} else {
telRange = ""
}
Something like above should work, you can tweak the indexes and the advanceBy arguments to match your expectations.
You should be doing something like this instead:
if (tfTelephone.text![telRange] == "08") {
// ...
}
However, before that will work you have to fix this:
let telRange = Range(
start: tfTelephone.text!.startIndex,
end: tfTelephone.text!.endIndex.advancedBy(2) // <- This will fail. Use negative
) // number, and make sure to stay
// within the length of the input
// string still.
Try something like this:
let yourString = tfTelephone.text
let telRange = Range(start: yourString!.startIndex, end: yourString!.endIndex.advancedBy(-2)) //It's important to use only negative numbers!
let stringToCompare = yourString!.substringWithRange(telRange)
Now you can compare the two strings
if (stringToCompare == "08")

Resources