Related
I'm working from a previous posting on AppCode called "Core Data Basics: Preload Data and Use Existing SQLite Database" located here: https://www.appcoda.com/core-data-preload-sqlite-database/
Within Simon Ng's posting is a function called parseCSV which does all the heavy lifting of scanning through a .csv and breaking it up into it's respective rows so that each row's elements can then be saved into their respective managedObjectContext in core data.
Unfortunately all of the code appears to be written in either Swift 1.0 or Swift 2.0 and I have been unable to understand the errors I'm getting in converting it into Swift 4.
I've made all of the changes suggested by Xcode with regards to "this" has been replaced with "that", with the final error telling me "Argument labels '(contentsOfURL:, encoding:, error:)' do not match any available overloads" which I have been unable to understand nor correct.
// https://www.appcoda.com/core-data-preload-sqlite-database/
func parseCSV (contentsOfURL: NSURL, encoding: String.Encoding, error: NSErrorPointer) -> [(name:String, detail:String, price: String)]? {
// Load the CSV file and parse it
let delimiter = ","
var items:[(name:String, detail:String, price: String)]?
if let content = String(contentsOfURL: contentsOfURL, encoding: encoding, error: error) {
items = []
let lines:[String] = content.componentsSeparatedByCharactersInSet(NSCharacterSet.newlineCharacterSet()) as [String]
for line in lines {
var values:[String] = []
if line != "" {
// For a line with double quotes
// we use NSScanner to perform the parsing
if line.range(of: "\"") != nil {
var textToScan:String = line
var value:NSString?
var textScanner:Scanner = Scanner(string: textToScan)
while textScanner.string != "" {
if (textScanner.string as NSString).substring(to: 1) == "\"" {
textScanner.scanLocation += 1
textScanner.scanUpTo("\"", into: &value)
textScanner.scanLocation += 1
} else {
textScanner.scanUpTo(delimiter, into: &value)
}
// Store the value into the values array
values.append(value! as String)
// Retrieve the unscanned remainder of the string
if textScanner.scanLocation < textScanner.string.count {
textToScan = (textScanner.string as NSString).substring(from: textScanner.scanLocation + 1)
} else {
textToScan = ""
}
textScanner = Scanner(string: textToScan)
}
// For a line without double quotes, we can simply separate the string
// by using the delimiter (e.g. comma)
} else {
values = line.components(separatedBy: delimiter)
}
// Put the values into the tuple and add it to the items array
let item = (name: values[0], detail: values[1], price: values[2])
items?.append(item)
}
}
}
return items
}
The 5th line:
if let content = String(contentsOfURL: contentsOfURL, encoding: encoding, error: error) {
is throwing the following error:
Argument labels '(contentsOfURL:, encoding:, error:)' do not match any available overloads
Which is beyond my understanding and skill level. I'm really just trying to find the best way of importing a comma separated .csv file into a core data object.
Any assistance would be appreciated. The original example by Simon Ng appears perfect for what I'm trying to achieve. It just hasn't been updated in a very long time.
First of all - you all are brilliant contributors and bloody fast at your intel. I'd like to thank all of you for answering so quickly. Here's where I ended up with that particular function in the latest Swift 5 syntax.
func parseCSV (contentsOfURL: NSURL, encoding: String.Encoding, error: NSErrorPointer) -> [(name:String, detail:String, price: String)]? {
// Load the CSV file and parse it
let delimiter = ","
var items:[(name:String, detail:String, price: String)]?
//if let content = String(contentsOfURL: contentsOfURL, encoding: encoding, error: error) {
if let content = try? String(contentsOf: contentsOfURL as URL, encoding: encoding) {
items = []
let lines:[String] = content.components(separatedBy: NSCharacterSet.newlines) as [String]
for line in lines {
var values:[String] = []
if line != "" {
// For a line with double quotes
// we use NSScanner to perform the parsing
if line.range(of: "\"") != nil {
var textToScan:String = line
var value:NSString?
var textScanner:Scanner = Scanner(string: textToScan)
while textScanner.string != "" {
if (textScanner.string as NSString).substring(to: 1) == "\"" {
textScanner.scanLocation += 1
textScanner.scanUpTo("\"", into: &value)
textScanner.scanLocation += 1
} else {
textScanner.scanUpTo(delimiter, into: &value)
}
// Store the value into the values array
values.append(value! as String)
// Retrieve the unscanned remainder of the string
if textScanner.scanLocation < textScanner.string.count {
textToScan = (textScanner.string as NSString).substring(from: textScanner.scanLocation + 1)
} else {
textToScan = ""
}
textScanner = Scanner(string: textToScan)
}
// For a line without double quotes, we can simply separate the string
// by using the delimiter (e.g. comma)
} else {
values = line.components(separatedBy: delimiter)
}
// Put the values into the tuple and add it to the items array
let item = (name: values[0], detail: values[1], price: values[2])
items?.append(item)
}
}
}
return items
}
As of Swift 3, that function has been changed to String(contentsOf:, encoding:) so you just need to modify the argument labels in code.
It's also worth mentioning, that this function will now throw so you will have to handle that. It wouldn't do any harm for you to take a look at this page on exception handling in Swift.
Because Scanner has been changed up in iOS 13 in ways that seem to be poorly explained, I rewrote this to work without it. For my application, the header row is of interest, so it's captured separately; if it's not meaningful then that part can be omitted.
The code starts with workingText which has been read from whatever file or URL is the source of the data.
var headers : [String] = []
var data : [[String]] = []
let workingLines = workingText.split{$0.isNewline}
if let headerLine = workingLines.first {
headers = parseCsvLine(ln: String(headerLine))
for ln in workingLines {
if ln != headerLine {
let fields = parseCsvLine(ln: String(ln))
data.append(fields)
}
}
}
print("-----------------------------")
print("Headers: \(headers)")
print("Data:")
for d in data {
print(d) // gives each data row its own printed row; print(data) has no line breaks anywhere + is hard to read
}
print("-----------------------------")
func parseCsvLine(ln: String) -> [String] {
// takes a line of a CSV file and returns the separated values
// so input of 'a,b,2' should return ["a","b","2"]
// or input of '"Houston, TX","Hello",5,"6,7"' should return ["Houston, TX","Hello","5","6,7"]
let delimiter = ","
let quote = "\""
var nextTerminator = delimiter
var andDiscardDelimiter = false
var currentValue = ""
var allValues : [String] = []
for char in ln {
let chr = String(char)
if chr == nextTerminator {
if andDiscardDelimiter {
// we've found the comma after a closing quote. No action required beyond clearing this flag.
andDiscardDelimiter = false
}
else {
// we've found the comma or closing quote terminating one value
allValues.append(currentValue)
currentValue = ""
}
nextTerminator = delimiter // either way, next thing we look for is the comma
} else if chr == quote {
// this is an OPENING quote, so clear currentValue (which should be nothing but maybe a single space):
currentValue = ""
nextTerminator = quote
andDiscardDelimiter = true
} else {
currentValue += chr
}
}
return allValues
}
I freely acknowledge that I probably use more conversions to String than those smarter than I am in the ways of Apple strings, substrings, scanners, and such would find necessary. Parsing a file of a few hundred rows x about a dozen columns, this approach seems to work fine; for something significantly larger, the extra overhead may start to matter.
An alternative is to use a library to do this. https://github.com/dehesa/CodableCSV supports this and has a list of other swift csv libraries too
I'm upgrading code from Swift 2 to Swift 3 and ran across this error:
wordcount.swift:7:5: error: value of type 'String' has no member 'enumerateSubstringsInRange'
line.enumerateSubstringsInRange(range, options: .ByWords) {w,,,_ in
In Swift 2, this method comes from a String extension of which the compiler is aware.
I have not been able to locate this method in the Swift 3 library. It appears in the documentation for Foundation here:
https://developer.apple.com/library/ios/documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/index.html#//apple_ref/occ/instm/NSString/enumerateSubstringsInRange:options:usingBlock:
My entire script is:
import Foundation
var counts = [String: Int]()
while let line = readLine()?.lowercased() {
let range = line.characters.indices
line.enumerateSubstringsInRange(range, options: .ByWords) {w,_,_,_ in
guard let word = w else {return}
counts[word] = (counts[word] ?? 0) + 1
}
}
for (word, count) in (counts.sorted {$0.0 < $1.0}) {
print("\(word) \(count)")
}
It works with Swift 2.2 (modulo the changes I have already made for Swift 3, such as lowercase -> lowercased and sort -> sorted) but fails to compile with Swift 3.
And very strangely, neither the Swift 3 command line compiler nor the Swift Migration assistant in XCode 8 Beta suggests a replacement, as it does for many other renamed methods. Perhaps enumerateSubstringsInRange is deprecated or its parameter names changed?
If you type str.enumerateSubstrings in a Playground, you'll see the following as a completion option:
enumerateSubstrings(in: Range<Index>, options: EnumerationOptions, body: (substring: String?, substringRange: Range<Index>, enclosingRange: Range<Index>, inout Bool) -> ())
In addition to addressing the new enumerateSubstrings(in:options:body:) syntax, you need to also change how you get the range for the string:
import Foundation
var counts = [String: Int]()
while let line = readLine()?.lowercased() {
let range = line.startIndex ..< line.endIndex
line.enumerateSubstrings(in: range, options: .byWords) {w,_,_,_ in
guard let word = w else {return}
counts[word] = (counts[word] ?? 0) + 1
}
}
for (word, count) in (counts.sorted {$0.0 < $1.0}) {
print("\(word) \(count)")
}
I am trying to read a file given in an NSURL and load it into an array, with items separated by a newline character \n.
Here is the way I've done it so far:
var possList: NSString? = NSString.stringWithContentsOfURL(filePath.URL) as? NSString
if var list = possList {
list = list.componentsSeparatedByString("\n") as NSString[]
return list
}
else {
//return empty list
}
I'm not very happy with this for a couple of reasons. One, I'm working with files that range from a few kilobytes to hundreds of MB in size. As you can imagine, working with strings this large is slow and unwieldy. Secondly, this freezes up the UI when it's executing--again, not good.
I've looked into running this code in a separate thread, but I've been having trouble with that, and besides, it still doesn't solve the problem of dealing with huge strings.
What I'd like to do is something along the lines of the following pseudocode:
var aStreamReader = new StreamReader(from_file_or_url)
while aStreamReader.hasNextLine == true {
currentline = aStreamReader.nextLine()
list.addItem(currentline)
}
How would I accomplish this in Swift?
A few notes about the files I'm reading from: All files consist of short (<255 chars) strings separated by either \n or \r\n. The length of the files range from ~100 lines to over 50 million lines. They may contain European characters, and/or characters with accents.
(The code is for Swift 2.2/Xcode 7.3 now. Older versions can be found in the edit history if somebody needs it. An updated version for Swift 3 is provided at the end.)
The following Swift code is heavily inspired by the various answers to
How to read data from NSFileHandle line by line?. It reads from the file in chunks, and converts complete lines to strings.
The default line delimiter (\n), string encoding (UTF-8) and chunk size (4096)
can be set with optional parameters.
class StreamReader {
let encoding : UInt
let chunkSize : Int
var fileHandle : NSFileHandle!
let buffer : NSMutableData!
let delimData : NSData!
var atEof : Bool = false
init?(path: String, delimiter: String = "\n", encoding : UInt = NSUTF8StringEncoding, chunkSize : Int = 4096) {
self.chunkSize = chunkSize
self.encoding = encoding
if let fileHandle = NSFileHandle(forReadingAtPath: path),
delimData = delimiter.dataUsingEncoding(encoding),
buffer = NSMutableData(capacity: chunkSize)
{
self.fileHandle = fileHandle
self.delimData = delimData
self.buffer = buffer
} else {
self.fileHandle = nil
self.delimData = nil
self.buffer = nil
return nil
}
}
deinit {
self.close()
}
/// Return next line, or nil on EOF.
func nextLine() -> String? {
precondition(fileHandle != nil, "Attempt to read from closed file")
if atEof {
return nil
}
// Read data chunks from file until a line delimiter is found:
var range = buffer.rangeOfData(delimData, options: [], range: NSMakeRange(0, buffer.length))
while range.location == NSNotFound {
let tmpData = fileHandle.readDataOfLength(chunkSize)
if tmpData.length == 0 {
// EOF or read error.
atEof = true
if buffer.length > 0 {
// Buffer contains last line in file (not terminated by delimiter).
let line = NSString(data: buffer, encoding: encoding)
buffer.length = 0
return line as String?
}
// No more lines.
return nil
}
buffer.appendData(tmpData)
range = buffer.rangeOfData(delimData, options: [], range: NSMakeRange(0, buffer.length))
}
// Convert complete line (excluding the delimiter) to a string:
let line = NSString(data: buffer.subdataWithRange(NSMakeRange(0, range.location)),
encoding: encoding)
// Remove line (and the delimiter) from the buffer:
buffer.replaceBytesInRange(NSMakeRange(0, range.location + range.length), withBytes: nil, length: 0)
return line as String?
}
/// Start reading from the beginning of file.
func rewind() -> Void {
fileHandle.seekToFileOffset(0)
buffer.length = 0
atEof = false
}
/// Close the underlying file. No reading must be done after calling this method.
func close() -> Void {
fileHandle?.closeFile()
fileHandle = nil
}
}
Usage:
if let aStreamReader = StreamReader(path: "/path/to/file") {
defer {
aStreamReader.close()
}
while let line = aStreamReader.nextLine() {
print(line)
}
}
You can even use the reader with a for-in loop
for line in aStreamReader {
print(line)
}
by implementing the SequenceType protocol (compare http://robots.thoughtbot.com/swift-sequences):
extension StreamReader : SequenceType {
func generate() -> AnyGenerator<String> {
return AnyGenerator {
return self.nextLine()
}
}
}
Update for Swift 3/Xcode 8 beta 6: Also "modernized" to
use guard and the new Data value type:
class StreamReader {
let encoding : String.Encoding
let chunkSize : Int
var fileHandle : FileHandle!
let delimData : Data
var buffer : Data
var atEof : Bool
init?(path: String, delimiter: String = "\n", encoding: String.Encoding = .utf8,
chunkSize: Int = 4096) {
guard let fileHandle = FileHandle(forReadingAtPath: path),
let delimData = delimiter.data(using: encoding) else {
return nil
}
self.encoding = encoding
self.chunkSize = chunkSize
self.fileHandle = fileHandle
self.delimData = delimData
self.buffer = Data(capacity: chunkSize)
self.atEof = false
}
deinit {
self.close()
}
/// Return next line, or nil on EOF.
func nextLine() -> String? {
precondition(fileHandle != nil, "Attempt to read from closed file")
// Read data chunks from file until a line delimiter is found:
while !atEof {
if let range = buffer.range(of: delimData) {
// Convert complete line (excluding the delimiter) to a string:
let line = String(data: buffer.subdata(in: 0..<range.lowerBound), encoding: encoding)
// Remove line (and the delimiter) from the buffer:
buffer.removeSubrange(0..<range.upperBound)
return line
}
let tmpData = fileHandle.readData(ofLength: chunkSize)
if tmpData.count > 0 {
buffer.append(tmpData)
} else {
// EOF or read error.
atEof = true
if buffer.count > 0 {
// Buffer contains last line in file (not terminated by delimiter).
let line = String(data: buffer as Data, encoding: encoding)
buffer.count = 0
return line
}
}
}
return nil
}
/// Start reading from the beginning of file.
func rewind() -> Void {
fileHandle.seek(toFileOffset: 0)
buffer.count = 0
atEof = false
}
/// Close the underlying file. No reading must be done after calling this method.
func close() -> Void {
fileHandle?.closeFile()
fileHandle = nil
}
}
extension StreamReader : Sequence {
func makeIterator() -> AnyIterator<String> {
return AnyIterator {
return self.nextLine()
}
}
}
Efficient and convenient class for reading text file line by line (Swift 4, Swift 5)
Note: This code is platform independent (macOS, iOS, ubuntu)
import Foundation
/// Read text file line by line in efficient way
public class LineReader {
public let path: String
fileprivate let file: UnsafeMutablePointer<FILE>!
init?(path: String) {
self.path = path
file = fopen(path, "r")
guard file != nil else { return nil }
}
public var nextLine: String? {
var line:UnsafeMutablePointer<CChar>? = nil
var linecap:Int = 0
defer { free(line) }
return getline(&line, &linecap, file) > 0 ? String(cString: line!) : nil
}
deinit {
fclose(file)
}
}
extension LineReader: Sequence {
public func makeIterator() -> AnyIterator<String> {
return AnyIterator<String> {
return self.nextLine
}
}
}
Usage:
guard let reader = LineReader(path: "/Path/to/file.txt") else {
return; // cannot open file
}
for line in reader {
print(">" + line.trimmingCharacters(in: .whitespacesAndNewlines))
}
Repository on github
Swift 4.2 Safe syntax
class LineReader {
let path: String
init?(path: String) {
self.path = path
guard let file = fopen(path, "r") else {
return nil
}
self.file = file
}
deinit {
fclose(file)
}
var nextLine: String? {
var line: UnsafeMutablePointer<CChar>?
var linecap = 0
defer {
free(line)
}
let status = getline(&line, &linecap, file)
guard status > 0, let unwrappedLine = line else {
return nil
}
return String(cString: unwrappedLine)
}
private let file: UnsafeMutablePointer<FILE>
}
extension LineReader: Sequence {
func makeIterator() -> AnyIterator<String> {
return AnyIterator<String> {
return self.nextLine
}
}
}
Usage:
guard let reader = LineReader(path: "/Path/to/file.txt") else {
return
}
reader.forEach { line in
print(line.trimmingCharacters(in: .whitespacesAndNewlines))
}
I'm late to the game, but here's small class I wrote for that purpose. After some different attempts (try to subclass NSInputStream) I found this to be a reasonable and simple approach.
Remember to #import <stdio.h> in your bridging header.
// Use is like this:
let readLine = ReadLine(somePath)
while let line = readLine.readLine() {
// do something...
}
class ReadLine {
private var buf = UnsafeMutablePointer<Int8>.alloc(1024)
private var n: Int = 1024
let path: String
let mode: String = "r"
private lazy var filepointer: UnsafeMutablePointer<FILE> = {
let csmode = self.mode.withCString { cs in return cs }
let cspath = self.path.withCString { cs in return cs }
return fopen(cspath, csmode)
}()
init(path: String) {
self.path = path
}
func readline() -> String? {
// unsafe for unknown input
if getline(&buf, &n, filepointer) > 0 {
return String.fromCString(UnsafePointer<CChar>(buf))
}
return nil
}
deinit {
buf.dealloc(n)
fclose(filepointer)
}
}
This function takes a file URL and returns a sequence which will return every line of the file, reading them lazily. It works with Swift 5. It relies on the underlying getline:
typealias LineState = (
// pointer to a C string representing a line
linePtr:UnsafeMutablePointer<CChar>?,
linecap:Int,
filePtr:UnsafeMutablePointer<FILE>?
)
/// Returns a sequence which iterates through all lines of the the file at the URL.
///
/// - Parameter url: file URL of a file to read
/// - Returns: a Sequence which lazily iterates through lines of the file
///
/// - warning: the caller of this function **must** iterate through all lines of the file, since aborting iteration midway will leak memory and a file pointer
/// - precondition: the file must be UTF8-encoded (which includes, ASCII-encoded)
func lines(ofFile url:URL) -> UnfoldSequence<String,LineState>
{
let initialState:LineState = (linePtr:nil, linecap:0, filePtr:fopen(url.path,"r"))
return sequence(state: initialState, next: { (state) -> String? in
if getline(&state.linePtr, &state.linecap, state.filePtr) > 0,
let theLine = state.linePtr {
return String.init(cString:theLine)
}
else {
if let actualLine = state.linePtr { free(actualLine) }
fclose(state.filePtr)
return nil
}
})
}
So for instance, here's how you would use it to print every line of a file named "foo" in your app bundle:
let url = NSBundle.mainBundle().urlForResource("foo", ofType: nil)!
for line in lines(ofFile:url) {
// suppress print's automatically inserted line ending, since
// lineGenerator captures each line's own new line character.
print(line, separator: "", terminator: "")
}
I developed this answer by modifying Alex Brown's answer to remove a memory leak mentioned by Martin R's comment, and by updating it to for Swift 5.
Try this answer, or read the Mac OS Stream Programming Guide.
You may find that performance will actually be better using the stringWithContentsOfURL, though, as it will be quicker to work with memory-based (or memory-mapped) data than disc-based data.
Executing it on another thread is well documented, also, for example here.
Update
If you don't want to read it all at once, and you don't want to use NSStreams, then you'll probably have to use C-level file I/O. There are many reasons not to do this - blocking, character encoding, handling I/O errors, speed to name but a few - this is what the Foundation libraries are for. I've sketched a simple answer below that just deals with ACSII data:
class StreamReader {
var eofReached = false
let fileHandle: UnsafePointer<FILE>
init (path: String) {
self.fileHandle = fopen(path.bridgeToObjectiveC().UTF8String, "rb".bridgeToObjectiveC().UTF8String)
}
deinit {
fclose(self.fileHandle)
}
func nextLine() -> String {
var nextChar: UInt8 = 0
var stringSoFar = ""
var eolReached = false
while (self.eofReached == false) && (eolReached == false) {
if fread(&nextChar, 1, 1, self.fileHandle) == 1 {
switch nextChar & 0xFF {
case 13, 10 : // CR, LF
eolReached = true
case 0...127 : // Keep it in ASCII
stringSoFar += NSString(bytes:&nextChar, length:1, encoding: NSASCIIStringEncoding)
default :
stringSoFar += "<\(nextChar)>"
}
} else { // EOF or error
self.eofReached = true
}
}
return stringSoFar
}
}
// OP's original request follows:
var aStreamReader = StreamReader(path: "~/Desktop/Test.text".stringByStandardizingPath)
while aStreamReader.eofReached == false { // Changed property name for more accurate meaning
let currentline = aStreamReader.nextLine()
//list.addItem(currentline)
println(currentline)
}
Or you could simply use a Generator:
let stdinByLine = GeneratorOf({ () -> String? in
var input = UnsafeMutablePointer<Int8>(), lim = 0
return getline(&input, &lim, stdin) > 0 ? String.fromCString(input) : nil
})
Let's try it out
for line in stdinByLine {
println(">>> \(line)")
}
It's simple, lazy, and easy to chain with other swift things like enumerators and functors such as map, reduce, filter; using the lazy() wrapper.
It generalises to all FILE as:
let byLine = { (file:UnsafeMutablePointer<FILE>) in
GeneratorOf({ () -> String? in
var input = UnsafeMutablePointer<Int8>(), lim = 0
return getline(&input, &lim, file) > 0 ? String.fromCString(input) : nil
})
}
called like
for line in byLine(stdin) { ... }
Following up on #dankogai's answer, I made a few modifications for Swift 4+,
let bufsize = 4096
let fp = fopen(jsonURL.path, "r");
var buf = UnsafeMutablePointer<Int8>.allocate(capacity: bufsize)
while (fgets(buf, Int32(bufsize-1), fp) != nil) {
print( String(cString: buf) )
}
buf.deallocate()
This worked for me.
Thanks
Swift 5.5: use url.lines
ADC Docs are here
Example usage:
guard let url = URL(string: "https://www.example.com") else {
return
}
// Manipulating an `Array` in memory seems to be a requirement.
// This will balloon in size as lines of data get added.
var myHugeArray = [String]()
do {
// This should keep the inbound data memory usage low
for try await line in url.lines {
myHugeArray.append(line)
}
} catch {
debugPrint(error)
}
You can use this in a SwiftUI .task { } modifier or wrap this in a Task return type to get its work off the main thread.
It turns out good old-fasioned C API is pretty comfortable in Swift once you grok UnsafePointer. Here is a simple cat that reads from stdin and prints to stdout line-by-line. You don't even need Foundation. Darwin suffices:
import Darwin
let bufsize = 4096
// let stdin = fdopen(STDIN_FILENO, "r") it is now predefined in Darwin
var buf = UnsafePointer<Int8>.alloc(bufsize)
while fgets(buf, Int32(bufsize-1), stdin) {
print(String.fromCString(CString(buf)))
}
buf.destroy()
(Note: I'm using Swift 3.0.1 on Xcode 8.2.1 with macOS Sierra 10.12.3)
All of the answers I've seen here missed that he could be looking for LF or CRLF. If everything goes well, s/he could just match on LF and check the returned string for an extra CR at the end. But the general query involves multiple search strings. In other words, the delimiter needs to be a Set<String>, where the set is neither empty nor contains the empty string, instead of a single string.
On my first try at this last year, I tried to do the "right thing" and search for a general set of strings. It was too hard; you need a full blown parser and state machines and such. I gave up on it and the project it was part of.
Now I'm doing the project again, and facing the same challenge again. Now I'm going to hard-code searching on CR and LF. I don't think anyone would need to search on two semi-independent and semi-dependent characters like this outside of CR/LF parsing.
I'm using the search methods provided by Data, so I'm not doing string encodings and stuff here. Just raw binary processing. Just assume I got an ASCII superset, like ISO Latin-1 or UTF-8, here. You can handle string encoding at the next-higher layer, and you punt on whether a CR/LF with secondary code-points attached still counts as a CR or LF.
The algorithm: just keep searching for the next CR and the next LF from your current byte offset.
If neither is found, then consider the next data string to be from the current offset to the end-of-data. Note that the terminator length is 0. Mark this as the end of your reading loop.
If a LF is found first, or only a LF is found, consider the next data string to be from the current offset to the LF. Note that the terminator length is 1. Move the offset to after the LF.
If only a CR is found, do like the LF case (just with a different byte value).
Otherwise, we got a CR followed by a LF.
If the two are adjacent, then handle like the LF case, except the terminator length will be 2.
If there is one byte between them, and said byte is also CR, then we got the "Windows developer wrote a binary \r\n while in text mode, giving a \r\r\n" problem. Also handle it like the LF case, except the terminator length will be 3.
Otherwise the CR and LF aren't connected, and handle like the just-CR case.
Here's some code for that:
struct DataInternetLineIterator: IteratorProtocol {
/// Descriptor of the location of a line
typealias LineLocation = (offset: Int, length: Int, terminatorLength: Int)
/// Carriage return.
static let cr: UInt8 = 13
/// Carriage return as data.
static let crData = Data(repeating: cr, count: 1)
/// Line feed.
static let lf: UInt8 = 10
/// Line feed as data.
static let lfData = Data(repeating: lf, count: 1)
/// The data to traverse.
let data: Data
/// The byte offset to search from for the next line.
private var lineStartOffset: Int = 0
/// Initialize with the data to read over.
init(data: Data) {
self.data = data
}
mutating func next() -> LineLocation? {
guard self.data.count - self.lineStartOffset > 0 else { return nil }
let nextCR = self.data.range(of: DataInternetLineIterator.crData, options: [], in: lineStartOffset..<self.data.count)?.lowerBound
let nextLF = self.data.range(of: DataInternetLineIterator.lfData, options: [], in: lineStartOffset..<self.data.count)?.lowerBound
var location: LineLocation = (self.lineStartOffset, -self.lineStartOffset, 0)
let lineEndOffset: Int
switch (nextCR, nextLF) {
case (nil, nil):
lineEndOffset = self.data.count
case (nil, let offsetLf):
lineEndOffset = offsetLf!
location.terminatorLength = 1
case (let offsetCr, nil):
lineEndOffset = offsetCr!
location.terminatorLength = 1
default:
lineEndOffset = min(nextLF!, nextCR!)
if nextLF! < nextCR! {
location.terminatorLength = 1
} else {
switch nextLF! - nextCR! {
case 2 where self.data[nextCR! + 1] == DataInternetLineIterator.cr:
location.terminatorLength += 1 // CR-CRLF
fallthrough
case 1:
location.terminatorLength += 1 // CRLF
fallthrough
default:
location.terminatorLength += 1 // CR-only
}
}
}
self.lineStartOffset = lineEndOffset + location.terminatorLength
location.length += self.lineStartOffset
return location
}
}
Of course, if you have a Data block of a length that's at least a significant fraction of a gigabyte, you'll take a hit whenever no more CR or LF exist from the current byte offset; always fruitlessly searching until the end during every iteration. Reading the data in chunks would help:
struct DataBlockIterator: IteratorProtocol {
/// The data to traverse.
let data: Data
/// The offset into the data to read the next block from.
private(set) var blockOffset = 0
/// The number of bytes remaining. Kept so the last block is the right size if it's short.
private(set) var bytesRemaining: Int
/// The size of each block (except possibly the last).
let blockSize: Int
/// Initialize with the data to read over and the chunk size.
init(data: Data, blockSize: Int) {
precondition(blockSize > 0)
self.data = data
self.bytesRemaining = data.count
self.blockSize = blockSize
}
mutating func next() -> Data? {
guard bytesRemaining > 0 else { return nil }
defer { blockOffset += blockSize ; bytesRemaining -= blockSize }
return data.subdata(in: blockOffset..<(blockOffset + min(bytesRemaining, blockSize)))
}
}
You have to mix these ideas together yourself, since I haven't done it yet. Consider:
Of course, you have to consider lines completely contained in a chunk.
But you have to handle when the ends of a line are in adjacent chunks.
Or when the endpoints have at least one chunk between them
The big complication is when the line ends with a multi-byte sequence, but said sequence straddles two chunks! (A line ending in just CR that's also the last byte in the chunk is an equivalent case, since you need to read the next chunk to see if your just-CR is actually a CRLF or CR-CRLF. There are similar shenanigans when the chunk ends with CR-CR.)
And you need to handle when there are no more terminators from your current offset, but the end-of-data is in a later chunk.
Good luck!
I wanted a version that did not continually modify the buffer or duplicate code, as both are inefficient, and would allow for any size buffer (including 1 byte) and any delimiter. It has one public method: readline(). Calling this method will return the String value of the next line or nil at EOF.
import Foundation
// LineStream(): path: String, [buffSize: Int], [delim: String] -> nil | String
// ============= --------------------------------------------------------------
// path: the path to a text file to be parsed
// buffSize: an optional buffer size, (1...); default is 4096
// delim: an optional delimiter String; default is "\n"
// ***************************************************************************
class LineStream {
let path: String
let handle: NSFileHandle!
let delim: NSData!
let encoding: NSStringEncoding
var buffer = NSData()
var buffSize: Int
var buffIndex = 0
var buffEndIndex = 0
init?(path: String,
buffSize: Int = 4096,
delim: String = "\n",
encoding: NSStringEncoding = NSUTF8StringEncoding)
{
self.handle = NSFileHandle(forReadingAtPath: path)
self.path = path
self.buffSize = buffSize < 1 ? 1 : buffSize
self.encoding = encoding
self.delim = delim.dataUsingEncoding(encoding)
if handle == nil || self.delim == nil {
print("ERROR initializing LineStream") /* TODO use STDERR */
return nil
}
}
// PRIVATE
// fillBuffer(): _ -> Int [0...buffSize]
// ============= -------- ..............
// Fill the buffer with new data; return with the buffer size, or zero
// upon reaching end-of-file
// *********************************************************************
private func fillBuffer() -> Int {
buffer = handle.readDataOfLength(buffSize)
buffIndex = 0
buffEndIndex = buffer.length
return buffEndIndex
}
// PRIVATE
// delimLocation(): _ -> Int? nil | [1...buffSize]
// ================ --------- ....................
// Search the remaining buffer for a delimiter; return with the location
// of a delimiter in the buffer, or nil if one is not found.
// ***********************************************************************
private func delimLocation() -> Int? {
let searchRange = NSMakeRange(buffIndex, buffEndIndex - buffIndex)
let rangeToDelim = buffer.rangeOfData(delim,
options: [], range: searchRange)
return rangeToDelim.location == NSNotFound
? nil
: rangeToDelim.location
}
// PRIVATE
// dataStrValue(): NSData -> String ("" | String)
// =============== ---------------- .............
// Attempt to convert data into a String value using the supplied encoding;
// return the String value or empty string if the conversion fails.
// ***********************************************************************
private func dataStrValue(data: NSData) -> String? {
if let strVal = NSString(data: data, encoding: encoding) as? String {
return strVal
} else { return "" }
}
// PUBLIC
// readLine(): _ -> String? nil | String
// =========== ____________ ............
// Read the next line of the file, i.e., up to the next delimiter or end-of-
// file, whichever occurs first; return the String value of the data found,
// or nil upon reaching end-of-file.
// *************************************************************************
func readLine() -> String? {
guard let line = NSMutableData(capacity: buffSize) else {
print("ERROR setting line")
exit(EXIT_FAILURE)
}
// Loop until a delimiter is found, or end-of-file is reached
var delimFound = false
while !delimFound {
// buffIndex will equal buffEndIndex in three situations, resulting
// in a (re)filling of the buffer:
// 1. Upon the initial call;
// 2. If a search for a delimiter has failed
// 3. If a delimiter is found at the end of the buffer
if buffIndex == buffEndIndex {
if fillBuffer() == 0 {
return nil
}
}
var lengthToDelim: Int
let startIndex = buffIndex
// Find a length of data to place into the line buffer to be
// returned; reset buffIndex
if let delim = delimLocation() {
// SOME VALUE when a delimiter is found; append that amount of
// data onto the line buffer,and then return the line buffer
delimFound = true
lengthToDelim = delim - buffIndex
buffIndex = delim + 1 // will trigger a refill if at the end
// of the buffer on the next call, but
// first the line will be returned
} else {
// NIL if no delimiter left in the buffer; append the rest of
// the buffer onto the line buffer, refill the buffer, and
// continue looking
lengthToDelim = buffEndIndex - buffIndex
buffIndex = buffEndIndex // will trigger a refill of buffer
// on the next loop
}
line.appendData(buffer.subdataWithRange(
NSMakeRange(startIndex, lengthToDelim)))
}
return dataStrValue(line)
}
}
It is called as follows:
guard let myStream = LineStream(path: "/path/to/file.txt")
else { exit(EXIT_FAILURE) }
while let s = myStream.readLine() {
print(s)
}
I have buttons that when pressed, will call/message a number from an array. i.e. button1 will call the number at index 0 of the array, button2 at index 1, etc.. For some reason whenever the number from the array contains a format other than xxx-xxx-xxx it crashes (i.e. (xxx) xxx-xxx). And yet, the log gives me the following error even though the array isn't nil:
Anyone know why this is happening?
Here is the code for everything:
import UIKit
import AddressBook
var contactInfo: [String] = []
[...]
override func viewDidLoad() {
super.viewDidLoad()
//this is the function that grabs the array from an app group
setUpCallMessageButtons()
[...]
callButton1.addTarget(self, action: "call:", forControlEvents: UIControlEvents.TouchUpInside)
}
func call(sender:UIButton!)
{
if (sender == callButton1) {
println("\(contactInfo)")
var url:NSURL? = NSURL(string: "tel:\(contactInfo[0])")
self.extensionContext?.openURL(url!, completionHandler:{(success: Bool) -> Void in
})
}
}
func setUpCallMessageButtons(){
let appGroupID = "**redacted**"
let defaults = NSUserDefaults(suiteName: appGroupID)
contactInfo = (defaults!.objectForKey("contactInfo") as! [String])
println("\(contactInfo)")
//This is gives the log down below. As you can see, none are nil.
}
Buttons 1,2 and 5 work while 3 and 4 always crash.
My guess is that if the phone number isn't formatted correctly, the call to convert it to an NSURL is failing and returning nil.
You probably need to wrap your call to openURL in an optional binding ("if let") block:
var url:NSURL? = NSURL(string: "tel:\(contactInfo[0])")
if let url = url
{
self.extensionContext?.openURL(url!,
completionHandler:
{
(success: Bool) -> Void in
}
}
else
{
println("Phone number \(contactInfo[0]) is not in a valid format")
}
You might want to strip away parenthesis from your phone number before trying to create your URL. A simple way would be to use the NSString method stringByReplacingOccurrencesOfString:withString:.
Here's a little storyboard - which shows you where the nil is coming from
Unexpectedly found nil means there is a variable which is expected to be non-nil but at run time was nil
This is the line of code that is causing the issue
self.extensionContext?.openURL(url!, completionHandler:{(success: Bool)
It expects url to be non-nil (i.e. the !) but it is definitely nil (see image)
If this data comes from the user or from the internet, you might want a method to strip away all non-numeric characters. Something like this (from a working playground I just banged out) :
import UIKit
func digitsOnly(#fromString: String) -> String
{
var workString = NSMutableString(string: fromString)
let digitsSet = NSCharacterSet.decimalDigitCharacterSet()
var index: Int
for index = count(fromString)-1; index>=0; index--
{
if !digitsSet.characterIsMember(workString.characterAtIndex(index))
{
workString.deleteCharactersInRange(NSRange(location:index, length:1))
}
}
return workString as String
}
let testString = "(555) 111-2222"
let result = digitsOnly(fromString:testString)
println("digitsOnly(\"\(testString)\") = \"\(result)\" ")
This displays:
digitsOnly("(555) 111-2222") = "5551112222"
Edit:
Or alternately a more Swift-like version of the same function:
func digitsOnly(#fromString: String) -> String
{
var result = String()
let digitsSet = NSCharacterSet.decimalDigitCharacterSet()
for char in fromString
{
if digitsSet.characterIsMember(char as unichar)
result += char
}
}
EDIT #2:
You can increase the set of characters that is left in place by changing the character set you use. Replace the line
let digitsSet = NSCharacterSet.decimalDigitCharacterSet()
With
let digitsSet = NSCharacterSet(charactersInString: "0123456789+-")
To preserve "+" signs and dashes. (Edit the string to include the characters you need.)
Given the name of a file in the bundle, I want load the file into my Swift app. So I need to use this method:
let soundURL = NSBundle.mainBundle().URLForResource(fname, withExtension: ext)
For whatever reason, the method needs the filename separated from the file extension. Fine, it's easy enough to separate the two in most languages. But so far I'm not finding it to be so in Swift.
So here is what I have:
var rt: String.Index = fileName.rangeOfString(".", options:NSStringCompareOptions.BackwardsSearch)
var fname: String = fileName .substringToIndex(rt)
var ext = fileName.substringFromIndex(rt)
If I don't include the typing on the first line, I get errors on the two subsequent lines. With it, I'm getting an error on the first line:
Cannot convert the expression's type '(UnicodeScalarLiteralConvertible, options: NSStringCompareOptions)' to type 'UnicodeScalarLiteralConvertible'
How can I split the filename from the extension? Is there some elegant way to do this?
I was all excited about Swift because it seemed like a much more elegant language than Objective C. But now I'm finding that it has its own cumbersomeness.
Second attempt: I decided to make my own string-search method:
func rfind(haystack: String, needle: Character) -> Int {
var a = Array(haystack)
for var i = a.count - 1; i >= 0; i-- {
println(a[i])
if a[i] == needle {
println(i)
return i;
}
}
return -1
}
But now I get an error on the line var rt: String.Index = rfind(fileName, needle: "."):
'Int' is not convertible to 'String.Index'
Without the cast, I get an error on the two subsequent lines.
Can anyone help me to split this filename and extension?
Swift 5.0 update:
As pointed out in the comment, you can use this.
let filename: NSString = "bottom_bar.png"
let pathExtention = filename.pathExtension
let pathPrefix = filename.deletingPathExtension
This is with Swift 2, Xcode 7: If you have the filename with the extension already on it, then you can pass the full filename in as the first parameter and a blank string as the second parameter:
let soundURL = NSBundle.mainBundle()
.URLForResource("soundfile.ext", withExtension: "")
Alternatively nil as the extension parameter also works.
If you have a URL, and you want to get the name of the file itself for some reason, then you can do this:
soundURL.URLByDeletingPathExtension?.lastPathComponent
Swift 4
let soundURL = NSBundle.mainBundle().URLForResource("soundfile.ext", withExtension: "")
soundURL.deletingPathExtension().lastPathComponent
Works in Swift 5. Adding these behaviors to String class:
extension String {
func fileName() -> String {
return URL(fileURLWithPath: self).deletingPathExtension().lastPathComponent
}
func fileExtension() -> String {
return URL(fileURLWithPath: self).pathExtension
}
}
Example:
let file = "image.png"
let fileNameWithoutExtension = file.fileName()
let fileExtension = file.fileExtension()
Solution Swift 4
This solution will work for all instances and does not depend on manually parsing the string.
let path = "/Some/Random/Path/To/This.Strange.File.txt"
let fileName = URL(fileURLWithPath: path).deletingPathExtension().lastPathComponent
Swift.print(fileName)
The resulting output will be
This.Strange.File
In Swift 2.1 String.pathExtension is not available anymore. Instead you need to determine it through NSURL conversion:
NSURL(fileURLWithPath: filePath).pathExtension
In Swift you can change to NSString to get extension faster:
extension String {
func getPathExtension() -> String {
return (self as NSString).pathExtension
}
}
Latest Swift 4.2 works like this:
extension String {
func fileName() -> String {
return URL(fileURLWithPath: self).deletingPathExtension().lastPathComponent
}
func fileExtension() -> String {
return URL(fileURLWithPath: self).pathExtension
}
}
In Swift 2.1, it seems that the current way to do this is:
let filename = fileURL.URLByDeletingPathExtension?.lastPathComponent
let extension = fileURL.pathExtension
Swift 5 with code sugar
extension String {
var fileName: String {
URL(fileURLWithPath: self).deletingPathExtension().lastPathComponent
}
var fileExtension: String{
URL(fileURLWithPath: self).pathExtension
}
}
SWIFT 3.x Shortest Native Solution
let fileName:NSString = "the_file_name.mp3"
let onlyName = fileName.deletingPathExtension
let onlyExt = fileName.pathExtension
No extension or any extra stuff
(I've tested. based on #gabbler solution for Swift 2)
Swift 5
URL.deletingPathExtension().lastPathComponent
Strings in Swift can definitely by tricky. If you want a pure Swift method, here's how I would do it:
Use find to find the last occurrence of a "." in the reverse of the string
Use advance to get the correct index of the "." in the original string
Use String's subscript function that takes an IntervalType to get the strings
Package this all up in a function that returns an optional tuple of the name and extension
Something like this:
func splitFilename(str: String) -> (name: String, ext: String)? {
if let rDotIdx = find(reverse(str), ".") {
let dotIdx = advance(str.endIndex, -rDotIdx)
let fname = str[str.startIndex..<advance(dotIdx, -1)]
let ext = str[dotIdx..<str.endIndex]
return (fname, ext)
}
return nil
}
Which would be used like:
let str = "/Users/me/Documents/Something.something/text.txt"
if let split = splitFilename(str) {
println(split.name)
println(split.ext)
}
Which outputs:
/Users/me/Documents/Something.something/text
txt
Or, just use the already available NSString methods like pathExtension and stringByDeletingPathExtension.
Swift 5
URL(string: filePath)?.pathExtension
Try this for a simple Swift 4 solution
extension String {
func stripExtension(_ extensionSeperator: Character = ".") -> String {
let selfReversed = self.reversed()
guard let extensionPosition = selfReversed.index(of: extensionSeperator) else { return self }
return String(self[..<self.index(before: (extensionPosition.base.samePosition(in: self)!))])
}
}
print("hello.there.world".stripExtension())
// prints "hello.there"
Swift 3.0
let sourcePath = NSURL(string: fnName)?.pathExtension
let pathPrefix = fnName.replacingOccurrences(of: "." + sourcePath!, with: "")
Swift 3.x extended solution:
extension String {
func lastPathComponent(withExtension: Bool = true) -> String {
let lpc = self.nsString.lastPathComponent
return withExtension ? lpc : lpc.nsString.deletingPathExtension
}
var nsString: NSString {
return NSString(string: self)
}
}
let path = "/very/long/path/to/filename_v123.456.plist"
let filename = path.lastPathComponent(withExtension: false)
filename constant now contains "filename_v123.456"
A better way (or at least an alternative in Swift 2.0) is to use the String pathComponents property. This splits the pathname into an array of strings. e.g
if let pathComponents = filePath.pathComponents {
if let last = pathComponents.last {
print(" The last component is \(last)") // This would be the extension
// Getting the last but one component is a bit harder
// Note the edge case of a string with no delimiters!
}
}
// Otherwise you're out of luck, this wasn't a path name!
They got rid of pathExtension for whatever reason.
let str = "Hello/this/is/a/filepath/file.ext"
let l = str.componentsSeparatedByString("/")
let file = l.last?.componentsSeparatedByString(".")[0]
let ext = l.last?.componentsSeparatedByString(".")[1]
A cleaned up answer for Swift 4 with an extension off of PHAsset:
import Photos
extension PHAsset {
var originalFilename: String? {
if #available(iOS 9.0, *),
let resource = PHAssetResource.assetResources(for: self).first {
return resource.originalFilename
}
return value(forKey: "filename") as? String
}
}
As noted in XCode, the originalFilename is the name of the asset at the time it was created or imported.
Maybe I'm getting too late for this but a solution that worked for me and consider quite simple is using the #file compiler directive. Here is an example where I have a class FixtureManager, defined in FixtureManager.swift inside the /Tests/MyProjectTests/Fixturesdirectory. This works both in Xcode and withswift test`
import Foundation
final class FixtureManager {
static let fixturesDirectory = URL(fileURLWithPath: #file).deletingLastPathComponent()
func loadFixture(in fixturePath: String) throws -> Data {
return try Data(contentsOf: fixtureUrl(for: fixturePath))
}
func fixtureUrl(for fixturePath: String) -> URL {
return FixtureManager.fixturesDirectory.appendingPathComponent(fixturePath)
}
func save<T: Encodable>(object: T, in fixturePath: String) throws {
let data = try JSONEncoder().encode(object)
try data.write(to: fixtureUrl(for: fixturePath))
}
func loadFixture<T: Decodable>(in fixturePath: String, as decodableType: T.Type) throws -> T {
let data = try loadFixture(in: fixturePath)
return try JSONDecoder().decode(decodableType, from: data)
}
}
Creates unique "file name" form url including two previous folders
func createFileNameFromURL (colorUrl: URL) -> String {
var arrayFolders = colorUrl.pathComponents
// -3 because last element from url is "file name" and 2 previous are folders on server
let indx = arrayFolders.count - 3
var fileName = ""
switch indx{
case 0...:
fileName = arrayFolders[indx] + arrayFolders[indx+1] + arrayFolders[indx+2]
case -1:
fileName = arrayFolders[indx+1] + arrayFolders[indx+2]
case -2:
fileName = arrayFolders[indx+2]
default:
break
}
return fileName
}