String handling not working — am I going nuts?


Rick Aurbach
 

Can you take a look at this for me? I think I’ve been cooped up at home too long, because whatever is going on, I’m missing it.
 
Context:
I’m working on a data import function for a new app. It looks like I’ve set things up correctly, since if I drop the file (from my desktop to the Simulator), the sceneDelegate’s scene(_:openURLContexts:) method is called. I set up an operations queue to handle the input in background and instantiate the Operation subclass, passing it a URLContext.
 
In the Operation subclass’s main() method, I do the following:
 
do {
var str = try String(contentsOf: urlContext.url) // (1)
str.removeAll {  $0 == “\r” } // (2)
let lines = str.split(separator: “\n”) // (3)
 
The file is UTF8, with lines terminated in “\r\n”
 
After executing (1), the debugger shows:
 
str String "sugar,NO\r\n1/1/20,154\r\n1/2/20,141\r\n1/3/20,147\r\n1/4/20,154\r\n1/5/20,172\r\n1/6/20,170\r\n1/7/20,164\r\n1/8/20,201\r\n1/9/20,163\r\n1/10/20,171\r\n1/11/20,209\r\n1/12/20,202\r\n1/13/20,164\r\n1/14/20,232\r\n1/15/20,182\r\n1/16/20,211\r\n1/17/20,175\r\n1/18/20,174\r\n1/19/20,147\r\n1/20/20,148\r\n1/21/20,150\r\n1/22/20,141\r\n1/23/20,165\r\n1/24/20,169\r\n1/25/20,260\r\n1/26/20,168\r\n1/27/20,179\r\n1/28/20,160\r\n1/29/20,155\r\n1/30/20,181\r\n1/31/20,156\r\n2/1/20,178\r\n2/2/20,202\r\n2/3/20,188\r\n2/4/20,175\r\n2/5/20,157\r\n2/6/20,166\r\n2/7/20,165\r\n2/8/20,146\r\n2/9/20,192\r\n2/10/20,178\r\n2/11/20,172\r\n2/12/20,180\r\n2/13/20,140”
 
which is what I expect.
 
After executing (2), the string is UNCHANGED. (i.e., the “\r” characters were not removed.
 
After executing (3), the ‘lines’ variable contains 1 item which is the whole string (with both the “\r” and “\n” characters present.
 
I’m going nuts here. What’s going on??

Cheers,

Rick Aurbach


Marco S Hyman
 

On Apr 6, 2020, at 3:29 PM, Rick Aurbach via groups.io <rlaurb=me.com@groups.io> wrote:

The file is UTF8, with lines terminated in “\r\n”
Are the lines terminated with a carriage return/line feed pair or are they terminated with the character string "\r\n”. The results of step 2 and 3 are what I’d expect if the string contained the character string \r\n instead of a carriage return/line feed pair.


 

Doesn't Swift's String class have higher-level methods to break a string by lines? NSString certainly did.

—Jens


Marco S Hyman
 

do {
var str = try String(contentsOf: urlContext.url) // (1)
str.removeAll { $0 == “\r” } // (2)
let lines = str.split(separator: “\n”) // (3)
Oh. Yeah. That’s too much work.


var str = try String(contentsOf: urlContext.url) // (1)
let lines = str.split(separator: “\r\n”) // (3)

works and is faster in that you’ve removed an O(n) operation,

I created your test file with crlf end-of-line (thank you bbedit) and tried it using this code in a playground:

let url = URL(fileURLWithPath: “/path/to.foo.txt”)
if let str = try? String(contentsOf: url) {
let lines = str.split(separator: "\r\n") // (3)
print(lines)
}

Marc


Rick Aurbach
 

Ok, I still don’t understand what’s going on, but I’ve got an alternative solution. (Maybe somebody can clarify this for me??)

WHAT DOESN’T WORK IN MY CONTEXT:

do {
    var str = try String(contentsOf: urlContext.url)
    str.removeAll { $0 == “\r” }
    let lines = str.split(separator: “\n”)
   …
} catch {}

BUT THE FOLLOWING WORKS NICELY::

var lines = [String]()
do {
    let str = try String(contentsOf: urlContext.url)
    str.enumerateLines { (line, stop) in
        lines.append(line)
        stop = false
    }
} catch {}

Note: since enumerateLines is not labeled ’throws’, Xcode complains if I try to throw from it’s closure. So instead of working one line at a time within the closure, I need to build a line array and then process it (one line at a time) in a second do-block.

So I have a solution, but I don’t understand the original code didn’t work. Any insights?

Cheers,

Rick Aurbach


 



On Apr 6, 2020, at 4:53 PM, Marco S Hyman <marc@...> wrote:

var str = try String(contentsOf: urlContext.url) // (1)
let lines = str.split(separator: “\r\n”) // (3)

works and is faster in that you’ve removed an O(n) operation,

You've also removed support for Unix line endings. Now it only supports Windows line endings; probably a bad idea.



On Apr 7, 2020, at 10:30 AM, Rick Aurbach via groups.io <rlaurb@...> wrote:

So I have a solution, but I don’t understand the original code didn’t work. Any insights?

I'm mystified too, but IMHO enumerateLines is a better way to do it.

—Jens