Exact Swift replacement for NSString's boolValue method?

120 Views Asked by At

There are several threads on how to replicate NSString's boolValue method, but most are guesses or simplifications.

Apple only gives us this brief descriptions:

This property is YES on encountering one of "Y", "y", "T", "t", or a digit 1-9—the method ignores any trailing characters. This property is NO if the receiver doesn’t begin with a valid decimal text representation of a number. The property assumes a decimal representation and skips whitespace at the beginning of the string. It also skips initial whitespace characters, or optional -/+ sign followed by zeroes.

So what does it consider as whitespace? What does "assumes a decimal representation" mean?

I am converting an existing Objective-C app and converting it to Swift, and would like a true Swift boolValue method without having to first convert the string to an NSString (ie (string as NSString).boolValue())

1

There are 1 best solutions below

12
David H On

UPDATE: As Duncan pointed out in comments, whitespace includes much more than ASCII white space, and the original code and test harness has been updated to reflect that.


It actually turns out that the Apple terse description is complete and correct! Whitespace is the CharacterSet.whitespaces plus NULL. Runs of "0"s can be optionally prefixed by a "+" or "-".

This Swift code exactly duplicates the NSString function (at least for ASCII characters other than the Unicode whitespace):

extension String {

static var whiteSpaces: CharacterSet = {
    var set = CharacterSet.whitespaces
    let asciiNull = CharacterSet(charactersIn: "\u{0000}")
    return set.union(asciiNull)
}()

var boolValue: Bool {
    guard !self.isEmpty else { return false }

    let trueChar: Set<Character>    = ["t", "T", "y", "Y", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
    let trueNum: Set<Character>     = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
    let zeroRun: Set<Character>     = ["+", "-", "0"]

    var index = self.startIndex
    let lastIndex = self.endIndex

    var c: Character
    repeat {
        c = self[index]
        index = self.index(index, offsetBy: 1)
    } while c.unicodeScalars.reduce(0, { return Self.whiteSpaces.contains($1) ? $0 : $0 + 1 }) == 0 &&  index < lastIndex;

    if zeroRun.contains(c) && index < lastIndex {
        repeat {
            c = self[index]
            index = self.index(index, offsetBy: 1)
        } while c == "0" && index < lastIndex;
        return trueNum.contains(c)
    } else {
        return trueChar.contains(c)
    }
}

}

The reason I can claim compatibility is that I created a test harness to compare the above code to the NSString boolValue method, and insured that both returned the same value for 1,000,000,000 "fuzz" tests, using this code:

func test() {
    func random() -> String {
        // from https://stackoverflow.com/a/51200984/1633251 and https://stackoverflow.com/a/52133561/1633251
        let unicode: [String] = [
            "\u{00A0}",
            "\u{1680}",
            "\u{2000}",
            "\u{2001}",
            "\u{2002}",
            "\u{2003}",
            "\u{2004}",
            "\u{2005}",
            "\u{2006}",
            "\u{2007}",
            "\u{2008}",
            "\u{2009}",
            "\u{200A}",
            "\u{200B}",
            "\u{202F}",
            "\u{205F}",
            "\u{3000}",
            "\u{1D6A8}",
            "\u{CC791}",
        ]

        let val = Int.random(in: 0..<(128+unicode.count))

        let c: String
        if val < 128 {
            c = String(format: "%c", val)
        } else {
            c = unicode[val - 128];
        }

        return c
    }

    var i = 0
    while true {
        var s = ""
        let len = Int.random(in: 1..<16)
        for _ in 0..<len {
            s += random()
        }

        let oTest = (s as NSString).boolValue
        let sTest = s.boolValue
        if oTest != sTest {
            print("Testing \(s) oTest=\(oTest) sTest=\(sTest)")
            for c in s {
                var hex: String = ""
                for v in c.unicodeScalars {
                    hex += String(format: "%4.4X", v.value)
                }
                print("HEX:", String(format: "%@ char=%@", hex, String(c) ) )
            }
            fatalError()
        }

        i += 1
        if i % 1_000_000 == 0 {
            print("Count \(i)")
            if i == 1_000_000_000 {
                break
            } else {
                sleep(1)
            }
        }
    }
}

Note that the strings contain random characters in the range from ASCII 0 to 127 and the Unicode white spaces extracted from CharacterSet.whitespaces.

Of course my coded failed several times before I got it right!

--- Based on Martin's comment, there is a NSString open source project that has a boolValue - but it doesn't match the iOS version. However a slight modification will let it match (changes commented):

extension String {

private static var whiteSpaces: CharacterSet = {
    var set = CharacterSet.whitespaces
    let asciiNull = CharacterSet(charactersIn: "\u{0000}")
    return set.union(asciiNull)
}()

// https://github.com/apple/swift-corelibs-foundation/blob/main/Sources/Foundation/NSString.swift#L666C33-L666C33

var boolValue: Bool {
    let scanner = Scanner(string: self)
    scanner.charactersToBeSkipped = nil // must do this or "+ 4" passes

    // skip initial whitespace if present
    let _ = scanner.scanCharacters(from: Self.whiteSpaces2) // Need the null character

    if scanner.scanCharacters(from: CharacterSet(charactersIn: "tTyY")) != nil {
        return true
    }

    // scan a single optional '+' or '-' character, followed by zeroes
    if scanner.scanString("+") == nil {
        let _ = scanner.scanString("-")
    }

    // scan any following zeroes
    let _ = scanner.scanCharacters(from: CharacterSet(charactersIn: "0"))
    return scanner.scanCharacters(from: CharacterSet(charactersIn: "123456789")) != nil
}

}