Date
1 - 5 of 5
Word detection without separating spaces
Sandor Szatmari
You know how apple apps can turn ‘itwas’ into ‘it was’ as your typing in a text field? Is there an API for this? I’d like to be able to take something like AREALLYLONGSTRING and turn it into ‘a really long string’ or ‘a_really_long_string’ or basically insert whatever separator I want.
Something like -(NSString*)componentsSeparatedByDictionaryWord:(NSString*)string; Does my magical imaginary API exist? I’d imagine it’s part of autocorrection functionality… Thanks, Sandor Trying hard not to reinvent the wheel |
|
On 27 Sep 2018, at 5:21 am, Sandor Szatmari <admin.szatmari.net@...> wrote:Some quick googlage suggests that UITextChecker (iOS) and NSSpellChecker (Mac OS) might fit the bill...? -ben |
|
Sandor Szatmari
Yes, thanks! I have been playing with NSSpellChecker but haven’t been able to configure it with the magic sauce to get it to tokenize the way I am describing.
toggle quoted message
Show quoted text
Sandor On Sep 27, 2018, at 12:41, Ben Kennedy <ben-groups@...> wrote:
On 27 Sep 2018, at 5:21 am, Sandor Szatmari <admin.szatmari.net@...> wrote:Some quick googlage suggests that UITextChecker (iOS) and NSSpellChecker (Mac OS) might fit the bill...? |
|
Jeremy Hughes
You could try taking the first n characters of a word (where n is 1 to maximum word length) and spell checking that string. When you a string that spell-checks, continue for the next n characters, and so on. If you get to a point where you can’t find a word for the remaining characters, backtrack and look for a longer word in the previous characters.
Some strings might have multiple answers: findale could be find ale or fin dale Jeremy |
|
NSLinguisticTagger provides a lot of operations like finding word boundaries and identifying parts of speech. I don’t know if it can be made to identify English words without spaces between them.* Apple has added some newer APIs that use machine-learning to analyze natural language, but I can’t remember what the framework is called; maybe it has some support for that? —Jens * It does, however, detect word breaks in East Asian languages that don’t put any separator characters between words. |
|