NSString stringWithContentsOfFile:usedEncoding:error:

Alexey Proskuryakov ap-carbon at rambler.ru
Mon Jun 20 01:35:36 PDT 2005


On 17.06.2005 19:59, "Fritz Anderson" <fritza at manoverboard.org> wrote:

>> What is the equivalent of NSString
>> stringWithContentsOfFile:usedEncoding:error: under 10.3.9? I need
>> to read in a file but I'm not sure what the encoding is. How do you
>> deal with this?
> 
> You've no doubt discovered that encoding-sniffing is new with Tiger.

  In my tests, the only encoding this method could sniff was UTF-16. Not
UTF-8, or ASCII, or Windows Cyrillic.

> Before this, people muddled through as best they could. You can load
> an NSData with the contents of the file and sniff the first four
> bytes of the file for Unicode byte-order markers (a file can still be
> UTF without a BOM). You can sniff the whole contents for bit 7 being
> set, and if it never is, pick ASCII.
> 
> After that, you guess based on your market. Mac Roman encoding is
> often the safest 8-bit encoding, though UTF-8 is taking over. ISO
> Latin-1, if you're dealing with Windows-origin text that isn't Unicode.

  Of course, this would render your product unusable for many people. A
preferences option plus a runtime override (in your Open and Save dialogs)
is a much more robust approach.

  By the way, TextEdit is an almost perfect example of how this should be
done.

- WBR, Alexey Proskuryakov





More information about the MacOSX-dev mailing list