Ämne:
Re: [dcdev] New Encoding Scheme First
Från:
eric
Datum:
2003-10-28 3:51
Till:
Direct Connect developers


> 6) UTF8 is the encoding of choice for me as well...it should suffice for
> now, although we should keep an eye open for 4-byte unicode support once
> that becomes popular (don't remember if utf8 can encode that or not)

UTF8 supports all kinds of characters (including 2 and 4 byte unicodes) but
use generally less space on western european texts. For asian languages on
the other hand, UTF16 is the better choice considering space/bandwidth.

Is UTF16 compatible with standard strlen [& C°] functions ? When UTF8 needs multiple bytes to encode a character, all the bytes have a value >=128. Does UTF16 has a same kind of feature ? If no, we may see some annoying bytes in the middle of strings (like \0 or even | ).

Eric
DCTC/dchub

-- 
DC Developers mailinglist
http://3jane.ashpool.org/cgi-bin/mailman/listinfo/dcdev