Unicode
Moderator: Moderators
this would mean that nicknames and myinfo content would be forced utf8
(which I do not have a problem with, per se)
(which I do not have a problem with, per se)
http://dc.selwerd.nl/hublist.xml.bz2
http://www.b.ali.btinternet.co.uk/DCPlusPlus/index.html (TheParanoidOne's DC++ Guide)
http://www.dslreports.com/faq/dc (BSOD2600's Direct Connect FAQ)
http://www.b.ali.btinternet.co.uk/DCPlusPlus/index.html (TheParanoidOne's DC++ Guide)
http://www.dslreports.com/faq/dc (BSOD2600's Direct Connect FAQ)
Well, it could be. I havent discussed it or thought about it. It might be a good idea to require hub support for those things, and that way the hub can flag every joining client it should use utf8 in descr and nick.
If you want to make a client which doesnt need any hub support, its thinkable (imo) to make the GUI have some way of manually tagging a user as using utf8 or not. The user will probably know, after all, and if not she can try, which should clear things up (hopefully, but otherwise your nick/descr is seriously screwy)
Making a heuristic to detect utf8 might also be possible, but it will probably not be 100%. Utf8 with any large content of for example asian codes, will have a very large degree of >127 codes in it.
If you want to make a client which doesnt need any hub support, its thinkable (imo) to make the GUI have some way of manually tagging a user as using utf8 or not. The user will probably know, after all, and if not she can try, which should clear things up (hopefully, but otherwise your nick/descr is seriously screwy)
Making a heuristic to detect utf8 might also be possible, but it will probably not be 100%. Utf8 with any large content of for example asian codes, will have a very large degree of >127 codes in it.
uhm.. I don't like heuristics and guesswork, either a hub forces utf8 or it doesn't
it would mean that if someone tries to name himself (ibm-charset) björn on a hub, he'd be rejected because of the illegal 0x94, and even worse.. those characters can get you booted when used in a MyINFO (description for instance) and this would confuse the hell out of people (read: not liked by arne for the same reason as he rejecting the original $UserIP on)
...unless you were to break the utf8 spec and allow illegal sequences (correct me if I'm wrong on the specs, I merely glances at the docs)
it would mean that if someone tries to name himself (ibm-charset) björn on a hub, he'd be rejected because of the illegal 0x94, and even worse.. those characters can get you booted when used in a MyINFO (description for instance) and this would confuse the hell out of people (read: not liked by arne for the same reason as he rejecting the original $UserIP on)
...unless you were to break the utf8 spec and allow illegal sequences (correct me if I'm wrong on the specs, I merely glances at the docs)
http://dc.selwerd.nl/hublist.xml.bz2
http://www.b.ali.btinternet.co.uk/DCPlusPlus/index.html (TheParanoidOne's DC++ Guide)
http://www.dslreports.com/faq/dc (BSOD2600's Direct Connect FAQ)
http://www.b.ali.btinternet.co.uk/DCPlusPlus/index.html (TheParanoidOne's DC++ Guide)
http://www.dslreports.com/faq/dc (BSOD2600's Direct Connect FAQ)
Clients will have to deal with decoding-erros, hubs wont. Was confused, I blame the heat.
Anyway, the problem is one of clients. They can easily use utf8 without hub support, and without the hub even knowing. The problem is merely one of communicating the encoding in use to other clients, and decoding problems arise when a client wrongfully believes the encoding used is utf8 when its not. The problem of decoding could be used as a flag telling this client isnt using utf8, it should be a sure sign. It would be nice to have a better method though.
We could use bit of the speedbyte in the descr for this, I believe there are a few bits left?
Anyway, the problem is one of clients. They can easily use utf8 without hub support, and without the hub even knowing. The problem is merely one of communicating the encoding in use to other clients, and decoding problems arise when a client wrongfully believes the encoding used is utf8 when its not. The problem of decoding could be used as a flag telling this client isnt using utf8, it should be a sure sign. It would be nice to have a better method though.
We could use bit of the speedbyte in the descr for this, I believe there are a few bits left?