Handling non-unicode users/hubs
Moderator: Moderators
Handling non-unicode users/hubs
How well does DC++ handle if a user is using a different character encoding than utf8? As far as I can tell it assumes that the foreign user is using the same encoding as your local system and converts? Is there any support for converting the text from a different encoding to your local encoding?
The reason I'm asking this is because, as you may/may not know, I am kind of in charge of the linux port of DC++. We have discovered a need to be able to specify on a per hub, or maybe even a per user basis what character encoding should be used when communicating with a user/hub that doesn't support utf8. As it is now the program assumes the other part is using the same encoding as the users native, which causes a number of problems:
1) A lot of Linux distributions are using utf8 locales now, which is obviously no good for converting a foreign string TO utf8.
2) Russian has different encodings in windows and unix/linux.
3) Maybe somebody just wants to be on differently encoded hubs/speak with people who use different encodings at the same time.
So my question is really: Is there any interest in adding recoding functions to the main DC++? At least 3 seems like it might be a problem for you to but like I said I'm not 100% sure how windows DC++ handles these things.
The reason I'm asking this is because, as you may/may not know, I am kind of in charge of the linux port of DC++. We have discovered a need to be able to specify on a per hub, or maybe even a per user basis what character encoding should be used when communicating with a user/hub that doesn't support utf8. As it is now the program assumes the other part is using the same encoding as the users native, which causes a number of problems:
1) A lot of Linux distributions are using utf8 locales now, which is obviously no good for converting a foreign string TO utf8.
2) Russian has different encodings in windows and unix/linux.
3) Maybe somebody just wants to be on differently encoded hubs/speak with people who use different encodings at the same time.
So my question is really: Is there any interest in adding recoding functions to the main DC++? At least 3 seems like it might be a problem for you to but like I said I'm not 100% sure how windows DC++ handles these things.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Re: Handling non-unicode users/hubs
Unicode support is a feature of ADC. In order to try to get it into NMDC, you would have to resort to a lot of hackery. If your locales are Unicode, you have a problem, as it's safest to assume that remote users have the same locale as you do (this way Chinese hubs will work for Chinese users, Russian hubs will work for Russian users, etc.) This is the way DC++ does it, and it's unlikely to change. If you want Unicode support, go to ADC.paskharen wrote:Is there any interest in adding recoding functions to the main DC++? At least 3 seems like it might be a problem for you to but like I said I'm not 100% sure how windows DC++ handles these things.
Just out of curiosity, are there many ADC hubs around? For example, is there a public hub list for ADC hubs? All hubs within my no limit zone are NMDC so I kind of have to use that.
To say something about solving the actual problem, I think making it possible to specify which encoding to convert from will do. (like I said before) I'm not sure how much "hackery I will have to resort to" but I think I'll try something like this for ldcpp in the future. Right now everyone who uses unicode locales has a lot of problems and just saying "only use ADC" is a bit harsh I think.
To say something about solving the actual problem, I think making it possible to specify which encoding to convert from will do. (like I said before) I'm not sure how much "hackery I will have to resort to" but I think I'll try something like this for ldcpp in the future. Right now everyone who uses unicode locales has a lot of problems and just saying "only use ADC" is a bit harsh I think.
-
- Posts: 164
- Joined: 2005-01-06 08:39
- Location: HU
- Contact:
Dude relax, I wasn't talking about DC++ devs But the linuxdc++ client is quite unusable as it is for people with a unicode locale. And I don't think *I* can tell them "just use ADC", it would probably be better in that case to just say "this client doesn't support utf8, don't use it"bastya_elvtars wrote:It's not harsh, if you think it over. Or are DC++ developers responsible for a protocol that they just use/optimize? They aren't the creators of the NMDC protocol.
Yes. http://project.bandicoot.nl/paskharen wrote:For example, is there a public hub list for ADC hubs?
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Yeah well.paskharen wrote:Right now everyone who uses Unicode locales has a lot of problems and just saying "only use ADC" is a bit harsh I think.
I understand what you're saying, since I've been through there before. However, ADC is a clean way to support this. And if you do implement something on NMDC, Linux dcpp will be the only client to support it.
I'm not talking about a protocol extension or something like that, just a way for the user to say "when recieving data not in Utf8, assume it is in this encoding", and then choosing encoding from a list. Instead of assuming it is in system locale. So I don't think it will matter that it's only in Linux dcpp.GargoyleMT wrote: if you do implement something on NMDC, Linux dcpp will be the only client to support it.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
True, that wasn't the idea I came away with earlier.paskharen wrote:I'm not talking about a protocol extension or something like that, just a way for the user to say "when recieving data not in Utf8, assume it is in this encoding", and then choosing encoding from a list. Instead of assuming it is in system locale. So I don't think it will matter that it's only in Linux dcpp.
That is something that would make sense to track/override on a per-hub basis. I'm not sure how well that would work, since all the AcpToUtf code is in a central location.