Handling non-unicode users/hubs

Problems compiling? Don't understand the source code? Don't know how to code your feature? Post here.

Moderator: Moderators

Locked
paskharen
Posts: 29
Joined: 2004-01-27 14:32

Handling non-unicode users/hubs

Post by paskharen » 2006-01-08 17:45

How well does DC++ handle if a user is using a different character encoding than utf8? As far as I can tell it assumes that the foreign user is using the same encoding as your local system and converts? Is there any support for converting the text from a different encoding to your local encoding?

The reason I'm asking this is because, as you may/may not know, I am kind of in charge of the linux port of DC++. We have discovered a need to be able to specify on a per hub, or maybe even a per user basis what character encoding should be used when communicating with a user/hub that doesn't support utf8. As it is now the program assumes the other part is using the same encoding as the users native, which causes a number of problems:

1) A lot of Linux distributions are using utf8 locales now, which is obviously no good for converting a foreign string TO utf8.

2) Russian has different encodings in windows and unix/linux.

3) Maybe somebody just wants to be on differently encoded hubs/speak with people who use different encodings at the same time.

So my question is really: Is there any interest in adding recoding functions to the main DC++? At least 3 seems like it might be a problem for you to but like I said I'm not 100% sure how windows DC++ handles these things.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2006-01-08 18:50

For NMDC, the algorithm is to simply assume everyone is using the same encoding you are. There really is no better way to do it. For ADC, everything is sent as UTF-8, so you don't need to worry.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Handling non-unicode users/hubs

Post by GargoyleMT » 2006-01-09 12:01

paskharen wrote:Is there any interest in adding recoding functions to the main DC++? At least 3 seems like it might be a problem for you to but like I said I'm not 100% sure how windows DC++ handles these things.
Unicode support is a feature of ADC. In order to try to get it into NMDC, you would have to resort to a lot of hackery. If your locales are Unicode, you have a problem, as it's safest to assume that remote users have the same locale as you do (this way Chinese hubs will work for Chinese users, Russian hubs will work for Russian users, etc.) This is the way DC++ does it, and it's unlikely to change. If you want Unicode support, go to ADC.

paskharen
Posts: 29
Joined: 2004-01-27 14:32

Post by paskharen » 2006-01-14 06:52

Just out of curiosity, are there many ADC hubs around? For example, is there a public hub list for ADC hubs? All hubs within my no limit zone are NMDC so I kind of have to use that.

To say something about solving the actual problem, I think making it possible to specify which encoding to convert from will do. (like I said before) I'm not sure how much "hackery I will have to resort to" but I think I'll try something like this for ldcpp in the future. Right now everyone who uses unicode locales has a lot of problems and just saying "only use ADC" is a bit harsh I think.

bastya_elvtars
Posts: 164
Joined: 2005-01-06 08:39
Location: HU
Contact:

Post by bastya_elvtars » 2006-01-14 09:17

It's not harsh, if you think it over. Or are DC++ developers responsible for a protocol that they just use/optimize? They aren't the creators of the NMDC protocol.
Hey you, / Don't help them to bury the light... / Don't give in / Without a fight. (Pink Floyd)

paskharen
Posts: 29
Joined: 2004-01-27 14:32

Post by paskharen » 2006-01-14 11:35

bastya_elvtars wrote:It's not harsh, if you think it over. Or are DC++ developers responsible for a protocol that they just use/optimize? They aren't the creators of the NMDC protocol.
Dude relax, I wasn't talking about DC++ devs :D But the linuxdc++ client is quite unusable as it is for people with a unicode locale. And I don't think *I* can tell them "just use ADC", it would probably be better in that case to just say "this client doesn't support utf8, don't use it"

ullner
Forum Moderator
Posts: 333
Joined: 2004-09-10 11:00
Contact:

Post by ullner » 2006-01-14 11:39

paskharen wrote:For example, is there a public hub list for ADC hubs?
Yes. http://project.bandicoot.nl/

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2006-01-14 12:24

Though it currently gives a PHP error when I try to submit mine :)

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2006-01-16 12:36

paskharen wrote:Right now everyone who uses Unicode locales has a lot of problems and just saying "only use ADC" is a bit harsh I think.
Yeah well.

I understand what you're saying, since I've been through there before. However, ADC is a clean way to support this. And if you do implement something on NMDC, Linux dcpp will be the only client to support it.

paskharen
Posts: 29
Joined: 2004-01-27 14:32

Post by paskharen » 2006-01-17 10:24

GargoyleMT wrote: if you do implement something on NMDC, Linux dcpp will be the only client to support it.
I'm not talking about a protocol extension or something like that, just a way for the user to say "when recieving data not in Utf8, assume it is in this encoding", and then choosing encoding from a list. Instead of assuming it is in system locale. So I don't think it will matter that it's only in Linux dcpp.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2006-01-18 12:37

paskharen wrote:I'm not talking about a protocol extension or something like that, just a way for the user to say "when recieving data not in Utf8, assume it is in this encoding", and then choosing encoding from a list. Instead of assuming it is in system locale. So I don't think it will matter that it's only in Linux dcpp.
True, that wasn't the idea I came away with earlier.

That is something that would make sense to track/override on a per-hub basis. I'm not sure how well that would work, since all the AcpToUtf code is in a central location.

Locked