Ticket #3033 (closed defect: fixed)

Opened 22 months ago

Last modified 22 months ago

messages received from ICQ transport (WINDOW-1251 codepage) is not correctly encoded to UTF-8

Reported by: Lucky Owned by: asterix
Priority: normal Milestone:
Component: xmpppy Version: 0.11.1
Severity: normal Keywords: encoding
Cc: OS:

Description

English is not my native language. Sorry for my English. I'm using Gajim for ICQ transport. My friends use ICQ clients (ICQ, QIP, Miranda, ...) with WINDOWS-1251 codepage. Sending a messages from Gajim to ICQ clients is work. But receiving a messages from ICQ clients is not work. Gajim think that i use ASCII codepage, and Gajim encode incoming messages from ASCII to UTF-8.

Some servers is working, for example - jabber.ru (server is too busy). This can be fixed in jabber client.

Please add an option "Codepage" to the Configuration Dialog, where i can select WINDOWS-1251 codepage; or add environment variable "GAJIM_CODEPAGE".

I don't know Python programming language, but i know Java. I was fixed this problem in JBother client (Jabber client in Java) by patching Smack library (Jabber Java library for JBother).

Fixed org.jivesoftware.smack.packet.Message class - org/jivesoftware/smack/packet/Message.java:

...

public String getBody() {

// my code begin

String codepage = null;

try {

// "SMACK_CODEPAGE" is environment variable, set to "WINDOWS-1251" for me

codepage = System.getenv("SMACK_CODEPAGE");

} catch (Exception ex) {}

if (codepage != null)

{

char[] buf = new char[body.length]; // body is message body, char = 2 bytes (useful for UNICODE)

body.getChars(0, body.length(), buf, 0);

try

{

mesgBody = ""; // new message body

for (int i = 0; i < buf.length; i++)

{

if ((buf[i] & 0xff00) == 0) // high byte = 0 -> not unicode

{

byte[] ch = new byte[1]; // 1-byte char

ch[0] = (byte) buf[i];

mesgBody = mesgBody + new String(ch, codepage); // encode from 'codepage' to UTF-8

} else mesgBody = mesgBody + buf[i]; // unicode

}

} catch (Exception ex) { mesgBody = body; }

body = mesgBody;

}

// my code end

return body; }

...

Attachments

Change History

follow-ups: ↓ 2 ↓ 3   Changed 22 months ago by asterix

  • priority changed from highest to normal
  • severity changed from critical to normal
  • milestone 0.11.1 deleted

This sounds like a transport problem. Do you know what transport version is used on the server you are ?

can you try with icq.gajim.org ?

in reply to: ↑ 1   Changed 22 months ago by Lucky

Replying to asterix:

This sounds like a transport problem. Do you know what transport version is used on the server you are ? can you try with icq.gajim.org ?

I was trying many servers, for example - Wildfire ICQ transport 3.2.2 - 1.0 beta 8

icq.gajim.org is not work.

in reply to: ↑ 1   Changed 22 months ago by anonymous

I was fixed this bug!

This is the part of modified file "src/common/xmpp/transports_nb.py":

   ...
   # we have received some bites
   self.renew_send_timeout()
   if self.on_receive:
           # my code begins
           import os
           try: enc = os.environ['GAJIM_ENCODING']
           except KetError: enc = ''
           if enc <> '':
               r = u''
               for c in ustr(received):
                   if ord(c) > 255: r = r + unichr(ord(c))
                   if ord(c) < 256: r = r + chr(ord(c)).decode(enc)
               received = str(r)
           # my code ends
           if received.strip():
                   self.DEBUG(received, 'got')
   ...

This works for me. I set GAJIM_ENCODING environment variable to windows-1251 and run Gajim.

follow-up: ↓ 5   Changed 22 months ago by spike411

XMPP clients are required to use UTF-8 encoding. Don't break Gajim! Tell your friends with buggy ICQ clients to: a) fix their settings (usually it's QiP which claims it's sending different encoding than it actually sends and it's easy to fix), b) use a different client.

Official ICQ 5.1 client supports UTF-8, Miranda supports UTF-8. If QiP or any other client uses something different than UTF-8, it should claim the correct encoding/character-set in the OSCAR protocol encoding header.

in reply to: ↑ 4   Changed 22 months ago by anonymous

  • status changed from new to closed
  • resolution set to fixed

Replying to spike411:

XMPP clients are required to use UTF-8 encoding. Don't break Gajim! Tell your friends with buggy ICQ clients to: a) fix their settings (usually it's QiP which claims it's sending different encoding than it actually sends and it's easy to fix), b) use a different client. Official ICQ 5.1 client supports UTF-8, Miranda supports UTF-8. If QiP or any other client uses something different than UTF-8, it should claim the correct encoding/character-set in the OSCAR protocol encoding header.

Thank you. I was changed a Jabber server to server with ICQ Transport 0.8 - SVN r0. And this transport is works fine. I was tested this transport with Official ICQ and QiP.

Add/Change #3033 (messages received from ICQ transport (WINDOW-1251 codepage) is not correctly encoded to UTF-8)

Author



Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.