Opened 10 years ago

Closed 9 years ago

Last modified 9 years ago

#2514 closed defect (fixed)

strnage utf8 stuff

Reported by: nk Owned by: asterix
Priority: normal Milestone: 0.11
Component: None Version: hg
Severity: normal Keywords:
Cc: Blocked By:
Blocking: OS:

Description

conv_txtview.py

format += '%X' + after_str
tim_format = time.strftime(format, tim)
becomes:
[3:57:20 πμ] <nkour> format += '%X' + after_str
print type(format), format.decode('utf-8')
tim_format = time.strftime(format, tim)
print type(tim_format), tim)format.decode('utf-8')
[3:57:31 πμ] <nkour> also before buffer.insert
[3:57:33 πμ] <nkour> do:
[3:57:47 πμ] <nkour> one more:
print type(tim_format), tim)format.decode('utf-8')
<type 'unicode'> [Вчера %X]
<type 'str'> [Вчера 20:59:31]
<type 'str'> [п▓я┤п╣я─п╟ 20:59:31]

Change History (9)

comment:1 Changed 10 years ago by asterix

the only thing between both is a helpers.ensure_utf8_string() call

what does sys.getfilesystemencoding() answer ?

what does 'Вчера'.decode(sys.getfilesystemencoding()) answer ?

and finaly 'Вчера'.decode(sys.getfilesystemencoding()).encode('utf-8') ?

comment:2 Changed 10 years ago by sl

[slav0nic@[00]t] ~ $ python -c "import sys;print sys.getfilesystemencoding()"
KOI8-R

[slav0nic@[00]t] ~ $ python -c "import sys;print 'Вчера'.decode(sys.getfilesystemencoding())"
Вчера

[slav0nic@[00]t] ~ $ python -c "import sys;print 'Вчера'.decode(sys.getfilesystemencoding()).encode('utf-8')"
п▓я┤п╣я─п╟

[slav0nic@[00]t] ~ $ python -c "import sys;print repr('Вчера'.decode(sys.getfilesystemencoding()).encode('utf-8'))"
'\xd0\x92\xd1\x87\xd0\xb5\xd1\x80\xd0\xb0'

comment:3 Changed 10 years ago by nk

helpers.ensure_utf8_string()

is where this bad thing happens, and it is expected because reporter does:

python -c "import sys;print 'Вчера'.decode(sys.getfilesystemencoding()).encode('utf-8')"
п▓я┤п╣я─п╟

not sure why though..

comment:4 Changed 10 years ago by nk

python -c "import sys;print 'Вчера'.decode(sys.getfilesystemencoding()).encode('utf-8')"

i bet your term inputs utf8..

fix your term to input koi and try again?

comment:5 Changed 10 years ago by asterix

could you try

import locale
print locale.getpreferredencoding()

and then

print 'Вчера'.decode('utf-8')
print 'Вчера'.decode('utf-8').encode('utf-8')

and finally could you try latest svn (>=7326) and see if pb is fixed ?

comment:6 Changed 10 years ago by sl

latest svn can't help

>>>import locale
>>> print locale.getpreferredencoding()
KOI8-R
>>> print 'Вчера'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "encodings/utf_8.py", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-3: invalid data
>>> print 'Вчера'.decode('utf-8').encode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "encodings/utf_8.py", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-3: invalid data
>>>

comment:7 Changed 9 years ago by asterix

what does locale and locale -a return in your term ?

comment:8 Changed 9 years ago by asterix

  • Resolution set to fixed
  • Status changed from new to closed

(In [78eed36e961cc5c4d15dce1dab7be7d5af5839ac]) don't ensure_utf8_string() when prefferedencoding is not 'UTF-8'. fixes #2514

comment:9 Changed 9 years ago by nk

(In [45baab311dd9907beb7687ef92d2b9224e6c9b34]) show time correctly [it was failing at least in Windows]. see #2514

Note: See TracTickets for help on using tickets.