Using Unicode console output with Python
Posted
On Windows console and Unicode are not quite friends. Here is some code that I use in order to assure that my Python scripts output is consistent on all platforms and is supporting Unicode encoded as UTF-8.
You'll be able to see the right characters if you are running on Windows 7 but on Windows XP you'll see the UTF-8 ccodes displayed as ANSI. Even with this problem you'll be able to redirect the output of stdout or stderr to files in order to store UTF-8 output.
On other platforms like Linux or OS X it will just use UTF-8 without any problems.
import codecs, sys reload(sys) print sys.getdefaultencoding() if sys.platform == 'win32': #import sys, codecs print "This is an Е乂αmp١ȅ testing Unicode support using Arabic, Latin, Cyrillic, Greek, Hebrew and CJK code points.n"
#!/usr/bin/python
# -*- coding: UTF-8 -*-
sys.setdefaultencoding('utf-8')
try:
import win32console
except:
print "Python Win32 Extensions module is required.n You can download it from https://sourceforge.net/projects/pywin32/ (x86 and x64 builds are available)n"
exit(-1)
# win32console implementation of SetConsoleCP does not return a value
# CP_UTF8 = 65001
win32console.SetConsoleCP(65001)
if (win32console.GetConsoleCP() != 65001):
throw ("Cannot set console codepage to 65001 (UTF-8)")
win32console.SetConsoleOutputCP(65001)
if (win32console.GetConsoleOutputCP() != 65001):
throw ("Cannot set console output codepage to 65001 (UTF-8)")
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)
</code>