0

I would just like the python code to work but these conversion errors I don't understand (I always get some type of 'ascii' encoding or decoding error). I went crazy and did a decode and encode on every part of the line and it still giving me trouble. It's available via GIT at https://github.com/TBOpen/papercut if you would be so kind as to correct it (I also solved a similar error not checked in on line 885 using self.wfile.write(message.decode('cp1250', 'replace').encode('ascii', 'replace') + "\r\n").

However here's the traceback for the one I can't solve (where I gave up).

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/SocketServer.py", line 535, in process_request
    self.finish_request(request, client_address)
  File "/usr/local/lib/python2.6/SocketServer.py", line 320, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python2.6/SocketServer.py", line 615, in __init__
    self.handle()
  File "./papercut.py", line 221, in handle
    getattr(self, "do_%s" % (command))()
  File "./papercut.py", line 410, in do_ARTICLE
    self.send_response("%s\r\n%s\r\n\r\n%s\r\n.".decode('cp1250', 'replace').encode('ascii', 'replace') % (response.decode('cp1250', 'replace').encode('ascii', 'replace'), result[0].decode('cp1250', 'replace').encode('ascii', 'replace'), result[1].decode('cp1250', 'replace').encode('ascii', 'replace')))
  File "/usr/local/lib/python2.6/encodings/cp1250.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in position 20: ordinal not in range(128)

TIA!!

6
  • are you using python 2.x or python 3.x? Commented Jan 13, 2014 at 9:11
  • @mig-25foxbat: Python 2.6, from the traceback. Commented Jan 13, 2014 at 9:12
  • Can you paste your py code out? Commented Jan 13, 2014 at 9:12
  • 1
    This is a horribly unreadable and undebuggable way to do things. Why not decode all of your input at input, process everything—including the %-formatting—as Unicode, and then encode at output, instead of decoding and encoding all over the place? Commented Jan 13, 2014 at 9:13
  • You don't need to decode, then encode your string template (the string with %s placeholders); it is already just ASCII. Commented Jan 13, 2014 at 9:16

1 Answer 1

1

The root problem is that one of response, result[0], or result[1] is actually a unicode string, not an encoded str string.

So, when you call (picking one arbitrarily) response.decode('cp1250', 'replace'), you're asking to decode something that's already decoded to Unicode. What Python 2.x does with this is to first encode it to your default encoding (ASCII) so that it can decode it as you requested. And that's why you're getting a UnicodeEncodeError from trying to call decode.*

To fix this, you're going to have to figure out which one of the three is wrong, and why. That's not possible with a giant mess of a statement with 4 decode calls in it, but it's easy if you break it up into separate statements, or just add some print debugging to see what's in those variables right before they get used.

However, it would make your life a whole lot easier to reorganize your code completely. Instead of converting everything back and forth all over the place, giving yourself dozens of places to make a simple mistake that ends up causing an un-debuggable error halfway across your program, just decode all of your input at input time, process everything as Unicode, then encode everything at output time.

By the way, if you haven't read Python's Unicode HOWTO, and the blog post The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), go read them before going any further.


* If you think this is a silly design for a language… well, that's the main reason Python 3 exists. In Python 3, you can't decode a unicode or encode a bytes, so the error shows up as early as possible, and tells you exactly what's wrong, instead of making you try to hunt down where you called the wrong method on the wrong type and got an error that makes no sense. So if you want to use Python 2 instead of 3, you don't get to complain that Python 2's design is sillier than 3's.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.