[Product-Developers] Re: Re: Re: UnicodeDecodeError

Derek Broughton news at pointerstop.ca
Tue Mar 25 13:06:18 UTC 2008

Andreas Jung wrote:

> --On 24. März 2008 17:19:48 -0300 Derek Broughton
> <news at pointerstop.ca> wrote:
>>> What are you talking about? Python has nothing like a 'unicode' default.
>>> Likely you're referring to sys.getdefaultencoding() which is ascii by
>>> default.
>> Why do you always have to be so confrontational?
> Because the part of content of your postings are wrong 

Which part?  There's no need to be insulting - it's OBVIOUS that none of us
know the right way to do this, and your best response is to tell us that. 
Well thanks for nothing.

> and the implicit 
> hint for changing the default encoding of Python is the wrong way and will

What do you mean "implicit"! I SAID I've changed the default BUT THERE MUST
can I be?

> lead of other problems (see below). I am so confrontational because I want
> people
> to write and design clean code when it comes to unicode-awareness.
> Unicode-awareness is a big problem within Python-based applications...but
> you can get around doing it the right way.
>> That is, of course, what I'm talking about - and I know perfectly well
>> that it's NOT a unicode default.  If you read the posts, that would be
>> clear.
> You're mixing mixing unicode encodings with the Python 'unicode' type.

No, I'm not.  

> There is no way for making strings by default a unicode string.

And I never suggested there was.  
>>> General rule #1: don't touch that. Rule #2: if you have the need
>>> to touch the default encoding as a workaround: better fix your code
>>> first.
>> Funny, but it's not _my_ code that runs into problems - in fact it's some
>> of yours.  SQLAlchemyDA won't read non-ascii data off my UTF-8 postgres
>> or Oracle databases.
> SQLAlchemy has options do deal with different encoding. The Oracle client
> libraries also provide environment variables for controlling the client
> encoding (e.g. for an implicit conversion of the server side database
> encoding into some client side encoding).
>> I _have_ non-ascii strings in my data, I'm not going to change that.
> We all have that. There is no problem building a clean application dealing
> with various encodings at a time in a sane way

Then why do I get an error any time I read data from a UTF-8 database
(either Oracle or Postgres).  Try to be at least a little bit helpful.

>> It
>> seems to me that the only way to make Plone work with it is to set my
>> default encoding to unicode.
> Wrong again. You set an *encoding*. An *encoding* like 'utf-8' is NOT
> unicode. 

There you go again.  From wiki:utf-8, "UTF-8 (8-bit UCS/Unicode
Transformation Format) is a variable-length character encoding for
Unicode."  I'm NOT talking about unicode strings.  I'm talking about ANY
attempt to get data out of a Unicode-encoded database with SqlAlchemy.

> Don't mix the 'unicode' type of Python with the _various_ 


>> It's worked so far, and it _may_ be
>> hackish,
> Yes, it is hackish. And if you write code that depends on a particular


>   As I said, it seems logical that there's a reason why it's not
>> Unicode by default, but you're not helping any by just saying we should
>> "fix our code".
> Well, the code is yours :-)

IT'S STILL NOT MY CODE. And don't think that putting a smiley on a statement
makes it less insulting.

> And because the approaches for building clean applications in such a case
> are well known:
>  - represent your data *internally* as unicode strings
>    (*NOT* as utf-X encoded byte strings)

That's not an option.  I don't control the SQL databases.

>  - do all the processing internally on top of Python unicode strings
>  - convert all incoming string data from your input encoding to unicode
>  - convert all outgoing data from unicode some your output encoding

Also not an option - it's not my code.

More information about the Product-Developers mailing list