[Product-Developers] PloneGazette and Unicode problem

Jean Jordaan jean.jordaan at gmail.com
Thu Aug 28 11:18:42 UTC 2008


Hi all

I'm having trouble with PloneGazette and Unicode. Logged as
http://plone.org/products/plonegazette/issues/46

Here's the gist, from the report:

The breakage occurs here, in 'write':

        tree = ElementTree.ElementTree(rootnode);
        output = StringIO.StringIO()
        tree.write(output)
        text = output.getvalue()
        output.close()

Before:

<p>English - <a
href="..."><span>\xd8\xa7\xd9\x84\xd8\xb9\xd8\xb1\xd8\xa8\xd9\x8a\xd8\xa9</span></a>

After:

<p>English - <a
href="..."><span>&#216;&#167;&#217;&#132;&#216;&#185;&#216;&#177;&#216;&#168;&#217;&#138;&#216;&#169;</span></a>

The Unicode values get turned into HTML character entities.

In the rendered pages attached to the original issue report (see URL
above), that were taken from Firefox's "view source" view, you don't
see the entities. I don't know why.

The default for 'write' is US-ASCII. If I change it to:

        tree.write(output, encoding='UTF-8')

then I don't get the entities any more, but I do get gobbledygook (the
kind in the attachment to the original report).

Subsequently, 'safe_unicode' gets called on the text (in
'renderTextHTML'), then '...encode('utf8')' is called on it (in
'renderTextPlain'), then it is finally served with
'REQUEST.RESPONSE.setHeader('Content-Type', 'text/plain; charset=%s' %
self.ploneCharset())' where I suppose 'ploneCharset' may or may not be
unicode.

When I log the output in 'changeRelativeToAbsolute' I see:

  <?xml version='1.0' encoding='UTF-8'?><div><meta content="text/html;
charset=ISO-8859-1" http-equiv="content-type" />

Hmm, where did that 'charset=ISO-8859-1"' come from?

In the final rendered HTML preview I see:

 <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
 <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type">

Eugh! ;-)

The Plone view of the newsletter also has both charsets in the HTML
source. I'm not sure where in this chain I should try fixing.

-- 
jean                         . .. .... //\\\oo///\\




More information about the Product-Developers mailing list