[PLIP-Advisories] Re: [Plone] #9328: content im-/export
plip-advisories at lists.plone.org
plip-advisories at lists.plone.org
Tue Jun 30 12:16:38 UTC 2009
#9328: content im-/export
---------------------+------------------------------------------------------
Reporter: csenger | Owner: csenger
Type: PLIP | Status: new
Priority: minor | Milestone: 4.0
Component: Unknown | Resolution:
Keywords: |
---------------------+------------------------------------------------------
Old description:
> == Motivation ==
>
> Content ex-/import is an important functionality for different tasks,
> e.g. using a dedicated editing site during development and transferring
> the content into a production site without risking migration issues. It
> is also important if an in-site migration is not possible. This might be
> the case for a Plone release after 4.x. Currently used solutions include
>
> * portal_setup's content step - Discouraged in most discussions and
> at least lacks support for binary data and references
> * gsxml [http://pypi.python.org/pypi/collective.plone.gsxml pypi
> page] [ - Generic Product for Archetypes based content types. Uses
> atxml and is buggy, incomplete, under documented and difficult to
> set up, but does the job generally.
> * hand written scripts like
> [http://www.zopyx.de/blog/updated-when-the-plone-migration-fails-
> doing-content-migration-only
> those from Andreas Jung (svn down atm)].
> * collective.transmogrifier
> [http://dev.plone.org/collective/browser/collective.transmogrifier
> trac] - generic, extensible solution with a number of add ons
> (e.g.
> [http://dev.plone.org/collective/browser/collective.transmogrifier
> plone.app.transmogrifier],
> [http://svn.quintagroup.com/products/quintagroup.transmogrifier
> quintagroup.transmogrifier]) that build an already working
> im-/export. Downsides are: Transmogrifier has no final release, no
> end user interface or documentation and is complex.
>
> All of these solutions have their problems and are incomplete, under
> documented, difficult to set up or not flexible enough.
>
> == Definitions ==
>
> transmogrifier vocabulary
>
> pipeline::
> A sequence of sections that is processed.
> section::
> A section consists of a blueprint and optional configuration variables
> blueprint::
> An class that provides ISectionBlueprint and implements ISection. In
> fact it is just a callable that implements __iter__ to be used with
> python's iteration protocol.
> source::
> A blueprint that reads in data that will be used by another blueprint
> in the pipeline. There can be more than one source where the second
> source injects new items into the pipeline.
> constructor::
> A blueprint that reads the data and constructs an object.
>
> == Proposal ==
>
> This PLIP aims to provide a solution for plone that
>
> * can be used to export the out-of-the box and most add-on content
> types
> * is extensible so add ons can add ex-/import data that can not be
> covered by a generic solution
> * is ready to use for an administrator out-of-the box
> * is integrated into the control panel.
> * can be used by developers to write a custom import for external data
>
> Why a Proposal for Plone 4?
>
> * It should be the canonical ex-/import mechanism that add-on developer
> extend if the generic part does not cover enough data.
> * With dexterity and plone.app.content, there are other ways than
> archetypes to construct content. It seems impossible to support them and
> maintain the code outside of plone core.
> * It's regularly requested and import is one of the problems people are
> facing when it comes to migrating external to plone.
>
> It can be added to a later Plone 4.x release just as well as it does not
> need changes to plone core and doesn't introduce backward
> incompatibility, but I submit it for Plone 4.0 to begin with.
>
> == Assumptions ==
>
> This im-/export system covered by this PLIP handles only archetypes
> content and few special cases like comments. Generic blueprints for zope
> 3 schemata handling is not part of this PLIP.
>
> == Implementation ==
>
> The export will be implemented with collective.transmogrifier. The main
> reasons are that it is extensible, fast and there are already most
> necessary blueprints implemented in plone.app.transmogrifier and
> quintagroup.transmogrifier, collective.blueprint.translationlinker. These
> include handling of Archetypes and ATCT + topics and their criteria (a
> port of gxml I think), references, comments, translation links, Browser
> Defaults and workflow state.
>
> quintagroup.transmogrifier already implements a working ex-/import into a
> tarball. It uses atxml handler from Products.Marshall to export an
> archetypes object to xml
> ([http://dev.plone.org/archetypes/browser/Products.Marshall/trunk/Products/Marshall/tests/input/atxml/Document.xml
> example output])
> To write and read the data, it uses GenericSetup's TarballExportContext
> and TarballImportContext (with two small monkey patches). The structure
> of the tarball is similar the the generic setup content im-/export step
> and contains folders and xml files:
>
> * '''structure/'''
> * '''.objects.xml'''[[BR]]
> <?xml version="1.0" ?>[[BR]]
> <manifest> ... <record type="Document">front-page</record> ...
> * '''.properties.xml'''[[BR]]
> Xml produced with GenericSetup's propertymanager support and
> contains properties like default_page.
> * '''front-page/'''
> * '''.marshall.xml'''[[BR]]
> See atxml's example output.
> * '''news/'''
> * ...
> * '''aggregator/'''
> * '''.properties.xml'''
> * '''.objects.xml'''
> * '''.marshall.xml'''
> * '''crit__effective_ATSortCriterion/'''
> * '''.marshall.xml'''
> * ...
>
> The work on this PLIP is split into two major steps:
>
> 1. Get a reliable, complete, hard wired content im-/export for
> an out-of-the-box plone site
> 2. Make the system flexible enough to support add-on products and
> maybe TTW configuration of the export process.
>
> === 1. Out-of-the-box Plone im-/export ===
>
> This already supports add-ons as long as all information are saved in
> archetype schemata.
>
> 1. Review the existing blueprints
> 2. see what information we additionally need to export and
> write the missing blueprints
> 3. write a pipeline configuration for im- and for export that works
> within a plone version.
> 4. write a utility and a basic export control panel
> 5. Get all used packages into the collective or the plone repository
> where they can be maintained.
>
> === 2. Flexibility to support add-ons and configuration ===
>
> A transmogrifier pipeline consists of many section where every section
> defines the blueprint to use and a number of configuration variables.
>
> {{{
> >>> exampleconfig = """\
> ... [transmogrifier]
> ... pipeline =
> ... section 1
> ... section 2
> ...
> ... [section 1]
> ... blueprint = collective.transmogrifier.tests.examplesource
> ... size = 5
> ...
> ... [section 2]
> ... blueprint = collective.transmogrifier.tests.exampletransform
> }}}
>
> We split the configuration into PloneTransmogrifierConfigProviders. They
> provide
> * information for the user interface (Title, Description)
> * one or more sections together with information
> * which kind of blueprint the section contains (source, transformer,
> writer; reader, transformer, constructor)
> * the priority of the section (like init scripts) within the group
>
> The utility that composes the pipeline can then order the sections it
> receives from different ConfigProviders without knowing more about them.
> If an add-on registers a ConfigProvider, it can be integrated into the
> pipeline with a low chance to break the export.
>
> ==== Why not have one config provider per available blueprint? ====
>
> 1. One or more sections (blueprints) are bound together if they do one
> thing at different points in the pipeline. An example is one blueprint
> that reads the information which object is the canonical version of a
> translation and a second blueprint that links the objects together after
> they were constructed by another blueprint.
>
> 2. One blueprint can also be used several times like one that is
> transforming parts based on an regular expression so more than one
> PloneTransfomrationConfigProvider can use the same blueprint.
>
> ==== Configurability ====
>
> PloneTransmogrifierConfigProviders can also be used to give the user the
> option to disable or configure certain tasks. Every provider could
> contain a zope schema to display an edit form with an option to disable
> it. If this generally makes sense has to be explored.
>
> Another option would be to write a set of filter blueprints that are
> configurable and allow to configure e.g. the set of content types etc.
> that are removed before the export archive is generated or the imported
> data is written to the database.
>
> == Risks ==
>
> The key component for reading/writing archetypes content is atxml from
> the Products.Marshall package. This package is kept in a working state,
> but is not well maintained. The unit tests of the package are not
> working. It seems to be an acceptable risk as this is the case for a long
> time and the package seems to be used by many people.
>
> The package might not be finished within the the 4.0 release cycle.
> Beside the glue code there are are lots of details to be implemented and
> tested. But it's no problem to introduce the package in a later Plone 4.x
> release.
>
> == Deliverables ==
>
> * Consolidate blueprint packages
> * A plone package that contains the configuration backend and the
> control panel
> * ConfigProviders partly in the plone package, partly in external
> packages that implement the blueprints
> * Unit tests
> * Developer and end user documentation
>
> == Participants ==
>
> Carsten Senger (csenger)
New description:
== Motivation ==
Content ex-/import is an important functionality for different tasks, e.g.
using a dedicated editing site during development and transferring the
content into a production site without risking migration issues. It is
also important if an in-site migration is not possible. This might be the
case for a Plone release after 4.x. Currently used solutions include
* portal_setup's content step - Discouraged in most discussions and
at least lacks support for binary data and references
* gsxml [http://pypi.python.org/pypi/collective.plone.gsxml pypi
page] [ - Generic Product for Archetypes based content types. Uses
atxml and is buggy, incomplete, under documented and difficult to
set up, but does the job generally.
* hand written scripts like
[http://www.zopyx.de/blog/updated-when-the-plone-migration-fails-doing-
content-migration-only
those from Andreas Jung (svn down atm)].
* collective.transmogrifier
[http://dev.plone.org/collective/browser/collective.transmogrifier
trac] - generic, extensible solution with a number of add ons
(e.g.
[http://dev.plone.org/collective/browser/collective.transmogrifier
plone.app.transmogrifier],
[http://svn.quintagroup.com/products/quintagroup.transmogrifier
quintagroup.transmogrifier]) that build an already working
im-/export. Downsides are: Transmogrifier has no final release, no
end user interface or documentation and is complex.
All of these solutions have their problems and are incomplete, under
documented, difficult to set up or not flexible enough.
== Definitions ==
transmogrifier vocabulary
pipeline::
A sequence of sections that is processed.
section::
A section consists of a blueprint and optional configuration variables
blueprint::
An class that provides ISectionBlueprint and implements ISection. In
fact it is just a callable that implements __iter__ to be used with
python's iteration protocol.
source::
A blueprint that reads in data that will be used by another blueprint
in the pipeline. There can be more than one source where the second source
injects new items into the pipeline.
constructor::
A blueprint that reads the data and constructs an object.
== Proposal ==
This PLIP aims to provide a solution for plone that
* can be used to export the out-of-the box and most add-on content types
* is extensible so add ons can add ex-/import data that can not be
covered by a generic solution
* is ready to use for an administrator out-of-the box
* is integrated into the control panel.
* can be used by developers to write a custom import for external data
Why a Proposal for Plone 4?
* It should be the canonical ex-/import mechanism that add-on developer
extend if the generic part does not cover enough data.
* With dexterity and plone.app.content, there are other ways than
archetypes to construct content. It seems impossible to support them and
maintain the code outside of plone core.
* It's regularly requested and import is one of the problems people are
facing when it comes to migrating external to plone.
It can be added to a later Plone 4.x release just as well as it does not
need changes to plone core and doesn't introduce backward incompatibility,
but I submit it for Plone 4.0 to begin with.
== Assumptions ==
This im-/export system covered by this PLIP handles only archetypes
content and few special cases like comments. Generic blueprints for zope 3
schemata handling is not part of this PLIP.
== Implementation ==
The export will be implemented with collective.transmogrifier. The main
reasons are that it is extensible, fast and there are already most
necessary blueprints implemented in plone.app.transmogrifier and
quintagroup.transmogrifier, collective.blueprint.translationlinker. These
include handling of Archetypes and ATCT + topics and their criteria (a
port of gxml I think), references, comments, translation links, Browser
Defaults and workflow state.
quintagroup.transmogrifier already implements a working ex-/import into a
tarball. It uses atxml handler from Products.Marshall to export an
archetypes object to xml
([http://dev.plone.org/archetypes/browser/Products.Marshall/trunk/Products/Marshall/tests/input/atxml/Document.xml
example output])
To write and read the data, it uses GenericSetup's TarballExportContext
and TarballImportContext (with two small monkey patches). The structure of
the tarball is similar the the generic setup content im-/export step and
contains folders and xml files:
* '''structure/'''
* '''.objects.xml'''[[BR]]
<?xml version="1.0" ?>[[BR]]
<manifest> ... <record type="Document">front-page</record> ...
* '''.properties.xml'''[[BR]]
Xml produced with GenericSetup's propertymanager support and contains
properties like default_page.
* '''front-page/'''
* '''.marshall.xml'''[[BR]]
See atxml's example output.
* '''news/'''
* ...
* '''aggregator/'''
* '''.properties.xml'''
* '''.objects.xml'''
* '''.marshall.xml'''
* '''crit__effective_ATSortCriterion/'''
* '''.marshall.xml'''
* ...
The work on this PLIP is split into two major steps:
1. Get a reliable, complete, hard wired content im-/export for
an out-of-the-box plone site
2. Make the system flexible enough to support add-on products and
maybe TTW configuration of the export process.
=== 1. Out-of-the-box Plone im-/export ===
This already supports add-ons as long as all information are saved in
archetype schemata.
1. Review the existing blueprints
2. see what information we additionally need to export and
write the missing blueprints
3. write a pipeline configuration for im- and for export that works
within a plone version.
4. write a utility and a basic export control panel
5. Get all used packages into the collective or the plone repository
where they can be maintained.
=== 2. Flexibility to support add-ons and configuration ===
A transmogrifier pipeline consists of many section where every section
defines the blueprint to use and a number of configuration variables.
{{{
>>> exampleconfig = """\
... [transmogrifier]
... pipeline =
... section 1
... section 2
...
... [section 1]
... blueprint = collective.transmogrifier.tests.examplesource
... size = 5
...
... [section 2]
... blueprint = collective.transmogrifier.tests.exampletransform
}}}
We split the configuration into PloneTransmogrifierConfigProviders. They
provide
* information for the user interface (Title, Description)
* one or more sections together with information
* which kind of blueprint the section contains (source, transformer,
writer; reader, transformer, constructor)
* the priority of the section (like init scripts) within the group
The utility that composes the pipeline can then order the sections it
receives from different ConfigProviders without knowing more about them.
If an add-on registers a ConfigProvider, it can be integrated into the
pipeline with a low chance to break the export.
==== Why not have one config provider per available blueprint? ====
1. One or more sections (blueprints) are bound together if they do one
thing at different points in the pipeline. An example is one blueprint
that reads the information which object is the canonical version of a
translation and a second blueprint that links the objects together after
they were constructed by another blueprint.
2. One blueprint can also be used several times like one that is
transforming parts based on an regular expression so more than one
PloneTransfomrationConfigProvider can use the same blueprint.
==== Configurability ====
PloneTransmogrifierConfigProviders can also be used to give the user the
option to disable or configure certain tasks. Every provider could contain
a zope schema to display an edit form with an option to disable it. If
this generally makes sense has to be explored.
Another option would be to write a set of filter blueprints that are
configurable and allow to configure e.g. the set of content types etc.
that are removed before the export archive is generated or the imported
data is written to the database.
== Risks ==
The key component for reading/writing archetypes content is atxml from the
Products.Marshall package. This package is kept in a working state, but is
not well maintained. The unit tests of the package are not working. It
seems to be an acceptable risk as this is the case for a long time and the
package seems to be used by many people.
The package might not be finished within the the 4.0 release cycle. Beside
the glue code there are are lots of details to be implemented and tested.
But it's no problem to introduce the package in a later Plone 4.x release.
== Deliverables ==
* Consolidate blueprint packages
* A plone package that contains the configuration backend and the control
panel
* ConfigProviders partly in the plone package, partly in external
packages that implement the blueprints
* Unit tests
* Developer and end user documentation
== Participants ==
Carsten Senger (csenger)
== Progress and further information ==
See PlipContentImExport
--
Comment(by csenger):
Add link to PlipContentImExport to Description
--
Ticket URL: <https://dev.plone.org/old/plone/ticket/9328#comment:18>
Plone <http://plone.org>
Plone Content Management System
More information about the PLIP-Advisories
mailing list