[Product-Developers] collective.recipe.backup doesn't clean up older blob backups and snapshots

Maurits van Rees m.van.rees at zestsoftware.nl
Wed Sep 14 00:03:52 UTC 2011

Op 12-09-11 12:56, Maurits van Rees schreef:
> Hi Gil,
> Op 09-09-11 12:48, Gil Forcada schreef:
>> Hi,
>> I'm enjoying c.r.backup a lot, actually it saved my day thanks to the
>> new version with blob backups!
>> One thing I noticed though is that keeping all defaults[1] the only
>> folder that gets cleaned is var/snapshotbackups, the other
>> backup/snapshot folders keep growing and growing:
>> - var/backups (incremental backups)
>> - var/blobstoragebackups (incremental blob backups)
>> - var/blobstoragesnapshots (full blob backups)
>> Was it meant to be like this or is just because a .0 release is always
>> waiting for a .1 to enhance the release? :)
> Well, there is already a 2.1 release. :-)

And a 2.2 now. :-)

> The incremental backups only need to be cleaned up when a zeopack has
> been run as the next backup then automatically makes a full backup
> (standard repozo behaviour). So if you are not doing zeopacks then
> var/backups will keep increasing. Is that maybe what is happening for
> you? Or are you regularly packing the database? (For most projects I
> pack once a week.) It should work as far as I know, and there are tests.
> For the blobstorage backups or snapshots I realize I have not added code
> for clearing (removing) old backups. In a standard setup on Linux or Mac
> it should not matter much, as hard links are used even for the full
> backups, so for each backup only the new blobs take up extra space. Note
> that changing a file or image in Plone will lead to a new blob; existing
> blobs are never changed.
> For blobstoragesnapshots it should be pretty simple to remove the oldest
> backups as for each snapshot taken from the Data.fs there is one
> snapshot taken from the blobstorage. I'll have a look at that.


> For blobstoragebackups it is trickier, as we cannot count the backups
> and keep the last two: we need to keep all backups that correspond to
> the latest two full Data.fs backups including their incremental backups.
> We could order by last modified date and remove all blob backup
> directories that were created before the last kept full Data.fs backup.

I have added a keep_blob_days option.  From the docs:

Number of days of blob backups to keep. Defaults to 14, so two weeks. 
This is only used for partial (full=False) backups, so this is what gets 
used normally when you do a bin/backup. This option has been added in 
2.2. For full backups (snapshots) we just use the keep option. 
Recommended is to keep these values in sync with how often you do a 
zeopack on the Data.fs, according to the formula keep * 
days_between_zeopacks = keep_blob_days. The default matches one zeopack 
per seven days (2*7=14).

> In fact, when restoring blobs the generated scripts simply copy over the
> most recent backup. Any option you give that restores a backup of the
> Data.fs in a specific point in time is ignored for the blobs: as far as
> I know existing filesystem blobs are never changed and having a few
> extra blobs that are not used does not hurt. If anyone knows this is
> bad, please speak up.
> We could be smarter about restoring blobs to a specific point in time
> (pick the first blob backup with a modification date after the requested
> time), but it may be safer not to try to be too smart here.

Then again, if you restore a Data.fs from before the last zeopack then 
you need a blob backup from before that last zeopack too.  So that is a 
to-do item.

Meanwhile, if you are using zc.buildout 1.5.x then you will want to use 
the latest 2.2 release: the scripts generated by 2.0 and 2.1 are broken 
with that zc.buildout version and a 'system python' that does not have a 
broken 'dash S' behaviour.  If you don't know what that means: be happy. 
  Just upgrade the recipe or stick to zc.buildout 1.4.4.

Note for recipe authors: watch out when creating scripts with the 
sitepackage_safe_scripts introduced in zc.buildout 1.5.  Test it well. 
The mentioned breakage was caused because I actually tried to be a good 
citizen and create those sitepackage_safe_scripts when possible.  In 2.2 
I almost got it working, but ran into a problem because the separately 
generated bin/repozo was not site package safe and this printed an error 
each time you ran bin/buildout.  The error seemed to be wrong and did 
not have ill effects, but it was printed nonetheless.  So I reverted to 
always generating normal 'old style' scripts.  Works fine in both 
zc.buildout 1.4 and 1.5.  So for me the hassle of trying to upgrade to 
site package safe scripts was not worth it, at least in this case. 
Don't let my momentary frustration deter you though. :-)

Maurits van Rees   http://maurits.vanrees.org/
Web App Programmer at Zest Software: http://zestsoftware.nl
"Logical thinking shows conclusively that logical thinking
is inconclusive." - My summary of Gödel, Escher, Bach

More information about the Product-Developers mailing list