[Product-Developers] collective.recipe.backup doesn't clean up older blob backups and snapshots

Maurits van Rees m.van.rees at zestsoftware.nl
Mon Sep 12 10:56:47 UTC 2011


Hi Gil,

Op 09-09-11 12:48, Gil Forcada schreef:
> Hi,
>
> I'm enjoying c.r.backup a lot, actually it saved my day thanks to the
> new version with blob backups!
>
> One thing I noticed though is that keeping all defaults[1] the only
> folder that gets cleaned is var/snapshotbackups, the other
> backup/snapshot folders keep growing and growing:
> - var/backups (incremental backups)
> - var/blobstoragebackups (incremental blob backups)
> - var/blobstoragesnapshots (full blob backups)
>
> Was it meant to be like this or is just because a .0 release is always
> waiting for a .1 to enhance the release? :)

Well, there is already a 2.1 release. :-)

The incremental backups only need to be cleaned up when a zeopack has 
been run as the next backup then automatically makes a full backup 
(standard repozo behaviour).  So if you are not doing zeopacks then 
var/backups will keep increasing.  Is that maybe what is happening for 
you?  Or are you regularly packing the database?  (For most projects I 
pack once a week.)  It should work as far as I know, and there are tests.

For the blobstorage backups or snapshots I realize I have not added code 
for clearing (removing) old backups.  In a standard setup on Linux or 
Mac it should not matter much, as hard links are used even for the full 
backups, so for each backup only the new blobs take up extra space. 
Note that changing a file or image in Plone will lead to a new blob; 
existing blobs are never changed.

For blobstoragesnapshots it should be pretty simple to remove the oldest 
backups as for each snapshot taken from the Data.fs there is one 
snapshot taken from the blobstorage.  I'll have a look at that.

For blobstoragebackups it is trickier, as we cannot count the backups 
and keep the last two: we need to keep all backups that correspond to 
the latest two full Data.fs backups including their incremental backups. 
  We could order by last modified date and remove all blob backup 
directories that were created before the last kept full Data.fs backup.

In fact, when restoring blobs the generated scripts simply copy over the 
most recent backup.  Any option you give that restores a backup of the 
Data.fs in a specific point in time is ignored for the blobs: as far as 
I know existing filesystem blobs are never changed and having a few 
extra blobs that are not used does not hurt.  If anyone knows this is 
bad, please speak up.

We could be smarter about restoring blobs to a specific point in time 
(pick the first blob backup with a modification date after the requested 
time), but it may be safer not to try to be too smart here.


-- 
Maurits van Rees   http://maurits.vanrees.org/
Web App Programmer at Zest Software: http://zestsoftware.nl
"Logical thinking shows conclusively that logical thinking
is inconclusive." - My summary of Gödel, Escher, Bach



More information about the Product-Developers mailing list