sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] corrupt PTree?


From: Jonathon Weiss
Subject: Re: [Sks-devel] corrupt PTree?
Date: Wed, 14 Oct 2009 18:42:17 -0400

> 
> > > Whenever I try to start the reconciliation server, it dies:
> > 
> > > 2009-09-24 15:06:09 Raising Sys.Break -- PTree may be corrupted: Failure(=
> > "remove_from_node: attempt to delete non-existant element from prefix tree")
> > > 2009-09-24 15:06:09 DB closed
> > =20
> > > Even after I rebuilt the PTree from scratch (admittedly with the sks
> > > server running the whole time (that should be safe, rihgt?)) I
> > > continue to get the same error.  Any thoughts or recommendations?
> > 
> > This can be made into a nonfatal error, but at the expense of
> > losing new and updated keys from the pTree, thus breaking gossip
> > synchronization.  Because the key DBs are separate from the pTree
> > DB, the pTree DB cannot be kept fully synchronized with the key DBs
> > via transactions that can't span both DBs.  Thus the ./KDB/time DB
> > is used as a workaround.  Unless/until the DBs are consolidated into
> > a single BDB instance, you'll need to either run "sks db" in read-only
> > mode (not currently supported) or shut it down during the "sks pbuild."
> 
> Well, I tried shutting down sksd and rebuilding the PTree db, and am
> still losing.  The recon server log says:
> 
> 2009-10-03 04:36:16 Opening log
> 2009-10-03 04:36:16 sks_recon, SKS version 1.1.0
> 2009-10-03 04:36:16 Copyright Yaron Minsky 2002-2003
> 2009-10-03 04:36:16 Licensed under GPL.  See COPYING file for details
> 2009-10-03 04:36:16 Opening PTree database
> 2009-10-03 04:36:16 Setting up PTree data structure
> 2009-10-03 04:36:16 PTree setup complete
> 2009-10-03 04:36:16 Initiating catchup
> 2009-10-03 04:36:16 Added 5000 hash-updates. Caught up to 1245856578.239066
> 2009-10-03 04:36:16 Added 5000 hash-updates. Caught up to 1245856578.424982
> 2009-10-03 04:36:16 Added 5000 hash-updates. Caught up to 1245856578.608924
> 2009-10-03 04:36:16 Added 5000 hash-updates. Caught up to 1245856579.369999
> 2009-10-03 04:36:17 Added 5000 hash-updates. Caught up to 1245856579.570681
> 2009-10-03 04:36:17 Added 5000 hash-updates. Caught up to 1245856579.757335
> 2009-10-03 04:36:35 Added 5000 hash-updates. Caught up to 1245856580.592294
> 2009-10-03 04:36:36 Added 5000 hash-updates. Caught up to 1245856580.785099
> 2009-10-03 04:36:38 Added 5000 hash-updates. Caught up to 1245856580.973482
> 2009-10-03 04:36:39 Added 5000 hash-updates. Caught up to 1245856581.867670
> 2009-10-03 04:36:45 Added 5000 hash-updates. Caught up to 1245856582.054365
> 2009-10-03 04:36:46 Added 5000 hash-updates. Caught up to 1245856582.240482
> 2009-10-03 04:36:46 Added 5000 hash-updates. Caught up to 1245856583.526407
> 2009-10-03 04:36:47 Raising Sys.Break -- PTree may be corrupted: 
> Failure("add_to_node: attempt to reinsert element into prefix tree")
> 2009-10-03 04:36:47 DB closed
> 2009-10-05 16:39:10 Opening log
> 2009-10-05 16:39:10 sks_recon, SKS version 1.1.0
> 2009-10-05 16:39:10 Copyright Yaron Minsky 2002-2003
> 2009-10-05 16:39:10 Licensed under GPL.  See COPYING file for details
> 2009-10-05 16:39:10 Opening PTree database
> 2009-10-05 16:39:14 Setting up PTree data structure
> 2009-10-05 16:39:14 PTree setup complete
> 2009-10-05 16:39:14 Initiating catchup
> 2009-10-05 16:39:15 Raising Sys.Break -- PTree may be corrupted: 
> Failure("add_to_node: attempt to reinsert element into prefix tree")
> 2009-10-05 16:39:15 DB closed
> 
> Which means it isn't getting very far before finding corruption, even
> on a brand new DB.  Any thoughts?  The only things I can even think of
> to try are different arguments to "sks prebuild" or running "sks
> cleandb" in case that's where the problem really is.

I eventually got around to trying the sks cleandb, but that was
apparently a no-op:

        2009-10-14 18:38:43 Opening log
        2009-10-14 18:38:43 Opening KeyDB database
        2009-10-14 18:38:43 Keydb opened
        2009-10-14 18:38:43 Database already deduped
        2009-10-14 18:38:43 Database already merged


Anyone have any thoughts on this?  Either solutions, or next steps in
debugging that might garner some additional information?

        Jonathon

        Jonathon Weiss <address@hidden>
        MIT/IS&T/OIS  Server Operations




reply via email to

[Prev in Thread] Current Thread [Next in Thread]