monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Freeze on Windows using ssh transport


From: Daniel Carosone
Subject: Re: [Monotone-devel] Freeze on Windows using ssh transport
Date: Wed, 21 Feb 2007 07:03:23 +1100
User-agent: Mutt/1.5.13 (2006-08-11)

On Tue, Feb 20, 2007 at 04:19:15PM +1100, William Uther wrote:
>   A central Unix box has a db.  Everyone has a login.  Everyone is  
> using ssh to sync.  Everyone (but me) is using 0.32 binaries from the  
> website (incl. the unix box).  (I'm using a more recent revision.)
> 
>   The people using windows (and only them) are having intermittent  
> sync issues.  When they sync, mtn counts everything, then exchanges  
> about 2k each direction, then freezes.  The weird part is that this  
> is intermittent.  It seems to work for an initial sync, and if it  
> just worked then it will work again. 

There are a couple of possibilities here, and a cautionary note.

The ssh:// db access method is really not meant for multiuser work;
there's no support for multiple concurrent 'clients' accessing the
same server db in parallel, other than via netsync.  So multiple
concurrent users will trip over db locking issues.

The first -- unlikely -- possibility is that you're tripping over this
but in some odd way that isn't making the db locking error obvious.  I
doubt that.

The second is that there appears to still be some cases where netsync
gets wedged.  The usual workaround is to try either just pulling or
pushing (rather than sync) or to change the selection set (by syncing
just a subset of branches, or by committing some more local revs).  We
thought we had killed the last of these some time ago, but we hit
another instance during the summit, in that case with nearly-empty
databases (ie, containing only a handful or revs).

If its the latter possibility, it's not at all clear to me why it
should only happen for you on windows, or why it doesn't happen over
native netsync. It may be that some data is not getting flushed to the
socket, in the windows ssh case, and that produces the same symptoms.

> If it is failing for a user it will continue to fail for them.  If
> they have no local changes to send, then they can just blow away
> their local db, make a new one, re- sync, and it will work... until
> they try to sync 30 minutes later and it fails.

That sounds very much like the sync getting wedged problem.  Next time
you see it, see if you can provoke it to continue by changing the set
of revs being exchanged, as above.  It would also be very interesting
to know if you do see it over netsync some time later, or if you stop
seeing it at all (even on ssh/windows) after you have accumulated some
more revs overall.

>   We've switched to a netsync server on one of the windows boxes and  
> that seems to work fine.

It's odd that it should be different, but you want to use it this way
anyway.

--
Dan.

Attachment: pgpMm5ginTXDi.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]