[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gluster-devel] debugging ping timeouts
From: |
Pranith Kumar Karampuri |
Subject: |
[Gluster-devel] debugging ping timeouts |
Date: |
Fri, 21 Mar 2014 05:25:29 -0400 (EDT) |
hi,
I do not think glusterfs at the moment could tell why a ping-timeout
happened. And by the time a user learns that such an event happened, client
would have disconnected and reconnected, so we can not debug the issue any
more. One of the reasons why ping-timeouts may happen is because epoll thread
is busy doing something, most probably waiting on a mutex lock. So I am
thinking may be we should add some extra information before and after acquiring
locks and duration of critical section executions and report them at the time
of disconnect.
pseudo code:
PTHREAD_MUTEX_LOCK(lock) {
get the current time to T1;
pthread_mutex_lock (lock);
get the current time T2;
if T2-T2 is greather than already recorded time update it //may be we
should also remember the xlator in which it happened.
}
PTHREAD_MUTEX_UNLOCK(lock) {
get the current time to T3;
pthread_mutex_unlock (lock);
if T3-T2 is greather than already recorded time update it
}
Something similar should be done for spin_locks as well.
When a disconnect event comes this information will be logged along with
disconnect messages.
If you could think of anything else please add it to the thread and we will
make a call after a while to see what all can be done to debug such issues
further.
Pranith
- [Gluster-devel] debugging ping timeouts,
Pranith Kumar Karampuri <=