gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SV: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_


From: Fredrik Widlund
Subject: SV: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_
Date: Mon, 8 Feb 2010 22:59:16 +0100


Hi,

The system is an iPhone storage/streaming platform.

I had problems with glusterfs and performance a couple of months ago, for 
example when re-exporting over NFS, but these limitations seem to be gone. I 
benchmarked a live distribution server to 8Gbps throughput with a 10GbE nic 
last week, using glusterfs as storagebackend, which is impressive, even though 
iocache took most of the load. The bottleneck ended up being the 
glusterfs-server backend which eventually got LOOKUP()-conflicts followed by 
OPEN() errors and "desynced" identifiers which only reset when the 
glusterfs-server backend was restarted. I.e. similar to the below pure-ftpd 
problems.

[2010-01-27 14:13:50] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: 
LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f2820017460, 
ino=5305, gen=5431421406367711815) found conflict (ptr=0x7f282000dd50, 
ino=5305, gen=5431421406367711815)
[2010-01-27 14:14:19] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: 
LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f28180056c0, 
ino=5307, gen=5431421406367711864) found conflict (ptr=0x7f2810002910, 
ino=5307, gen=5431421406367711864)
[2010-01-27 14:15:19] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: 
LOOKUP(/download/91001/live/Layer4/prog_index.m3u8) inode (ptr=0x7f2818013290, 
ino=5301, gen=5431421406367711959) found conflict (ptr=0x7f28180043b0, 
ino=5301, gen=5431421406367711959)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 
4335255: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file 
or directory)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 
4335270: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file 
or directory)
[2010-01-27 14:16:09] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 
4335272: OPEN() /download/91001/live/Layer4/prog_index.m3u8 => -1 (No such file 
or directory)
[...] Multiple OPEN() errors...

The raid+storage backend handles semi-random I/O with 100 concurrent streaming 
reads at 500MB/s, 40 at around 750MB/s, and 1 at around 1200MB/s, without any 
obvious performance bottlenecks in glusterfs.

Now, if you could just please add a cachefiles client-side translator similar 
to the NFS one? ;)

Btw, the below problems seems to be an undefined behaviour in glusterfs. 
Pure-ftpd manages to "desync" the glusterfs server into failing to read files 
in the posix storage, until the glusterfs server is restarted.

Kind regards,
Fredrik Widlund

-----Ursprungligt meddelande-----
Från: Tejas N. Bhise [mailto:address@hidden
Skickat: den 8 februari 2010 20:06
Till: Fredrik Widlund
Kopia: address@hidden
Ämne: Re: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_

Hi Fredrik,

Good to know it works and thanks for letting us know what caused the problem 
with your setup. Feel free to ask more questions. Just a word of caution - a 
couple of GlusterFS users recently saw XFS errors - just something to keep at 
the back of your mind in case you are trying to debug any problems later.

Do let us know more about what you are using the system for and how you have 
configured it etc - it could be a good use case for others on the user list.

Regards,
Tejas.

----- Original Message -----
From: "Fredrik Widlund" <address@hidden>
To: address@hidden
Sent: Monday, February 8, 2010 10:59:23 PM GMT +05:30 Chennai, Kolkata, Mumbai, 
New Delhi
Subject: [Gluster-devel] RE: LOOKUP conflict => OPEN failS_







Hi,



Ok, it seems to be solved for now. The writer was a pure-ftpd server, and the 
“-O, atomic replace” flag caused the behavior. I browsed through the code 
briefly and it uses among other things hard-link schemes to do atomic changes.



Kind regards,

Fredrik Widlund





From: address@hidden [mailto:address@hidden On Behalf Of Fredrik Widlund
Sent: den 8 februari 2010 16:57
To: address@hidden
Subject: [Gluster-devel] RE: LOOKUP conflict => OPEN fails_






It’s getting worse and worse. Upgraded to 3.0.2 but to no avail.



The prog_index.m3u8 files are being rewritten every 10 seconds. Every other 
read of a newly written index-file results in -1 and the file not being 
available, possibly until the next update of the file.



The strange thing is that until a few days ago this problem wasn’t noticeable 
at all, and now is huge. The only difference is the quickly growing number of 
files on the filesystem, now around 190k files.



Kind regards,

Fredrik Widlund




From: address@hidden [mailto:address@hidden On Behalf Of Fredrik Widlund
Sent: den 8 februari 2010 15:02
To: address@hidden
Subject: [Gluster-devel] LOOKUP conflict => OPEN fails_






Hi,



I’m running a simple AFR setup, thouch currently with only one backend, and 2 
tcp clients. Version is 3.0.0 from jan 20.



Basically one client is writing a large number of files, continuously, and the 
other client is reading.



I have a growing problem with lookup “conflicts”, resulting in files being 
listed in directories but where reads are returning “-1 (No such file…”.



Restarting the client does not solve the conflict, but restarting the server 
does and the files becomes available again.



The filesystem is a 5TB XFS hw raid-5 with around 150k files.



Debug trace of client:

[2010-02-08 13:39:29] N [trace.c:148:trace_open_cbk] replicated: 3073: 
(op_ret=0, op_errno=117, *fd=0x129a430)

[2010-02-08 13:39:37] N [trace.c:1837:trace_open] replicated: 3094: (loc 
{path=/download/90910/live/webb1/webb1/Layer3/prog_index.m3u8, ino=5042185}, 
flags=32768, fd=0x1296fc0, wbflags=0)

[2010-02-08 13:39:37] N [trace.c:148:trace_open_cbk] replicated: 3094: 
(op_ret=-1, op_errno=2, *fd=0x1296fc0)

[2010-02-08 13:39:37] W [fuse-bridge.c:858:fuse_fd_cbk] glusterfs-fuse: 3094: 
OPEN() /download/90910/live/webb1/webb1/Layer3/prog_index.m3u8 => -1 (No such 
file or directory)

[2010-02-08 13:39:38] N [trace.c:1837:trace_open] replicated: 3100: (loc 
{path=/download/90910/live/webb1/webb1/Layer4/prog_index.m3u8, ino=5013773}, 
flags=32768, fd=0x1296fc0, wbflags=0)

[2010-02-08 13:39:38] N [trace.c:148:trace_open_cbk] replicated: 3100: 
(op_ret=0, op_errno=117, *fd=0x1296fc0)

[2010-02-08 13:39:38] N [trace.c:1837:trace_open] replicated: 3106: (loc 
{path=/download/90910/live/webb1/webb1/Layer4/Period1/segment277.ts, 
ino=5050371}, flags=32768, fd=0x129a430, wbflags=0)

[…]



And server:

[2010-02-08 13:39:09] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e43f3

[2010-02-08 13:39:09] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e440b

[2010-02-08 13:39:17] D [server-protocol.c:2037:server_open_cbk] server: 1719: 
OPEN (null) (0) ==> -1 (No such file or directory)

[2010-02-08 13:39:18] D [server-protocol.c:2037:server_open_cbk] server: 1724: 
OPEN (null) (0) ==> -1 (No such file or directory)

[2010-02-08 13:39:28] D [server-resolve.c:238:resolve_path_deep] store0: 
RESOLVE OPEN() seeking deep resolution of 
/download/90910/live/webb1/webb1/Layer3/prog_index.m3u8

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e43db

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e43f3

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e440b

[2010-02-08 13:39:28] D [dict.c:303:dict_get] dict: @this=(nil) 
@key=0x7fedee4e43db

[…]



Kind regards,

Fredrik Widlund




_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]