[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance issue of find function in Gluster File System
From: |
Pierre Gaston |
Subject: |
Re: Performance issue of find function in Gluster File System |
Date: |
Thu, 17 Aug 2017 08:52:15 +0300 |
On Wed, Aug 16, 2017 at 11:02 PM, Zhao Li <lizhao.informatics@gmail.com>
wrote:
> Hi,
>
> I found there is a big difference of time performance between "ls" function
> and "find" function in Gluster File System
> <https://gluster.readthedocs.io/en/latest/Administrator%20Gu
> ide/GlusterFS%20Introduction/>.
> Here is the minimal working example.
>
> mkdir tmp
> touch tmp/{000..300}.txt
>
> time find ./ -path '*tmp*' -name '*.txt'> /dev/null
> real 0m42.629s
> user 0m0.675s
> sys 0m1.438s
>
> time ls tmp/*.txt > /dev/null
> real 0m0.042s
> user 0m0.003s
> sys 0m0.003s
>
> So I am wondering what C code you use for "ls" and "find" and how you
> explain "*" in "ls" and "find" to lead to this big difference in Gluster
> File System.
>
> Thanks a lot.
> Zhao
>
There are several differences, first note that "ls" is not the one finding
the files. The shell is expanding *.txt then ls is passed all the files as
arguments.
*.txt is not recursive so only the files directly under /tmp will be search
In your find command, -path matches the whole path (/ included) and your
find command will descend in all the directories, whether they match tmp or
not, so depending on where you started to search from, it may search your
whole / partition.
A more comparable command would be:
find /tmp -name tmp -o -prune -name '*.txt' -print
or if your find command supports it:
find /tmp -maxdepth 1 -name '*.txt'
Note also that ls and find are separate tools that are not developed along
with bash.
For gnu find: https://www.gnu.org/software/findutils/
For gnu ls: https://www.gnu.org/software/coreutils/coreutils.html
But there are also other implementation for various systems.