bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time

From:	Gregory Heytings
Subject:	bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Date:	Mon, 27 Sep 2021 00:43:05 +0000

Out of curiosity, because of your "it doesn't scale" remark, I justcompared the efficiency of ripgrep and idutils on the latest Linuxkernel tarball (1.4 GB in 78464 files):
mkid takes 31 seconds

rg O_CREAT takes 0.18 seconds
gid O_CREAT takes 0.02 seconds
rg O.?CREAT takes 0.18 seconds
gid O.?CREAT takes 0.93 seconds
rg O.*CREAT takes 0.19 seconds
gid O.*CREAT takes 1.73 seconds

Isn't idutils the one that doesn't scale?
No.  You compare apples with oranges.

No. I compare apples with apples. I compare regexp searches in a codebase with regexp searches in a code base. Because this is a thread aboutregexp searches in a code base. It's you who started talking aboutoranges instead, namely searching for identifiers in a code base.

The only case in which idutils is faster (if one does not take the timethat was spent to build the database into account, and if one considersthat it's okay to ignore some matches in comments) is a plainidentifier; from a user viewpoint getting an answer in 0.2 seconds onsuch a big code base is as good as getting an answer in 0.02 seconds.It's slower, much slower in all other cases, whenever a regexp is used--- which is what project-find-regexp is all about.
See what I mean?  Even when it's better, it's worse.  Perfect reasoning.

Perfect reading. Nowhere did I say that it's worse when it's better. Isaid that from a user viewpoint, a tool that is 155 ms faster in one (andonly one) case, and slower in all other cases, is worse, and that from auser viewpoint this single "155 ms faster case" does not matter enough tojustify the use of a more complex tool.

Note that Emacs takes some time (55 ms for a search for O_CREAT on theEmacs trunk) to read, process and display the output, which must be takeninto account to calculate the perceived difference between search toolcandidates.


Some more detailed numbers:

1. on Emacs' trunk (4759 files, 174 MB)

gid O_CREAT : 10 ms
gid O[A-Z_]CREAT : 75 ms
gid O.?CREAT : 70 ms
gid O.*CREAT : 70 ms

rg O_CREAT : 25 ms
rg O[A-Z_]CREAT : 25 ms
rg O.?CREAT : 25 ms
rg O.*CREAT : 25 ms

rg -w O_CREAT : 30 ms
rg -w O[A-Z_]CREAT : 30 ms
rg -w O.?CREAT : 30 ms
rg -w O.*CREAT : 30 ms

2. on the latest Linux kernel tarball (78464 files, 1.4 GB)

gid O_CREAT : 25 ms
gid O[A-Z_]CREAT : 1375 ms
gid O.?CREAT : 930 ms
gid O.*CREAT : 1730 ms

rg O_CREAT : 180 ms
rg O[A-Z_]CREAT : 185 ms
rg O.?CREAT : 185 ms
rg O.*CREAT : 185 ms

rg -w O_CREAT : 185 ms
rg -w O[A-Z_]CREAT : 190 ms
rg -w O.?CREAT : 190 ms
rg -w O.*CREAT : 190 ms

I initially reacted to your paragraph:

Btw, I don't understand why we focus on general-purpose text-searchingtools for these features. Why not focus on packages like ID Utilsinstead, they are so much faster. Daniel, could you time the samesearch in that large tree when xref-search-program is 'gid'? (You'dneed to run 'mkid' first, to create the ID database, but that isone-time, and is very fast.) As I told many times, I think this is thefuture: program language sensitive tools that use a precomputed DB.

It should now be clear that idutils is not "so much faster", it ismarginally faster in one case, and slower in all other cases. And itdoesn't do what project-find-regexp needs, because it ignores (most, butnot all) tokens in comments (oh, BTW, including tokens in comments hasbeen on its TODO for at least 20 years). Creating the ID database is alsonot "very fast", and the ID database cannot be updated incrementally (oh,BTW, incremental database updates has been on its TODO list for at least20 years). In short, it's an outdated tool, that isn't maintainedanymore, and that can't be the future.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, (continued)

Prev by Date: bug#50832: [PATCH] 28.0.50; Wrong mode in test-cl-flet-indentation
Next by Date: bug#50834: Feature request: cl-remove-method (prototyped) and buttons for it
Previous by thread: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Next by thread: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Index(es):
- Date
- Thread