bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time

From:	Gregory Heytings
Subject:	bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Date:	Mon, 27 Sep 2021 09:23:28 +0000

To get back to the issue at hand: we are talking (or at least I wastalking) about scalability of an algorithm, not about some particularimplementation of the algorithm.

Are you now again shifting the discussion to something else, a theoreticalcomparison between various algorithms?

Ripgrep is a multithreaded program, whereas idutils is single-threaded.So for a fair comparison of scalability of these two main ideas:file-based search vs DB search, you need at the very least to limitripgrep to a single thread. And then you need to run each program oncode bases of various sizes, preferably those which differ by orders ofmagnitude or close to that, and see their O(n) behavior. And excludefrom your comparison command-line options that require IDUtils to accessthe files in addition to the DB. That would be at least anapproximation to comparing apples to apples.

You're asking me to disable everything that makes ripgrep a modern tool,and to disable everything that makes idutils an outdated tool, to make theoutdated tool shine in comparison? Interesting viewpoint.

But frankly, I don't understand why this all would be needed at all,because it should be absolutely clear that searching the files in thefilesystem will always scale worse than reading a well-indexed DB.

Which is precisely what I don't believe. It is, at least to me, not atall "absolutely clear" when you look at the whole picture, IOW, when youinclude the necessity to create and keep a database up to date in yourcomparison, the added complexity of that solution, and the purpose of thetool.

IDUtils is an example of the latter, and it beats many utilities thatsearch the files, including ripgrep, as long as it doesn't need toaccess the files themselves. But even if it doesn't always beat them(which you didn't yet demonstrate), it just means the ideas of itsdesign should be taken further and/or implemented better, that's all.

I provided you with many numbers and comparisons, which IMO demonstratewhat they were meant to demonstrate. A tool which finds matches for aregexp in a O(100 MB) code base in O(10 ms), and in a O(1 GB) code base inO(100 ms), is clearly good enough in practice. (Note that I made thesecomparisons on a six or seven years old laptop, these numbers would beeven lower on a more recent machine.)

I'm still waiting for some numbers from you to demonstrate *your*viewpoint.

I said that such tools are the future, not that IDUtils itself isnecessarily the future (though it could be, if someone picks up itsdevelopment).

Is it not simply because it's not useful/better in practice that nobody ispicking its development (and pretty much nobody is using it)?

Again, this is about looking for the best tools for this job, and Istill stand by my opinion: focusing only on general-purpose search toolsis sub-optimal.

The message to which you replied and which started this subtread did notsuggest to "focus only on general-purpose search tools", it suggested tofocus only on *one* particular general-purpose search tool, ripgrep, whichis currently the best tool for the job, and to bundle it with Emacs. Ithas a public domain license, so I guess it should be possible.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time, (continued)

Prev by Date: bug#50743: Emacsclient not tested vs. Local Variables prompt
Next by Date: bug#50839: 28.0.50; Odd things in comp-ctxt doc string
Previous by thread: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Next by thread: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Index(es):
- Date
- Thread