bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time


From: Gregory Heytings
Subject: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Date: Mon, 27 Sep 2021 09:23:28 +0000



To get back to the issue at hand: we are talking (or at least I was talking) about scalability of an algorithm, not about some particular implementation of the algorithm.


Are you now again shifting the discussion to something else, a theoretical comparison between various algorithms?


Ripgrep is a multithreaded program, whereas idutils is single-threaded. So for a fair comparison of scalability of these two main ideas: file-based search vs DB search, you need at the very least to limit ripgrep to a single thread. And then you need to run each program on code bases of various sizes, preferably those which differ by orders of magnitude or close to that, and see their O(n) behavior. And exclude from your comparison command-line options that require IDUtils to access the files in addition to the DB. That would be at least an approximation to comparing apples to apples.


You're asking me to disable everything that makes ripgrep a modern tool, and to disable everything that makes idutils an outdated tool, to make the outdated tool shine in comparison? Interesting viewpoint.


But frankly, I don't understand why this all would be needed at all, because it should be absolutely clear that searching the files in the filesystem will always scale worse than reading a well-indexed DB.


Which is precisely what I don't believe. It is, at least to me, not at all "absolutely clear" when you look at the whole picture, IOW, when you include the necessity to create and keep a database up to date in your comparison, the added complexity of that solution, and the purpose of the tool.


IDUtils is an example of the latter, and it beats many utilities that search the files, including ripgrep, as long as it doesn't need to access the files themselves. But even if it doesn't always beat them (which you didn't yet demonstrate), it just means the ideas of its design should be taken further and/or implemented better, that's all.


I provided you with many numbers and comparisons, which IMO demonstrate what they were meant to demonstrate. A tool which finds matches for a regexp in a O(100 MB) code base in O(10 ms), and in a O(1 GB) code base in O(100 ms), is clearly good enough in practice. (Note that I made these comparisons on a six or seven years old laptop, these numbers would be even lower on a more recent machine.)

I'm still waiting for some numbers from you to demonstrate *your* viewpoint.


I said that such tools are the future, not that IDUtils itself is necessarily the future (though it could be, if someone picks up its development).


Is it not simply because it's not useful/better in practice that nobody is picking its development (and pretty much nobody is using it)?


Again, this is about looking for the best tools for this job, and I still stand by my opinion: focusing only on general-purpose search tools is sub-optimal.


The message to which you replied and which started this subtread did not suggest to "focus only on general-purpose search tools", it suggested to focus only on *one* particular general-purpose search tool, ripgrep, which is currently the best tool for the job, and to bundle it with Emacs. It has a public domain license, so I guess it should be possible.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]