emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dired-do-find-regexp failure with latin-1 encoding


From: Dmitry Gutov
Subject: Re: dired-do-find-regexp failure with latin-1 encoding
Date: Sun, 29 Nov 2020 18:27:24 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 29.11.2020 17:19, Eli Zaretskii wrote:
From: Dmitry Gutov <dgutov@yandex.ru>
Cc: stephen.berman@gmx.net, emacs-devel@gnu.org
Date: Sun, 29 Nov 2020 02:49:25 +0200

On 28.11.2020 23:04, Dmitry Gutov wrote:
or latin-1 (AND the current system locale matches that encoding), the
search should work fine across such files in different encodings, and
without 'C-x RET c'

Correction: only utf-8 and utf-16 detection is automatic. latin-1 needs
explicit arguments '-E latin-1' passed to rg.

The official recommended workaround is to use a --pre flag which is
similar to what Stephen did originally by inserting 'iconv ...' in the
shell command string: https://github.com/BurntSushi/ripgrep/issues/746

How can --pre help?  It still cannot easily support different
encodings in the same command, right?

It can help by calling iconv with different arguments depending on the contents of each file. Which is valuable, I think, because we're normally not piping file contents to grep (or, potentially, rg), instead we pass multiple file names to it using xargs.

That wouldn't be easy, but some script that performs conversion based on file contents could work.

I suppose if we really wanted, we could insert some custom program that
chooses what to 'iconv' with, but that would be slower, of course. But
it could work with Grep, too.

It would be brittle, unless that program actually reads the entire
file (which will be slow).

How does Emacs do it? Does it read until the end of the file? If not, we could try to reuse some of its logic.

Otherwise, yes, our options are either slow or brittle. That might be why ripgrep's author decided to offload this responsibility, looking at the discussion referenced above.

In any case, --pre will already become significantly slower than the current behavior (it will spawn a process for each searched file), so we might afford the "slow" approach here because we won't enable it by default anyway.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]