Re: dired-do-find-regexp failure with latin-1 encoding

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dired-do-find-regexp failure with latin-1 encoding

From:	Dmitry Gutov
Subject:	Re: dired-do-find-regexp failure with latin-1 encoding
Date:	Sun, 29 Nov 2020 18:27:24 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 29.11.2020 17:19, Eli Zaretskii wrote:

From: Dmitry Gutov <dgutov@yandex.ru>
Cc: stephen.berman@gmx.net, emacs-devel@gnu.org
Date: Sun, 29 Nov 2020 02:49:25 +0200

On 28.11.2020 23:04, Dmitry Gutov wrote:

or latin-1 (AND the current system locale matches that encoding), the
search should work fine across such files in different encodings, and
without 'C-x RET c'


Correction: only utf-8 and utf-16 detection is automatic. latin-1 needs
explicit arguments '-E latin-1' passed to rg.

The official recommended workaround is to use a --pre flag which is
similar to what Stephen did originally by inserting 'iconv ...' in the
shell command string: https://github.com/BurntSushi/ripgrep/issues/746


How can --pre help?  It still cannot easily support different
encodings in the same command, right?

It can help by calling iconv with different arguments depending on thecontents of each file. Which is valuable, I think, because we'renormally not piping file contents to grep (or, potentially, rg), insteadwe pass multiple file names to it using xargs.

That wouldn't be easy, but some script that performs conversion based onfile contents could work.

I suppose if we really wanted, we could insert some custom program that
chooses what to 'iconv' with, but that would be slower, of course. But
it could work with Grep, too.


It would be brittle, unless that program actually reads the entire
file (which will be slow).

How does Emacs do it? Does it read until the end of the file? If not, wecould try to reuse some of its logic.

Otherwise, yes, our options are either slow or brittle. That might bewhy ripgrep's author decided to offload this responsibility, looking atthe discussion referenced above.

In any case, --pre will already become significantly slower than thecurrent behavior (it will spawn a process for each searched file), so wemight afford the "slow" approach here because we won't enable it bydefault anyway.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: dired-do-find-regexp failure with latin-1 encoding, (continued)
- Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
  - Re: dired-do-find-regexp failure with latin-1 encoding, Stephen Berman, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Stephen Berman, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/28
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov <=
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Yuri Khan, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Stephen Berman, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Dmitry Gutov, 2020/11/29
    - Re: dired-do-find-regexp failure with latin-1 encoding, Eli Zaretskii, 2020/11/29

Prev by Date: Re: dired-do-find-regexp failure with latin-1 encoding
Next by Date: Re: dired-do-find-regexp failure with latin-1 encoding
Previous by thread: Re: dired-do-find-regexp failure with latin-1 encoding
Next by thread: Re: dired-do-find-regexp failure with latin-1 encoding
Index(es):
- Date
- Thread