Re: Multibyte and unibyte file names

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multibyte and unibyte file names

From:	Stephen J. Turnbull
Subject:	Re: Multibyte and unibyte file names
Date:	Sun, 27 Jan 2013 02:57:58 +0900

Richard Stallman writes:

 > Removing unibyte mode could probably be a big slowdown for visiting
 > binary files,

This was benchmarked for XEmacs in the early 2000s, and for files big
enough to matter the decoding time is swamped by disk I/O.  If it's
possible to mmap files into buffers, the difference to visiting might
be perceptible, but even SSDs can't transfer fast enough to beat CPUs
at decoding if you actually fill the buffer.

The biggest difference would be in `goto-char', which becomes
O(distance from the nearest point in the position cache).  It would be
possible to initialize the cache at read time, and a cache 1% of the
size of the file would allow you to cache a position every 2KB or so,
for effectively O(1) performance.[1]  We never implemented that, though;
the most important uses for large files were log files, and all of the
people who were using Emacsen to read log files had pure ASCII files
which did allow random access.

 > and might make it unreliable.

Depends on how you implement it.  XEmacs's implementation has never
had a bug that I've heard of.

Footnotes: 
[1]  Since you know the cache is "nearly uniform", you can do better
than O(log(0.0005*size)).

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Multibyte and unibyte file names, (continued)
- Re: Multibyte and unibyte file names, Michael Albinus, 2013/01/24
  - Re: Multibyte and unibyte file names, Eli Zaretskii, 2013/01/24

Prev by Date: Re: Multibyte and unibyte file names
Next by Date: Re: Multibyte and unibyte file names
Previous by thread: Re: Multibyte and unibyte file names
Next by thread: Re: Multibyte and unibyte file names
Index(es):
- Date
- Thread