guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Relaxing the restrictions for store item names


From: Saku Laesvuori
Subject: Re: Relaxing the restrictions for store item names
Date: Fri, 25 Aug 2023 21:14:03 +0300

> > Although now, just a few hours later, I'm having second thoughts on
> > this. When you really think about it, it's very unlinkely that some
> > user would prefer typing something like
> > 
> > guix install 
> > %D0%B8%D0%BC%D0%B0%D0%B3%D0%B8%D0%BD%D0%B0%D1%80%D0%B8-%D0%BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC
> > 
> > over
> > 
> > guix install имагинари-програм
> 
> I imagine that, for usability, the percent encoding (or other encoding
> or transliteration) of non-ASCII characters could be handled
> transparently, i.e. for "guix install имагинари-програм", guix would
> translate "имагинари-програм" to the encoded form for operations. And
> if the escape character (e.g. the "%" in percent encoding) isn't also
> a valid character for store or package names then the values can be
> handled transparently. For example, both "guix install git" and "guix
> install %67%69%74" and "guix install g%69t" would all install git.
>
> > [...]
>
> > It would also make
> > store name unnecessarily long (they're already long as is), and
> > there's a 255 char limit for filenames that we have to keep in mind as
> > well. Searching the store using standard utilities such as find and
> > grep would too, as a consequence,
> 
> I split out the quote above as a bit of reference. While I agree that
> we have to keep in mind the 255 char limit for filenames, with percent
> encoding causing a single byte in ASCII or UTF-8 to become ~3 bytes
> (with iirc most non-latin characters having multi-byte encodings in
> UTF-8) and the store hashes being a 33 byte prefix (counting the
> dash), 255 chars is still quite a bit. Specifically, the extracted
> quote above--without the "> " prefixes and with line breaks treated as
> single characters--is exactly 255 characters. (I find a bit of
> readable text to be helpful for wrapping my brain around a value like
> "255 characters".)
>
> > break... There's just too many problems with this.

The encoding could also be transparent in the other direction so the
percent encoded form would be usable on the command line (in addition to
the UTF-8 one, of course), but guix would translate it to UTF-8 for
operations. This would allow typing all package names with only ascii
characters but still keep the store readable and grepable. There are
most likely simple utility programs that can decode precent encoding, so
the store is also grepable with only ascii characters. 

There is no reason (that I can see) not to allow UTF-8 in the store
paths, other than it being hard to type with a keyboard for a different
locale. But how often do people actually want to type store paths by
hand? I at least avoid it at all times possible by using $(guix build ...), 
$(herd configuration ...), $(realpath /var/guix/profiles/...) etc.
Even when recovering a broken system the only store path you really need to
type is that of a working guix (and /var/guix/profiles/... probably also
works in a broken system).

> > even if they don't have the russian (or whatever other language)
> > keyboard layout set up on their system, so just for accessability
> > purposes, the solution wouldn't be all that great.

I agree. It is really annyoing and hard to write percent encoding by
hand, so this doesn't really solve the issue of UTF-8 being hard to
write with an ASCII keyboard.

Maybe some sort of fuzzy character matching could be used in guix search
instead of percent encoding. That way people could find the packages
even if they can't type the entire name and then use the name from guix
search (by copy-pasting or shell piping) to install it (or do whatever
operation they want to it).

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]