Re: uft8 character set -> "repository problem"

info-cvs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uft8 character set -> "repository problem"

From:	Pierre Asselin
Subject:	Re: uft8 character set -> "repository problem"
Date:	Fri, 26 Aug 2005 01:41:17 +0000 (UTC)
User-agent:	tin/1.6.2-20030910 ("Pabbay") (UNIX) (NetBSD/2.0 (i386))

Marko Kaening <address@hidden> wrote:

> [ trouble with non-ascii characters in files ]
>
> Well, the contents isn't the problem so far. It rather begins already with 
> the file name ITSELF. If the filename itself contains umlauts, than it 
> gets close to unreadable because of the appearing question marks.
> [ ... ]
> Since I do my cvs access to the server via ssh, I wonder how I can change 
> the LANG variable for these connections properly...

But what you said was this (emphasis added):

>>> Files in our repository do contain sometimes umlauts. *These will be
>>> handled okay on our clients*, but if you look into the repository on the
>>> server itself they will look strange. Every umlaut gets displayed just as
>>> a question mark.

So I gather that
    1)  The way your cvs clients connect to the server, whatever
        it may be, working just fine.  File names are transmitted to
        the CVS server in some form or other, the server creates
        ,v files in the repository with possibly weird names, but
        *when you check out or update from a client, the names come
        back correctly*.
    2)  When you ssh to a shell prompt on the CVS server and look
        at the repository, the ,v files have strange characters.

At which point I ask:  why are you looking directly at the repository ?

You may have said also, but I'm not sure,

    3)  When you ssh to the server under your normal user account
        and checkout a sandbox under your home directory, the file
        names have strange characters.

If *that* is the problem, you need to set the right environment
variable after you log in, possibly from your .bash_profile.  Here's
an experiment pasted in from an xterm window on my laptop.  I don't
know how it will show after going through Usenet so I will describe.

Default environment:

    $ touch café
(I used right_alt-i for the last character, which gives
me an e-acute.)
    $ ls
    caf?
(Last character prints as a question mark.)
    $ ls | od -c
    0000000   c   a   f 351  \n
    0000005
(The 'od' command shows the octal code of the
last character.)

Now with the right environment:
    $ export LC_ALL=en_US
    $ ls
    café
(Last character prints as an e-acute.)
    $ ls | od -c
    0000000   c   a   f   é  \n
    0000005
(The 'od' command shows the last character
as is, and it prints as an e-acute.)


So it *could* be a question of setting the environment of your
login shell so the various commands know how to handle non-ascii
characters.  I am not very familiar with this myself, but the
"locale" man page seems like a good starting point.  This has
nothing to do with CVS.


-- 
pa at panix dot com

[Prev in Thread]

Current Thread

[Next in Thread]

uft8 character set -> "repository problem", Marko Kaening, 2005/08/24
- Re: uft8 character set -> "repository problem", Pierre Asselin, 2005/08/24
  - Re: uft8 character set -> "repository problem", Marko Kaening, 2005/08/25
  - Message not available
    - Re: uft8 character set -> "repository problem", Pierre Asselin <=

Prev by Date: checkout a old version file to cover the new one
Next by Date: RE: checkout a old version file to cover the new one
Previous by thread: Re: uft8 character set -> "repository problem"
Next by thread: Linux Repository backup on WIN server
Index(es):
- Date
- Thread