[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bug: UTF-8 expansion results in extra characters

From: L A Walsh
Subject: Bug: UTF-8 expansion results in extra characters
Date: Mon, 06 Mar 2017 04:50:09 -0800
User-agent: Thunderbird

I didn't see that this was caught and wasn't sure if
it was already covered in what I previously
posted this problem in a follow-up to a similar problem.

It may be the same bug, but I wasn't sure.
If I past the text in quotes into bash, bash tosses in
an extra character as evidenced by 'wc':

echo 'あa a '|wc -m

There should only be 5 characters.

If I cut/paste her text (in quotes) directly into 'wc -m' (so it
doesn't go through bash, but is taken directly from 'wc' on its
stdin), then I get '7' (2 extra chars for the quotes):

 wc -m
'あa a '7
      ^^ pressed Ctl-D twice to not end line w/another char (like LF).

Maybe that examples allows you to duplicate the problem?

I don't think your development setup allows for cut/paste
from an editor or mail-window that accurately copies the
characters.   Something in your text display+copy+insert
doesn't seem to copy the actual characters, but something
that looks similar.

Ex, copying text like this sometimes misses a space on the end:

あa a

Copying from an editor or line that has 'tabs' in it, doesn't
seem to preserve the tabs, so you can't cut+paste script from
an editor into bash without it being becoming disintegrous.

I think its a side-effect of the development tools you are
using (I don't know what they are, but its also a guess, based
on sparse data where specific characters aren't mapped the same
and don't produce the same output in regards to what was
actually presented when someone tries to describe a problem.

It doesn't make a difference in most cases, but in a few, like
the ones mentioned above.... ;-(

Anyway, like I was trying to say, it's a, perhaps, inaccurate
deduction based scant evidence...  oh well...

reply via email to

[Prev in Thread] Current Thread [Next in Thread]