lilypond-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multi-byte characters in Lyrics


From: Maurits Lamers
Subject: Re: Multi-byte characters in Lyrics
Date: Fri, 27 Oct 2017 11:49:59 +0200


Op 27 okt. 2017, om 10:18 heeft David Kastrup <address@hidden> het volgende geschreven:

Maurits Lamers <address@hidden> writes:

Hi,


I cannot convert a multi-byte character to a symbol, unless I do some
very inelegant hacks.

Huh?  string->symbol works just fine.  So what do you mean when you say
"symbol"?

This is partly because of a mistake on my end. I defined my braille
dots lookup alist through symbols.

brailleSymbols = #`(
(1 . 1)
(2 . 12)
(3 . 14)
(4 . 145)
)

There is no symbol here whatsoever.

Mmm, I didn't get it to work without converting the char to a symbol. 
This probably also has to do with me having to dive in scheme after just a few lessons quite a few years ago. :)
But anyhow, this has changed now, so it is no longer valid.


etc...
This required me to do a (char->symbol) in order for assoc-ref to
return something.

That makes no sense.

Then I must have made some other kind of mistake which caused that. 


I should really read mails to the end before coming up with code.

I think I will use your solution instead of the other one, as it is
much more elegant and easier to read and understand than the
bitshifting variation.

Oh, bit shifting?  Probably for arriving at integers (rather than
characters)?  I was thinking of that but decided that sticking with
single-character strings was more likely to result in readable code.

The bitshifting solution looks like this:

#(define ((clz n) x)
  (let loop ((i 1) (x x))
    (if (< i n)
        (loop (ash i 1) (logior x (ash x (- 0 i))))
        (- n (logcount x)))))

#(define (bitwise-andc1 x y)
  (logand (lognot x) y)
  )

#(define (utf8-n o)
  (max 1 ((clz 8) (bitwise-andc1 o #xff))))

#(define (string->utf8-list str)
  (if (equal? (string-length str) 0)
    '()
    (append '()
      (list (string-copy str 0 (utf8-n (char->integer (string-ref str 0)))))
      (string->utf8-list (string-copy str (utf8-n (char->integer (string-ref str 0))))))
  )
)

The bitshifting is used to figure out how many high bits are set, as in UTF8 those indicate how many chars it takes to have the full character.
The solution then splits the string by the amount of chars it take to assemble the full character

From reading your solution, I figured you do something similar, but it uses the char code instead.



Though I figured with some consternation that something like

"⁹" resulted in garbage being printed, so the readability does not
really extend to the output.

Interesting, will have to try the bitshifting solution with that :)

cheers and thanks!

Maurits

reply via email to

[Prev in Thread] Current Thread [Next in Thread]