[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Worrying development

From: Dirk Herrmann
Subject: Re: Worrying development
Date: Thu, 22 Jan 2004 17:11:52 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030821

Hi folks,

first, sorry for the late answer...

Marius Vollmer wrote:

Roland Orre <address@hidden> writes:
I suggest that shared substrings are moved back to guile.


I'm sorry for previously giving the impression that shared substrings
wont come back.

There is no problem on the Scheme side of things: we can just add
shared substrings and make it a proper subtype of 'string'.

The problem lies with C code and there only with the low level API
consisting of SCM_STRINGP, SCM_STRING_CHARS etc.  Functions like
scm_c_string2str can be updated to just continue to work.

Shared substrings also touch on the issues of using Unicode in Guile
and on making sure we have a nice type conversion API that can replace
gh_ in all respects.

I'd like to do it in this order:

- type conversion API (which allows for different encodings of
  strings, but doesn't need it immediately) (the first part of this
  was the 'frame' stuff for handling unwinds in C).

- Unicode (with shared substrings in mind).

- shared substrings

Of course, we shouldn't do too much lest 1.8 wont happen...

I'll try to put forth a proposal in the next days for the string part
of the type conversion API that allows Unicode and shared substrings.

I am not quite sure, everybody is talking about the same issues here: When talking about the re-introduction of shared substrings, Marius, do you think of implicitly shared copy-on-write substrings, or guile's explicitly shared substrings?

Shared substrings as they have been provided by guile could have served two purposes:

1) saving resources (run time and memory)

2) communicating changes via something like a shared memory interface

The first purpose is just a matter of performance and should not change the functional behaviour of strings. However, guile's former implementation was only imperfectly suited to this kind of usage: Whoever used shared substrings for this purpose needed to be well aware of which strings were actually shared and which were not, because modifications on the strings would cause side effects. Thus, for this purpose the mechanism of implicitly shared copy-on-write substrings is safer. And, this is what we had intended to implement for guile. It would not require the user to perform any explicit action to have substrings to be shared.

The second purpose is about a change in behaviour, and to me it is not quite sure that it should be brought back in the old way. Providing shared substrings in that way changes the semantics of strings: Two strings s1 and s2, which are not eq? would become connected in a user-visible way such that modifications to s1 influence s2 and vice versa. What may users of a string data type expect? Shall it be granted that the following expression will always evaluate to true?
 (if (and
       (not (eq? s1 s2))
       (equal? s1 s3))
      (string-set! s2 0 #\x)
      (equal? s1 s3))
My assumption is that most users will assume the above expression to evaluate to true. If that was not the case, we would require users to perform aliasing checks in their code. Do we really want that? In which way would we extend the string API such that users are able to perform the necessary aliasing checks? IIRC, the old shared substring API did not provide a means for such aliasing checks.

The shared substring feature was deprecated since we had considered that feature as a bug in guile's design. I propose not to officially re-introduce it in its former way. The best thing was to have code changed that used the old behaviour. To allow applications to be migrated incrementally, we have provided the feature as deprecated since guile-1.6. If that is not possible for some applications, then a workaround like the one that Mikael and Roland have developed can be used. With that solution, the feature may even remain part of guile - but deprecated, only provided for backwards compatibility! Whoever uses it, should be aware of the fact that due to the aliasing it may lead to problems with other string libraries.

Best regards
Dirk Herrmann

reply via email to

[Prev in Thread] Current Thread [Next in Thread]