guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About shared substrings (now working)


From: Roland Orre
Subject: Re: About shared substrings (now working)
Date: Mon, 19 Jan 2004 00:12:45 +0100

As Mikael Djurfeldt commented, his solution on Friday to the shared
substring problem was a little too magical. I added his suggestion about
guardians and that works fine. The essential code is below.

Despite that I'm happy that I've got shared substrings working again in
guile 1.7 and it's of course elegant to not need the substring tag, at
the same time I also learned both about weak_hash_tables and guardians,
I have to say that I'm somewhat disappointed with the guile
developers just removing an essential function like this, before 
providing an alternative. The reason I have used shared substrings is
because of the side effects. I was very happy when the
make-shared-substring function was introduced in 93 or 94. This made it
possible for me to make tremendous speedups when reading fix length text
records and be able to immediately treat the different fields of each
record with scheme standard conversion routines. Of course our
production environment is mostly using specialized C-functions for this,
but some essential parts of our production code is still based upon
shared substrings as well as the prototype environment of course.
When reading many millions of records from a fix width text file it
makes a tremendous difference in speed if you can avoid substring
allocation to reach the fields of these records and it is also nice
to be able to access every field with standard scheme routines.

During the fall 2003 I was not following the guile-devel list as I
mainly consider myself as a guile user. The removal of
make-shared-substring was not mentioned at all on the guile-user
list as I'm aware about. In the comments from 1.6 about
make-shared-substring being deprecated I understood that it would
be replaced with implicitly shared substrings in 1.7. OK, 1.7 is a
development version so I should not count on the functions there,
this is my mistake.

Anyway, here is the working code (the gc-hook is not necessary,
therefore commented, as the unused substrings will be cached in
the guardian (OK, that was Mikael Djurfeldt genius again :))

Hmm, maybe this is getting closer to the future "substring"
as well :) (at least for us who want the side effects...)

        Best regards
        Roland Orre

SCM substring_table;
SCM substring_guardian;

static void *
substrings_zombify (void *dummy1 SCM_UNUSED,
                    void *dummy2 SCM_UNUSED,
                    void *dummy3 SCM_UNUSED)
{
  SCM rest,str;
  str=scm_call_0(substring_guardian);
  while (SCM_STRINGP(str)) {
    SCM_SETCAR(str,SCM_EOL); SCM_SETCDR(str,SCM_EOL);
    str=scm_call_0(substring_guardian);
  }
  return 0;
} /* substrings_zombify */


SCM_DEFINE
(scm_make_shared_substring, "make-shared-substring", 1, 2, 0,
 (SCM parent, SCM start, SCM end),
 "Return a shared substring of @var{str}.  The arguments are the\n"
 "same as for the @code{substring} function: the shared substring\n"
 "returned includes all of the text from @var{str} between\n"
 "indexes @var{start} (inclusive) and @var{end} (exclusive).  If\n"
 "@var{end} is omitted, it defaults to the end of @var{str}.  The\n"
 "shared substring returned by @code{make-shared-substring}\n"
 "occupies the same storage space as @var{str}.")
#define FUNC_NAME s_scm_make_shared_substring
{
  SCM substring;
  char *mem;
  int c_start, c_end;
  SCM_VALIDATE_SUBSTRING_SPEC_COPY (1, parent, mem,
                                    2, start, c_start,
                                    3, end, c_end);
  substring = scm_call_0 (substring_guardian);
  if (SCM_FALSEP(substring))
    substring = scm_cell(SCM_MAKE_STRING_TAG (c_end - c_start),
                         (scm_t_bits) (mem + c_start));
  else
    {
      SCM_DEFER_INTS;
      SCM_SETCAR(substring,SCM_MAKE_STRING_TAG (c_end - c_start));
      SCM_SETCDR(substring,(scm_t_bits) (mem + c_start));
      SCM_ALLOW_INTS;
    }
  scm_hash_set_x (substring_table, substring, parent);
  scm_apply (substring_guardian,substring,scm_listofnull);
  return substring;
}
#undef FUNC_NAME

/* in init */
substring_table
  = scm_permanent_object(scm_make_weak_key_hash_table(SCM_UNDEFINED));

substring_guardian
  = scm_permanent_object(scm_make_guardian(SCM_UNDEFINED));

// scm_c_hook_add(&scm_after_gc_c_hook, substrings_zombify,0,0);






reply via email to

[Prev in Thread] Current Thread [Next in Thread]