octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

m file cache


From: John W. Eaton
Subject: m file cache
Date: Wed, 24 May 2006 04:36:51 -0400

On  4-May-2006, I wrote the following about path searching:

| Looking at the code in kpse.cc, I think we can do better.  The
| home-brew lists and hash tables should probably be replaced with a few
| STL containers.  I'd guess that we can reduce the 2600 lines down to
| something a bit more manageable, especially if we decide to skip the
| more elaborate expansions in path elements as you suggest.
| 
| I'm not sure of the details yet, but I think the basic searching
| algorithm could be changed so that we maintain some kind of map
| (multimap? set? multiset?) of names to directories that allows a quick
| search through the filesystem, updating the map elements for each
| directory if the directory hasn't been cached before, or if a prompt
| has been printed since the last cache update.
| 
| It's not clear to me whether there should be one large map (fast
| lookups, but forces a rehash of the entire map at each prompt) or one
| map for each directory (possibly slower lookups, but updates to each
| per-directory map can happen only as needed) but I'm leaning toward
| one map for each directory since that may also make it simpler to list
| the complete contents of the path by directory (I'm thinking of what
| is needed for the help list).

I've been giving this problem some more thought, and I think it should
be possible to avoid having to clear the entire cache if we keep a
list of files for each directory in the load path AND if directory
time stamps are reliable.  Are there any widely used filesystems where
the timestamp of a directory does not change if a file is added to or
removed from it?  I think POSIX filesystems (including NFS?) do update
timestamps correctly, but I'm not sure about Windows.  What about
network filesystems with Windows?  Samba?

If we can rely on directory time stamps, then we shouldn't have to
clear the cache.  We should only need to check to see if any of the
directories in the path have changed since the last time we checked
and then if so, we refresh just those directories and proceed with the
file lookup.  This check only needs to happen once per prompt.  The
first lookup will be slow (all directories will need to be searched)
but the rest should be fast.

Also, I'm now thinking that it might be useful to maintain both a
per-directory list of file names and a master list of file names, each
with an associated list of directories (first name in the list is the
first directory in the path where the named file is found, second name
in the list is the second directory ..., etc.).  This way we can
easily list all shadowed/overloaded functions in the load path (I
think this could be very useful when searching for @directories for
function overloading for class objects).

Comments?

jwe


reply via email to

[Prev in Thread] Current Thread [Next in Thread]