libtool
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

path normalization


From: Ralf Wildenhues
Subject: path normalization
Date: Tue, 18 Jan 2005 12:27:45 +0100
User-agent: Mutt/1.4.1i

One step toward integrating Linux multilib support, but a Libtool
requirement independent of that goal, is comparison of normalized
paths.  In a nutshell, I'd like to be able to decide that
  ../foo/../lib
  ../lib
are equal.

Unfortunately, libtool so far has neither required its input to be
normalized nor implemented that normalization itself.  Thus, we have
to deal with installed .la files which have unnormalized paths for
ever, thus adding such a requirement as an afterthought is hopeless.

Now I figured we may need to compare all sorts of paths,
- relative or absolute (but maybe not one group against the other)
- existing or not existing.
Thus `pwd -L' or portable approximations thereof won't work in all
cases.  If we ever have to compare relative to absolute paths, I
think we can rely on them being present.

This is my first try at a shell function that implements this with
sed (and little overhead in most trivial cases).  I'm posting it
because it's not trivial, and I'd like to know about bugs in it or 
general comments on this problem (before integrating into Libtool)
or the choices I had to make about normalization, or possible
simplification.  The size of the script is partly due to the fact
that we cannot use alternation `\|'.

You'd also do me a favor if you tried this on your system and reported
back any output (it does not output anything if all goes ok) or non-
zero exit status.  Your sed may require removing all comments from
the sed script -- filtering the whole thingy through
  sed '/^[      ]\{1,\}#/d'
should do the trick (space and tab within [ ]).  Please try different
shells as well.

This function keeps trailing slashes on purpose -- an IMHO independent
task.  I added necessary shell-sanitize blob for ease of testing.

Thanks for reading this far,
Ralf

--- cut here ---
#! /bin/sh

if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then
  emulate sh
  NULLCMD=:
  # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which
  # is contrary to our usage.  Disable this feature.
  alias -g '${1+"$@"}'='"$@"'
  setopt NO_GLOB_SUBST
elif test -n "${BASH_VERSION+set}${KSH_VERSION+set}" && (set -o posix) 
>/dev/null 2>&1; then
  set -o posix
fi
BIN_SH=xpg4; export BIN_SH # for Tru64
DUALCASE=1; export DUALCASE # for MKS sh

set -e

for ECHO in "${ECHO-echo}" 'print -r' 'printf %s\n' false
do
  if test "`($ECHO '\t') 2>/dev/null || :`" = '\t'; then break; fi
done

: ${SED=sed}
: ${Xsed="$SED -e s,^X,,"}
: ${VERBOSE=false}

# func_path_normalize pathname
# Remove /./ and /../ parts from PATHNAME.
# Do not fork in most of the trivial cases,
# respect the number of consecutive slashes at the beginning,
# DOS drive letters,
# trailing slash,
# absolute and relative paths.
# We do not honor newlines in PATHNAME,
# backslashes are always treated as separators,
# DOS paths may not contain a colon (except for the drive letter).
func_path_normalize ()
{
    case $1 in
      *[\\/]..* | *[\\/].[\\/]* | *[\\/]. )
      func_path_normalize_result=`$ECHO "X$1" | $Xsed -e '
        # remove multiple slashes except at beginning
        s#\([^\\/]\)\([\\/]\)\{1,\}#\1\2#g

        #### /./

        :one
        # common case
        s#\([^\\/][\\/]\)\.[\\/]#\1#
        # at beginning
        s#^\([\\/]\{1,\}\)\.[\\/]#\1#
        # /. at end
        s#\([\\/]\)\.$#\1#
        t one

        #### /../

        # three cases for path elements:
        #   foo
        #   .foo
        #   ..foo
        # where foo is nonempty and may not start with a dot.

        # DOS drive letters
        /^[A-Za-z]:/ {
          :dos
          s#^\(..[\\/]\)[^\\/.][^\\/]*[\\/]\.\.#\1#
          s#^\(..[\\/]\)\.[^\\/.][^\\/]*[\\/]\.\.#\1#
          s#^\(..[\\/]\)\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
          t dos

          # common case
          s#\([^\\/:]\)[\\/][^\\/.][^\\/]*[\\/]\.\.#\1#
          s#\([^\\/:]\)[\\/]\.[^\\/.][^\\/]*[\\/]\.\.#\1#
          s#\([^\\/:]\)[\\/]\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
          t dos

          s#^\(..\)[^\\/]\{1,\}[\\/]\.\.[\\/]*$#\1#

          # we may have picked up multiple slashes again
          s#\([^\\/]\)\([\\/]\)\{1,\}#\1\2#g

          b end
        }

        # common case
        :common
        s#\([^\\/]\)[\\/][^\\/.][^\\/]*[\\/]\.\.#\1#
        s#\([^\\/]\)[\\/]\.[^\\/.][^\\/]*[\\/]\.\.#\1#
        s#\([^\\/]\)[\\/]\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
        t common

        # do not add slashes to the root
        :root
        s#^\([\\/]\{1,\}\)[^\\/.][^\\/]*[\\/]\.\.[\\/]*$#\1#
        s#^\([\\/]\{1,\}\)\.[^\\/.][^\\/]*[\\/]\.\.[\\/]*$#\1#
        s#^\([\\/]\{1,\}\)\.\.[^\\/]\{1,\}[\\/]\.\.[\\/]*$#\1#
        t root

        # root special cases
        s#^[^\\/]\{1,\}[\\/]\.\.$#.#
        : end
        s#^\([\\/]\{1,\}\)\.\{1,2\}$#\1#
        '`;;
      *) func_path_normalize_result=$1;;
    esac
}

#  input                          output (desired)
tests='
/                                 /
/.                                /
/..                               /
..                                ..
../x                              ../x
a                                 a
a/b                               a/b
a/b/..                            a
a/b/../                           a/
a/b/../c                          a/c
a/b/c/d/../../../                 a/
a/b/c/d/../../..                  a
//a/b/c/d/../../../..             //
//a/b/c/d/../../../../            //
../x/../y                         ../y
../x/../y/                        ../y/
../../../a/b/../c/../../d         ../../../d
../../..//a/b//..////c/..//../d   ../../../d
/a/..                             /
/a/../                            /
a/b/..                            a
/.lib/..                          /
//a                               //a
///a/b/../c/../d                  ///a/d
C:\a\b\c\..\d                     C:\a\b\d
C:\a\b\c\..\d\                    C:\a\b\d\
C:\a\..                           C:\
C:\a\b\..\..                      C:\
C:\a\b\..\..\                     C:\
C:a\b\c\..\d                      C:a\b\d
C:a\b\c\..\d\                     C:a\b\d\
C:a\..                            C:
C:a\b\..\..                       C:
C:a\b\..\..\                      C:
.                                 .
.x                                .x
./                                ./
./x                               ./x
././x                             ./x
x/.                               x/
x/..                              .
x/./                              x/
x/././                            x/
/./                               /
///./                             ///
/x/./../                          /
/x/y/..                           /x
/x/y/../                          /x/
/x/y/./../..                      /
/x/y/./../../                     /
'

set dummy $tests
shift
while :
do
  in=$1; out=$2
  if test -z "$in"; then break; fi
  func_path_normalize "$in"
  if $VERBOSE || test "X$out" != "X$func_path_normalize_result"; then
    $ECHO "|$in|   |$func_path_normalize_result|   |$out|"
  fi
  shift; shift
done




reply via email to

[Prev in Thread] Current Thread [Next in Thread]