bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parse_time()


From: Bruce Korb
Subject: Re: parse_time()
Date: Sun, 02 Nov 2008 13:21:57 -0800
User-agent: Thunderbird 2.0.0.12 (X11/20071114)

Bruno Haible wrote:

Looking at wikipedia [1], I would find it good if
  1) the function was called 'parse_duration', not 'parse_time' (since "time"
     often denotes a time instant within a day),
  2) the three duration formats described in [1] were also supported.

Hi Bruno,
You'll find it "good" now.  Makes it a little longer though.  About 800
lines worth.  Probably could be condensed, but it would take some thinking
about it.

It is much more forgiving than the ISO-8601 spec.  It doesn't worry
about counts being lower than the container size.  The result just
has to fit in a time_t value.  So there are lots of ways:

  PyyyymmddThhmmss
  P nnnn Y nn M nn D T nn H nn M nn S
  nn d HH:MM:SS
  SSSS
  MMM:SS

whatever.  :)  There are probably bugs now because there are a lot of
permutations allowed.  Undoubtedly some are untested.

Cheers - Bruce
Bruno

[1] http://en.wikipedia.org/wiki/ISO_8601#Durations



/* Parse a time duration and return a seconds count
   Copyright (C) 2008 Free Software Foundation, Inc.
   Written by Bruce Korb <address@hidden>, 2008.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

/*

http://en.wikipedia.org/wiki/ISO_8601#Durations

Durations are a component of time intervals and define the amount of
intervening time in a time interval. They should only be used as part of a time
interval as prescribed by the standard. Time intervals are discussed in the
next section.

Durations are represented by the format P[n]Y[n]M[n]DT[n]H[n]M[n]S or P[n]W as
shown to the right. In these representations, the [n] is replaced by the value
for each of the date and time elements that follow the [n]. Leading zeros are
not required, but the maximum number of digits for each element should be
agreed to by the communicating parties. The capital letters 'P', 'Y', 'M', 'W',
'D', 'T', 'H', 'M', and 'S' are designators for each of the date and time
elements and are not replaced.

    * P is the duration designator (historically called "period") placed
           at the start of the duration representation.
    * Y is the year designator that follows the value for the number of years.
    * M is the month designator that follows the value for the number of months.
    * W is the week designator that follows the value for the number of weeks.
    * D is the day designator that follows the value for the number of days.
    * T is the time designator that precedes the time components of the
           representation.
    * H is the hour designator that follows the value for the number of hours.
    * M is the minute designator that follows the value for the number of
           minutes.
    * S is the second designator that follows the value for the number of
           seconds.

For example, "P3Y6M4DT12H30M5S" represents a duration of "three years, six
months, four days, twelve hours, thirty minutes, and five seconds". Date and
time elements including their designator may be omitted if their value is zero,
and lower order elements may also be omitted for reduced precision. For
example, "P23DT23H" and "P4Y" are both acceptable duration representations.

To resolve ambiguity, "P1M" is a one-month duration and "PT1M" is a one-minute
duration (note the time designator, T, that precedes the time value). The
smallest value used may also have a decimal fraction, as in "P0.5Y" to indicate
half a year. The standard does not prohibit date and time values in a duration
representation from exceeding their "carry-over points" except as noted
below. Thus, "PT36H" could be used as well as "P1DT12H" for representing the
same duration.

Alternately, a format for duration based on combined date and time
representations may be used by agreement between the communicating parties
either in the basic format PYYYYMMDDThhmmss or in the extended format
P[YYYY]-[MM]-[DD]T[hh]:[mm]:[ss]. For example, the same duration as shown above
would be "P0003-06-04T12:30:05". However, individual date and time values
cannot exceed their "carry-over point" (ex., a value of "13" for the month or
"25" for the hour would not be permissible).

 */
#include <config.h>

#include <ctype.h>
#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#ifndef _
#define _(_s)  _s
#endif

#ifndef NUL
#define NUL '\0'
#endif

typedef enum {
  NOTHING_IS_DONE,
  DAY_IS_DONE,
  HOUR_IS_DONE,
  MINUTE_IS_DONE,
  SECOND_IS_DONE
} whats_done_t;

#define SEC_PER_MIN     60
#define SEC_PER_HR      (SEC_PER_MIN * 60)
#define SEC_PER_DAY     (SEC_PER_HR  * 24)
#define SEC_PER_WEEK    (SEC_PER_DAY * 7)
#define SEC_PER_MONTH   (SEC_PER_DAY * 30)
#define SEC_PER_YEAR    (SEC_PER_DAY * 365)

#define TIME_MAX        0x7FFFFFFF
#define BAD_TIME        ((time_t)~0)

static time_t inline
scale_n_add(time_t base, time_t val, int scale)
{
  if (base == BAD_TIME)
    {
      if (errno == 0)
        errno = EINVAL;
      return BAD_TIME;
    }

  if (val > TIME_MAX / scale)
    {
      errno = ERANGE;
      return BAD_TIME;
    }

  val *= scale;
  if (base > TIME_MAX - val)
    {
      errno = ERANGE;
      return BAD_TIME;
    }

  return base + val;
}

static time_t
parse_hr_min_sec(time_t start, char * pz)
{
  int lpct = 0;

  /* For as long as our scanner pointer points to a colon *AND*
     we've not looped before, then keep looping.  (two iterations max) */
  while ((errno == 0) && (*pz == ':') && (lpct++ == 0))
    {
      unsigned long v;

      pz++;
      errno = 0;
      v = strtoul(pz, &pz, 10);
      if (errno != 0)
        return BAD_TIME;

      start = scale_n_add(v, start, 60);
    }

  /* allow for trailing spaces */
  while (isspace(*pz))   pz++;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return start;
}

static time_t
parse_year_month_day(char const * pz, char const * ps)
{
  time_t res = 0, val;

  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (pz != ps)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_YEAR);


  ps = strchr(++pz, '-');
  if (ps == NULL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (pz != ps)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_MONTH);

  pz++;
  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  while (isspace((int)*pz))   pz++;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return scale_n_add(res, val, SEC_PER_DAY);
}

static time_t
parse_yearmonthday(char const * in_pz)
{
  time_t res = 0, val;
  char buf[8];
  char * pz;

  memcpy(buf, in_pz, 4);
  buf[4] = NUL;
  if (strlen(buf) != 4)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(0, val, SEC_PER_YEAR);

  memcpy(buf, in_pz + 4, 2);
  buf[2] = NUL;
  if (strlen(buf) != 2)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_MONTH);

  memcpy(buf, in_pz + 6, 2);
  if (strlen(buf) != 2)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_DAY);
  if (in_pz[8] != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return res;
}

static time_t
parse_YMD(char const * pz)
{
  time_t res = 0, val;
  char * ps = strchr(pz, 'Y');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(0, val, SEC_PER_YEAR);
    }

  ps = strchr(pz, 'M');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(res, val, SEC_PER_MONTH);
    }

  ps = strchr(pz, 'D');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(res, val, SEC_PER_DAY);
    }

  while (isspace((int)*pz))   pz++;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return res;
}

static time_t
parse_hour_minute_second(char const * pz, char const * ps)
{
  time_t res = 0, val;

  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (pz != ps)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_HR);


  ps = strchr(++pz, ':');
  if (ps == NULL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (pz != ps)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_MIN);

  pz++;
  errno = 0;
  val = strtoul(pz, (char **)&pz, 10);
  if (errno != 0)
    return BAD_TIME;
  while (isspace((int)*pz))   pz++;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return scale_n_add(res, val, 1);
}

static time_t
parse_hourminutesecond(char const * in_pz)
{
  time_t res = 0, val;
  char buf[8];
  char * pz;

  memcpy(buf, in_pz, 2);
  buf[2] = NUL;
  if (strlen(buf) != 2)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(0, val, SEC_PER_HR);

  memcpy(buf, in_pz + 2, 2);
  buf[2] = NUL;
  if (strlen(buf) != 2)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, SEC_PER_MIN);

  memcpy(buf, in_pz + 4, 2);
  if (strlen(buf) != 2)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  errno = 0;
  val = strtoul(buf, &pz, 10);
  if (errno != 0)
    return BAD_TIME;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  res = scale_n_add(res, val, 1);
  if (in_pz[8] != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return res;
}

static time_t
parse_HMS(char const * pz)
{
  time_t res = 0, val;
  char * ps = strchr(pz, 'H');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(0, val, SEC_PER_HR);
    }

  ps = strchr(pz, 'M');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(res, val, SEC_PER_MIN);
    }

  ps = strchr(pz, 'S');
  if (ps != NULL)
    {
      errno = 0;
      val = strtoul(pz, (char **)&pz, 10);
      if (errno != 0)
        return BAD_TIME;
      while (isspace((int)*pz))   pz++;
      if (pz != ps)
        {
          errno = EINVAL;
          return BAD_TIME;
        }
      pz++;
      res = scale_n_add(res, val, 1);
    }

  while (isspace((int)*pz))   pz++;
  if (*pz != NUL)
    {
      errno = EINVAL;
      return BAD_TIME;
    }

  return res;
}

static time_t
parse_time(char const * pz)
{
  char * ps;

  time_t res = 0;
  time_t val;

  /*
   *  Scan for a hyphen
   */
  ps = strchr(pz, ':');
  if (ps != NULL)
    {
      res = parse_hour_minute_second(pz, ps);
    }

  /*
   *  Try for a 'H', 'M' or 'S' suffix
   */
  else if (ps = strpbrk(pz, "HMS"),
           ps == NULL)
    {
      /* Its a YYYYMMDD format: */
      res = parse_hourminutesecond(pz);
    }

  else
    res = parse_HMS(pz);

  return res;
}

/*
 *  Parse the year/months/days of a time period
 */
static time_t
parse_period(char const * in_pz)
{
  char * pz  = xstrdup(in_pz);
  char * pT  = strchr(pz, 'T');
  char * ps;
  void * fptr = pz;

  time_t res = 0;
  time_t val;

  if (pT != NUL)
    *(pT++) = NUL;

  /*
   *  Scan for a hyphen
   */
  ps = strchr(pz, '-');
  if (ps != NULL)
    {
      res = parse_year_month_day(pz, ps);
    }

  /*
   *  Try for a 'Y', 'M' or 'D' suffix
   */
  else if (ps = strpbrk(pz, "YMD"),
           ps == NULL)
    {
      /* Its a YYYYMMDD format: */
      res = parse_yearmonthday(pz);
    }

  else
    res = parse_YMD(pz);

  if ((errno == 0) && (pT != NULL))
    {
      val = parse_time(pT);
      res = scale_n_add(res, val, 1);
    }

  free(fptr);
  return res;
}

time_t
parse_duration(char const * in_pz)
{
  whats_done_t whatd_we_do = NOTHING_IS_DONE;

  char * pz;
  time_t val;
  time_t res = 0;

  while (isspace(*in_pz))      in_pz++;
  if (*in_pz == 'P')
    {
      res = parse_period(in_pz + 1);
      if ((errno != 0) || (res == BAD_TIME))
        goto bad_time;
      return res;
    }

  if (*in_pz == 'T')
    {
      res = parse_time(in_pz + 1);
      if ((errno != 0) || (res == BAD_TIME))
        goto bad_time;
      return res;
    }

  if (! isdigit((int)*in_pz))  goto bad_time;
  pz = (char *)in_pz;

  do  {

    errno = 0;
    val = strtol(pz, &pz, 10);
    if (errno != 0)
      goto bad_time;

    /*  IF we find a colon, then we're going to have a seconds value.
        We will not loop here any more.  We cannot already have parsed
        a minute value and if we've parsed an hour value, then the result
        value has to be less than an hour. */
    if (*pz == ':')
      {
        if (whatd_we_do >= MINUTE_IS_DONE)
          break;

        val = parse_hr_min_sec(val, pz);

        if ((errno != 0) || (res > TIME_MAX - val))
          break;

        if ((whatd_we_do == HOUR_IS_DONE) && (val >= SEC_PER_HR))
          break;

        /* Check for overflow */
        if (res > TIME_MAX - val)
          break;

        return res + val;
      }

    {
      unsigned int mult;

      while (isspace(*pz))   pz++;

      switch (*pz)
        {
        default:  goto bad_time;
        case NUL:
          /* Check for overflow */
          if (res > TIME_MAX - val)
            goto bad_time;

          return val + res;

        case 'd':
          if (whatd_we_do >= DAY_IS_DONE)
            goto bad_time;
          mult = SEC_PER_DAY;
          whatd_we_do = DAY_IS_DONE;
          break;

        case 'h':
          if (whatd_we_do >= HOUR_IS_DONE)
            goto bad_time;
          mult = SEC_PER_HR;
          whatd_we_do = HOUR_IS_DONE;
          break;

        case 'm':
          if (whatd_we_do >= MINUTE_IS_DONE)
            goto bad_time;
          mult = SEC_PER_MIN;
          whatd_we_do = MINUTE_IS_DONE;
          break;

        case 's':
          mult = 1;
          whatd_we_do = SECOND_IS_DONE;
          break;
        }

      /*  Check for overflow:  value that overflows or an overflowing
          result when "val" gets added to it.  */
      if (val > TIME_MAX / mult)
        break;

      val *= mult;
      if (res > TIME_MAX - val)
        break;

      res += val;

      while (isspace(*++pz))   ;
      if (*pz == NUL)
        return res;

      if (! isdigit(*pz))
        break;
    }

  } while (whatd_we_do < SECOND_IS_DONE);

 bad_time:

  fprintf(stderr, _("Invalid time duration:  %s\n"), in_pz);
  errno = EINVAL;
  return BAD_TIME;
}

/*
 * Local Variables:
 * mode: C
 * c-file-style: "gnu"
 * indent-tabs-mode: nil
 * End:
 * end of parse-duration.c */

reply via email to

[Prev in Thread] Current Thread [Next in Thread]