emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [FEATURE REQUEST] Timezone support in org-mode datestamps and org-ag


From: Sterling Hooten
Subject: Re: [FEATURE REQUEST] Timezone support in org-mode datestamps and org-agenda
Date: Fri, 27 Jan 2023 03:06:08 -0300

Hi all,

Collaborating around the subject of "time" is difficult; there are
subtleties abound in implementation, the perspectives people come from,
and the language used in discussions. I'm going to provide a glossary to
establish common terminology, use these terms to analyze our current
state, offer a roadmap for solving the problem in stages, suggest a
format for timestamps, urge compatibility with "exotic" use cases, and
finally call for outside help with implementing a timezone aware agenda
system.

Summary and references are at the end.

This is an initial glossary compiled from various standards and sources;
it's incomplete, probably incorrect, and open to critique, but is useful
in articulating a possible road map forward.

• Time

  Time (concept)
        What clocks measure (Einstein)
  Time axis
        Mathematical representation of the succession in time according
        to the space-time model of instantaneous events along a unique
        axis (ISO).

  Instant (object)
        A single point on time axis (ISO).
  Moment in time
        See: instant.
  Mark
        A set of symbols related to the object, or carrying some
        symbolic meaning
  Time scale
        System of ordered marks which can be attributed to instants on
        the time axis , one instant being chosen as the origin. e.g.,
        GMT, UTC, TAI.
  Basis time
        See: time scale.
  Time (mark)
        The designation of an instant on a selected time scale, used in
        the sense of time of day.
  Time interval (object)
        part of the time axis limited by two instants and, unless
        otherwise stated, the limiting instants themselves a part of
        time limited by two instants or moments in time (ISO). The
        elapsed time between two events (NIST).
  Duration (object)
        as a quantity characterizing a time interval. These can be
        written in different formats.
  UTC
        Time scale with the same rate as International Atomic Time
        (TAI), but differing from TAI only by an integral number of
        seconds.
  Offset
        Constant duration difference between times of two time scales
        (ISO). i.e., a quantity to combine with a time scale to produce
        a wall time. e.g., Nepal uses a +5:45 offset from the UTC time
        scale.
  Time shift
        See: offset.
• Calendar and civil time
  Wall time
        what shows on the clock on the wall at a location. Like "local
        system time" but needn't reference a computer to do the
        calculation.
  Standard time
        Time scale derived from UTC, by a time shift established in a
        given location by the competent authority (ISO).
  Local system time
        Local system time is determined by applying the system's time
        zone offset and year offset values to UTC. The Time of day
        system value displays the local system time. Local system time
        and system time are used interchangeably.
  Time Zone
        A place/region. Can map between wall time and a time scale with
        a table and an offset. A set of rules for determining the local
        observed time (wall time) as it relates to incremental time (as
        used in most computing systems) for a particular geographical
        region. e.g., Brasília time presently has an offset of −03:00
        from the UTC time.
  Calendar event
        A calendar object that is commonly used to represent things that
        mark time or use time. Examples include meetings, appointments,
        anniversaries, start times, arrival times, closing times.

• Implementation These concern how we actually decide to record,
  reference, or manipulate time.
  Representation
        Expression indicating a time point, time interval or recurring
        time interval. e.g., [2023-02-02 Thu 12:58 +1w], "this next
        suday at 2pm EST", 3600 seconds from Unix epoch
  Format
        A description of the abstract form used for a representation.
        e.g., [YYYY-MM-DD] 'Explain in prose relative to this moment in
        time using locale and include your timezone'
  Encoding
        The method of storing a representation of time e.g., datestruct
        in memory, Org timestamp in body of heading, value of a
        "created" key in a database
  Format syntax
        Rules that allow for parsing a encoding unambiguously into some
        time scale.
  Timestamp (mark)
        An encoded representation in a selected format. e.g., 24/01/2023
        or 2023-01-24
  Delimiting syntax
        Rules that allow for detection and extraction of an encoding.
        Necessary for encodings embedded in prose. e.g., '[]' for org
        timestamps.

  Displayed time
        The formatting of a representation exposed to a user.
  Calculating
        Manipulating a set of time points, time intervals, or recurring
        time intervals. e.g., determining instant from an offset,
        comparing two representations in some lattice.
  Incremental time
        A datetime value consisting of monotonically increasing integer
        units measured from a specific moment in time (epoch). For
        example, the moment 1970-01-02T00:00:00.000Z might have an
        incremental time value (measured in milliseconds) of 86400000,
        since there are 86,400 seconds in a day and 1000 ms in a second.
  Floating time
        A wall time value without time zone or offset information. E.g.,
        2023-01-24 or 6:45pm.
  Fixed time
        A representation of a (past or future) UTC time.
  Absolute time
        See: fixed time.
  Unfixed time (from UTC)
        A representation which is not referenced to a past or future UTC
        time. e.g., Future time given as a local time in some specified
        time zone, where changes to the definition of that time zone
        (e.g., a political decision to enact or rescind daylight saving
        time) affect the instant in time corresponding with the
        timestamp.
• Time formats
  Incremental timestamp
        Timestamps that can be directly compared: their integer values
        determine which is earlier or later. e.g., Unix seconds since
        epoch.
  Field-Based timestamp
        Timestamps which must be normalized and their individual fields
        compared. Field based times can have certain kinds of logical
        operations performed on them (for example, rolling the month
        forward or back), while incremental time requires a logical
        transformation. e.g., ISO8601 style timestamps.
  ISO Basic format
        A format which omits hyphen separators e.g., YYYYMMDD
  ISO Extended format
        A format which includes hyphen separators e.g., YYYY-MM-DD
  Extended Date/Time Format EDTF
        An extension of the ISO 8601 created by the Library of Congress
        to cover date formats and conditions useful in metadata systems
        but not handled in the ISO standard.


What does format does Org have now?

• The format currently in use for timestamps is floating, field-based,
  and has a resolution/precision of minutes.

What kinds of representations would a calendar system capable of
handling timezones require?

• Instant (fixed)
  • This is referring to an unambiguous moment in time
  • e.g., 2007-02-03T05:00:00.000Z
• Offset (fixed)
  • This captures the idea of "when did it happen for the person who
    made the observation"
  • e.g., 2007-02-03T04:00:00.000+01:00
• Instant with explicit offset and zone (fixed)
  • e.g., 2007-01-01T02:00:00.000+01:00[America/Chicago]
• Zoned local date time (floating)
  • Tricky, requires decisions about how to interpret timestamps after
    political changes.
  • e.g., 2007-01-01T01:00:00.000[America/Chicago]


I claim that before dealing with the nuances of calendar appointments,
repeating events, agenda displays etc, that Org must first support
fixed/absolute time instead of just floating time. Without some basis
time scale the conversions from time zones and offsets to some
incremental time point is impossible. Resolving this prerequisite will
also simplify the timezone discussion because we won't be mixing
calendar issues with time issues.

What would a roadmap be?

• Design and implement an absolute and offset timestamp system
  • Decide on a time scale
  • Decide on a format and syntax
  • Implement instant timestamps
  • Implement offeset timestamps
• Design and implement the time zone aware calendar system This is a
  separate project.

What time scale should Org use?

There are only two decent options, either TAI or UTC. The rest of the
world has agreed upon UTC, we should too. Conversion to TAI can be done
by users or on export.

What format and syntax should Org use?

A heretical suggestion: We should abandon the day of week abbreviation
and use a new format.

The current format generates a three leter abbreviation of the day of
the week [2023-01-25 Wed 12:12]. I suggest supporting this as a
legacy/simple format but switch to a format/encoding like
[2023-01-25T15:13:42Z] for the new system. Specifically I'm advocating
for an extended ISO 8601 format, compatible with expanded dates and
Level 2 of the EDTF, with some (bracket?) notation surrounding it such
that Org can parse the syntax as a timestamp. I advocate further for the
use of durations and repeating intervals to follow the same standard
format. Thus instead of a range being formatted as:

[2023-01-25 Wed 13:57]–[2023-01-26 Thu 13:57]

it would be:

[2023-01-25T16:57:42Z/2023-01-26T16:57:42Z].

If the square bracket delimiter syntax is insufficient or too difficult
to parse unambiguously, we could just encapsulate the ISO format in a
sub-syntax (e.g., [&&(ISO format)] similar to the [%%(diary sexp)]
technique). This is ugly, but perhaps a stepping stone during
development to separate syntax parsing concerns from calculating etc.

What are the problems with the day of the week in existing format?

• The day of the week is redundant information and can be rebuilt from
  an ISO date Any user who wishes to display a format with the day of
  the week can do so.
• It's a nonstandard format Although the Org documentation says that the
  timestamps are "inspired by the standard ISO 8601 date/time format"
  the use of a day name is not contained in the ISO specification. The
  present Org format is actually two ISO components, the date and the
  time, with a non-standard day name sandwiched between them with space
  separators. Spaces are no longer allowed in the ISO format. By
  deviating from an existing standard we place the burden of parsing on
  ourselves and make sharing more difficult.
• Day of the week is irrelevant in many situations Looking at timestamps
  from a year ago it's often the case that what day of the week it was
  created is unimportant.

What are the advantages of switching to a standard format for the new
system?

• We can allow the legacy/simple system to coexist and interpret it as a
  floating timestamp This simplifies the issues of maintaining
  compatibility with existing org documents. It also placates those who
  have single user systems in a single time zone who do not want to have
  any calendar complexity imposed on them.
• We have a way of distinguishing new timestamps from legacy/simple ones
  By making a change in syntax we reduce (or eliminate?) the possibility
  of ambiguity between "which version" of a timestamp is being parsed. A
  legacy timestamp can be treated as such, and new timestamps are easily
  identified by the 'T' present instead of spaces, or in the delimiters
  wrapping the representation.
• We free ourselves from the constraints of the legacy timestamp format
  Trying to engineer a new syntax which also parses as an extension of
  the legacy one is more complex and embeds things like "day of the
  week" and the use of spaces as separators into this new system. Easier
  to have two side by side.
• We can defer to existing parsing and calculating systems There are
  already programs written which support the ISO standard and EDTF.
  • We can directly (or nearly directly) import the regular expressions
    and parsing mechanisms already written.
  • These enable decent testing suites as we build the system, as we can
    check against existing packages to see if our parsing and
    calculations agree.
  • Users who wish to use external libraries (irrespective of language
    or license) can extract the new timestamp and parse or calculate
    externally.
• Org is part of a standard
  • We are able to defer to experts and 35 years of knowledge rather
    than debate among ourselves
  • Interfacing with other programs is simplified as the area inside the
    delimiter notation can be passed as a string without parsing.
  • New users and collaborators can be onboarded faster without needing
    to learn a new system
  • Org documentation can refer to the standard instead of bearing the
    burden of exposition.
• The move to include time zones in the format is simplified
  • The ISO standard has recently adopted a format for time zones from
    RFC3339 and JAVAZDT, we can adhere to 8601 and keep things
    consistent.


What other perspectives should the new format support?

In addition to the representations necessary for a timezone aware
calendar system, I suggest the new format be compatible with two other
representations: finer/ arbitrary resolution for scientific work, and
Level 2 of the Extended Date/Time Format for bibliographic and metadata
systems.

Although most implementations come from the computer/database
perspective, where precision is limited by clock speed, scientific data
may be finer grained. Adopting a format which allows for arbitrary
precision enables Org to be useful in more scenarios. This would allow
data of higher frequency to be collected and stored into org headings as
a plain text database. Even if the data was stored externally it would
be convenient to be able to comment or discuss collected data by
referencing its time point.

The Extended Data/Time Format (EDTF) was designed by the Library of
Congress to address limitations of the ISO standard for metadata and
archival purposes. A draft specification was created in 2012 and EDTF
functionality has now been integrated into ISO 8601-2019. Of great
interest is the ability to express the concepts of uncertainty and
approximation. Archival work includes scenarios where the precise date
may be unknown, so a format was created with qualifiers capable of
handling these situations. In the EDTF format '1984?' expresses possibly
the year 1984, but not definitely, while '2004-06~' expresses year-month
approximate. This format has been implemented by multiple library
systems and in 2021 Wikibase added an extension to support EDTF.

The initial technical or code burden to support these perspectives is
minimal. Both can be parsed and calculated with by existing libraries,
and the functionality to actually calculate with them can be delayed.
The important thing is selecting a format which won't exclude them.

That these features are omitted in many systems as result of the
restricted domain and the data types used for storage; Org does not have
these constraints. Further, both of these communities tend to attract
people who are talented and sympathetic with (even occasionally funded
to support!) open source projects. By expanding Org's format to be more
inclusive we provide a haven rather than shutting them out.

The calendar implementation should elicit help from experts

Everyone seems in agreement that leveraging existing libraries is
desirable. We should also read and defer to documentation and
recommendations available from legitimate projects (e.g., W3, ISO). But
I think these are still insufficient for architecting an elegent time
system capable of satisfying the various perspectives. Calendar
applications in particular contain many edge cases and decisions about
display and interface etc. The knowldege concerning these is more likely
tacit than explicit, so I suggest we reach out to people who have
already designed/engineered solutions and get their input.

Here are some projects, organizations, or perspectives we could seek
help from:

• Calendar applications
  • ical standard
  • CalConnect standard
  • Thunderbird/lightning calendar
  • Google calendar
  • Outlook
  • Lotus notes
• Standard organizations
  • NIST
  • ISO
• Database or computer applications
  • SQL
  • Oracle
  • Java's time system
  • Numpy
  • Rust
• Archival or research users
  • Library of congress
  • Metadata systems
• Academic users
  • History
• Scientific users
  • Astronomers
  • Physicists
  • Chemists
  • Geologists
  • Metrologists

To summarize:

Org presently only supports simple floating timestamps. A calendar
system capable of handling time zones requires some form of fixed or
incremental timestamp with offsets. We can solve the absolute timestamp
system first, and deal with calendar concerns after. If we're
implementing a new time system the format and syntax should allow for
"exotic" use cases like arbitrary precision, uncertainty, and expanded
dates. The mechanics for calculating with those exotic cases needn't be
implemented by Org immediately.

We should adopt UTC as the time scale, EDTF (an extension of ISO 8601)
as the time format, and merely encapsulate this format with a delimiting
syntax (using brackets if possible) that Org can parse and distinguish
from the present format. The existing Org format should be considered
simple/legacy and can be interpretted or translated internally into the
new system as calculations require. The new format can be implemented
alongside the simple/legacy system.

This discussion of absolute offset timestamps should be split off from
timezone, calendar, and display concerns. Implementing a calendar
application with timezones is complicated and we should seek help from
those who have built the systems from before.

References:

Time

https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en
https://www.w3.org/International/articles/definitions-time/
https://www.ibm.com/docs/en/i/7.3?topic=concepts-time
https://tc39.es/proposal-temporal/docs/ambiguity.html

EDTF

https://www.loc.gov/standards/datetime/ Main page on EDTF
https://edtf.wikibase.wiki/wiki/Property:P1 Has examples of EDTF codes
https://www.wikibase.consulting/wikibase-edtf/ Wikibase implemented
EDTF in 2021
https://github.com/ProfessionalWiki/WikibaseEdtf#wikibase-edtf
https://github.com/corylown/edtf-humanize Transform EDTF strings into
human friendly display https://github.com/unt-libraries/edtf-validate
Validate EDTF strings https://github.com/plk/biblatex/issues/656
Discussion of Biblatex's implementation of EDTF
https://www.npmjs.com/package/edtf Parser for EDTF
https://github.com/inukshuk/edtf.js/tree/main Parser for EDTF

Implemention details

https://www.w3.org/TR/international-specs/#loc_time
https://dev.mysql.com/doc/refman/5.7/en/date-and-time-type-syntax.html

Time zones

https://datatracker.ietf.org/doc/draft-ietf-sedate-datetime-extended/
An extension syntax for representing time zone. We should follow this.
Very helpful for implementing time zones.
https://www.w3.org/TR/timezone/#representing Very relevant
https://www.w3.org/International/core/2005/09/timezone.html#IDALFAT

Calendar and scheduling

https://www.calconnect.org/resources/glossary




reply via email to

[Prev in Thread] Current Thread [Next in Thread]