The year 2000 is predicted to bring chaos to software
which is unable to handle dates beyond 1999. The question is
what effect the change of century will have on the Internet,
Web and Apache in particular. This feature shows what the risks
are.
First published in Apache Week issue 56 (23
June 1997).
The theory of the year 2000 problem is that many older
programs use only two digits for the date, such as "97" or
"06". This might be part of the internal storage, input
fields, output display, or network communcation protocol. If
a program does use a two digit date, it might either not
accept year 2000 dates such as "02", or it might make
incorrect comparisons (thinking that 02 is earlier than 97,
because it assumes that 02 is 1902). There are some areas
where two digit years are widely used - for example, on
credit card expiry dates - and the software which handles
these dates will have to be capable of knowing that smaller
values for the date are really in the 21st century.
There are three things which can affect how Apache treats
year 2000 issues:
-
Apache code itself
-
The HTTP and other protocols that Apache implements
-
The underlying operating system
The Apache code internally never stores years as two digits -
it processes dates and times as standard Unix time epochs
(the number of seconds since 1st January 1970). When it
outputs the year (e.g. to the log file) it writes years as
four digits.
The HTTP protocol may be more troublesome. It allow for three
different date formats in requests and responses, one of
which uses a two-digit year. Dates are used on every
response, in fields such as "Date", "Last-Modified" and
"Expires", and requests can contain dates in the
"If-Modified-Since" and similar fields. The date formats
listed in HTTP/1.1 and HTTP/1.0 are:
-
Sun, 06 Nov 1994 08:49:37 GMT (defined in RFC
822 as updated by RFC 1123)
-
Sunday, 06-Nov-94 08:49:37 GMT (defined in RFC
850 and RFC 1036)
-
Sun Nov 6 08:49:37 1994 (as defined in ANSI
C's asctime() format)
The first format is the only one that HTTP/1.1 servers are
allowed to generate, and Apache uses it. This format includes
a four-digit date. However to be compatible with older
browsers and servers, Apache recognizes the other formats.
The main problem will be older applications which generate
RFC850 format dates - these only have a two digit date field.
RFC850 format was used in early web servers and browsers, and
the replacement with RFC1132 format in in early 1990's was
not fully documented until HTTP/1.0 was published in 1996.
However if Apache sees this format and the year is before
1970, it assumes that the first two digits of the four digit
year are "20" rather than "19".
The final area which affects Apache's ability to handle dates
is the underlying operating system. If the OS has problems
with dates past year 2000, Apache will as well. Most Unix
systems store dates internally as 32 bit integers which
contain the number of seconds since 1st January 1970. This
allows dates up to the year 2038 to be stored. For dates past
2038, the OS will have to be updated to store dates in larger
fields (for example, as a 64 bit value).
There may also be problems before 2038 with OS calls which
accept or return year numbers. For example, many date
functions use a structure called tm which
contains a field tm_year. This field holds the
number of years since 1900, so for example the year 2002 will
be stored as 102. This should not be a problem, provided that
the OS and applications do not assume that the
tm_year value is always a two-digit year between
1900 and 1999. All modern operating systems should be ok.