In this issue
Release: 1.1.1
Beta: None
Bugs in 1.1.1:
-
Language negotiation doesn't match languages against
sub-languages, i.e. it treats en and en-US as completely
different languages.
-
Doesn't support HTTP continuation headers
There has been some discussion over whether to make a Windows
NT release of Apache. A couple of groups have ported Apache
to NT, but these have consisted of rewrites rather than
integrating NT support into the current Apache code. At
present there is no source code released for these ports.
Currently, Apache supports a range of different Unix systems,
and OS/2. Supporting NT would be a logical extension, and
would fit in well with plans for a multi-threaded version
since NT is a multi-threaded operating system. However there
are decisions to take about how the Apache source code should
be structured to support multiple different platforms.
Ideally, the code should be as similar as possible across all
systems, with only changes to lower-level system-dependent
parts.
The following items are under development for the next
release of Apache.
Configuration file simplified
The process of configuring Apache has been simplified for the
next release. It still involves editing the
Configuration file, but this file has been
re-written to make it easier. Previously, selecting an
operating system involved finding the appropriate section in
the file, then uncommenting one or more lines. Now that is
reduced to simply uncommenting a line like
'PLATFORM=SOLARIS2'.
In addition, previously the various compile time options,
such as 'STATUS' for extra status information, required
making manual additions to the CFLAGS line. These have been
replaced by simple 'Rule' lines. For example, to use the
extra status information, the line 'Rule STATUS=yes' would be
included in the Configuration file.
This new configuration system will be in place for release
1.2. After that, the next major release (2.0) will probably
have a significantly different configuration procedure. This
might include fully automatic detection of the operating
system and its capabilities.
Debugging CGI Scripts
Problems with CGI scripts can be notoriously difficult to
identify. To help make it easier, the next release of Apache
will let the administrator log the input and output of the
script.
The ScriptLog directive sets a log file to receive
the debugging information. Each time Apache has a problem
with a CGI program it will log all the relevant details such
as the URL, CGI filename, request headers, POST data, script
output and script error output. Additional directives can
limit the total log file size, and the size of POST data
logged.
Config Log to be Default
Apache currently comes with two modules for doing logging:
the 'common log' module logs in standard Common Log Format
(CLF) and is compiled in by default, while the 'config log'
module makes the log file format customizable (but defaulting
to working exactly like the common log module unless
configured otherwise). From the next release the config log
module will be the default.
With the config log module, the LogFormat directive
is used to set a format for the log file (it defaults to CLF
format). Any of the CLF elements can be logged, along with
any incoming or outgoing header. In addition, the module can
perform simple tests before logging some items: for example,
it can be told to log referer headers only when the request
failed.
Besides becoming the default log module, the config log has
some addition things it can log: the host and port of the
request, the duration of the request, the outgoing
content-type, and a configurable format for the date and
time.
New Default Modules
So far, there will be three major changes to the modules
included in the next release of Apache:
The use of server-side-includes is covered in our feature
Executing CGI as Other Users
The next release of Apache will include the ability to
execute some scripts as users other than the main server
owner. At present, when a CGI is executed, it runs as the
user specified by the User directive on the
configuration files. While this is fine on small sites, when
a site offers Web space to different users (such as with
multiple virtual hosts), the use of single user means that
any user can access (and potentially change) any other user's
data. The way around this at present is to use a 'setuid'
wrapper script (See Apache
Week issue 18 for more information about wrapper
scripts).
The next Apache release will include such a wrapper program.
The server itself will also be updated to set the user that a
script runs as in a couple of ways: firstly, the
User directive can be used inside
<VirtualHost> sections to set a user for that
VHost, and secondly if a request comes in for a URL starting
/~user the script will be run as the 'user' named in the URL.
This also applies to other sub-processes, for example,
commands run from server-side-includes. It is also planned to
allow the user to be set for each directory in a future
release, but this might not make it into Apache 1.2.
To enable this functionality, the Apache API will have a new
function call, call_exec. Modules which run
sub-programs (such as the CGI and includes module) now call
this function to run the program as the correct user.
Encoding and Content Type Duplication
The extensions .Z and .gz usually represent the 'encodings'
for Unix compress and gzip. Apache can be configured to set
the encoding on the transmitted reply with
AddEncoding. This lets browsers decode the file
before handling it. However, the default mime.types file with
Apache also includes entries for .gz and .Z scripts, so their
content type will be set to the encoding scheme, which is
wrong. This lines should be removed from mime.types, and will
not be included in the next release.
Each different browser implements a different range of HTML
commands. Some extend HTML with their own additions. It is
common to now see pages marked as 'designed for Netscape' or
'best viewed with Internet Explorer'. These pages are
clearing indicating a browser preference and might be
off-putting to people with other browsers. Even when designed
for a single browser, different releases of the browser have
different capabilities. Older versions of Netscape and MSIE
do not support frames, tables or java, for instance.
Content providers who want their content to be acceptable on
a wide range of browsers of various ages need to write pages
carefully, and often cannot take advantage of the latest
features. It would be better to have a way that the server
could somehow know what browser is being used, and what its
capabilities are, and tailor the HTML it sends to the browser
appropriately.
Of course, one way of tailoring responses is to have links to
different pages (e.g. "click here for a non-frames version"),
but that is intrusive and implies that the user knows exactly
what their browser can do. A better alternative is to get the
server to automatically output the correct HTML, using either
CGI or server side includes (SSIs). Both methods use the
USER_AGENT HTTP header which says what browser is
being used. For example, a perl CGI script might contain the
code:
if ($ENV{'HTTP_USER_AGENT'} =~ m|^Mozilla/[2-3]|) {
$tables = 1; }
The trouble with this is that the knowledge about the browser
capabilities needs to be repeated in every CGI script or SSI.
As new browsers come out and others are updated, modifying
every CGI or SSI which uses this would be tedious.
However, the next release Apache will make it easier for both
CGI and SSIs to know what capabilities the browser has. An
additional module will set environment variables based on
user-definable rules which match the USER_AGENT. The
administrator only has to maintain the knowledge about the
browser capabilities in one place: the Apache config file.
Then every CGI and SSI can use the environment variables to
tailor their output, if desired.
For example, the following rules could be used:
BrowserMatch ^Mozilla/[2-3] tables java
The first argument is a regular expression to match against
the USER_AGENT. If it matches, the rest of the line is
treated as enviroment variables to set (and these can be
specified with values, for instance "html=3.2").
Now CGI scripts and SSI files can use the environment
variable tables to determine if the browser supports tables.
Using XSSI, it is easy to create tailored HTML:
<!--#if expr="$tables" -->
<table>
<tr><td>Welcome to my Page!
...
</table>
<!--#else -->
<h1>Welcome to my Page!</h1>
...
<!--#endif -->
While standard HTML files are fine for storing pages, it is
very useful to be able to create some content dynamically.
For example, to add a footer or header to all files, or to
insert document information such as last modified times
automatically. This can be done with CGI, but that can be
complex and requires programming or scripting skills. For
simple dynamic documents there is an alternative:
server-side-includes (SSI).
SSI lets you embed a number of special 'commands' into the
HTML itself. When the server reads an SSI document, it looks
for these commands and performs the necessary action. For
example, there is an SSI command which inserts the document's
last modification time. When the server reads a file with
this command in, it replaces the command with the appropriate
time.
Apache includes a set of SSI commands based on those found in
the NCSA server. This is implemented by the includes module
(mod_includes). An extension of the standard SSI commands is
available in the XSSI module, which will be a standard part
of the Apache distribution from the next release. XSSI adds
the following abilities to the standard SSI:
-
Variables in commands: XSSI allows variables to be used in
any SSI commands. For example, the last modification time
of the current document could be obtained with
<!--#flastmod file="$DOCUMENT_NAME" -->
-
Setting variables: the set command can be used
within the SSI to set variables.
-
Conditionals: SSI commands if, else,
elif and endif can be used to include parts
of the file based on conditional tests. For example, the
$HTTP_USER_AGENT variable could be tested to see the type
of browser and different HTML codes output depending on the
browser capabilities.
For details of how to use SSI in your HTML documents, see our
feature on Using Server Side
Includes
The feature on Apache and SSL in issue
25 contained the following errors:
-
Microsoft Internet Explorer 3 beta still cannot handle
arbitrary certificate authorities
-
SSL security makes use of randomly generated symmetric keys
as well the public key encryption
-
To use RSA in the US, you must to use the RSA libraries
|