O'Reilly Open Source convention in
San Diego:
Apache Week visited the five day O'Reilly Open Source Conference in
San Diego this week and found an overwhelming source of
Apache information.
First published: 27th July 2001
It is exactly a year ago that we had the pleasure of visiting
Monterey California to report on the 4th O'Reilly Open Source
software convention (Apache Week
issue #208).
When we managed to get invited back to San Diego in July 2001
we thought we'd been given the ideal assignment; we get to fly
to California in July, avoiding the British rain, and
spend a week right on the West Coast with other open
source gurus and advocates. In fact with only one direct flight
a day from England we were unsuprised to find a large number of
delegates on the plane; wearing Penguin badges and snapping pictures
of the clear views over Greenland with a variety of digital cameras.
To accommodate feeding over a thousand delegates, the conference had erected
a huge tent outside the hotel with views overlooking the harbour. It was there
we started off Monday morning with the complimentary breakfast.
The conference was split over two buildings,
with a 10-15 minute walk between the two.
With 16 simultaneous tutorial sessions on the first
day and with only two Apache Week staff we found it really hard to
choose between the talks. We spoke to other delegates who had been
similarly overwhelmed by the choice.
Apache Week has reported on the ApacheCon and O'Reilly conferences
over the last
few years, so this time we wanted to avoid the talks that were copies
of ones we've already covered. We decided to mix Apache talks with others
that seemed new or interesting.
AxKit
Matt Sergeant gave the first tutorial we visited on his XML application
server for Apache, AxKit.
AxKit performs a similar function to the Apache Cocoon project, but
is written in Perl and C rather than Java. Matt even describes AxKit
as "the C version of Cocoon". AxKit was born to as a way of collecting
together the various Perl XML technologies and using them to deliver
the same XML data in different formats. The use of XML allows for the
separation of content, presentation, and logical site management.
The tutorial focussed on
the various Perl XML tools available, the evolution of AxKit,
and ways to use the result to power both static and dynamic sites.
Matt highlighted some
exciting and powerful features of AxKit: the intelligent compression of
pages being returned to the client (gzip), the ability to parse and
serve OpenOffice files on the fly, and AxPoint which powered his presentation
by converting an XML outline to PDF.
AxKit allows any number of ways to process the XML for output; from
the well known (but steep learning curve of) XSLT to
XPathScript which has been designed to allow easy dynamic
functionality and is also found within Cocoon.
Future plans for AxKit were covered, these included a port to
Apache 2.0 and a complete Content Management System.
Perl for System Administrators
After the provided lunch we headed over to the Perl for System
Administrators talk.
The presenter, David Blank-Edelman, played music and danced around the
hall to get into the mood for the tutorial. The talk had a heavy bias towards
security, giving reasons why administrators should be paranoid and
numerous stories and anecdotes about hacks and security vulnerabilities.
David suggested
some best practices that can help protect your scripts; for example
there is no need to run a log analysis script as root. Other areas where
users can overlook potential security problems are when appending to files,
or creating temporary files in Perl.
Although this talk was primarily about Perl,
David made the important point that "a cutting sysadmin is platform agnostic",
and his tips applied as much to sysadmin scripts as to CGI programs.
WebDAV and Apache
Also that afternoon, Jim Whitehead presented a tutorial on WebDAV and
Apache. Jim, the chair of the IETF's WebDAV working group, began by
giving a brief overview of authoring over HTTP, and gave examples of
how collaborative web authoring can take place using WebDAV. The
current state of client and server support was described, and an
insight into some of the future extensions of the DAV protocol was
given (including versioning, searching and access control). The talk
continued giving a detailed description of the DAV protocol,
explaining the support for properties, and the overwrite prevention
mechanisms.
The tutorial finished up with a guide to setting up the WebDAV module
for Apache, mod_dav,
covering the basic operation of the module and the usual configuration
issues. Jim noted that Apache 2.0 bundles mod_dav inside the source
tree, making it easier to set up than Apache 1.3, where mod_dav must
be compiled as an external module.
Film: Revolution OS
In the evening we took a coach to a local multiplex cinema for the west coast
premier of the film "Revolution OS" by director J.T.S. Moore. The
aim of the film was to document the history of the open source
movement from Richard Stallman's founding of the GNU project, through
the VA Linux IPO, to events taking place today. The film focussed on the
key people responsible for a few of the historical turning points in
the movement.
Early into the film, Eric Raymond said that "Apache was the killer app[lication]" and was responsible for the mass adoption of the Linux operating system.
A number of other key people were interviewed including Brian Behlendorf
from the Apache Project and Michael Tiemann from Red Hat.
We were impressed at the balance and accuracy of the film, especially the
positive way the people interviewed were portrayed. The film would be
interesting to engineers as well as outsiders.
At the end of the film the director took questions from the
audience aided by Eric Raymond and Bruce Perens. They explained that
the film took two years to make and was planned to be shown in the
future at film festivals and other conferences.
We kicked off the second day much as the first, spending our
breakfast trying to decide amongst the 17 simultaneous
tutorials. Amongst the sessions we didn't get to see was Ryan
Bloom's "Writing an Apache 2.0 Filter" which was given to a small,
but enthusiastic group of developers.
Introduction to Zope
We hear a lot of positive comments from people using the
python-based Zope application server so decided to attend
the tutorial "Introduction to Zope" given by Mike Homyack.
Mike ran through what Zope is, and its architecture, telling us
that "Zope is full Object-Orientated" and "really good at dynamic
stuff". Zope has a built
in server, z-server, that handles access to the internal content
via a number of mechanisms including HTTP, FTP, and DAV. It
is usual to let Zope handle all your web site content, but
in most situations another server such as Apache or a reverse proxy
such as Squid is placed in front
in order to accelerate any static content. The main zope.org site
itself uses Zope together with Apache; using Rewrite rules to proxy
and cache requests to a Zope backend.
Zope currently has its own license but we were told that there
was "motivation to give Zope some license like Python" to make
it GPL compatible. Zope is in production use by some major companies
including CBS New York.
Introduction to PostgresSQL
At the same time as the Zope tutorial, Bruce Momjian gave an
introductory tutorial on the PostgresSQL database. Attendees received
a complimentary copy of Bruce's book, which the tutorial was based
upon. Only a small amount of database expertise was presumed so this
talk was very open to beginners.
The half-day session allowed many chapters of the book to be
covered in reasonable detail, starting with the basic architecture
of a database, how to input data, modify data, and make simple
queries. The talk then progressed to describe the construction of
more complex queries, joins, and how to utilize the relational
database capabilities of PostgresSQL. Bruce also presented a
follow-up tutorial in the afternoon, covering some of the more
advanced features.
Tuesday afternoon
In the afternoon we visited a talk on "Secure Internet
Servers and Firewalls with OpenBSD". Although not directly
related to Apache, it was interested to see how much security
had been added into the OpenBSD system by default. OpenBSD
ships with an SSL-enabled version of Apache by default.
We were also lucky to catch the second of a pair of tutorials by
Mark-Jason Dominus, entitled "Stolen Secrets of the Wizards of the
Ivory Tower". In an enigmatic talk, a set of Perl programming
techniques were described including Memoization, the use of iterators,
and drew particular attention to closures and anonymous subroutines.
The obscure title alludes to the LISP heritage of many of these ideas.
In the evening Larry Wall gave an entertaining and lightning talk
on the new features in Perl 6. Larry's talk didn't touch on anything
Apache related, so if you are interested read all
about it in
"The State
of the Onion 5" at perl.com.
Wednesday started as usual with the complimentary breakfast. With
14 simultaneous talks split across the two hotel blocks we spent most
of our breakfast choosing which to visit. Four of the days tracks were
dedicated to Perl, two to XML, and the remainder split across Tcl/Tk,
Mozilla, mod_perl, Java, MySQL, Python, and Emerging Topics. The
dedicated Apache track was due to start on Thursday. We noticed that the
number of Perl tracks had shrunk slightly this year, with other
open-source technology tracks becoming more prominent. In particular
we were pleased to see the two XML tracks, something we said was
missing from last year.
Before the keynotes of the day a short film was shown which was made up
from interviews of the various conference attendees during the tutorial
days. Tim O'Reilly appeared on stage and reminded the packed ballroom
that we should "think the Internet" and think of "technologies such as
Apache, PHP" and not just Linux.
Keynotes
Fred Baker, previous chair of the IETF, gave his keynote
presentation titled "Will the next Internet generation still depend on
open source?". He explained that although Linux was the only real
technology that could threaten Windows and that successful open source is
"all about getting good documentation and predictable quality".
He welcomed the involvement
of commercial interests in open source: "Once the open source
technology has to be used by real people then real companies have to
do code freezes and manage the development in a way that makes a
quality product". He predicted that in the coming years we'll see
more open source projects in partnership with the business world.
Open source leads to rapid prototyping and exploratory code, with the
business partnerships being able to productise them.
W. Phillip Moore from Morgan Stanley Dean Witter then took the
stage to show "an open source success story on Wall Street". He
showed why open source was important to their business, allowing
them to tailor existing applications to their complex environment with
a bit of Perl glue thrown in.
MSDW are an enterprise class business that have decided to slowly
migrate from using Sun hardware with Solaris to using commodity hardware
and Linux, with Apache as their primary web server. They've also made
contributions back to open source, and have been covertly
submitting patches back into the community
as well as funding open source development. "It all comes down
to vendor risk management", he said, with proprietary software "you're
placing a bet on the security of that company and the security
of their product, a bet you're not always aware you're making". With
open source this dependency is removed and it's possible to get
enterprise level support for open source software from a number of
vendors.
Open Source Strategies Summit
Also taking place at the convention was the O'Reilly summit on
Open Source strategies, aimed at CTOs, CIOs, and CEOs who want to
find out how to use open source as a strategic advantage. Although
this summit was separate to the main conference we decided to take
a look at the opening talk given by Tim O'Reilly, and the subsequent
panel discussion with the economist Hal Varian, Brian Behlendorf, and
Michael Olsen from Sleepycat.
To begin the session, Tim O'Reilly discussed the reasons underlying
the success of the Internet and Open Source software, finding many
common themes. The highlights were the emphasis on decentralisation,
the combination of many small modules into large complex systems, and
the ability to easily extend existing technologies - all important to
the wide adoption seen in both arenas. By looking at current trends,
Tim talked about some emerging projects which may prove key to the
Next Generation Internet.
One of the biggest challenges for Open Source and Internet
companies is the search for an appropriate business model. The panel
discussion which followed the talk gave many interesting insights from
those who have been successful in that search. Brian Behlendorf spoke
about the need to identify which intellectual property is released
freely, and which is "owned" by the company generating it. All
speakers noted that embedded systems would be increasingly
important.
Real World Performance Tuning
After lunch, Apache Software Foundation member Ask Bjoern
Hansen gave a talk on
how to use mod_perl in an efficient way. He explained that it
is generally preferable to use mod_perl statically compiled into Apache
instead of as a dynamic (shared object) module. However, by doing this you end
up with a server that has a much larger memory footprint and since the majority of the
time the server is dealing with buffering data to slow clients, this
is wasted overhead.
The solution presented was to run a separate server that has
mod_perl compiled into it behind a reverse proxy. Apache can also be
used as this reverse proxy and can serve static content as well as
cache the content created by the dedicated Apache+mod_perl server. In
this way the memory usage can be decreased and performance increased.
The slides from the full presentation are available online.
Why SOAP sucks, Why SOAP rocks
There were a large number of talks throughout the conference
on SOAP and XML-RPC.
Matt Sergeant took a step back to examine what all the fuss was
about in a short talk renamed "Why SOAP sucks, Why SOAP rocks".
He started out by asking why we are using SOAP when we could use HTTP
instead, since
HTTP already has all the features that are normally needed, and more. Using HTTP
natively allows caching and logging for example. The talk then showed
how to do SOAP without SOAP; using mod_perl to control the URL space
and using Perl HTTP modules for the transport. The current major
advantage of SOAP is that modules such as the Perl SOAP::Lite module
exist which allow applications to be developed quickly and easily. There
currently is no simple library that would do the equivalent directly
over HTTP.
Finally we were shown some services that are already doing the
equivalent of a SOAP transaction without SOAP; such as the ability to get search
results from Google in XML format (for example try
http://www.google.com/xml?q=apacheweek).
The slides to this talk
are available online.
XML Content management
For the remainder of the afternoon we visited the XML track; in
particular we were interested in XML application servers. The first
session "XML Content management using XSLT, Schematron and Ant", showed
one extensible way of serving XML content to browsers. Following that talk a
panel discussion "XML-based Application Frameworks" took place.
The basic idea of an XML application server is that you create all the content
for your site in XML. The use of XML allows the separation of content from presentation, a useful extra abstraction layer.
The XML content can come from static files, from a database,
or be dynamically generated content from scripts. In its simplest form
you take your XML content then apply a style-sheet to generate HTML for a browser.
Application servers usually perform this style-sheet conversion
on the fly, caching the results for speed.
XSLT is one language that is used to transform XML data in this way.
Tools also exist that will take XML and generate PDF, Postscript,
presentations, (and more) on the fly.
The most well-known open source
XML application server is Apache Cocoon,
which relies on Java. Other solutions such as
AxKit (C/Perl/mod_perl),
Charlie (C/C++/Perl/mod_perl),
and technologies such as
Xerces/Xalan (Java), and
Sablotron (Java), and
LibXML/LibXSLT (C), are
also available. Even scripting languages such as PHP now have their
own XML solutions, although during his tutorial earlier
in the week mod_perl guru
Matt Sergeant said that the "PHP XML solutions are not very strong".
When the attendees were asked which application server they
were using for their applications, the majority said they
were using a system they developed themselves (home grown) from the underlying
technologies. The rest were a pretty even split between the application
frameworks listed. However, having such a wide choice of technologies and
servers is no bad thing. As one panel member said "no matter what, if your
content is in XML you win".
Pathologically Polluting Perl with Inline.pm
Brian Ingerson presented this talk on the award-winning Inline
module (which only celebrated its 1st birthday a few days before the
conference). Inline.pm allows programmers to embed code from a
variety of programming languages directly inside a Perl script, from
C, C++, and assembler through to Java and Python. Brian covered some
of the advanced features available when using using embedded C,
notably caching of compiled object files.
A demonstration was given showing some "one-liners" using
Inline.pm, including an ASCII Mandelbrot set generator. The talk went
on to discuss some of the different ways to use Inline.pm: replacing
the traditional usage of XS and MakeMaker, and also explained how to
extend the module to support new languages.
Microsoft and Open Source
Wednesday had ended with a night of Mexican food and drink in the conference
tent, followed by a party from Stonehenge. Even with all
the free drink and food the night before, by 8.45am on Thursday the ballroom
was packed for the much anticipated debate between Craig Mundie of Microsoft
and Michael Tiemann of Red Hat.
The details of the debate has been covered in a
number of other articles.
However, we were interested in the comments with relevance to Apache
made during the panel discussion. Craig Mundie stated that Microsoft's concern
was not about open source but "about the GPL" as it "creates it's own closed
community". Tim O'Reilly commented that University licenses (like the
BSD License and Apache Software License) "give the best balance between
freedom and the right to make money". Also on the panel was Apache Software
Foundation member Brian Behlendorf, who said the Apache model has worked well
to build up momentum. Although with the Apache license
there are no obligations placed on commercial users, history has shown that
the companies involved do re-invest and give back to the community.
Apache 2.0; where is it?
With the provocative title "Apache 2.0; where is it?", Ryan Bloom
proved a popular start to the Apache track, with over 80 attendees packed
in to hear his session.
The aim of the talk was to cover what was new in Apache 2.0
but also answer the question of why Apache 2.0 is taking so long.
Ryan explained that since Apache is now so big there are "only three or four
people who know 100% of Apache 2.0", and that fortunately he was one of them.
The new features of 2.0 were then explained, stopping at Layered
IO which is "the Holy Grail" of Apache.
Ryan then gave a demonstration of Apache 2.0 acting as a POP3 server to
show that it is easy to have Apache serve up other protocols as well
as HTTP
Apache Week asked Ryan if he was correct in using the name "Apache 2.0"
throughout his talk given that the Apache group have a number of other
products and that the
binary downloads have been renamed to "httpd". Ryan said that the name
was officially "Apache httpd 2.0" but hinted that there was talk of changing
the name to something other than httpd in the future.
To answer the question of Apache 2.0 availability Ryan said that he
expected to see a full release "next year."
PostgreSQL & The Web
After attending the PostgreSQL tutorial on Monday, we decided to
follow up with this talk from Gavin Roy, which gave a practical guide
to using PostgreSQL in web applications. Gavin gave an overview of
which web platforms could make use of a PostgreSQL database (for
instance, PHP and Perl), and gave testimony to the product's
reliability and performance in large scale web applications.
The talk proceeded to discuss the architecture of systems using a
web server together with a PostgreSQL database, covering the
advantages and disadvantages of using a single machine or two separate
machines. Some tips on optimising performance in a production
database were also given, emphasizing the use of database indices, and
regularly vacuuming the database.
In closing, Gavin briefly covered security, authentication and
authorization issues when using Postgres in a web environment.
mod_perl 2.0
To end the day we were expecting good things from Doug MacEachern's
talk on "mod_perl 2.0". We were not disappointed as over 50 people packed
into the last mod_perl
session to hear a heavy technical talk about Apache 2.0 and mod_perl.
Doug showed Apache 2.0.22-dev working with both mod_ssl and Perl/mod_perl.
This is perhaps the first demonstration of its kind, as mod_ssl is only
just becoming usable in the Apache 2.0 tree.
He continued and took a program that
communicated entirely using stdin and stdout (in this case a NNTP server)
and showed how it was easy to make this function as a Apache protocol
handler. This allowed Apache to serve newsgroups to his news reader, whilst
still allowing other filters to be included such as SSL and authentication.
Future plans for mod_perl 2.0 include the ability to write a MPM completely in Perl, and
to continue with the Apache-TestKit, a package not tied to mod_perl that
has been designed to test Apache.
Doug said that there was still plenty left to do on mod_perl even
though it currently seems stable and that there would be "probably
a release of some sort at the end of the summer."
Web Security for Business: Introduction to mod_ssl
At the same time as the talk on mod_perl, Paul Weinstein was giving
his popular introduction to mod_ssl in the Apache track. The history of mod_ssl
for Apache 1.3 was discussed together with some of the decision making
process for including mod_ssl in Apache 2.0. The slides to this talk
are available online.
Extreme Programming and Open Source Software
In a pair of talks which attracted 60 people into a room designed
for 40, the speaker known as "chromatic" described the basics of the
Extreme Programming (XP) software development method, and in
particular their application in the Open Source world.
The first talk gave an introduction to XP, its differences from
more traditional software development, and the motivations behind the
techniques it uses to promote the development of high quality
software. The talk highlighted that the most important aspect of XP
is the emphasis on writing unit tests, and also covered the principles
of incremental change, and pair programming.
The room remained packed into the second half of the session, where
chromatic discussed how XP can be used within Open Source software
(OSS) development. Some elements of XP are already employed in many
OSS projects, for instance, the tight feedback loop between users and
developers. Many other XP techniques could also be usefully employed,
but some, such as pair programming, were considered inappropriate in
the majority of Open Source development.
Last year, the conference sessions were held over just two days and we
were pleased to see they were extended to a third day to
fit in more presentations. Friday consisted of the extension
of tracks from previous days together with tracks dedicated to PHP,
Zope, and Open Source Speech.
After breakfast, Michael Tiemann was the moderator for the morning
keynote looking at the "big hairy problems: open source challenges in the
enterprise".
The first speaker was from DreamWorks, the animation company behind
such epics as Antz, Chicken Run, and now Shrek. He told us how
DreamWorks were slowly switching thousands of machines from SGI
to Linux giving them increased performance and value for money.
When working on their strategy for adopting Linux they analysed
six key factors: performance, scalability, stability, software, support,
and transition.
W. Phillip Moore from Morgan Stanley Dean Witter took the stage and
built upon his previous keynote. He explained that it was important that
the enterprise customers have a support number they can call with problems, the
ability to get fixes to existing problems, and the ability to get
enhancements. He complimented Covalent and Red Hat specifically but said
that there
was a need to see more companies providing commercial support for open
source software: "you need to know there is a 800 number and a staff
of people that will be able to solve the problem."
Apache Portable Run-time: Why?
Ryan Bloom gave this talk on APR, the Apache Portable Run-Time,
which began with a quick history lesson explaining how Apache 1.3
addressed portability issues, and how APR and Apache 2.0 grew out of
that experience. Ryan explained what the initial goals for the
library were, and showed how it provides an abstraction layer for
commonly used operating system interfaces which has been ported to a
range of 50 Unix platforms, BeOS, Windows, and OS/2.
The talk gave a breakdown of the different components which make up
APR: from file and network I/O, memory handling, through to some of
the more complex interfaces providing threading support. For each
component an overview of the API was given, showing how it could be
used in applications. Ryan also gave an insight into why various OS
interfaces (such as POSIX) cannot be used portably, justifying the
need for the abstraction layer which APR provides.
To give a more in-depth look at the API, the talk gave a
walk-through of a code sample using the threading interface, and took
a look at some of the test code present in APR which exercises most of
the library's capabilities. Although APR's primary user is the Apache
httpd server, the library is also used by a number of other projects
such as Subversion.
Web Security for Business
Paul Weinstein closed off the afternoon with his talk all about
private certificate authorities. The session showed the basics of how
to create and then use a private certificate authority, then went into
the more advanced details.
The examples were based around the OpenSSL toolkit; showing
which parameters to use on the OpenSSL command line, and how to integrate
the certificates into Apache with mod_ssl. Finally the tricky subject
of certificate revocation was covered.
The slides to this talk
are available online.
Exhibition
The vendor exhibition area was very popular with a large number of
companies attending.
We didn't find much information specific to Apache at the
exhibition: NuSphere were giving out MySQL CDs that come with a packaged
version of Apache, and Red Hat had some information on their Apache services.
However there were plenty of free promotional t-shirts to add to our collection,
as well as more of the flashing clear rubber bouncy balls we picked up
last year from collab.net. Oh, and let's not
forget the "Apache by night" Apache Week postcards of course.
Even if you were not interested in any of the other tracks there were plenty
of talks and tutorials relevant to Apache users, although a number of them
were direct copies or updates of talks given at previous Apache conferences
such as ApacheCon 2001.
Apache Week talked to a large number of the attendees of the conference and
the overall impression was very positive. One attendee said that "the keynotes
alone were worth the trip". We were also particularly impressed by the child
care facilities; allowing conference speakers and participants to bring their
families and enjoy a mini holiday in San Diego. The night-time
activities and the food was also excellent. The only complaint we heard
repeated by a number of attendees was that lunch was not included on
Friday, even though there was a full day of sessions.
With 802.11b wireless internet connectivity to most of the conference rooms
it was hard to escape from work; and with five intensive days packed with
new material we found ourselves tired and in need of a holiday by the
end of the week. Next time we'll bring our swimming trunks and sun cream.
Please note that although Apache Week is an O'Reilly Network affiliate,
O'Reilly had no editorial control over this review of their conference, even
though they did give us free beer.
Apache Week will give you our unbiased opinion of all the conferences
we attend that have things of interest to Apache users and developers.
For more coverage of the rest of the conference visit
the O'Reilly Network web site.
|