In this issue
Apache Week had the pleasure of visiting San Diego last week
to report on the 5th O'Reilly Open Source Software
Convention. In the last issue
(Apache Week 256)
we reported on the start of the conference and the first two days of tutorials.
Wednesday started as usual with the complimentary breakfast. With
14 simultaneous talks split across the two hotel blocks we spent most
of our breakfast choosing which to visit. Four of the days tracks were
dedicated to Perl, two to XML, and the remainder split across Tcl/Tk,
Mozilla, mod_perl, Java, MySQL, Python, and Emerging Topics. The
dedicated Apache track was due to start on Thursday. We noticed that the
number of Perl tracks had shrunk slightly this year, with other
open-source technology tracks becoming more prominent. In particular
we were pleased to see the two XML tracks, something we said was
missing from last year.
Before the keynotes of the day a short film was shown which was made up
from interviews of the various conference attendees during the tutorial
days. Tim O'Reilly appeared on stage and reminded the packed ballroom
that we should "think the Internet" and think of "technologies such as
Apache, PHP" and not just Linux.
Keynotes
Fred Baker, previous chair of the IETF, gave his keynote
presentation titled "Will the next Internet generation still depend on
open source?". He explained that although Linux was the only real
technology that could threaten Windows and that successful open source is
"all about getting good documentation and predictable quality".
He welcomed the involvement
of commercial interests in open source: "Once the open source
technology has to be used by real people then real companies have to
do code freezes and manage the development in a way that makes a
quality product". He predicted that in the coming years we'll see
more open source projects in partnership with the business world.
Open source leads to rapid prototyping and exploratory code, with the
business partnerships being able to productise them.
W. Phillip Moore from Morgan Stanley Dean Witter then took the
stage to show "an open source success story on Wall Street". He
showed why open source was important to their business, allowing
them to tailor existing applications to their complex environment with
a bit of Perl glue thrown in.
MSDW are an enterprise class business that have decided to slowly
migrate from using Sun hardware with Solaris to using commodity hardware
and Linux, with Apache as their primary web server. They've also made
contributions back to open source, and have been covertly
submitting patches back into the community
as well as funding open source development. "It all comes down
to vendor risk management", he said, with proprietary software "you're
placing a bet on the security of that company and the security
of their product, a bet you're not always aware you're making". With
open source this dependency is removed and it's possible to get
enterprise level support for open source software from a number of
vendors.
Open Source Strategies Summit
Also taking place at the convention was the O'Reilly summit on
Open Source strategies, aimed at CTOs, CIOs, and CEOs who want to
find out how to use open source as a strategic advantage. Although
this summit was separate to the main conference we decided to take
a look at the opening talk given by Tim O'Reilly, and the subsequent
panel discussion with the economist Hal Varian, Brian Behlendorf, and
Michael Olsen from Sleepycat.
To begin the session, Tim O'Reilly discussed the reasons underlying
the success of the Internet and Open Source software, finding many
common themes. The highlights were the emphasis on decentralisation,
the combination of many small modules into large complex systems, and
the ability to easily extend existing technologies - all important to
the wide adoption seen in both arenas. By looking at current trends,
Tim talked about some emerging projects which may prove key to the
Next Generation Internet.
One of the biggest challenges for Open Source and Internet
companies is the search for an appropriate business model. The panel
discussion which followed the talk gave many interesting insights from
those who have been successful in that search. Brian Behlendorf spoke
about the need to identify which intellectual property is released
freely, and which is "owned" by the company generating it. All
speakers noted that embedded systems would be increasingly
important.
Real World Performance Tuning
After lunch, Apache Software Foundation member Ask Bjoern
Hansen gave a talk on
how to use mod_perl in an efficient way. He explained that it
is generally preferable to use mod_perl statically compiled into Apache
instead of as a dynamic (shared object) module. However, by doing this you end
up with a server that has a much larger memory footprint and since the majority of the
time the server is dealing with buffering data to slow clients, this
is wasted overhead.
The solution presented was to run a separate server that has
mod_perl compiled into it behind a reverse proxy. Apache can also be
used as this reverse proxy and can serve static content as well as
cache the content created by the dedicated Apache+mod_perl server. In
this way the memory usage can be decreased and performance increased.
The slides from the full presentation are available online.
Why SOAP sucks, Why SOAP rocks
There were a large number of talks throughout the conference
on SOAP and XML-RPC.
Matt Sergeant took a step back to examine what all the fuss was
about in a short talk renamed "Why SOAP sucks, Why SOAP rocks".
He started out by asking why we are using SOAP when we could use HTTP
instead, since
HTTP already has all the features that are normally needed, and more. Using HTTP
natively allows caching and logging for example. The talk then showed
how to do SOAP without SOAP; using mod_perl to control the URL space
and using Perl HTTP modules for the transport. The current major
advantage of SOAP is that modules such as the Perl SOAP::Lite module
exist which allow applications to be developed quickly and easily. There
currently is no simple library that would do the equivalent directly
over HTTP.
Finally we were shown some services that are already doing the
equivalent of a SOAP transaction without SOAP; such as the ability to get search
results from Google in XML format (for example try
http://www.google.com/xml?q=apacheweek).
The slides to this talk
are available online.
XML Content management
For the remainder of the afternoon we visited the XML track; in
particular we were interested in XML application servers. The first
session "XML Content management using XSLT, Schematron and Ant", showed
one extensible way of serving XML content to browsers. Following that talk a
panel discussion "XML-based Application Frameworks" took place.
The basic idea of an XML application server is that you create all the content
for your site in XML. The use of XML allows the separation of content from presentation, a useful extra abstraction layer.
The XML content can come from static files, from a database,
or be dynamically generated content from scripts. In its simplest form
you take your XML content then apply a style-sheet to generate HTML for a browser.
Application servers usually perform this style-sheet conversion
on the fly, caching the results for speed.
XSLT is one language that is used to transform XML data in this way.
Tools also exist that will take XML and generate PDF, Postscript,
presentations, (and more) on the fly.
The most well-known open source
XML application server is Apache Cocoon,
which relies on Java. Other solutions such as
AxKit (C/Perl/mod_perl),
Charlie (C/C++/Perl/mod_perl),
and technologies such as
Xerces/Xalan (Java), and
Sablotron (Java), and
LibXML/LibXSLT (C), are
also available. Even scripting languages such as PHP now have their
own XML solutions, although during his tutorial earlier
in the week mod_perl guru
Matt Sergeant said that the "PHP XML solutions are not very strong".
When the attendees were asked which application server they
were using for their applications, the majority said they
were using a system they developed themselves (home grown) from the underlying
technologies. The rest were a pretty even split between the application
frameworks listed. However, having such a wide choice of technologies and
servers is no bad thing. As one panel member said "no matter what, if your
content is in XML you win".
Pathologically Polluting Perl with Inline.pm
Brian Ingerson presented this talk on the award-winning Inline
module (which only celebrated its 1st birthday a few days before the
conference). Inline.pm allows programmers to embed code from a
variety of programming languages directly inside a Perl script, from
C, C++, and assembler through to Java and Python. Brian covered some
of the advanced features available when using using embedded C,
notably caching of compiled object files.
A demonstration was given showing some "one-liners" using
Inline.pm, including an ASCII Mandelbrot set generator. The talk went
on to discuss some of the different ways to use Inline.pm: replacing
the traditional usage of XS and MakeMaker, and also explained how to
extend the module to support new languages.
Microsoft and Open Source
Wednesday had ended with a night of Mexican food and drink in the conference
tent, followed by a party from Stonehenge. Even with all
the free drink and food the night before, by 8.45am on Thursday the ballroom
was packed for the much anticipated debate between Craig Mundie of Microsoft
and Michael Tiemann of Red Hat.
The details of the debate has been covered in a
number of other articles.
However, we were interested in the comments with relevance to Apache
made during the panel discussion. Craig Mundie stated that Microsoft's concern
was not about open source but "about the GPL" as it "creates it's own closed
community". Tim O'Reilly commented that University licenses (like the
BSD License and Apache Software License) "give the best balance between
freedom and the right to make money". Also on the panel was Apache Software
Foundation member Brian Behlendorf, who said the Apache model has worked well
to build up momentum. Although with the Apache license
there are no obligations placed on commercial users, history has shown that
the companies involved do re-invest and give back to the community.
Apache 2.0; where is it?
With the provocative title "Apache 2.0; where is it?", Ryan Bloom
proved a popular start to the Apache track, with over 80 attendees packed
in to hear his session.
The aim of the talk was to cover what was new in Apache 2.0
but also answer the question of why Apache 2.0 is taking so long.
Ryan explained that since Apache is now so big there are "only three or four
people who know 100% of Apache 2.0", and that fortunately he was one of them.
The new features of 2.0 were then explained, stopping at Layered
IO which is "the Holy Grail" of Apache.
Ryan then gave a demonstration of Apache 2.0 acting as a POP3 server to
show that it is easy to have Apache serve up other protocols as well
as HTTP
Apache Week asked Ryan if he was correct in using the name "Apache 2.0"
throughout his talk given that the Apache group have a number of other
products and that the
binary downloads have been renamed to "httpd". Ryan said that the name
was officially "Apache httpd 2.0" but hinted that there was talk of changing
the name to something other than httpd in the future.
To answer the question of Apache 2.0 availability Ryan said that he
expected to see a full release "next year."
PostgreSQL & The Web
After attending the PostgreSQL tutorial on Monday, we decided to
follow up with this talk from Gavin Roy, which gave a practical guide
to using PostgreSQL in web applications. Gavin gave an overview of
which web platforms could make use of a PostgreSQL database (for
instance, PHP and Perl), and gave testimony to the product's
reliability and performance in large scale web applications.
The talk proceeded to discuss the architecture of systems using a
web server together with a PostgreSQL database, covering the
advantages and disadvantages of using a single machine or two separate
machines. Some tips on optimising performance in a production
database were also given, emphasizing the use of database indices, and
regularly vacuuming the database.
In closing, Gavin briefly covered security, authentication and
authorization issues when using Postgres in a web environment.
mod_perl 2.0
To end the day we were expecting good things from Doug MacEachern's
talk on "mod_perl 2.0". We were not disappointed as over 50 people packed
into the last mod_perl
session to hear a heavy technical talk about Apache 2.0 and mod_perl.
Doug showed Apache 2.0.22-dev working with both mod_ssl and Perl/mod_perl.
This is perhaps the first demonstration of its kind, as mod_ssl is only
just becoming usable in the Apache 2.0 tree.
He continued and took a program that
communicated entirely using stdin and stdout (in this case a NNTP server)
and showed how it was easy to make this function as a Apache protocol
handler. This allowed Apache to serve newsgroups to his news reader, whilst
still allowing other filters to be included such as SSL and authentication.
Future plans for mod_perl 2.0 include the ability to write a MPM completely in Perl, and
to continue with the Apache-TestKit, a package not tied to mod_perl that
has been designed to test Apache.
Doug said that there was still plenty left to do on mod_perl even
though it currently seems stable and that there would be "probably
a release of some sort at the end of the summer."
Web Security for Business: Introduction to mod_ssl
At the same time as the talk on mod_perl, Paul Weinstein was giving
his popular introduction to mod_ssl in the Apache track. The history of mod_ssl
for Apache 1.3 was discussed together with some of the decision making
process for including mod_ssl in Apache 2.0. The slides to this talk
are available online.
Extreme Programming and Open Source Software
In a pair of talks which attracted 60 people into a room designed
for 40, the speaker known as "chromatic" described the basics of the
Extreme Programming (XP) software development method, and in
particular their application in the Open Source world.
The first talk gave an introduction to XP, its differences from
more traditional software development, and the motivations behind the
techniques it uses to promote the development of high quality
software. The talk highlighted that the most important aspect of XP
is the emphasis on writing unit tests, and also covered the principles
of incremental change, and pair programming.
The room remained packed into the second half of the session, where
chromatic discussed how XP can be used within Open Source software
(OSS) development. Some elements of XP are already employed in many
OSS projects, for instance, the tight feedback loop between users and
developers. Many other XP techniques could also be usefully employed,
but some, such as pair programming, were considered inappropriate in
the majority of Open Source development.
Last year, the conference sessions were held over just two days and we
were pleased to see they were extended to a third day to
fit in more presentations. Friday consisted of the extension
of tracks from previous days together with tracks dedicated to PHP,
Zope, and Open Source Speech.
After breakfast, Michael Tiemann was the moderator for the morning
keynote looking at the "big hairy problems: open source challenges in the
enterprise".
The first speaker was from DreamWorks, the animation company behind
such epics as Antz, Chicken Run, and now Shrek. He told us how
DreamWorks were slowly switching thousands of machines from SGI
to Linux giving them increased performance and value for money.
When working on their strategy for adopting Linux they analysed
six key factors: performance, scalability, stability, software, support,
and transition.
W. Phillip Moore from Morgan Stanley Dean Witter took the stage and
built upon his previous keynote. He explained that it was important that
the enterprise customers have a support number they can call with problems, the
ability to get fixes to existing problems, and the ability to get
enhancements. He complimented Covalent and Red Hat specifically but said
that there
was a need to see more companies providing commercial support for open
source software: "you need to know there is a 800 number and a staff
of people that will be able to solve the problem."
Apache Portable Run-time: Why?
Ryan Bloom gave this talk on APR, the Apache Portable Run-Time,
which began with a quick history lesson explaining how Apache 1.3
addressed portability issues, and how APR and Apache 2.0 grew out of
that experience. Ryan explained what the initial goals for the
library were, and showed how it provides an abstraction layer for
commonly used operating system interfaces which has been ported to a
range of 50 Unix platforms, BeOS, Windows, and OS/2.
The talk gave a breakdown of the different components which make up
APR: from file and network I/O, memory handling, through to some of
the more complex interfaces providing threading support. For each
component an overview of the API was given, showing how it could be
used in applications. Ryan also gave an insight into why various OS
interfaces (such as POSIX) cannot be used portably, justifying the
need for the abstraction layer which APR provides.
To give a more in-depth look at the API, the talk gave a
walk-through of a code sample using the threading interface, and took
a look at some of the test code present in APR which exercises most of
the library's capabilities. Although APR's primary user is the Apache
httpd server, the library is also used by a number of other projects
such as Subversion.
Web Security for Business
Paul Weinstein closed off the afternoon with his talk all about
private certificate authorities. The session showed the basics of how
to create and then use a private certificate authority, then went into
the more advanced details.
The examples were based around the OpenSSL toolkit; showing
which parameters to use on the OpenSSL command line, and how to integrate
the certificates into Apache with mod_ssl. Finally the tricky subject
of certificate revocation was covered.
The slides to this talk
are available online.
Exhibition
The vendor exhibition area was very popular with a large number of
companies attending.
We didn't find much information specific to Apache at the
exhibition: NuSphere were giving out MySQL CDs that come with a packaged
version of Apache, and Red Hat had some information on their Apache services.
However there were plenty of free promotional t-shirts to add to our collection,
as well as more of the flashing clear rubber bouncy balls we picked up
last year from collab.net. Oh, and let's not
forget the "Apache by night" Apache Week postcards of course.
Even if you were not interested in any of the other tracks there were plenty
of talks and tutorials relevant to Apache users, although a number of them
were direct copies or updates of talks given at previous Apache conferences
such as ApacheCon 2001.
Apache Week talked to a large number of the attendees of the conference and
the overall impression was very positive. One attendee said that "the keynotes
alone were worth the trip". We were also particularly impressed by the child
care facilities; allowing conference speakers and participants to bring their
families and enjoy a mini holiday in San Diego. The night-time
activities and the food was also excellent. The only complaint we heard
repeated by a number of attendees was that lunch was not included on
Friday, even though there was a full day of sessions.
With 802.11b wireless internet connectivity to most of the conference rooms
it was hard to escape from work; and with five intensive days packed with
new material we found ourselves tired and in need of a holiday by the
end of the week. Next time we'll bring our swimming trunks and sun cream.
Please note that although Apache Week is an O'Reilly Network affiliate,
O'Reilly had no editorial control over this review of their conference, even
though they did give us free beer.
Apache Week will give you our unbiased opinion of all the conferences
we attend that have things of interest to Apache users and developers.
For more coverage of the rest of the conference visit
the O'Reilly Network web site.
Next Week...
Apache Week will be back to normal next week and catching up on the
Apache news and features from the last few weeks.
|