The Apache Web Server Documentation Project
Sep 27, 2000, 10 :00 UTC (3 Talkback[s]) (15198 reads) (Other stories by Ken Coar)
|
By
Copyright © 2000 by Ken Coar. All rights reserved. Limited rights granted to Internet.Com.
Over the years, a lot of people have become interested in the idea of contributing to the Apache HTTP Server project, but have hung back or remained silent because they felt only hardcore C programmers with tons of experience need apply. Some actually have contacted the Project, saying they'd like to help out but don't have the coding skills and so didn't know what they could do. And some have offered specifically to help out on the documentation, either translating it, or correcting technical nits, or improving its readability or navigability.
Until recently, there has been no real organised way for non-programmers to contribute; it's been a rather haphazard matter of sending comments or suggestions to the best-guess contact they could find.
In July of 2000, however, the Apache HTTP server project created a subproject and reorganised the documentation files so that they can be worked on directly by non-programmers. This article describes more about this, and how you can get involved.
Contents
The mailing list that's dedicated to the documentation project is called apache-docs . Discussions about what needs to be done and how to do it take place on that list; in addition, the reports of changes actually made to the master documentation files are sent there, as well. (There's an example mesage of this sort later in this article.)
All comers are welcome to join and participate in the mailing list, even if they don't intend to actually submit changes but just want to express opinions about how they'd like to see things go. To join the mailing list, send an empty email message to
[email protected]
You'll receive a confirmation message in response, which you must handle as it instructs in order to complete your subscription.
There are currently four parts to the Apache Web server documentation project. There are two versions, one for Apache 1.3 and one for Apache 2.0 (still in development at this point), and for each version there are two sections: the 'user manual,' which describes the directives and how to use the server, and the API (Application Programming Interface) documentation, which exists to help developers such as those writing modules for Apache. Ordinarily when you download the files, you request either the 1.3 or 2.0 version and get both sections, but you can cut it more finely than that if you like and only download a single section.
Just like the source code, the master documentation files are maintained using CVS, the Concurrent Versioning System. CVS allows multiple people to work on the same files and coordinates their changes, making conflict resolution very simple and easy.
There are essentially two different ways to obtain a copy of the master documentation files and keep the copy up to date. One method works for people who have been granted commit access (the ability to make changes directly to the master files), and the other method is for people who don't have commit access.
Workers who don't have commit access to the project need to work with a copy of the documentation, make their changes, sum those changes up in the form of a patch, submit the patch to the mailing list for approval, and wait for someone who does have commit access to apply the changes to the master copies. On the other hand, someone with commit acces can just make changes to his or her working copies and then apply them directly to the master files.
Commit access to the documentation project is granted pretty readily. The main prerequisites are a demonstrated desire and willingness to work on it.
The Web server documentation project has been separated into two distinct areas: one for version 1.3 of the server, and one for version 2.0. Apache 1.3 is stable, released, and widely deployed, so there are a lot of people (millions!) who can benefit from improvements to its documentation. Apache 2.0, however, is still in development and hasn't been released yet. There are a lot of significant differences between it and version 1.3, so there are lots of areas needing new documentation written or 1.3 documentation adapted. There's no shortage of needed documentation work! Which version you choose to work on is entirely up to you -- you can even choose to work on both.
For people who haven't (yet) been granted commit access, there are three basic ways to keep up with the changes made to the master copies:
- Anonymous CVS
rsync
- Regular tarballs
I've listed these in descending order of ease-of-use when it comes to submitting changes. These procedures are essentially the same as would be used to keep up with changes to the source code (as opposed to the documentation), except that the bits being downloaded are coming from different locations. How to use them will be described in the following sections.
This is probably the best way to stay synchronised with the master documentation sources while actively working on changes. This method allows CVS to do one of the things it does best, keeping your local working files up to date with any changes other people are making while keeping your own alterations intact.
Anonymous CVS uses a real CVS repository, but the username used to access it does not have permission to upload changes. So it's strictly a one-way proposition, keeping your local copies synchronised with the master files, with no commit ability.
There are currently two anonymous CVS repositories available. One is actually the master Apache repository in read-only mode; if you use that you're guaranteed to get the very latest and up-to-the-second version of the files. The other repository is on another system, and is synchronised with the master every two hours, so you can use that if the one on the Apache site becomes unavailable or you don't need its steaming freshness.
The first step in using anoncvs is setting your CVSROOT environment variable. This tells CVS to what server it should connect. There are currently two servers, and the only available method to access these anoncvs servers is pserver; you need a reasonably recent CVS client (such as 1.10) to use it. An example of setting it from a Bourne shell:
user@host:~$ CVSROOT=:pserver:[email protected]:/home/cvspublic
user@host:~$ export CVSROOT
- Note:
- It is very important that you not put a trailing "/" on the value of CVSROOT.
The two anoncvs servers currently available are:
:pserver:[email protected]:/home/cvspublic
and
:pserver:[email protected]:/cvs/apache
Choose one of these values for your setting of CVSROOT. Once you have set your CVSROOT, you need to log in:
user@host:~$ cvs login
(Logging in to [email protected]:/home/cvspublic)
CVS password:
The password is "anoncvs" (without the quotation marks) for both servers.
To check out the Apache 1.3 documentation:
user@host:~$ cvs checkout httpd-docs-1.3
cvs server: Updating httpd-docs-1.3
cvs server: Updating httpd-docs-1.3
cvs server: Updating httpd-docs-1.3/apidoc
U httpd-docs-1.3/apidoc/.cvsignore
U httpd-docs-1.3/apidoc/APIdict.pm
U httpd-docs-1.3/apidoc/README
U httpd-docs-1.3/apidoc/TODO
U httpd-docs-1.3/apidoc/api-dict.html
[...]
user@host:~$ ls httpd-docs-1.3
CVS apidoc htdocs
To update your local tree to the latest version:
user@host:~$ cvs update -dP httpd-docs-1.3
or
user@host:~/httpd-docs-1.3/$ cvs update -dP .
P apidoc/APIdict.pm
RCS file: /home/cvs/httpd-docs-1.3/apidoc/TODO,v
retrieving revision 1.109
retrieving revision 1.110
Merging differences between 1.109 and 1.110 into TODO
M apidoc/README
The P means that the local copy was patched to update it to the current version in the master repository. The M means that your local copy is different from the master copy, but that any changes were merged into your copy successfully. If you see a C that means that there was a conflict in merging the changes and that you need to review the file manually (hint: search for >>>> in the file) to merge the changes.
To obtain a diff (patch) of changes between your checked out copy and the source tree at the time you checked it out:
user@host:~$ cvs diff -u httpd-docs-1.3
To obtain a diff against the current source tree, be sure to do an update before the diff.
The idea of having an anoncvs server is to make it much easier for people interested in doing development to have access to the CVS tree so they can submit patches against the current tree and can keep their patched version up to date without having to manually merge their patches all the time.
The main disadvantage of using the anonymous CVS method is that it requires you to have a CVS client on your local system.
The rsync (Remote SYNChroniser) tool is a very fast and easy way to keep a local copy of a directory tree up-to-date with the remote master files. It's a client/server application, and the local rsync client command works with the remote server daemon to figure out which files on the remote end are new, deleted, or have changed, and then propagates the changes to the local copy, adding, deleting, or modifying files locally as needed.
Here's how you can use it with Apache:
% rsync -avz dev.apache.org::doc-module ~/doc-module
where doc-module is either 'httpd-docs-1.3 ' or 'httpd-docs-2.0 '.
The second parameter, '~/doc-module ', is the name of the local directory under which you want the remote files copied. You can call that directory whatever you like; for simplicity's sake, I suggest you give the directory the same name as the documentation module, although you can call it anything you like and put it anywhere you have access. The first time you use rsync , it will correctly think that all of the remote files are 'new,' and will download everything.
If you've already rsync ed the files into that directory before, the command will only apply changes -- that's where the real speed comes in.
There are two main drawbacks to using this method to keep synchronised with the master files:
rsync only notes that files have been changed; it doesn't try to coordinate local and remote alterations. You can instruct it to not download files from the remote system if the local copies of them have been modified more recently, or to download them regardless. The former will keep you from receiving changes to the files you're actively editing, and the latter will cause you to lose your edits.
- Because of the way the documentation for Apache 2.0 is being done, only the 2.0 user manual is available via
rsync , so you can't use this method to keep up with the API section if you want to work on that. Unless, of course, some solution is figured out that allows the API files to be made available through rsync , too.
An automatic job runs on the Apache system every six hours and bundles up the master CVS repositories into a tar archive (called a tarball or tarchive). These tarballs are put into a directory for downloading,
Unlike a lot of download areas on the Internet, the Apache download areas do not allow access using FTP. The only method permitted is HTTP, which means you need to use a browser or other Web tool to download the tarballs. The URL is:
<URL:http://dev.apache.org/dist/from-cvs/>
The tarballs are gzipped and named with a timestamp indicating when they were created. For example,
apache-1.3_20000819060015.tar.gz
was created at 6:00:15 AM (PDT) on Saturday, 19 August 2000. (All times are U.S. West Coast time.)
Once you've downloaded a tarball, you need to unpack it into a directory tree in order to work on the files. This is where one of the shortcomings of this method becomes clear, because unpacking the tarball into the same directory where you've been working will replace any of your changed files with the as-yet-unchanged ones from the master repository. For this reason, this method is not very well suited to ongoing changes being made to existing files; it's acceptable for translating the existing files into other-language files, however.
Once you've been granted direct access to the CVS repository for the documentation, you need to go through a few actions before you can take advantage of it. This section describes those actions. Getting commit access isn't that difficult, so if you really want to help out with the project, it's almost certainly just a matter of time.
Setting things up so you can take advantage of your direct access to the project files takes a number of steps, but almost all of them only need to be done once. So even though this section is rather long, it's not like it's something you need to run through each and every time you want to make a change.
When you're given commit access, you'll be assigned an account on the machine where the documentation sources live. You will need to log onto the system at least once, but not very often after that. Telnet access isn't permitted, and all the CVS activity goes through channels secured with SSH (Secure SHell), so you'll need to log on using it. The first thing you need to do is change your password on the Apache machine:
% ssh -l jdoe locus.apache.org
[email protected]'s password: enter the password you were given
[welcome message]
bash-2.03$ passwd
Old password: enter the password you were given
New password: enter the new password you want
Retype new password: enter the new password again
passwd: updating the database...
passwd: done
The next thing you should do is create an .ssh directory on the Apache machine so that your transmissions will be secure, and make sure the directory itself is secure:
bash-2.03$ mkdir .ssh
bash-2.03$ chmod 700 .ssh
Next you need to set up an SSH environment on your working machine (if you don't already have one), so log out of the Apache system again. How you set up your local SSH environment depends on whether you're using Windows or some flavour of Unix; the example here is for the latter, since that's common to all Unix environments while the Windows procedure varies depending on the SSH tools you're using:
% ssh-keygen
Initializing random number generator...
Generating p: ........................++ (distance 378)
Generating q: ......++ (distance 78)
Computing the keys...
Testing the keys...
Key generation complete.
Enter file in which to save the key (/home/jdoe/.ssh/identity):
Enter passphrase: some phrase
Enter the same passphrase again: some phrase
Your identification has been saved in /home/jdoe/.ssh/identity.
Your public key is:
1024 37 1757929984416721208730364810902293553450996411072075783191053944
754656443598770447187276288423727562812292885286108547911953028869014076
111044194416145436315271955021155359447781640675839182941840834261082262
672358626432123431097764865833697997894646219556601685973804104269238278
14361628743139739328209517121 [email protected]
Your public key has been saved in /tmp/identity.pub
The next step is to make the Apache system aware of your SSH key; you do this by copying a file to it:
% scp .ssh/identity.pub [email protected]:.ssh/authorized_keys
Enter passphrase for RSA key '[email protected]': some phrase
identity.pub | 0 KB | 0.8 kB/s | ETA: 00:00:00 | 100%
At this point, you should be able to use the ssh command to log directly into the Apache system with the passphrase you assigned to your SSH key instead of the login password:
% ssh locus.apache.org
Enter passphrase for RSA key '[email protected]': some phrase
[welcome message]
bash-2.03$
Now you should be all set up to take advantage of your commit access. You need to set up a set of working directories in order to do this, so do the following on your working machine:
% cvs -d locus.apache.org:/home/cvs checkout httpd-docs-1.3
Enter passphrase for RSA key '[email protected]': some phrase
cvs server: Updating httpd-docs-1.3
cvs server: Updating httpd-docs-1.3/apidoc
U httpd-docs-1.3/apidoc/.cvsignore
U httpd-docs-1.3/apidoc/README
U httpd-docs-1.3/apidoc/TODO
U httpd-docs-1.3/apidoc/api-dict.html
[more output]
Now when you want to work on the documentation, make your changes in the newly checked-out working tree, and then use cvs commit to check them into the master repository.
If you have any questions about this process, you should ask them on the apache-docs mailing list.
One of the major outcomes we hope to see from the documentation project is complete versions of the files translated to other languages. This will be an ongoing task, since the master copy of the documentation will always be in English, and anyone translating it to another language will need to be alert to updates to the English version in order to make the appropriate changes to the translation.
The translated versions are being maintained alongside the English versions, with the goal of allowing Apache's content negociation process to choose the appropriate language variant. This means that for a particular section, there may be the following files:
foo.html.en |
English |
foo.html.tw.Big5 |
Taiwan Chinese, in the Big5 character set |
foo.html.po |
Polish |
foo.html.fr |
French |
foo.html.pt |
Portuguese |
foo.html.pt-br |
Portuguese (Brazilian) |
A lot of translation work has already been done by some Web sites around the world; we hope to be able to merge their efforts into the master documentation.
Just like the Apache source code, changes to the documentation are submitted in the form of patches. A patch is a list of differences between what's currently in the master repository and to what it should be changed, and understanding them is critical to keeping up with what's going on. Figure 1 shows an example patch:
Figure 1: Sample Patch Text |
Index: index.html.en
===================================================================
RCS file: /home/cvs/httpd-docs-1.3/htdocs/index.html.en,v
retrieving revision 1.4
diff -u -r1.4 index.html.en
--- index.html.en 1999/11/20 21:29:40 1.4
+++ index.html.en 2000/08/16 11:50:51
@@ -12,9 +12,12 @@
ALINK="#FF0000"
>
-
-<P>
-If you can see this, it means that the installation of the <A HREF=...1
+<p>
+If you can see this, it means that the installation of the
+<a href="http://www.apache.org/httpd">Apache web server</a> software on
+this system was successful. You may now add content to this directory
+and replace this page.
+</p>
<P><HR WIDTH="50%" SIZE="8">
|
1 This line from the original source was very long and has been truncated for display purposes. Part of the purpose of the patch is to fix that by wrapping the line.
|
This is what's called a unified diff, but don't worry about the name so much. If you see a patch that has bangs ('! )' at the beginning of some of the lines, you know it's a context diff instead.
The first few lines of the patch identify the file involved, and the sources of the changes and the original against which they compared. In Figure 1, the '1.4 ' at the end of the third line indicates that the original source used as a reference was version 1.4 of the file in the CVS repository. You don't generally need to worry about this stuff, either.
In the body of the patch, there are one or more sections separated by lines beginning with '@@ '. Each of these introduces a new change section; within such a section, lines may begin with '+ ', '- ', '! ', or a space.
Lines beginning with '- ' need to be deleted from the master copy; those beginning with '+ ' need to be added. When taken in combination, you can interpret the following:
Line 0
-Line 1
+Line 2
Line 3
as meaning 'Line 1 is replaced by Line 2 ', and 'Line 0 ' and 'Line 3 ' don't get changed.
So much for what patches look like. A lot of the patch messages which will sent to the mailing list will merely be signals of changes that have been made to the master files -- but some of them will be proposed changes, sent to the list for review and suggestions.
In order to properly review such proposals, you need to be able to apply the changes to your own working copies of the documentation files so you can see how the result looks.
On either Windows or Unix, the key to this is the patch tool. It accepts as input a text file containing a unified or context diff, and applies the changes to the indicated files. It will ignore irrelevant bits like mail message headers or signatures, so you can simply save a patch mail message in a file and use that.
% cd httpd-docs-1.3/htdocs/
% patch < /tmp/index.html.en.patch
patching file `index.html.en'
You may need to define the environment variable POSIXLY_CORRECT in order for patches to apply properly, or use the -p0 option on the patch command. Experiment with both until you can successfully apply a patch. For example, you might add the following to your .bash_profile file on your Unix system:
POSIXLY_CORRECT=1
export POSIXLY_CORRECT
Again, if you have any difficulty, explain the problem and ask for help on the apache-docs mailing list.
Once you've applied a patch to your working files and reviewed it, it's generally a good idea to undo it again. If you don't, you run the risk of getting conflict messages on future synchronisations with the master files if the patch is altered at all before actually being committed.
To undo a patch, use the same steps as you did to apply it, but add the -R flag to the patch command:
% cd httpd-docs-1.3/htdocs/
% patch -R < /tmp/index.html.en.patch
patching file `index.html.en'
The -R means "reverse this patch".
Patches can be generated automatically by CVS, which is one reason people intending to submit changes are strongly encouraged to use it. Here's an example:
% cvs update # make sure we're up to date
% cvs diff -u index.html.en # generate the patch
[output from Figure 1 displayed]
Redirect the output of cvs diff to a file, and you have a patch to submit. As shown in the example above, unidiffs (unified diffs) are vastly preferred to context diffs.
To actually submit a patch, send a message to the [email protected] mailing list (you must be subscribed to it first). The subject of the message should start with '[PATCH] ' and describe briefly what the patch is changing. In the body of the message, give a more detailed explanation of the patch, such as what it's altering and why, and follow it with the patch text itself.
DO NOT send patches as attachments! They should definitely be included as inline text. Attachments have a tendency to be difficult to apply, and often get tagged as being binary, making them difficult to read as well.
Don't try to mix operating systems, either! If you're working with a tree that you checked out on Windows, generate the patch and send the mail on Windows as well. Likewise, if you are working on Unix, generate the patch and send the mail from the Unix system. If you don't abide by this stricture, the different formats of the two systems' text files will cause confusion and require careful editing to undo. (I.e., files will end up with embedded carriage-return characters that will need to be manually removed.)
Figure 2 shows a sample patch submission message, including the inline patch text and the 'cover letter' explaining what it fixes and why it's being submitted. If the patch text looks familiar, that's because it was used in examples earlier in this article. Note the prefix '[PATCH] ' in the message's subject line.
Figure 2: Sample Patch Submission Message |
From: Rodent of Unusual Size <[email protected]>
To: [email protected]
Reply-to: [email protected]
Subject: [PATCH] to wrap long lines
The English 'welcome' page has a reeeally long line in it; this
is just a cosmetic change to wrap the text for easier maintainability.
Index: index.html.en
===================================================================
RCS file: /home/cvs/httpd-docs-1.3/htdocs/index.html.en,v
retrieving revision 1.4
diff -u -r1.4 index.html.en
--- index.html.en 1999/11/20 21:29:40 1.4
+++ index.html.en 2000/08/16 11:50:51
@@ -12,9 +12,12 @@
ALINK="#FF0000"
>
-
-<P>
-If you can see this, it means that the installation of the <A HREF=...1
+<p>
+If you can see this, it means that the installation of the
+<a href="http://www.apache.org/httpd">Apache web server</a> software on
+this system was successful. You may now add content to this directory
+and replace this page.
+</p>
<P><HR WIDTH="50%" SIZE="8">
|
1 This line from the original source was very long and has been truncated for display purposes. Part of the purpose of the patch is to fix that by wrapping the line.
|
Also note that the patch text is included inline as part of the message's body, and not as an attachment.
The rules governing the format and syntax of the Apache documentation are quite loose, but there are some guiding principles:
- HTML tags should be lowercase wherever possible. In other words, '
<a href="foo.html">Link</a> ' is preferred over '<A HREF="foo.html">Link</A> '. This is because lowercase letters result in more efficient space savings when documents are compressed.
- The HTML source should be wrapped at column 72 or less when possible. (This may not be possible for long URLs.)
- The documentation should be valid HTML, and pass the HTML validator at <URL:http://validator.w3c.org/>.
- Container tags with optional closing tags (such as <p> and <li>) should always include the closing tag regardless.
- Use indentation and blank lines to make the HTML more easily readable, since readable HTML is easier to maintain.
- If you have to split an anchor tag across multiple lines, it is preferable to split it before the '>' character. For example:
Follow
<a href="some-really-long-and-involved-URL"
>this link</a>.
rather than
Follow <a href="some-really-long-and-involved-URL">
this link</a>.
This is because that newline and leading whitespace will be displayed by some browsers, giving the link a little tail.
- Remember that every byte of HTML has a network cost associated with it, and try to strike a balance between readability of the HTML source and the amount of whitespace and other nonsignificant bytes that will consume bandwidth.
Getting set up to work on any part of any of the Apache projects is a fairly involved process, and lots of things can go wrong or be misconfigured. Go ahead and follow the steps listed here; if you have any problems, send mail either to the apache-docs mailing list or to me personally.
You may find that you like the experience of working on a distributed project, and want to work on others. If so, there are always the other Apache projects, most of which can use help with their documentation (as well as their source code), and other open projects that welcome contributors, such as the PHP projects. Ask around on the mailing lists and someone will almost certainly point you to something that can use your skills.
Though the initial setup investment may be a little high in terms of time, remember that the Apache documentation is used by literally millions of people. Working on it to improve it is therefore improving the lives of all of those people -- and the advertisement of your talents doesn't hurt, either. ;-) So come aboard and be part of one of the most successful open-source projects in the world!
Got a Topic You Want Covered?
If you have a particular Apache-related topic that you'd like covered in a future article in this column, please let me know; drop me an email at <>. I do read and answer my email, usually within a few hours (although a few days may pass if I'm travelling or my mail volume is 'way up). If I don't respond within what seems to be a reasonable amount of time, feel free to ping me again.
About the Author
Ken Coar is a member of the Apache Group and a director and vice president of the Apache Software Foundation. He is also a core member of the Jikes open-source Java compiler project, a contributor to the PHP project, the author of Apache Server for Dummies, a lead author of Apache Server Unleashed, and is currently working with Ryan Bloom on a book for Addison-Wesley tentatively entitled Apache Module Development in C. He can be reached via email at <>.
Related Stories:
Apache 2.0alpha6 Released(Aug 19, 2000)
PHP on Apache: The Definitive Installation Guide(Aug 09, 2000)
Suexec and Apache: A Tutorial(Jul 12, 2000)
Securing Your Web Pages with Apache(Jun 29, 2000)
Getting Started with Apache 1.3(Jun 01, 2000)
|