An Introduction to the LWGate

Line Image

This page is intended to explain the basic design concepts of the LWGate, and to an experienced HTTP server administrator, this may seem a bit too basic. The other pages in the installation process are authored with a much more technical tone.

The LWGate is a CGI script, which means that it utilizes the Common Gateway Interface specification which allows WWW clients to send information to HTTP servers. If you are unfamiliar with CGI, installing the LWGate may be very difficult. The NCSA CGI Overview does a good job of defining CGI.

Basically, the LWGate does a few general things: It presents pages of information for a WWW client; it processes data from the client and sends off an appropriate mailing list command to a mailing list server; and it builds a hypertext interface from mailing list archives. Browsing NetSpace LWGate should give you a fairly good idea about what the LWGate does.

One important thing to note as you browse the NetSpace LWGate is that the URLs all begin with "http://www.netspace.org/cgi-bin/lwgate". All pages that have URLs that begin with this string are calling the NetSpace LWGate; the HTML page y ou see is generated on the fly by the script, replacing information that is specific to each list. The LWGate uses PATH_INFO variable (as per the CGI specs) to know which "page" of information you want. For instance, when you ask for the U RL that ends in "request-add.html", the LWGate knows that you want the request-add form, as opposed to the main LWGate page or some other page.

Also note that there are only a certain number of mailing lists available on the NetSpace LWGate and that each mailing list's info page is available from the URL "http://www.netspace.org/cgi-bin/lwgate/listname/". The LWGa te is specifically designed only to provide a WWW interface for mailing lists that request this service. The LWGate stores which lists it makes available in the LWGate database. You will notice that the various mailing lists all have different info rmation: some have certain commands unavailable, some are of different mailing list server types, and all have a different description. The LWGate database keeps track of this information. The format of this file is described later, in the page Create the LWGate Database.

When someone is executing a form to send a mailing list command, the LWGate receives this data via the CGI POST method. However, the LWGate is designed to also accept the GET method, and in fact uses the GET method in a few instances to pass info rmation from one hypertext link to itself. The LWGate takes the data and constructs a mailing list command appropriate to the type of server that provides the particular mailing list. The LWGate sends this command via electronic mail, forging the r eturn address to make it look as though the mail was sent from the user, and then gives the client a response.

The other main function of the LWGate is to make a list's archives available through a hypertext interface. For this service to be available to a mailing list, the archives must be available from the filesystem on which the script is being run. While the LWGate can allow external lists to have commands processed by the LWGate, the hypertext archive feature is available only to lists that store their archives on your filesystem.

This function of the LWGate operates by generating an index, on the fly, to a particular mailing lists's archive file. Here's how this is done:

The LWGate examines the first part of the archive file line-by-line. It ignores the line unless it matches the LISTSERV(TM) archive delimiter (a line with 73 equals signs), the ListProcessor/Majordomo archive delimiter (a line which begins with " From ", as in a standard UNIX mail file), or the RFC 1153 defined preamble separator for a digest (a line with 30 hyphens). Once the LWGate finds one of these delimiters, the LWGate then parses the messages in blocks as appropriate to the archive t ype. For the LISTSERV(TM) and ListProcessor/Majordomo archives, it cycles through the file, looking at various blocks defined by the message delimiter. For the RFC 1153 compliant digest, it parses the digest as suggested by that document, grabbing blocks delimited by a blank line, a line of 30 hyphens, and another blank line; when it comes to the end-of-digest indicator (two lines, one which begins with "End of " and the next which is a series of asterisks), the LWGate ignores all furthe r lines in the file.

I have no idea in what format SmartList archives are stored. My intuition suspects that it is the same format as ListProcessor and Majordomo lists. If this is the case, the LWGate should support SmartList archives. Otherwise, I'll need someone to give me more information about SmartList.

After creating an index to the archive/digest file, the LWGate then does one of two things: When a user is trying to see what articles are in the archive file, the LWGate displays an index to the messages contained in the file, sorted in a variet y of manners. When a user is trying to view an individual article, the LWGate displayes the body of that message. It also interprets the message and tries to translate all URLs mentioned into appropriate anchors.

The LWGate can also allow a user to navigate down the filesystem from the archive directory (but not upwards). These subdirectories must match the regular expression the LWGate uses to identify appropriate files for the list; this regular express ion, described later, is set for each list that will have its archives provided by the LWGate. One limitation of this is that the LWGate keeps track of directories with a backslash character (\), and thus will not be able to deal with subdirectorie s that contain that character.

The LWGate source distribution contains a few files. The README file, as suggested by its name, explains the purpose of each file in the distribution and points you to these directions for assistance in installing the script. The lwg ate file is the script source itself. The LICENCE file is your copy of the GNU General Public Licence. Finally, there are two image files, lwgate.gif and lwgate-button.gif, which are used by the script when displayin g certain pages. The instructions to come will explain what to do with these files.

NB: CGI scripts present a possible security hole for HTTP servers. As a system administrator, I do not recommend installing any CGI script unless you understand what each line of code is doing. If you agree that this is a prudent and necessary task, I have tried to make your job easier by extensively commenting the entire script.

If you are familiar with CGI, Perl, the WWW, and HTTP, the rest of the directions should (hopefully) be fairly explicit. The next step is to obtain the LWGate source.

Line Image

Copyright 1996, David W. Baker.