Web Alpine Technical Notes

Why Web Alpine

Web Alpine was originally conceived as a means of providing reliable, ubiquitous, and consistent access to UW email facilities via the Interactive Message Access Protocol (IMAP). As with Pine, it is intended to present a simple, approachable interface that can be easily navigated with minimal familiarity. It is also intended to be as efficient as possible, in both page-service and user navigation while meshing as neatly as possible with the central campus IMAP infrastructure.

How Web Alpine

Web Alpine's foundation rests firmly on Unix Alpine's many years of development and deployment. It's core element is a per-session server (or serverette) that provides data to the scripts generating the web pages the users sees and manages the connection to the IMAP mail server where the user's mail resides. This serverette is literally built from the same Alpine sources used to build the Unix Alpine and PC-Alpine mail user agents. Thus, it inherits most of the efficiency, such as data caching and request ordering and grouping, that has been built into Alpine, as well as many of the features and functions.

Components

Browser Performs usual browser role: sends requests, renders responses and displays or hands off for display non-HTML data
Web Server Performs extraordinary web server role: produces HTML responses to requests, maintains relationship between user's browser and IMAP server for the life of the user's session.
IMAP Server Performs usual mail server role: provides access to the various bits and pieces of mail data via Interactive Message Access Protocol

Protocols, Services, Technologies

HTTP(s)
Protocol between browser and web server. Common language allowing users to get at their mail from the widest variety of platforms and locations.
IMAP(s)
Broadly implemented distributed mail protocol which permits users to access mail from a variety of clients: web alpine, pine, pc-pine, outlook express, eudora.
Pubcookie
Secure, distributed web-based authentication service optionally used as the mechanism to authenticate users in the Web Alpine login process.
GSSAPI
SASL authentication mechanism used to establish session between Web Alpine serverette and IMAP server on behalf of Pubcookie authenticated user.
Tcl
Tool Command Language. Chosen CGI scripting language for the Web Alpine page templates. Reasonable language, reasonably implemented, reasonably supported, particularly well suited for exporting functionality in pre-existing C-based tools.
Unix Domain Sockets
Communication channel used to pass requests/responses between Tcl interpreters processing scripts and web alpine serverette.

Web Alpine Distribution Layout

The Web Alpine application consists of three main components; the CGI scripts that generate pages containing the user's email data, a serverette running on the http server spanning the life of the user's session, and a couple of libraries to aid page generation and serverette communication.

Web Alpine is packaged as part of the Alpine Distribution and it's source resides within the web/ subdirectory. Subdirectories are organized by the service the provide, and breakdown as follows:

bin/ Contains scripts and generated binaries that initiate and maintain Web Alpine sessions.
cgi/ Contains scripts referenced by the browser to generate the Web Alpine interface. Subdirectories within organize scripts by function, and allow for suitable scoping of session key.
alpine/ Browser's view of Web Alpine application. Contains scripts to generate pages the user interacts with.
session/ Scripts referenced by the browser to manage session initiation.
images/
sounds/
pub/
Scripts and data that don't require restricted access control.
config/ Server configuration scripts, default settings.
lib/ Contains components to support IPC, CGI processing and HTML generation.
src/ Contains source files for Web Alpine's binary components which will be linked against the pith/ library components.
alpined.d/ Source files for alpined serverette. This is a per-user, per-session server that services requests for email data from the CGI scripts via UNIX domain socket.
pubcookie/ Source files providing UW Pubcookie web authentication support.
cgi.tcl-1.10/ Source for TCL library providing CGI/HTML support
detach@ Typically a symbolic link to a subdirectory within /tmp. It is used to hold temporary copies of message attachments as they're downloaded to the browser.
In the pubcookie case, it must have world read/write/execute mode set due to alpined pseudo-uid partitioning.

Web Alpine Configuration

CGI Script Configuration

Most Web Alpine configuration is contained in the config/alpine.tcl configuration file. Most of the interesting settings are toward the top of the file and pretty much suggest what they set. The most important settings, though, are probably:
_wp(fileroot)
that defines where the Web Alpine application was unpacked on your system, and
_wp(urlprefix)
which defines where browsers can find Web Alpine's CGI scripts. This is set to the null string if the server's DocumentRoot is synonymous with the root of Web Alpine's CGI directory. Otherwise, it's typically set to the Alias accessed subdirectory in the web server's configuration.

Web Server Configuration

Typically, a Web Alpine server is used solely to serve Web Alpine pages. That is, no other hosting is done by the server. Thus, it is usually convenient to configure the web server to treat the web/cgi/ directory within the distribution as the root directory of the pages it serves (or to move those files and directories into the web server's document root). Similarly, it may be necessary to configure the web server to process CGI scripts from it's root (since this should be a dedicated server this shouldn't matter).

IMAP Server Configuration

Genarally, no configuration is required of the IMAP server.
However, in the Pubcookie case the Web and IMAP servers need to coordinate the existence of a meta-user, such as webalpine, used for SASL proxy authentication. For UW imapd this means creating an account on the IMAP server that is a member of the "mailadm" group. A SASL GSSAPI authentication handshake is used between the Web and IMAP server when the web server initiates a session on behalf of a particular user.

User Configuration

Since no user-initiated local file or mailbox access is permitted by (much less compiled into) the alpined, user configuration and data files are stored using Pine's remote pinerc and address book capabilities. The configuration settings in web/config/ are used to set per-user defaults and direct Web Alpine toward the user's configuration settings on the IMAP server. Similarly, the default addressbook is stored as an IMAP folder on the server as well. Concurrent Web Alpine, Unix Pine and PC-Pine users that would like a consistent mail environment can easily configure their other Pine's to use the remote_pinerc and remote_addrbook on their IMAP server.

Browser Configuration

A Web Alpine goal is to run reasonably on as many browsers as possible. Toward that end, little beyond basic table and form support is required of the browser. And while Javascript is not a requirement to access Web Alpine functions, when enabled in the browser some enhanced capability is available such as keyboard accessible commands and implicit selection of various listbox choices.

Session Lifecycle

  1. User requests greeting.tcl which consists of a form to be filled out with any necessary authentication tokens and mail server choice.
    In Pubcookie case user is not presented the username/password option unless they have chosen to connect to a mail server outside the locally managed, predefined set.
  2. User submits form with authentication tokens and initial mail server
    By default, the submitted authentication tokens consist of a username/password pair. When Pubcookie is in use, the browser sends the pubcookie-specific authentication token with the form submission.
  3. Web Alpine CGI logon script processes form and instantiates serverette. The logon script:
    1. validates form data
    2. generates session key
    3. launches the Web Alpine serverette, alpined, passing session key via stdin
      serverette reads session key, creates Unix domain socket, and enters command loop waiting for input on the fresh socket
    4. sends serverette the command to establish a session with the requested IMAP server on behalf of the given user
      By default, the login script simply passes the username/password pair to the serverette where it's the serverette's job to present them to the IMAP server for validation. If the IMAP server declines, a "bad user or password" error page is generated and sent to the browser and the serverette exits.

      The Pubcookie case is a bit more involved. The CGI scripts rely on the netid specified in the REMOTE_USER environment variable which is set as a side effect of pubcookie module processing. The trusted netid is not passed directly to the serverette, rather all CGI processing is done via a setuid Tcl interpreter. The uid is unique to each netid on the system, but not related to any netid/uid binding on general access systems. Running the CGI scripts and serverette under a netid-bound uid provides a convenient way to implement the authentication mechanism between the serverette and the IMAP server as well as a useful way to partition serverettes such that one compromised serverette can't affect others.

    5. With a valid IMAP session established, the logon script redirects the user's browser to the initial Message List page.
  4. The user navigates/manipulates their email environment based on web pages generated by Tcl script templates which were fleshed out via requests to the alpined serverette. The serverette in turn may draw on its cache of IMAP data, make new requests of the IMAP server, post messages via SMTP or the local mail queue, formulate LDAP queries, or perform other tasks as required.
    Note: Web Alpine sessions run as long as the user's browser requests pages. In the absence of user interaction Web Alpine will self-refresh every few minutes to mainain the session. Sessions only end when the user logs out or closes the browser.
  5. User ends session and confirms
    Note: If the user simply closes their browser, the serverette will self-exit after 30 minutes.

Web Server Considerations

Web Alpine has been developed under Apache (versions 1.x thru 2.x). However, because the intent was to be as flexible and manageable as possible, little aside from SSL and basic CGI services are required of the web server. It's conceivable Web Alpine could be made to run under another server, or even Windows and IIS modulo the UNIX-Sockets communication issues between the CGI scripts and alpined.

The downside, of course, is that this requires somewhat redundant parses of the configuration and CGI-helper library with each page request. It's a trade-off. A slightly more efficient approach might be to create an apache module that understands requests and passes them directly to the corresponding alpined which would execute the script and return HTML directly. However, the additional cost in installation and management complexity stands to offset those gains.

Similarly, it is assumed that the Web Alpine service is provided on a black box server. That is, a host that has no general user accounts. Unmodified, Web Alpine creates the UNIX-domain sockets corresponding pretty directly to the user's session key in the /tmp directory. In addition, depending on the nature of the connection, the session key may also be exposed via oridinary httpd logging. Important safety tip: make sure ordinary users do not have access to the Web Alpine system or httpd log files. In the future those sockets may be moved into a access-restricted subdirectory, but the httpd log file record may be harder, and less reliably concealed.

CGI Considerations

Most Web Alpine pages are generated via CGI scripts written in Tcl. A library of Tcl functions called cgi.tcl is used heavily to help with the HTML generation. Of course, this means that a web developer that might wish to change or enhance Web Alpine pages, will have to acquire some Tcl knowledge. Additionally, the library has one or two interface inconsistencies (not unlike Tcl, but that's another discussion), which will mean a bit steeper learning curve, but we think this is only slightly more difficult than the amount of Tcl one would have to learn in a more template-oriented approach. We think the scipt's logic flow and such is much easier to understand and maintain than the substitution and recursion necessary in an html-template approach.

alpined Considerations

Tcl, incidentally, is also the language used to move data in and out of the Web Alpine serverette, alpined. Tcl lends itself nicely to string oriented data, and provides a convenient, simple interface to export functionality contained in C-based utilities.

HTML Considerations

Much of the HTML generated by Web Alpine does layout based on tables. This somewhat sub-optimal state mostly has to do with when the Web Alpine development effort was initiated and the concurrent browser chaos. The goal is to move scripts toward generating more CSS-oriented layout over time.

Similarly, earlier versions of Web Alpine relied heavily on Javascript in a misguided attempt to make the browser-based experience feel as familiar as possible to a dedicated desktop application. Beyond the fact that Javascript support varied widely across browsers at the time, it should have also been obvious that by presenting a familiar desktop-like interface, we also set desktop-like performance expectations which we had no hope of meeting.

Clustering Considerations

Since alpined persists for the life of the user's session, the session is bound to the particular server that initiated it. In order to provide service to a sizeable constituency, it may be necessary to spread usage across a group or cluster of servers. There exist numerous strategies to distribute connecting users across a cluster, such as an initial server that redirects randomly to one of the servers in a cluster, DNS-based randomizing, or load-balancing strategies. The former can lead to web server names users find distracting (though this doesn't appear to be to much of an issue) and the latter, of course, could lead to misdirected requests over time (or as loads change) so it is necessary for servers to either redirect or proxy requests to the appropriate server.

As a basic allowance for such installations, Web Alpine's session key contains the hostname of the server that created it. Similarly, the access routines that parse the key for access to the appropriate alpined are aware of the hostname and will redirect misguided requests to the appropriate server. This isn't particularly satisfying in terms of network RTTs.

One alternative that saves network performance at the expense of slightly increased server load is to introduce a directory above the web alpine/ script directory and then add one along side that new directory for each server in the cluster. It's then possible to use the Apache directive to proxy requests within the scope of those directories to the corresponding server.

Security Considerations

Future Plans

Through the semi-formal usability testing process, early testing phases and regular campus use, overall response has been very favorable. Usability testing concurrent with ongoing feature development and interface adjustments continues to hone rough edges, particularly where the drive for performance has led to less intuitive interface choices. We plan to continue emphasis on the refine/feedback loop as we roll in many of the features Pine users have come to appreciate.

Performance in terms of both user perceived response time and users per web server are always a concern, but must, of course, be balanced against additional maintenance and complexity costs. Less obvious complicating factors must be considered, such as alpined process partitioning and session-key containing cookie exposure in the face of malicious HTML attachments. We plan, of course, to continue exploring various methods to improve performance.

Appendix: Installation Tests

For the most part, if you can get the login greeting page and then log into a session, things should be working for the most part. Some things you might try to verify a complete installation include: