|
Gateways
chapter11
·
Using Existing Network Applications
·
Running a Program Using C
·
Parsing the Output in C
·
Running a Program Using Perl
·
Parsing the Output in Perl
·
Finger Gateway
·
Security
·
True Client/Server Gateways
·
Network Programming
·
A Direct Finger Gateway
·
E-Mail Gateway
·
A Simple Mail Program (C)
·
Extending the Mail Program (Perl)
·
Summary
Several
different types of network services are available on the Internet, ranging
from e-mail to database lookups to the World Wide Web. The ability to use
one service to access other services is sometimes convenient. For example,
you might want to send e-mail or post to USENET news from your Web browser.
You might also want to do a WAIS search and have the results sent to your
Web browser.
A gateway is a link between
these various services. Think of a gateway between two different pastures:
one representing one service and the other representing another. In order to
access one service through another.
Very often, your CGI programs act
as gateways between the World Wide Web and other services. After all, CGI
stands for Common Gateway Interface, and it was designed so that you could
use the World Wide Web as an interface to other services and applications.
In this chapter, you see a couple
of examples of gateway applications, beginning with a simple finger gateway.
You learn how to take advantage of existing client applications within your
CGI applications, and you learn the related security issues. You see an
example of developing a gateway from scratch rather than using an existing
application. Finally, you learn how to design a powerful e-mail gateway.
Using
Existing Network Applications
Network applications all work in a
similar fashion. You need to know how to do two things: connect to the
service and communicate with it. The language that you use to communicate
with the service is called the protocol. You have already seen one
type of protocol in great detail: the web or http protocol,
Most network services already have
clients that know how to properly connect to the server and that understand
the protocol. For example, any Web browser understands the http protocol. If
you want to get information from a Web server, you don't need to know the
protocol. All you need to do is tell the browser what information you want,
and the browser does all the communicating for you.
If you already have a suitable
client for various services, you can easily write a Web gateway that gets
input from the browser, calls the program using the input, and sends the
output back to the browser.
Because the existing client does
all the communicating for you, your CGI program only needs to do a few
things:
·
Get the input from the browser
·
Call the program with the specified input
·
Possibly parse the output from the program
·
Send the
output to the browser
The first and last steps are easy.
You know how to get input from and send output to the browser using CGI. The
middle two steps are slightly more challenging.
Running a Program Using C
Several ways exist to run a
program from within another program; some of them are platform specific. In
C, the standard function for running other programs is system() from
stdlib.h. The parameters for system() and the behavior of this function
usually depend on the operating system. In the following examples, assume
the UNIX platform, although the concepts can apply generally to all
platforms and languages.
On UNIX, system() accepts the
program as its parameter and its command-line parameters exactly as you
would type them on the command line. For example, if you wanted to write an
application that printed the contents of your current directory, you could
use the system() function to call the UNIX program /bin/ls. The program
myls.c in Listing 11.1 does just that.
-----------------------------------
Listing 11.1. The myls.c
program.
#include <stdlib.h>
int main()
{
system("/bin/ls"); /* assumes ls resides in the /bin directory */
}
----------------------------------------
|
Tip |
|
When you use the system() or
any other function that calls programs, remember to use the full
pathname. This measure provides a reliable way to make sure the
program you want to run is run, and it reduces the security risk by
not depending on the PATH environment. |
When the system() function is
called on UNIX, the C program spawns a shell process (usually /bin/sh) and
tells the shell to use the input as its command line. Although this is a
simple and portable way to run programs, some inherent risks and extra
overhead occur when using it in UNIX. When you use system(), you spawn
another shell and run the program rather than run the program directly.
Additionally, because UNIX shells interpret special characters (metacharacters),
you can inadvertently allow the user to run any program he or she wishes.
For more information about the risks of the system() call,
To directly run programs in C on
UNIX platforms is more complex and requires using the exec() class of
functions from unistd.h. Descriptions of each different exec() function are
in Table 11.1.
Table 11.1. The exec() family.
|
Function |
Description |
|
execv() |
The first argument indicates
the path to the program. The second is a null-terminated array of
pointers to the argument list; the first argument is usually the name
of the program. |
|
Execl() |
The first argument is the
path to the program. The remaining arguments are the program
arguments; the second argument is usually the name of the program.
|
|
Execvp() |
Same as execv(), except the
first argument stores the name of the program, and the function
searches the PATH environment for that program. |
|
Execlp() |
Same as execl(), except the
first argument stores the name of the program, and the function
searches the PATH environment for that program. |
|
execle() |
Same as execl(), except it
includes the environment for the program. Specifies the environment
following the null pointer that terminates the list. |
In order to execute a program
directly under UNIX, you need to create a new process for it. You can do
this using the fork() function. After you create a new process (known as the
child), your program (the parent) must wait until the child is finished
executing. You do this using the wait() function.
Using the exec() function, I
rewrote myls.c, shown in Listing 11.2. The program is longer and more
complex, but it is more efficient. If you do not understand this example,
you might want to either read a book on UNIX system programming or just
stick to the system() function, realizing the implications.
-------------------------------------
Listing 11.2. The myls.c
program (using exec()).
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
int pid,status;
if ((pid = fork()) < 0) {
perror("fork");
exit(1);
}
if (pid == 0) { /* child process */
execl("/bin/ls","ls");
exit(1);
}
/* parent process */
while (wait(&status) != pid) ;
}
--------------------------------------
Parsing the Output in C
These programs print their output,
unparsed, to stdout. Although most of the time this is satisfactory,
sometimes you might want to parse the output. How do you capture the output
of these programs?
Instead of using the system()
function, you use the popen() function, which uses UNIX pipes (popen stands
for pipe open). UNIX users will be familiar with the concept of the pipe.
For example, if you had a program that could manipulate the output of the ls
command, in order to feed the output to this program you could use a pipe
from the command line (| is the pipe symbol).
ls | dosomething
This step takes the output of ls
and feeds it into the input of dosomething.
The popen() function emulates the
UNIX pipe from within a program. For example, if you wanted to pipe the
output of the ls command to the parse_output() function, your code might
look like the following:
FILE *output;
output = popen("/bin/ls","r");
parse_output(output);
pclose(output);
popen() works like system(),
except instead of sending the output to stdout, it sends the output to a
file handle and returns the pointer to that file handle. You can then read
from that file handle, parse the data, and print the parsed data to stdout
yourself. The second argument of popen() determines whether you read from or
write to a pipe. If you want to write to a pipe, you would replace "r" with
"w". Because popen() works like system(), it is also susceptible to the same
security risks as system(). You should be able to filter any user input for
metacharacters before using it inside of popen().
Because popen() suffers from the
same problems as system(), you might sometimes prefer to use the pipe()
function in conjunction with an exec() function. pipe() takes an array of
two integers as its argument. If the call works, the array contains the read
and write file descriptors, which you can then manipulate. pipe() must be
called before you fork and execute the program. Again, this process is
complex. If you don't understand this, don't worry about it; you probably
don't need to use it. An example of pipe() appears later in this chapter, in
"Parsing the Output in Perl."
In each of these examples, the
output is buffered by default, which means that the system stores the output
until it reaches a certain size before sending the entire chunk of output to
the file handle. This process usually operates faster and more efficiently
than sending one byte of output to the file handle at a time. Sometimes,
however, you run the risk of losing part of the output because the file
handle thinks no more data exists, even though some data is still left in
the buffer. To prevent this from happening, you need to tell your file
handles to flush their buffers. In C, you do this using the fflush()
function, which flushes the given file handle. For example, if you wanted
your program not to buffer the stdout, you would use the following call:
fflush(stdout);
Running a Program Using
Perl
The syntax for running a program
within a Perl program is less complex than in C, but no less powerful. Perl
also has a system() function, which usually works exactly like its C
equivalent. myls.pl in Listing 11.3 demonstrates the Perl system() function.
Listing 11.3. The myls.pl
program.
#!/usr/local/bin/perl
system("/bin/ls");
As you can see, the syntax is
exactly like the C syntax. Perl's system() function, however, will not
necessarily spawn a new shell. If all the arguments passed to system() are
separate parameters, Perl's system() function is equivalent to the forking
and execing of programs in C. For example, Listing 11.4 shows the Perl code
for listing the contents of the root directory and Listing 11.5 shows the C
equivalent.
Listing 11.4. The lsroot.pl
program.
#!/usr/local/bin/perl
system "/bin/ls","/";
Listing 11.5. The lsroot.c
program.
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main()
{
int pid,status;
if ((pid = fork()) < 0) {
perror("fork");
exit(1);
}
if (pid == 0) { /* child process */
execl("/bin/ls","ls","/");
exit(1);
}
/* parent process */
while (wait(&status) != pid) ;
}
You will find it considerably
easier to obtain the efficiency and security of forking and then executing a
program in Perl than in C. Note, however, that if you had used the
following:
system("/bin/ls /");
instead of this:
system "/bin/ls","/";
then the system call would have
been exactly equivalent to the C system call; in other words, it would spawn
a shell.
|
Note |
|
You can also run programs
directly in Perl using fork() and exec(). The syntax is the same as
the C syntax using fork() and any of the exec() functions. Perl only
has one exec() function, however, that is equivalent to C's execvp().
The exec() function by
itself is equivalent to system() except that it terminates the
currently running Perl script. In other words, if you included all of
the arguments in one argument in exec(), it would spawn a shell and
run the program, exiting from the Perl script after it finished. To
prevent exec() from spawning a shell, separate the arguments just as
you would with system(). |
Parsing the Output in
Perl
Capturing and parsing the output
of programs in Perl is also simpler than in C. The easiest way to store the
output of a Perl program is to call it using back ticks (`). Perl spawns a
shell and executes the command within the back ticks, returning the output
of the command. For example, the following spawns a shell, runs /bin/ls, and
stores the output in the scalar $files:
$files = `/bin/ls`;
You can then parse $files or
simply print it to stdout.
You can also use pipes in Perl
using the open() function. If you want to pipe the output of a command (for
example, ls) to a file handle, you would use the following:
open(OUTPUT,"ls|");
Similarly, you could pipe data
into a program using the following:
open(PROGRAM,"|sort");
This syntax is equivalent to C's
popen() function and suffers from similar problems. In order to read from a
pipe without opening a shell, use
open(OUTPUT,"-|") || exec "/bin/ls";
To write to a pipe, use
open(PROGRAM,"|-") || exec "/usr/bin/sort";
Make sure each argument for the
program gets passed as a separate argument to exec().
To unbuffer a file handle in Perl,
use
select(FILEHANDLE); $| = 1;
For example, to unbuffer the
stdout, you would do the following:
select(stdout); $| = 1;
Finger Gateway
Using the methods described in the
preceding section, you can create a Web gateway using existing clients.
Finger serves as a good example. Finger enables you to get certain
information about a user on a system. Given a username and a hostname (in
the form of an e-mail address), finger will contact the server and return
information about that user if it is available.
The usage for the finger program
on most UNIX systems is
finger username@hostname
For example, the following returns
finger information about user eekim at the machine hcs.harvard.edu:
finger eekim@hcs.harvard.edu
You can write a Web-to-finger CGI
application, as shown in Listings 11.6 (in C) and 11.7 (in Perl). The
browser passes the username and hostname to the CGI program finger.cgi,
which in turn runs the finger program. Because finger already returns the
output to stdout, the output appears on the browser.
You want the finger program to be
flexible. In other words, you should have the capability to specify the user
and host from the URL, and you should be able to receive information from a
form. Input for finger.cgi must be in the following form:
finger.cgi?who=username@hostname
If you use finger.cgi as the
action parameter of a form, you must make sure you have a text field with
the name who.
Listing 11.6. The finger.cgi.c
program.
#include <stdio.h>
#include <stdlib.h>
#include "cgi-lib.h"
#include "html-lib.h"
#include "string-lib.h"
#define FINGER "/usr/bin/finger "
void print_form()
{
html_begin("Finger Gateway");
h1("Finger Gateway");
printf("<form>\n");
printf("Who? <input name=\"who\">\n");
printf("</form>\n");
html_end();
}
int main()
{
char *command,*who;
llist entries;
html_header();
if (read_cgi_input(&entries)) {
if (cgi_val(entries,"who")) {
who = newstr(escape_input(cgi_val(entries,"who")));
html_begin("Finger results");
printf("<pre>\n");
command = malloc(strlen(FINGER) + strlen(who) + 1);
strcpy(command,FINGER);
strcat(command,who);
fflush(stdout);
system(command);
printf("</pre>\n");
html_end();
}
else
print_form();
}
else
print_form();
list_clear(&entries);
}
Listing 11.7. The finger.cgi
program (Perl).
#!/usr/local/bin/perl
require 'cgi-lib.pl';
select(stdout); $| = 1;
print &PrintHeader;
if (&ReadParse(*input)) {
if ($input{'who'}) {
print &HtmlTop("Finger results"),"<pre>\n";
system "/usr/bin/finger",$input{'who'};
print "</pre>\n",&HtmlBot;
}
else {
&print_form;
}
}
else {
&print_form;
}
sub print_form {
print &HtmlTop("Finger Gateway");
print "<form>\n";
print "Who? <input name=\"who\">\n";
print "</form>\n";
print &HtmlBot;
}
Both the C and Perl versions of
finger.cgi are remarkably similar. Both parse the input, unbuffer stdout,
and run finger. The two versions, however, differ in how they run the
program. The C version uses the system() call, which spawns a shell and runs
the command. Because it spawns a shell, it must escape all metacharacters
before passing the input to system(); hence, the call to escape_input(). In
the Perl version, the arguments are separated so it runs the program
directly. Consequently, no filtering of the input is necessary.
You can avoid filtering the input
in the C version as well, if you avoid the system() call. Listing 11.8 lists
a version of finger.cgi.c that uses execl() instead of system(). Notice that
in this version of finger.cgi.c, you no longer need escape_input() because
no shell is spawned.
Listing 11.8. The finger.cgi.c
program (without spawning a shell).
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include "cgi-lib.h"
#include "html-lib.h"
#include "string-lib.h"
#define FINGER "/usr/bin/finger"
void print_form()
{
html_begin("Finger Gateway");
h1("Finger Gateway");
printf("<form>\n");
printf("Who? <input name=\"who\">\n");
printf("</form>\n");
html_end();
}
int main()
{
char *command,*who;
llist entries;
int pid,status;
html_header();
if (read_cgi_input(&entries)) {
if (cgi_val(entries,"who")) {
who = newstr(cgi_val(entries,"who"));
html_begin("Finger results");
printf("<pre>\n");
command = malloc(strlen(FINGER) + strlen(who) + 1);
strcpy(command,FINGER);
strcat(command,who);
fflush(stdout);
if ((pid = fork()) < 0) {
perror("fork");
exit(1);
}
if (pid == 0) { /* child process */
execl(FINGER,"finger",who);
exit(1);
}
/* parent process */
while (wait(&status) != pid) ;
printf("</pre>\n");
html_end();
}
else
print_form();
}
else
print_form();
list_clear(&entries);
}
For a variety of reasons, you
might want to parse the output before sending it to the browser. Perhaps,
for example, you want to surround e-mail addresses and URLs with <a href>
tags. The Perl version of finger.cgi in Listing 11.9 has been modified to
pipe the output to a file handle. If you want to, you can then parse the
data from the file handle before sending it to the output.
Listing 11.9. The finger.cgi
program (Perl using pipes).
#!/usr/local/bin/perl
require 'cgi-lib.pl';
select(stdout); $| = 1;
print &PrintHeader;
if (&ReadParse(*input)) {
if ($input{'who'}) {
print &HtmlTop("Finger results"),"<pre>\n";
open(FINGER,"-|") || exec "/usr/bin/finger",$input{'who'};
while (<FINGER>) {
print;
}
print "</pre>\n",&HtmlBot;
}
else {
&print_form;
}
}
else {
&print_form;
}
sub print_form {
print &HtmlTop("Finger Gateway");
print "<form>\n";
print "Who? <input name=\"who\">\n";
print "</form>\n";
print &HtmlBot;
}
Security
It is extremely important to
consider security when you write gateway applications. Two specific security
risks exist that you need to avoid. First, as previously stated, avoid
spawning a shell if possible. If you cannot avoid spawning a shell, make
sure you escape any non-alphanumeric characters (metacharacters). You do
this by preceding the metacharacter with a backslash (\).
You must note that using a Web
gateway could circumvent certain access restrictions. For example, suppose
your school, school.edu, only allowed people to finger from within the
school. If you set up a finger gateway running on www.school.edu, then
anyone outside the school could finger machines within the school. Because
the finger gateway runs the finger program from within the school.edu, the
gateway sends the output to anyone who requests it, including those outside
of school.edu.
If you want to maintain access
restrictions, you need to build an access layer on your CGI program as well.
You can use the REMOTE_ADDR and REMOTE_HOST environment variables to
determine from where the browser is connecting.
True
Client/Server Gateways
If you do not already have an
adequate client for certain network services, or if you want to avoid the
extra overhead of calling this extra program directly, you can include the
appropriate protocol within your CGI application. This way, your CGI gateway
talks directly to the network service rather than call another program that
communicates with the service.
Although this way has an
efficiency advantage, your programs are longer and more complex, which means
longer development time. Additionally, you generally duplicate the work in
the already existing client that handles the network connections and
communication for you.
If you do decide to write a
gateway client from scratch, you need to first find the protocol. You can
get most of the Internet network protocols via ftp at ds.internic.net. A
nice Web front-end to various Internet protocols and RFC's exists at
<URL:http://www.cis.ohio-state.edu/hypertext/information/rfc.html>.
Network Programming
To write any direct gateways, you
need to know some basic network programming. This
section briefly describes network client programming on UNIX using Berkeley
sockets. The information in this section is not meant to serve as a
comprehensive tutorial to network programming; you should refer to other
sources for more information.
TCP/IP (Internet) network
communication on UNIX is performed using something called a socket (or a
Berkeley socket). As far as the programmer is concerned, the socket works
the same as a file handle (although internally, a socket is very different
from a file handle).
Before you can do any network
communication, you must open a socket using the socket() function (in both C
and Perl). socket() takes three arguments-a domain, a socket type, and a
protocol-and returns a file descriptor. The domain tells the operating
system how to interpret the given domain name. Because you are doing
Internet programming, you use the domain AF_INET as defined in the header
file, socket.h, which is located in /usr/include/sys.
The socket type is either
SOCK_STREAM or SOCK_DGRAM. You almost definitely will use SOCK_STREAM, which
guarantees reliable, orderly delivery of information to the server. Network
services such as the World Wide Web, ftp, gopher, and e-mail use SOCK_STREAM.
SOCK_DGRAM sends packets in datagrams, little packets of information that
are not guaranteed to be delivered or delivered in order. Network File
System (NFS) is an example of a protocol that uses SOCK_DGRAM.
Finally, the protocol defines the
transport layer protocol. Because you are using TCP/IP, you want to define
the network protocol as TCP.
|
Note |
|
AF_INET, SOCK_STREAM, and
SOCK_DGRAM are defined in <sys/socket.h>. In Perl, these values are
not defined unless you have converted your C headers into Perl headers
using the h2ph utility. The following values will work for almost any
UNIX system:
·
AF_INET: 2
·
SOCK_STREAM: 1 (2 if using Solaris)
·
SOCK_DGRAM: 2 (1 if using Solaris)
Solaris users should note
that the values for SOCK_STREAM and SOCK_DGRAM are reversed.
|
After you create a socket, your
client tries to connect to a server through that socket. It uses the
connect() function to do so (again, this process works in both Perl and C).
In order for connect() to work properly, it needs to know the socket, the IP
address of the server, and the port to which to connect.
A Direct Finger Gateway
In order to demonstrate network
programming, this chapter shows finger.cgi programmed to do a direct network
connection. This example appears in Perl; the C equivalent works in a
similar way. Once again, check a book on network programming for more
information.
In order to modify finger.cgi into
a direct finger gateway, you need to change three things. First, you need to
initialize various network variables. Second, you need to split up the value
of who from e-mail form into a separate username and hostname. Finally, you
need to create the socket, make the network connection, and communicate
directly with the finger server. Listings 11.10 and 11.11 show the code for
the first two tasks.
Listing 11.10. Initialize
network variables.
$AF_INET = 2;
$SOCK_STREAM = 1; # Use 2 if using Solaris
$sockaddr = 'S n a4 x8';
$proto = (getprotobyname('tcp'))[2];
$port = (getservbyname('finger', 'tcp'))[2];
Listing 11.11. Separate the
username and hostname and determine IP address from hostname.
($username,$hostname) =
split(/@/,$input{'who'});
$hostname = $ENV{'SERVER_NAME'} unless $hostname;
$ipaddr = (gethostbyname($hostname))[4];
if (!$ipaddr) {
print "Invalid hostname.\n";
}
else {
&do_finger($username,$ipaddr);
}
Communicating directly with the
finger server requires understanding how the finger server communicates.
Normally, the finger server runs on port 79 on the server. In order to use
it, the server expects the username followed by a CRLF. After it has the
username, the server searches for information about that user, sends it to
the client over the socket, and closes the connection.
|
Tip |
|
You can communicate directly
with the finger server using the telnet command. Suppose you want to
finger ed@gunther.org:
% telnet gunther.org 79
Trying 0.0.0.0...
Connected to gunther.org
Escape character is '^]'.
ed
After you press Enter, the
finger information is displayed. |
The code for connecting to and
communicating with the finger server appears in the &do_finger function,
listed in Listing 11.12.
Listing 11.12. The &do_finger
function.
sub do_finger {
local($username,$ipaddr) = @_;
$them = pack($sockaddr, $AF_INET, $port, $ipaddr);
# get socket
socket(FINGER, $AF_INET, $SOCK_STREAM, $proto) || die "socket: $!";
# make connection
if (!connect(FINGER,$them)) {
die "connect: $!";
}
# unbuffer output
select(FINGER); $| = 1; select(stdout);
print FINGER "$username\r\n";
while (<FINGER>) {
print;
}
}
The completed program-dfinger.cgi-appears
in Listing 11.13. Although this program works more efficiently overall than
the older version (finger.cgi) you can see that it is more complex, and that
the extra complexity might not be worth the minute gain in efficiency. For
larger client/server gateways, however, you might see a noticeable advantage
to making a direct connection versus running an existing client from the
gateway.
Listing 11.13. The dfinger.cgi
program (Perl).
#!/usr/local/bin/perl
require 'cgi-lib.pl';
# initialize network variables
$AF_INET = 2;
$SOCK_STREAM = 1; # Use 2 if using Solaris
$sockaddr = 'S n a4 x8';
$proto = (getprotobyname('tcp'))[2];
$port = (getservbyname('finger', 'tcp'))[2];
# unbuffer output
select(stdout); $| = 1;
# begin main
print &PrintHeader;
if (&ReadParse(*input)) {
if ($input{'who'}) {
print &HtmlTop("Finger results"),"<pre>\n";
($username,$hostname) = split(/@/,$input{'who'});
$hostname = $ENV{'SERVER_NAME'} unless $hostname;
$ipaddr = (gethostbyname($hostname))[4];
if (!$ipaddr) {
print "Invalid hostname.\n";
}
else {
&do_finger($username,$ipaddr);
}
print "</pre>\n",&HtmlBot;
}
else {
&print_form;
}
}
else {
&print_form;
}
sub print_form {
print &HtmlTop("Finger Gateway");
print "<form>\n";
print "Who? <input name=\"who\">\n";
print "</form>\n";
print &HtmlBot;
}
sub do_finger {
local($username,$ipaddr) = @_;
$them = pack($sockaddr, $AF_INET, $port, $ipaddr);
# get socket
socket(FINGER, $AF_INET, $SOCK_STREAM, $proto) || die "socket: $!";
# make connection
if (!connect(FINGER,$them)) {
die "connect: $!";
}
# unbuffer output
select(FINGER); $| = 1; select(stdout);
print FINGER "$username\r\n";
while (<FINGER>) {
print;
}
}
E-Mail
Gateway
This chapter ends with examples of
a very common gateway found on the World Wide Web: a Web to e-mail gateway.
The idea is that you can take the content of a form and e-mail it to the
specified location using this gateway.
Many current browsers have
built-in e-mail capabilities that enable users to e-mail anyone and anywhere
from their browsers. Clicking on a tag such as the following will cause the
browser to run a mail client that will send a message to the recipient
specified in the <a href> tag:
<a href="mailto:eekim@hcs.harvard.edu">E-mail
me</a>
Why does anyone need a Web to
e-mail gateway if most browsers can act as e-mail clients?
An e-mail gateway can have
considerable power over the built-in mail clients and the mailto references.
For example, you could force all e-mail to have the same format by using a
fill-out form and a custom mail gateway. This example becomes useful if you
are collecting information for future parsing, such as a poll. Having people
e-mail their answers in all sorts of different forms would make parsing
extremely difficult.
This section shows the development
of a rudimentary mail gateway in C. This gateway requires certain fields
such as to and uses an authentication file to limit the potential recipients
of e-mail from this gateway. Next, you see the form.cgi-the generic form
parsing CGI application developed in
Chapter 10, "Basic Applications"-extended to support e-mail.
A Simple Mail Program (C)
mail.cgi is a simple e-mail
gateway with the following specifications:
·
If called with no input, it displays a generic
mail entry form.
·
If no to field specified, it sends e-mail
by default to a predefined Web administrator.
·
Only uses to, name, email, subject, and
message fields. Ignores all other fields.
·
Sends an error message if the user does not fill
out any fields.
·
Uses an authentication file to make sure only
certain people receive e-mail from this gateway.
As you can see, mail.cgi is fairly
inflexible, but it serves its purpose adequately. It will ignore any field
other than those specified. You could not include a poll on your HTML form
because that information would simply be ignored by mail.cgi. This CGI
functions essentially equivalent to the mailto reference tag, except for the
authentication file.
Why use an authentication file?
Mail using this gateway is easily forged. Because the CGI program has no way
of knowing the identity of the user, it asks the user to fill out that
information. The user could easily fill out false information. In order to
prevent people from using this gateway to send forged e-mail to anyone on
the Internet, it will enable you to send e-mail only to those specified in a
central authentication file maintained by the server administrator. As an
added protection against forged e-mail, mail.cgi adds an X-Sender mail
header that says this e-mail was sent using this gateway.
The authentication file contains
valid e-mail recipients, one on each line. For example, your authentication
file might look like this:
eekim@hcs.harvard.edu
president@whitehouse.gov
In this case, you could only use
mail.cgi to send e-mail to me and the President.
Finally, you need to decide how to
send the e-mail. A direct connection does not seem like a good solution: the
Internet e-mail protocol can be a fairly complex thing, and making direct
connections to mail servers seems unnecessary. The sendmail program, which
serves as an excellent mail transport agent for e-mail, is up-to-date,
fairly secure, and fairly easy to use. This example uses popen() to pipe the
data into the sendmail program, which consequently sends the information to
the specified address.
The code for mail.cgi appears in
Listing 11.14. There are a few features of note. First, even though this
example uses popen(), it doesn't bother escaping the user input because
mail.cgi checks all user inputted e-mail addresses with the ones in the
central authentication file. Assume that neither the e-mail addresses in the
central access file nor the hard-coded Web administrator's e-mail address
(defined as WEBADMIN) are invalid.
Listing 11.14. The mail.cgi.c
program.
#include <stdio.h>
#include "cgi-lib.h"
#include "html-lib.h"
#include "string-lib.h"
#define WEBADMIN "web@somewhere.edu"
#define AUTH "/usr/local/etc/httpd/conf/mail.conf"
void NullForm()
{
html_begin("Null Form Submitted");
h1("Null Form Submitted");
printf("You have sent an empty form. Please go back and fill out\n");
printf("the form properly, or email <i>%s</i>\n",WEBADMIN);
printf("if you are having difficulty.\n");
html_end();
}
void authenticate(char *dest)
{
FILE *access;
char s[80];
short FOUND = 0;
if ( (access = fopen(AUTH,"r")) != NULL) {
while ( (fgets(s,80,access)!=NULL) && (!FOUND) ) {
s[strlen(s) - 1] = '\0';
if (!strcmp(s,dest))
FOUND = 1;
}
if (!FOUND) {
/* not authenticated */
html_begin("Unauthorized Destination");
h1("Unauthorized Destination");
html_end();
exit(1);
}
}
else { /* access file not found */
html_begin("Access file not found");
h1("Access file not found");
html_end();
exit(1);
}
}
int main()
{
llist entries;
FILE *mail;
char command[256] = "/usr/lib/sendmail ";
char *dest,*name,*email,*subject,*content;
html_header();
if (read_cgi_input(&entries)) {
if ( !strcmp("",cgi_val(entries,"name")) &&
!strcmp("",cgi_val(entries,"email")) &&
!strcmp("",cgi_val(entries,"subject")) &&
!strcmp("",cgi_val(entries,"content")) )
NullForm();
else {
dest = newstr(cgi_val(entries,"to"));
name = newstr(cgi_val(entries,"name"));
email = newstr(cgi_val(entries,"email"));
subject = newstr(cgi_val(entries,"subject"));
if (dest[0]=='\0')
strcpy(dest,WEBADMIN);
else
authenticate(dest);
/* no need to escape_input() on dest, since we assume there aren't
insecure entries in the authentication file. */
strcat(command,dest);
mail = popen(command,"w");
if (mail == NULL) {
html_begin("System Error!");
h1("System Error!");
printf("Please mail %s and inform\n",WEBADMIN);
printf("the web maintainers that the comments script is improperly\n");
printf("configured. We apologize for the inconvenience<p>\n");
printf("<hr>\r\nWeb page created on the fly by ");
printf("<i>%s</i>.\n",WEBADMIN);
html_end();
}
else {
content = newstr(cgi_val(entries,"content"));
fprintf(mail,"From: %s (%s)\n",email,name);
fprintf(mail,"Subject: %s\n",subject);
fprintf(mail,"To: %s\n",dest);
fprintf(mail,"X-Sender: %s\n\n",WEBADMIN);
fprintf(mail,"%s\n\n",content);
pclose(mail);
html_begin("Comment Submitted");
h1("Comment Submitted");
printf("You submitted the following comment:\r\n<pre>\r\n");
printf("From: %s (%s)\n",email,name);
printf("Subject: %s\n\n",subject);
printf("%s\n</pre>\n",content);
printf("Thanks again for your comments.<p>\n");
printf("<hr>\nWeb page created on the fly by ");
printf("<i>%s</i>.\n",WEBADMIN);
html_end();
}
}
else {
html_begin("Comment Form");
h1("Comment Form");
printf("<form method=POST>\n";
printf("<input type=hidden name=\"to\" value=\"%s\">\n",WEBADMIN);
printf("<p>Name: <input name=\"name\"><br>\n");
printf("E-mail: <input name=\"email\"><br>\n");
printf("Subject: <input name=\"subject\"></p>\n");
printf("<p>Comments:<br>\n");
printf("<textarea name="content" rows=10 cols=70></textarea></p>\n");
printf("<input type=submit value=\"Mail form\">\n");
printf("</form>\n");
html_end();
}
list_clear(&entries);
return 0;
}
You might notice that the example
uses statically allocated strings for some values, such as the command
string. The assumption is that you know the maximum size limit of this
string because you know where the command is located (in this case, /usr/lib/sendmail),
and you assume that any authorized e-mail address will not put this combined
string over the limit. The example essentially cheats on this step to save
coding time. If you want to extend and generalize this program, however, you
might need to change this string to a dynamically allocated one.
Extending the Mail
Program (Perl)
mail.cgi doesn't serve as a
tremendously useful gateway for most people, although it offers some nice
features over using the <a href="mailto"> tag. A fully configurable mail
program that could parse anything, that could send customized default forms,
and that could send e-mail in a customizable format would be ideal.
These desires sound suspiciously
like the specifications for form.cgi, In fact, the only difference between
the form.cgi program described earlier and the program described here is
that the program described here sends the results via e-mail rather than
saving them to a file.
Instead of rewriting a completely
new program, you can use form.cgi as a foundation and extend the application
to support e-mail as well. This action requires two major changes:
·
A mailto configuration option in the
configuration file.
·
A function that will e-mail the data rather than
save the data.
If a MAILTO option is in the
configuration file, form.cgi e-mails the results to the address specified by
MAILTO. If neither a MAILTO nor OUTPUT option is specified in the
configuration file, then form.cgi returns an error. The new form.cgi with
e-mail support appears in Listing 11.15.
Listing 11.15. The form.cgi
program (with mail support).
#!/usr/local/bin/perl
require 'cgi-lib.pl';
$global_config = '/usr/local/etc/httpd/conf/form.conf';
$sendmail = '/usr/lib/sendmail';
# parse config file
$config = $ENV{'PATH_INFO'};
if (!$config) {
$config = $global_config;
}
open(CONFIG,$config) || &CgiDie("Could not open config file");
while ($line = <CONFIG>) {
$line =~ s/[\r\n]//;
if ($line =~ /^FORM=/) {
($form = $line) =~ s/^FORM=//;
}
elsif ($line =~ /^TEMPLATE=/) {
($template = $line) =~ s/^TEMPLATE=//;
}
elsif ($line =~ /^OUTPUT=/) {
($output = $line) =~ s/^OUTPUT=//;
}
elsif ($line =~ /^RESPONSE=/) {
($response = $line) =~ s/^RESPONSE=//;
}
elsif ($line =~ /^MAILTO=/) {
($mailto = $line) =~ s/^MAILTO=//;
}
}
close(CONFIG);
# process input or send form
if (&ReadParse(*input)) {
# read template into list
if ($template) {
open(TMPL,$template) || &CgiDie("Can't Open Template");
@TEMPLATE = <TMPL>;
close(TMPL);
}
else {
&CgiDie("No template specified");
}
if ($mailto) {
$mail = 1;
open(MAIL,"-|") || exec $sendmail,$mailto;
print MAIL "To: $mailto\n";
print MAIL "From: $input{'email'} ($input{'name'})\n";
print MAIL "Subject: $subject\n" unless (!$subject);
print MAIL "X-Sender: form.cgi\n\n";
foreach $line (@TEMPLATE) {
if ( ($line =~ /\$/) || ($line =~ /\%/) ) {
# form variables
$line =~ s/^\$(\w+)/$input{$1}/;
$line =~ s/([^\\])\$(\w+)/$1$input{$2}/g;
# environment variables
$line =~ s/^\%(\w+)/$ENV{$1}/;
$line =~ s/([^\\])\%(\w+)/$1$ENV{$2}/g;
}
print MAIL $line;
}
close(MAIL);
}
else {
$mail = 0;
}
# write to output file according to template
if ($output) {
open(OUTPUT,">>$output") || &CgiDie("Can't Append to $output");
foreach $line (@TEMPLATE) {
if ( ($line =~ /\$/) || ($line =~ /\%/) ) {
# form variables
$line =~ s/^\$(\w+)/$input{$1}/;
$line =~ s/([^\\])\$(\w+)/$1$input{$2}/g;
# environment variables
$line =~ s/^\%(\w+)/$ENV{$1}/;
$line =~ s/([^\\])\%(\w+)/$1$ENV{$2}/g;
}
print OUTPUT $line;
}
close(OUTPUT);
}
elsif (!$mail) {
&CgiDie("No output file specified");
}
# send either specified response or dull response
if ($response) {
print "Location: $response\n\n";
}
else {
print &PrintHeader,&HtmlTop("Form Submitted");
print &HtmlBot;
}
}
elsif ($form) {
# send default form
print "Location: $form\n\n";
}
else {
&CgiDie("No default form specified");
}
The changes to form.cgi are very
minor. All that you had to add was an extra condition for the configuration
parsing function and a few lines of code that will run the sendmail program
in the same manner as mail.cgi.
Summary
You can write CGI programs that
act as gateways between the World Wide Web and other network applications.
You can take one of two approaches to writing a CGI gateway: either embed an
existing client into a CGI program, or program your CGI application to
understand the appropriate protocols and to make the network connections
directly. Advantages and disadvantages exist with both methods, although for
most purposes, running the already existing client from within your CGI
application provides a more than adequate solution. If you do decide to take
this approach, you must remember to carefully consider any possible security
risks in your code, including filtering out shell metacharacters and
redefining access restrictions. |