Common Gateway Interface

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The Common Gateway Interface (CGI) is a standard (see RFC 3875: CGI Version 1.1) that defines how webserver software can delegate the generation of webpages to a text-based application. Such applications are known as CGI scripts; they can be written in any programming language, although scripting languages are often used.

Contents

[edit] Purpose

The task of a webserver is to respond to requests for webpages issued by clients (usually web browsers) by analyzing the content of the request (including the URL, the HTTP 'method', request headers and any message body), determining appropriate actions to take and creating an appropriate document to send in response, then returning that to the client.

If the request is just to GET a file that exists on disk, the server can just return the file's contents. Alternatively, the document's content may need to be composed on the fly, and other actions such as updating a database, may be required. One way of achieving this is for a console application to handle the request and compute the returned document's contents, and tell the web server to use that console application. CGI specifies which information is communicated between the webserver and such a console application, and how.

The webserver software will invoke the console application as a command. CGI defines how information about the request (such as the URL, request headers etc) is passed to the command in the form of arguments and environment variables. The application then writes the output document to standard output. CGI also defines how the application should pass back extra information about the output (such as the MIME type and other response headers).

[edit] More details

From the Web server's point of view, certain locators, e.g. http://www.example.com/wiki.cgi, are defined as corresponding to a program to execute via CGI. When a request for the URL is received, the corresponding program is executed.

Web servers often have a cgi-bin/ directory at the base of their directory tree to hold executable files called with CGI.

[edit] Input data

Data is passed into the program using environment variables. This is in contrast to typical execution, where command-line arguments are used. In the case of HTTP PUT or POSTs, the user-submitted data is provided to the program via the standard input.[1]

The following environment variables pass to a CGI program:

Server specific variables

SERVER_SOFTWAREname/version of HTTP server.
SERVER_NAMEhost name of the server, may be dot-decimal IP address.
GATEWAY_INTERFACE — CGI/version.

Request specific variables

SERVER_PROTOCOL — HTTP/version.
SERVER_PORTTCP port (decimal).
REQUEST_METHOD — name of HTTP method (see above).
PATH_INFO — path suffix, if appended to URL after program name and a slash.
PATH_TRANSLATED — corresponding full path as supposed by server, if PATH_INFO is present.
SCRIPT_NAME — relative path to the program, like /cgi-bin/script.cgi.
QUERY_STRING — the part of URL after ? character. Must be composed of name=value pairs separated with ampersands (such as var1=val1&var2=val2…) and used when form data are transferred via GET method.
REMOTE_HOST — host name of the client, unset if server did not perform such lookup.
REMOTE_ADDRIP address of the client (dot-decimal).
AUTH_TYPE — identification type, if applicable.
REMOTE_USER used for certain AUTH_TYPEs.
REMOTE_IDENT — see ident, only if server performed such lookup.
CONTENT_TYPEMIME type of input data if PUT or POST method are used, as provided via HTTP header.
CONTENT_LENGTH — similarly, size of input data (decimal, in octets) if provided via HTTP header.
Variables passed by user agent (HTTP_ACCEPT, HTTP_ACCEPT_LANGUAGE, HTTP_USER_AGENT, HTTP_COOKIE and possibly others) contain values of corresponding HTTP headers and therefore have the same sense.

[edit] Output format

The program returns the result to the web server in the form of standard output, prefixed by a header and a blank line.

The header is encoded in the same way as an HTTP header and must include the MIME type of the document returned.[2] The headers are generally forwarded with the response back to the user, supplemented by the web server.

[edit] Example

An example of a CGI program is one implementing a wiki. The user agent requests the name of an entry; the server will retrieve the source of that entry's page (if one exists), transform it into HTML, and send the result.

[edit] History

In 1993, the World Wide Web (WWW) was small but booming. WWW software developers and web site developers kept in touch on the www-talk mailing list, so it was there that a standard for calling command line executables was agreed upon. Specifically mentioned in RFC 3875[3] are the following contributors:

The NCSA team wrote the specification,[4] and NCSA still hosts it at its original location.[5][6] The other webserver developers adopted it, and it has been a standard for webservers ever since.

[edit] Drawbacks

Calling a command generally means the invocation of a newly created process. Starting up the process can take up much more time and memory than the actual work of generating the output, especially when the program still needs to be interpreted or compiled. If the command is called often, the resulting workload can quickly overwhelm the web server.

The overhead involved in interpretation may be reduced by using compiled CGI programs, such as those in C/C++, rather than using Perl or other scripting languages. The overhead involved in process creation can be reduced by solutions such as FastCGI, or by running the application code entirely within the webserver using special extension modules.

[edit] Alternatives

Several approaches can be adopted for remedying this:

The optimal configuration for any web application depends on application-specific details, amount of traffic, and complexity of the transaction; these tradeoffs need to be analyzed to determine the best implementation for a given task and time budget.

[edit] See also

[edit] References

[edit] External links

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages