Requests

In order to retrieve a document from a server, a request is made to the server in charge, which in turn returns a response that contains the requested document. The whole process of performing such requests is described in the following sections.

Accessing Resources

The World Wide Web is based on resources that are provided by servers to clients. These resources are often also denoted as documents and are identified and located by different kinds of URLs. The access to such documents follows the so-called request-response principle. So, in order to fetch a document, a request is made by a client to a server, which then returns a response that contains the requested document, or the server returns an error message in case something goes wrong, e.g. when the requested document does not exist. Requests and responses also include certain meta information about the exchanged entities, e.g. the content-type of a document.

The underlying protocol for the communication between a client and a server is the "Hyper Text Transfer Protocol" [HTTP]. This protocol is very powerful through its incorporation of the "Multipurpose Internet Mail Extensions" [MIME], so that even other protocols such as the "File Transfer Protocol" [FTP] can be logically wrapped into this pattern (see also section "Server-based URLs" for more examples). For such protocols, the actual work is done by so-called protocol-converters, which are either part of the client itself, or which are part of a proxy-server that acts on behalf of clients. A proxy-server by itself is both a server (for clients) and a client (for other servers).

Processing of Requests

w3browse makes extensive use of the concepts of HTTP and the request-response model for its internal processing of requests. The starting point is the creation of an initial request by some means such as following a hyperlink or submitting a form. The following process of retrieving the associated document can be divided into three main steps:

  1. The request is passed through an internal chain of request handlers. Each handler can add, modify or delete certain parts of the request, depending on its functionality and configuration. It is also possible for a handler to break the chain and return a response immediately, thus skipping the rest of the chain.

  2. At the end of the request chain, the resulting request is sent to a server which is expected to return a response. The contacted server may in fact be a web-server, a proxy-server or gateway, or it may even be a so-called internal application of w3browse.

  3. The received (or generated) response is passed back within the chain of request handlers. This time, a handler can change or extract parts of the response and may make use of this information in subsequent requests, e.g. for the processing of cookies or for cacheing documents. Furthermore, a handler is also allowed to perform multiple requests using the rest of the request chain that follows the handler, e.g. in order to deal with redirects (diversions).

The resulting response is usually given to a dispatcher which tries to find an application that is suitable to display the document that is contained in the response. During this process, a document may be converted internally multiple times before an appropriate application is found. A common target application for any sort of hypertext including plain text is a viewer window, but if no application can be found, the dialog "Save Document" is invoked instead.

Request Contexts

A request context is usually created by the dialog "Open URL Window" and is constructed from the request parameters that are specified within the dialog, and from the request headers and other options as defined within the dialog "Request Settings". The request context is subsequently passed on to other windows, so that it is possible to have multiple URL windows and other windows such as viewer windows with different request contexts at the same time. Tracking of things may get admittedly lost when too many windows are open simultaneously.

A request context is sometimes also denoted as request processing chain, because the internal processing of a request generally requires many steps to be performed in a particular order. Such a chain usually includes several components, where some of them can be accessed directly by using certain special URLs, e.g. in order to inspect their state or to influence their operation.

In the following, all available components of a request processing chain are presented in the order in which they are traversed as a request is being processed, starting with an initial request:

  1. If the URL of the initial request is a file URL (type file:), then the request is directly passed to the corresponding handler which does the necessary request processing itself.

  2. All enabled static HTTP headers that are specified by text fields within the dialog "Request Settings" are added to the initial request. An additional Referer: header field is added if the option Send Referrer is enabled within the dialog and if the referring URL is not of type file:, about: or internal:. Authentication information that may be embedded within server-based URLs (the user:password@ part) is stripped before the referring URL is used for reference.

  3. If present, the fragment part #fragment of the URL of the request is stripped and remembered in order to be added again to the URL of the returned response.

  4. If the option Follow Redirects is enabled, a component that deals with redirects of GET and HEAD requests is inserted into the chain. In order to prevent endless loops, up to 10 redirects per initial request are allowed. The last response of such a loop is returned if this limit is exceeded. Relative URLs of the Location: header field of a response are resolved (in the usual way) with respect to the URL of the response.

  5. The URL of the request is normalized to a standard form. All components that follow in the chain assume this form. An invalid URL causes further processing of the request to be aborted. This step also adds the (normalized) URL of the request to the corresponding response for the sake of upper levels (the previous steps and the initiator of the request).

  6. If the parameter CacheDir is specified within the dialog "Open URL Window", a component that maintains a cache is inserted into the chain. All GET and HEAD requests that match already cached entries are satisfied from the cache. All other requests are passed through in the request chain, whereby the returned responses of successful GET requests are cached and are used to satisfy further requests. Responses to requests with URLs of type about: and internal: are never cached.

  7. A component that remembers the URLs of all successful GET requests that pass this step is added to the chain. This component implements the functionality for the recognition of visited links (see also the dialog "Color Settings") and provides an additional special request method that makes it possible to query the status of URLs. This request method is also supported by a cache, which additionally propagates negative (unsuccessful) queries.

  8. If the option Accept Cookies is enabled, a cookie handler is inserted into the chain as well as an additional component, the "Cookie Manager", which allows to inspect and/or delete currently defined cookies. This manager can be accessed by using the special URL about:cookies. Cookies are not shared between different request contexts.

  9. An authorization handler is added to the request chain as well as an additional component, the "Authorization Manager", which allows to inspect, add and delete authorization records that are used in order to authenticate the user to a server when necessary. This manager can be accessed by using the special URL about:authorization, but it also intercepts "Authentication Required" messages when they are returned within responses. All authorization records are shared among all request contexts.

  10. If the parameter MailDir is specified within the dialog "Open URL Window", an instance of the "e-Mail Application" is inserted as a component into the request processing chain. This component can be accessed by using internal URLs that start with the prefix internal://mail/. All settings that are specific to an e-mail environment are performed within the e-mail application, especially the definition of mail accounts for sending and receiving messages are made there. Mailto URLs such as mailto: are also handled by this component and are redirected to the e-mail composer.

  11. A component that catches all remaining requests to URLs that start with the prefix internal: or about: is always added immediately before the end of the request processing chain. Some of the caught requests are processed by certain "Internal Applications", which can be accessed by using the following internal URLs:

    internal://help/
    Refers to the pages of the built-in help system.
    internal://admin/
    Provides access to a collection of administration tools.
  12. The job of the final component of a request processing chain is to make requests to external servers and return their responses. Three possibilities are provided by the dialog "Open URL Window":

    1. If the option Offline is enabled, then connections to external servers are not made at all.
    2. If the parameter Proxy is specified, it defines the location of a proxy-server or gateway that is subsequently used to fetch documents.
    3. Otherwise, connections to external servers are made directly.

Restricted Request Contexts

Another way to create a request context is to invoke the built-in help system of w3browse by using the "Help Menu" or by using the context-sensitive keyboard shortcut Alt-H. This request context has certain limitations, because it cannot be built from the request parameters of the dialog "Open URL Window", but it can and in fact does make use of the request headers and other options of the dialog "Request Settings". Consequently, such a restricted request context consists only of those components that do not depend on the request parameters of the former dialog. This means that steps 6, 10 and 12 of the request processing chain are not available, and because of the missing final component, accesses to external resources are not possible and may be honored with an error message.