During form submission, the form data values of all currently active so-called successful form elements that are contained in a particular formular are being sent back to a server. This action is initiated by the use of a submit or image button. All information about a particular formular may be displayed by invoking the FormInfo window.
A formular is a container of form elements and defines the basic parameters that are used in order to properly submit the form data values of those form elements that are part of the form and that are suitable for this submission. Most parameters can be specified on the originating element of a form, but others may result from other sources such as the document that contains the form.
The following parameters are relevant for form submission:
application/x-www-form-urlencoded
.The request method specifies the way how the form data, after it has been processed, is to be transmitted to the server for further processing. There are two common methods available: GET and POST.
With GET requests, the form data is just appended as a (url-encoded) query part to the target URL of the receiver, see also section "Server-based URLs". In contrast to many other web-browsers, w3browse is also able to append the form data to an already defined query part of the target URL. In any case, the character encoding (charset) of the form data values cannot be transmitted in a standard way to the server and because of this, it has to be guessed or determined by other means by the receiving process if needed.
With POST requests, the form data is being sent in the body of the request and therefore, it is possible to transmit data in any format and of any size. It is also possible to specify the character encoding of the transmitted form data values for the sake of receiving processes. There are several common form data encodings available that can be used with POST requests, which are described in the next section.
Before the form data can be sent to the server, it needs to be collected, transcoded, encoded and packed according to the basic formular parameters. The process of transforming the form element values into the form data output that is finally being transmitted consists of several steps:
The following encoding types are implemented in w3browse and are described in detail together with their handling in further subsections:
application/x-www-form-urlencoded
text/plain
multipart/form-data
The names and values of all suitable form elements are first transcoded to the target character encoding and then are URL-encoded. Finally, the resulting names and values are packed as follows:
name1=value1&name2=value2&...
The name and the value of a name-value pair is
separated by a "=
" character, while adjacent name-value pairs
are separated by "&
" characters.
For POST requests, the target character encoding is always transmitted to
the server and makes use of the charset
parameter of the
Content-Type:
header field, e.g.
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Note that some old web-server software has trouble when faced with such a header field contents. Note also that file selection elements are ignored completely for this encoding type.
The names and values of all suitable form elements are packed as follows:
name1=value1 name2=value2 ...
The name and the value of a name-value pair is
separated by a "=
" character, while each name-value pair is
terminated by CRLF, which causes all name-value pairs to appear on separate
lines. Note that the contents of textarea form elements can also span
multiple lines, which can lead to ambiguities when the output of such a form
submission is parsed on the server.
Finally, the whole form data output is transcoded to the target character
encoding before it is being submitted. The used character encoding is also
always transmitted to the server and makes use of the charset
parameter of the Content-Type:
header field, e.g.
Content-Type: text/plain; charset=utf-8
Note that file selection elements are ignored completely for this encoding type.
This is the most powerful encoding type, but also the most wasteful because of its overhead. The contents of all form elements including file selection elements (see next section) can be transmitted to the server, and it is further possible to attach character encoding information to each name-value pair separately.
The form data output that is to be transmitted to the server has the format of a multipart MIME message and looks like this:
--boundary Content-Disposition: form-data; name="name1" Content-Type: text/plain; charset=utf-8 value1 --boundary Content-Disposition: form-data; name="name2" Content-Type: text/plain; charset=utf-8 value2 --boundary ... --boundary--
For each name-value pair that is to be submitted, a message part is
created that starts with a boundary line and ends at either the next part or
the terminating boundary line (the last line of the example). The names and
values are transcoded first before they are inserted into a part. The value
of the charset
parameter of the Content-Type:
header field reflects the used character encoding. If a certain value
is empty, the corresponding Content-Type:
line is just omitted
because it is not supposed to provide any useful information to the server in
this case.
The header field for the Content-Type:
of the form data
output that is finally being transmitted to the server looks like this:
Content-Type: multipart/form-data; boundary="boundary"
Note that the value for the boundary (whether with or without
the prefix and/or suffix "--
") is supposed to not occur anywhere
within the message contents and therefore, it is recommended to choose a
sufficiently long random string. The same is also true for other boundary
strings that may be used in subparts.
The processing of file selection elements is more complex, because first, instead of the element value, the contents of a file is to be sent, and second, one or more files or even none at all may be sent:
If no files are to be sent, a file selection element is handled like an empty text element.
If only one file is to be sent, the format of the corresponding message part looks like this:
--boundary Content-Disposition: form-data; name="name"; filename="basename"; modification-date="last-modified" contents of file
The name is the assigned name of the file selection element, the basename denotes the basename component of the filename or it may be the value of othername in case one has been specified as a replacement for basename, and the value for last-modified is the last modification date of the file. The body of the message part is just the (unmodified, binary) contents of the file itself.
If more than one file is to be sent, the corresponding message part
becomes itself a multipart message of type multipart/mixed
and
looks like this:
--boundary Content-Disposition: form-data; name="name" Content-Type: multipart/mixed; boundary="boundary2" --boundary2 Content-Disposition: file; filename="basename1"; modification-date="last-modified1" contents of file1 --boundary2 Content-Disposition: file; filename="basename2"; modification-date="last-modified2" contents of file2 --boundary2 ... --boundary2--
The values of the parameters are similar to those of the previous one file case, so their description is not repeated here.
The ability to send othername as a replacement for basename and the transmission of the last modification date of a file are extensions that are specific to w3browse, most if not all other web-browsers are not capable of doing that. Even the ability to submit more than one file per file selection element is not widely supported.