The Hypertext Transfer Protocol (HTTP) is an application-level
protocol for distributed, collaborative, hypermedia information systems.
This is the foundation for data communication for the World Wide Web
(ie. internet) since 1990. HTTP is a generic and stateless protocol
which can be used for other purposes as well using extension of its
request methods, error codes and headers.
Basically, HTTP is an TCP/IP based communication protocol, which is used to deliver data (HTML files, image files, query results etc) on the World Wide Web. The default port is TCP 80, but other ports can be used. It provides a standardized way for computers to communicate with each other. HTTP specification specifies how clients request data will be constructed and sent to the serve, and how servers respond to these requests.
The HTTP protocol is a request/response protocol based on client/server based architecture where web browser, robots and search engines, etc. act like HTTP clients and Web server acts as server.
All content-coding values are case-insensitive. HTTP/1.1 uses content-coding values in the Accept-Encoding and Content-Encoding header fields which we will see in subsequent chapters.
An HTTP "client" is a program (Web browser or any other client) that establishes a connection to a server for the purpose of sending one or more HTTP request messages. An HTTP "server" is a program ( generally a web server like Apache Web Server or Internet Information Services IIS etc. ) that accepts connections in order to serve HTTP requests by sending HTTP response messages.
HTTP makes use of the Uniform Resource Identifier (URI) to identify a given resource and to establish a connection. Once connection is established, HTTP messages are passed in a format similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions (MIME) [RFC2045]. These messages are consisted of requests from client to server and responses from server to client which will have following format:
Basically, HTTP is an TCP/IP based communication protocol, which is used to deliver data (HTML files, image files, query results etc) on the World Wide Web. The default port is TCP 80, but other ports can be used. It provides a standardized way for computers to communicate with each other. HTTP specification specifies how clients request data will be constructed and sent to the serve, and how servers respond to these requests.
Basic Features
There are following three basic features which makes HTTP a simple but powerful protocol:- HTTP is connectionless: The HTTP client ie. browser initiates an HTTP request and after a request is made, the client disconnects from the server and waits for a response. The server process the request and re-establish the connection with the client to send response back.
- HTTP is media independent: This means, any type of data can be sent by HTTP as long as both the client and server know how to handle the data content. This is required for client as well as server to specify the content type using appropriate MIME-type.
- HTTP is stateless: As mentioned above, HTTP is a connectionless and this is a direct result that HTTP is a stateless protocol. The server and client are aware of each other only during a current request. Afterwards, both of them forget about each other. Due to this nature of the protocol, neither the client nor the browser can retain information between different request across the web pages.
HTTP/1.0 uses a new connection for each request/response exchange where as HTTP/1.1 connection may be used for one or more request/response exchanges.
Basic Architecture
Following diagram shows a very basic architecture of a web application and depicts where HTTP sits:The HTTP protocol is a request/response protocol based on client/server based architecture where web browser, robots and search engines, etc. act like HTTP clients and Web server acts as server.
Client
The HTTP client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME-like message containing request modifiers, client information, and possible body content over a TCP/IP connection.Server
The HTTP server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity metainformation, and possible entity-body content.HTTP - Parameters
This chapter is going to list down few of the important HTTP Protocol Parameters and their syntax in a way they are used in the communication. For example, format for date, format of URL etc. This will help you in constructing your request and response messages while writing HTTP client or server programs. You will see complete usage of these parameters in subsequent chapters while explaining message structure for HTTP requests and responses.HTTP Version
HTTP uses a <major>.<minor> numbering scheme to indicate versions of the protocol. The version of an HTTP message is indicated by an HTTP-Version field in the first line. Here is the general syntax of specifying HTTP version number:HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
Example
HTTP/1.0 or HTTP/1.1
Uniform Resource Identifiers (URI)
Uniform Resource Identifiers (URI) is simply formatted, case-insensitive string containing name, location etc to identify a resource, for example a website, a web service etc. A general syntax of URI used for HTTP is as follows:URI = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]Here if the port is empty or not given, port 80 is assumed for HTTP and an empty abs_path is equivalent to an abs_path of "/". The characters other than those in the reserved and unsafe sets are equivalent to their ""%" HEX HEX" encoding.
Example
Following two URIs are equivalent:http://abc.com:80/~smith/home.html http://ABC.com/%7Esmith/home.html http://ABC.com:/%7esmith/home.html
Date/Time Formats
All HTTP date/time stamps MUST be represented in Greenwich Mean Time (GMT), without exception. HTTP applications are allowed to use any of the following three representations of date/time stamps:Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
Character Sets
You use character set to specify the character sets that the client prefers. Multiple character sets can be listed separated by commas. If a value is not specified, the default is US-ASCII.Example
Following are valid character sets:US-ASCII or ISO-8859-1 or ISO-8859-7
Content Encodings
A content ecoding values indicate an encoding algorithm has been used to encode the content before passing it over the network. Content codings are primarily used to allow a document to be compressed or otherwise usefully transformed without losing the identity.All content-coding values are case-insensitive. HTTP/1.1 uses content-coding values in the Accept-Encoding and Content-Encoding header fields which we will see in subsequent chapters.
Example
Following are valid encoding schemes:Accept-encoding: gzip or Accept-encoding: compress or Accept-encoding: deflate
Media Types
HTTP uses Internet Media Types in the Content-Type and Accept header fields in order to provide open and extensible data typing and type negotiation. All the Media-type values are registered with the Internet Assigned Number Authority ((IANA). Following is a general syntax to specify media type:media-type = type "/" subtype *( ";" parameter )The type, subtype, and parameter attribute names are case- insensitive.
Example
Accept: image/gif
Language Tags
HTTP uses language tags within the Accept-Language and Content-Language fields. A language tag is composed of 1 or more parts: A primary language tag and a possibly empty series of subtags:language-tag = primary-tag *( "-" subtag )White space is not allowed within the tag and all tags are case- insensitive.
Example
Example tags include:en, en-US, en-cockney, i-cherokee, x-pig-latinWhere any two-letter primary-tag is an ISO-639 language abbreviation and any two-letter initial subtag is an ISO-3166 country code.
HTTP - Messages
HTTP is based on client-server architecture model and a stateless request/response protocol that operates by exchanging messages across a reliable TCP/IP connection.An HTTP "client" is a program (Web browser or any other client) that establishes a connection to a server for the purpose of sending one or more HTTP request messages. An HTTP "server" is a program ( generally a web server like Apache Web Server or Internet Information Services IIS etc. ) that accepts connections in order to serve HTTP requests by sending HTTP response messages.
HTTP makes use of the Uniform Resource Identifier (URI) to identify a given resource and to establish a connection. Once connection is established, HTTP messages are passed in a format similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions (MIME) [RFC2045]. These messages are consisted of requests from client to server and responses from server to client which will have following format:
HTTP-message = <Request> | <Response> ; HTTP/1.1 messagesHTTP request and HTTP response use a generic message format of RFC 822 for transferring the required data. This generic message format consists of following four items.
- A Start-line
- Zero or more header fields followed by CRLF
- An empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields
- Optionally a message-body
No comments:
Post a Comment