Web Application Penetration Testing - HTTP Protocol

You can’t start web application penetration testing without studying the protocol that makes it happen, HTTP or its secure version HTTPS.

So, What is HTTP Protocol ?

HyperText Transfer Protocol, client-server protocol used to transfer web pages and web application data.

The client and the server exchange messages [Clients Requests, server Responses]

Web Clients: normally web browsers (FireFox, Chrome, MS Explorer, Edge, Chromium, ..), but can also be tools (netcat, spidering tools, custome scripts,..)
Web Server: like Apache web server, MS IIS, Google web server gws,…

HTTP Request or Response contains a Header and body.

Let us see what it contains simply.

HTTP Header :

GET / HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101
Firefox/36.0
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, deflate
Connection: keep-alive

so, what each line means? let’s see:

GET : called the request verb and it is the default when you type URL and press Enter in your browser,

followed by “/” which is the path [could be a folder name, page name,… like /mages/img.jpg],

followed by HTTP protocol version 1.1 [HTTP/1.1]

There are other Verbs like POST, PUT, DELETE, OPTIONS, TRACE…

Note that a TCP connection to the webserver (www.google.com) on port 80 is initiated before sending HTTP commands.

Host header allows a web server to host multiple websites at a single IP address.

The browser is specifying which website you are requesting in the Host header.

Note: Host value + Path combine to create the full URL you are requesting: the home page of www.google.com/ in our example

User-Agent reveals your browser version, operating system and language to the remote web server.

Accept header is used by the browser to specify which document type is expected to be returned as a result of this request.

Accept-Encoding : is similar to Accept, but it restricts the content codings that are acceptable in the response.

Content codings are primarily used to allow a document to be compressed or transformed without losing the identity of its media type and without loss of information.

Connection: keep-alive : With HTTP 1.1 you can keep your connection to the web server open for unspecified amount of time using the value “keep-alive“.
This indicates that all requests to the web server will continue to be sent through this connection without initiating a new connection every time (which was an HTTP 1.0 issue).

HTTP Response :

HTTP/1.1 200 OK
Date: Fri, 13 Mar 2015 11:26:05 GMT
Cache-Control: private, max-age=0
Content-Type: text/html; charset=UTF-8
Content-Encoding: gzip
Server: gws
Content-Length: 258
<PAGE CONTENTS>

HTTP/1.1 200 OK : Status-Line, start with protocol version (HTTP 1.1) followed by numeric status code (200) and its relative textual meaning (OK).

common status codes are:

200 OK, the resource is found.
301 Moved Permanently, the requested resource has been assigned a new permanent URI.
302 Found, the resource is temporarily under another URI.
403 Forbidden, the client does not have enough privileges and the server refuses to fulfill the request.
404 Not Found, the server cannot find a resource matching the request.
500 Internal Server Error, the server does not support the functionality required to fulfill the request.

Date : represents when the message was originated.

Cache headers : allow the Browser and the Server to agree about caching rules
prevent your browser from re-requesting contents that have not changed since it is used at first request, so saving bandwidth.

Content-Type : lets the client know how to interpret the body of the message.

Content-Encoding extends Content-Type.
In this case the message body is compressed with gzip.

Server header : displays the Web Server banner.
Apache and IIS are common web servers.

Google uses a custom webserver banner: gws (that stands for Google Web Server).

Content-Length indicates the length, in bytes, of the message body.

<Page Contents> is the actual content of the requested resource.
The content can be an HTML page, a document, or even a binary file.
The type of the content is, of course, contained in the Content-type header.

For deeper digging in HTTP/1.1 visit RFC 2616

HTTP is a clear-text protocol, can be easily intercepted or altered by an attacker on the way to its destination, that is why we need HTTPS.

HTTPS or HTTP over SSL/TLS is a method to run HTTP, which is a clear-text protocol, over SSL/TLS, a cryptographic protocol.

HTTPS provides confidentiality, integrity protection and authentication to the HTTP protocol.
So attacker cannot sniff, cannot alter the application layer data.
And the client can tell the real identity of the server from its Certificate.

HTTPS does not protect against web application flaws such as XSS and SQL injections, but it just protects data exchanged between the client and the server.

You may use the Developer tools from your browser to check any HTTP/S request and response [Firefox: Developer Tools , then Network].

Web Application Penetration Testing – HTTP Protocol

So, What is HTTP Protocol ?

HTTP Header :

HTTP Response :

common status codes are:

Recent