HTTP Transport Basics for Web Developers

By Najaf Ali

All websites work using a system called HTTP. As a web developer you've seen parts of HTTP in your web browser and web apps, so the chances are you're already quite familiar with them.

HTTP is based on requests and responses. A web browser (like chrome or firefox) sends a request to another computer running a web server (like nginx, apache or your rails application). The web server then sends back a response. The response contains the body of a web page (i.e. all the HTML required to render a page).

Sending Messages Over TCP

Actually we're jumping ahead a bit here. We haven't really said how these messages get passed back and forth between computers.

We're not going to go down to the bare metal, but it's worth getting a feel for how computers pass messages to each other over the internet.

Before we can send any messages from one computer to another, we need to establish a TCP Connection. You can think of a TCP connection as a pipe between two computers. You can push data into one end of the pipe and that same data will reliably appear at the other end.

To make a TCP connection, you need to know the host you're connecting to (e.g. TCP also has the concept of 'ports' which you can think of as a door number on a computer. Different programs connect to different port numbers on a given host computer.

HTTP requests and responses are sent accross a TCP connection. HTTP servers listen on port 80 by default, though your local rails application probably listens for connections on port 3000.

Let's work with TCP on the command line (play along, it'll be fun!) A useful tool for working with TCP is netcat. You can install netcat using homebrew:

λ brew install netcat

Netcat works in a similar way to the cat unix command except that it outputs data to a TCP host and port instead of to your terminal window. It also has a 'listen' mode where you wait for connections and output any data you get from them.

In one terminal window, run this command:

# use netcat to listen for incoming TCP connections on port 3000
λ nc -lp 3000

Note: This won't work if you have something else bound to port 3000, like a rails app that you happen to be working on. If 3000 doesn't work, try something else, like 3001.

That instance of netcat is now waiting for a connection. Once you connect to it, it will act like one end of the 'pipe' we were talking about earlier.

In another terminal window, run this command:

# use netcat to establish a TCP connection
λ nc localhost 3000

You've set up both ends of the TCP connection. If you type into one of those terminal windows and hit enter, you should see whatever you typed appear in the other window. Any data put through one end of the pipe will appear at the other end of the pipe.

Once you're done playing with it, terminate the connection from either end by going into either of the two windows and hitting Ctrl+c.

Sending a HTTP Request

HTTP (usually) uses TCP connections for its requests and responses. Your web server or rails application is listening for connections, usually on port 80. Your web browser creates a TCP connection on port 80 to a computer identified by a host. A host of localhost tells the web browser to connect to the computer it's running on.

Your web browser and web server are the two ends of the pipe.

Try this:

  • As before, use netcat to listen for connections on a port, 3000 is fine if it's available.
  • Open your web browser, type localhost:3000 into your address bar and press enter (it will hang).
  • Take a look at the contents of the terminal window running the listening netcat process. What do you see?

This is a HTTP Request that your browser sent to port 3000 on localhost. Using google chrome on my macbook, it looks like this:

GET / HTTP/1.1
Host: localhost:3000
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36...
Accept-Encoding: gzip,deflate,sdchAccept-Language: en-US,en;q=0.8

Don't kill this process just yet, we can actually send an (albeit malformed) HTTP response to the web browser now.

Returning a HTTP Response

Now just like you did with two terminal windows, you can send a message back to the web browser. Try this:

  • In the terminal window where you're running netcat, type in <h1>Hello From Netcat</h1>.
  • Hit enter.
  • Kill the connection with Ctrl+c.
  • Have a look at the web browser making the request. You should see the title you sent back in the <h1> tags.

You've now sent data back down the pipe to the web browser, which treats it as HTML and renders the content.

A real HTTP response is split into headers and a response body, but browsers tend to be forgiving in what they'll render so your h1 tag should appear as expected.

This is all a web application does

Any web application that you create will do three things:

  • Listen for incoming TCP connections on a port (probably 80).
  • Receive HTTP requests over the connection, each with a method, host, path, body and headers.
  • Based on the contents of the HTTP request, build and send a HTTP response back down the TCP connection.

That is literally all they do. The details of all this is hidden away from you by your programming language, libraries and other tools you use.

Everything your web application does is driven by the data that comes out of an anonymous TCP connection.

Further Reading