HTTP Transport Basics for Web Developers
All websites work using a system called HTTP. As a web developer you've seen parts of HTTP in your web browser and web apps, so the chances are you're already quite familiar with them.
HTTP is based on requests and responses. A web browser (like chrome or firefox) sends a request to another computer running a web server (like nginx, apache or your rails application). The web server then sends back a response. The response contains the body of a web page (i.e. all the HTML required to render a page).
Sending Messages Over TCP
Actually we're jumping ahead a bit here. We haven't really said how these messages get passed back and forth between computers.
We're not going to go down to the bare metal, but it's worth getting a feel for how computers pass messages to each other over the internet.
Before we can send any messages from one computer to another, we need to establish a TCP Connection. You can think of a TCP connection as a pipe between two computers. You can push data into one end of the pipe and that same data will reliably appear at the other end.
To make a TCP connection, you need to know the host you're connecting to (e.g. example.com). TCP also has the concept of 'ports' which you can think of as a door number on a computer. Different programs connect to different port numbers on a given host computer.
HTTP requests and responses are sent accross a TCP connection. HTTP servers listen on port 80 by default, though your local rails application probably listens for connections on port 3000.
Let's work with TCP on the command line (play along, it'll be fun!) A useful
tool for working with TCP is
netcat. You can install netcat using
λ brew install netcat
Netcat works in a similar way to the
cat unix command except that it outputs
data to a TCP host and port instead of to your terminal window. It also has a
'listen' mode where you wait for connections and output any data you get from
In one terminal window, run this command:
# use netcat to listen for incoming TCP connections on port 3000 λ nc -lp 3000
Note: This won't work if you have something else bound to port 3000, like a rails app that you happen to be working on. If 3000 doesn't work, try something else, like 3001.
That instance of netcat is now waiting for a connection. Once you connect to it, it will act like one end of the 'pipe' we were talking about earlier.
In another terminal window, run this command:
# use netcat to establish a TCP connection λ nc localhost 3000
You've set up both ends of the TCP connection. If you type into one of those terminal windows and hit enter, you should see whatever you typed appear in the other window. Any data put through one end of the pipe will appear at the other end of the pipe.
Once you're done playing with it, terminate the connection from either end by
going into either of the two windows and hitting
Sending a HTTP Request
HTTP (usually) uses TCP connections for its requests and responses. Your web
server or rails application is listening for connections, usually on port 80.
Your web browser creates a TCP connection on port 80 to a computer identified
by a host. A host of
localhost tells the web browser to connect to the
computer it's running on.
Your web browser and web server are the two ends of the pipe.
- As before, use netcat to listen for connections on a port, 3000 is fine if it's available.
- Open your web browser, type
localhost:3000into your address bar and press enter (it will hang).
- Take a look at the contents of the terminal window running the listening netcat process. What do you see?
This is a HTTP Request that your browser sent to port 3000 on localhost. Using google chrome on my macbook, it looks like this:
GET / HTTP/1.1 Host: localhost:3000 Connection: keep-alive Cache-Control: max-age=0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9;q=0.8 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36... Accept-Encoding: gzip,deflate,sdchAccept-Language: en-US,en;q=0.8
Don't kill this process just yet, we can actually send an (albeit malformed) HTTP response to the web browser now.
Returning a HTTP Response
Now just like you did with two terminal windows, you can send a message back to the web browser. Try this:
- In the terminal window where you're running netcat, type in
<h1>Hello From Netcat</h1>.
- Hit enter.
- Kill the connection with
- Have a look at the web browser making the request. You should see the title
you sent back in the
You've now sent data back down the pipe to the web browser, which treats it as HTML and renders the content.
A real HTTP response is split into headers and a response body, but browsers
tend to be forgiving in what they'll render so your
h1 tag should appear as expected.
This is all a web application does
Any web application that you create will do three things:
- Listen for incoming TCP connections on a port (probably 80).
- Receive HTTP requests over the connection, each with a method, host, path, body and headers.
- Based on the contents of the HTTP request, build and send a HTTP response back down the TCP connection.
That is literally all they do. The details of all this is hidden away from you by your programming language, libraries and other tools you use.
Everything your web application does is driven by the data that comes out of an anonymous TCP connection.