Making a GET request

In order to understand how the next few applications we are going to be developing work, we need to understand how the internet works.

A question you may get in an interview is the following:

What happens when you type a URL in browser and press Enter?

An extremely in-depth answer is available here.

However, a lot of those steps are not needed in our case, as for now we are only worried with the network side of things.

A more suitable example is here: http://edusagar.com/articles/view/70/What-happens-when-you-type-a-URL-in-browser.

But really the steps we are focusing in are just a couple of them. Let's have a go at explaining it while programming in Python.

Install the required library

The required library is called requests, so you can include that library in your requirements.txt file. I would recommend, as always, to find the current version of the library and using that in your requirements.txt file. At the time of writing, the latest version was 2.7.2, and my requirements.txt file looked like this:

requests==2.7.2

Import the library for use in your application

Before using a library in Python, we need to import it. First, create the Python file which will run your application. I tend to call this app.py or run.py.

Then, the amongst the first lines you should write some code to tell Python that this file is going to be using the requests library. Note: the line does not have to be at the top necessarily, but that is the most common place for it.

import requests

__author__ = "Your Name"

That first line now tells the Python interpreter that it needs to load the contents of the requests library for use when executing any code from this file.

Get the content of a page

All of the internet-related communications we are going to be doing in this course happen in a network layer that uses a specific protocol to transfer data: the HyperText Transfer Protocol. You'll know this as it often appears in front of URLs as http://...

This protocol just states how the transfer happens, and what data is transferred to some extent. The name itself tells us what type of data is transferred: hypertext, which is just another name for text that has links to other pieces of text. This protocol is as old as the internet, when pages were just bits of text with links to other pages.

Today, we use HTTP to transfer the pages themselves, images, videos, and everything in between.

Thus, the pages we will be writing are all going to be text. The browser (e.g. Google Chrome, Safari, or others) interpret that text (which is HTML and CSS code, mostly) to show us a renderized version of the page. The way the browser gets the content of the page is by requesting it from a server. A server is just a computer that has a program designed to answer these requests.

Thus, when a browser connects to the server, the server gives it the content of the page that it has asked for, and then the browser renders it and shows you a version that isn't all just plain text.

When we make a request using Python, we do not have a browser, so all we are going to be getting back is the text that makes up the page--the HTML and CSS code. Let's make our first request!

import requests

__author__ = "Your Name"

requests.get("http://google.com")

We've made a program that will ask one of the servers hosting http://google.com for the contents of the page!

However we are not storing that request anywhere, so there's not much we can do with it. Let's store the contents of the request in a variable, and then print the content of the page:

import requests

__author__ = "Your Name"

r = requests.get("http://google.com")
print(r.content)

If you run this program, you'll see an extremely long line be printed out to your console. That's the Google page!

A GET request

What we have done is requests.get("<page>"). This has replicated what a browser would do when asking Google for a page, and it's called a GET request.

A GET request just retrieves something from a server. In this case, the page content.

There are many other types of requests that HTTP supports: POST, PUT, DELETE, and many more. Some a self-explanatory, whereas others are not:

HTTP "verb"	Meaning
GET	Retrieve something from the server
POST	Create a new element in the server, using the data provided in the request
PUT	Update an existing element in the server, using the data provided in the request
DELETE	Remove an element in the server

The URL

The GET request retrieves something from the server. But when we access http://google.com/ we are not retrieving anything specific. We aren't telling the server what we want. Are we?

It turns out we are, and the key is that last character of the URL: /.

The forward slash character by itself means "the root". In the case of web applications, the root of the application tends to be the home page. So we can access pages, and we are always accessing the root:

http://google.com/
http://schoolofcode.me/
http://facebook.com/

If the forward slash character is not at the end, then it is assumed, so sometimes you may not see it!

Accessing other parts of the page

If we wanted to access School of Code's courses, we could go into the courses "folder":

http://schoolofcode.me/courses

I read this as a folder of courses because it makes sense to then access a specific course, living inside that folder:

http://schoolofcode.me/courses/complete-python-web

What does HTTP look like?

When the browser does the request to view the courses page, really it's doing something like this:

HTTP/1.1 GET http://schoolofcode.me/courses

That is sent to the server, which knows that the browser is expecting something back: the content of the page.

Is it really this simple?

Well... No. Not really. There's a lot more going on. HTTP isn't magical, traffic still has to travel from one server to another, and for that we need a lot of things. Feel free to look these up: Physical Network Layer, Ethernet, TCP, IP, DNS. That will help understand a bit more of what is happening behind the scenes, although it is not required for this course.

Our first GET request