Web(uilt) This City on Rock and Roll: An Intro to Web Hacking

Estimated difficulty: 💚💚💚

Hey Securiteenies! And welcome to another blog written just for you.

Following on from Sarah’s “Castle on a Cloud” post about the basics of the internet – are you ready to learn a little bit about web hacking?

First of all let’s recap…

The Internet vs. the World Wide Web

As Sarah mentioned in her post, the internet can be described as the following:

A “large system of connected computers around the world that allows people to share information and communicate with each other“.
Cambridge Dictionary

But what is the World Wide Web?

The World Wide Web is a service on the internet that allows us to browse it. It’s a collection of different webpages made up of “HTML” (a programming language that makes webpages) and we can access these pages through a protocol called HTTP. Webpages on the World Wide Web are linked together using URLs, to access pages and websites – we can type the URL in a browser and the web server will take us to where we want to!

Accessing Web Pages

When we access a web page, we refer to the browser as a “client” which communicates with the web server to retrieve the website for us. The client will first request the page from the web server and receive the data when the server accepts the requests and finds the correct page for the client. Once this is done, the browser will display the page. Usually, browsers will not directly communicate with each other over the Internet.

If you’re a little lost, don’t worry – have a look at the diagram below or at Sarah’s post to refresh your mind.

This image has an empty alt attribute; its file name is Client_Server-1-1.png

Let’s get hack-y!

So! We know a little bit about how websites work, let’s go through some basic web hacking techniques…

Remember: Do not attempt to hack or gain access to a website if you do not have written permission and/or if it is not a training platform. If you attempt to do this, this is an illegal act – and you will be breaking the law!

Back to Basics

The following techniques are absolute basic web hacking and assessment techniques. Most of these techniques are focal to assessing a website before you do all the mad hax!

Recon

Before you hack a website, get to know the website first.

Reconnaissance is a technique used to help you understand how the website works. Before you try to hack into anything, website or not, always do a little recon beforehand. By understanding the website, you can try pinpoint vulnerabilities you can exploit – and this can help you tailor your attack.

Recon may involve browsing the website or checking out what functions it has. Try navigating to different pages, are they any search boxes? How do they work? Have a good old look around before you even attempt any attacks/hacking 🙂

Page Source

So remember me saying that websites are written in something called HTML? This is exactly what the page source is! Checking out the “page source” lets you see the HTML used to write the page/site. This should be apart of your recon process, and allows you to have a good look at how the website works behind the scenes.

HTML: Underrated tags. Useful functionalities not widely used | by Adrian Legaspi | ITNEXT — Source: ITNext

Poorly designed websites may have sensitive information left by developers in the page source, including hard-coded passwords and comments.

To view the page source of a webpage, simply right click and select “View Page Source”. You can also do Ctrl + U, or press the F12 key.

Try it out! Can you see the page source of this page?

File Inclusion

Do you know how files are stored on a system? Have a look at your computer. Open your Documents, go to a folder – and you should see loads of files stored inside a directory. Web servers work a very similar way! Web servers usually have a dedicated folder/directory – with loads of files inside. Each file is usually a web page, and all together – they make the website.

Poorly designed web servers mean you can potentially include files in the URL to retrieve them. Sometimes this may be sensitive documents (like a password file!) or directories and files that may be hidden from public view.

Pulling Season 3 GIF - Find & Share on GIPHY

Try it out! Play around with this Wikipedia URL, if you change the last word (where “Page” is) can you see it change to different pages?

Robots.txt

Robots - Robot jokes, hardware comics and cartoons — A little bit of a binary joke for you!
Source: Browserling

When you search for something on the internet, “robots” crawl through the web to try retrieve potential websites and webpages that match your query.

So if you go to Google, and search for “the weather” – when you click “search” (or press enter!) a whole load of robots will try find websites that may provide you with information about the weather.

Many websites and webpages use something called a robots.txt. These text files identify items not to be included in search engines, in other words – these files tell the robots that they’re not allowed to retrieve those pages! Some people disallow pages on their website that may have sensitive information, or perhaps they are hiding a login page/portal away from the public eye. But you can check what’s on these text files by simply visiting <the website>/robots.txt

Have a look at Google’s robot.txt file. Can you see what’s disallowed?

Practice Makes Perfect

So, shall we try a little bit of hacking?! Have a look at the Natas wargame here and attempt the first few levels. Using the techniques we have gone through so far, you should be able to get to at least level 4!

PSA: Natas is a game that has been created for people to learn how to do web hacking, so you will not be breaking the law attempting this game.

To play the game, use the following instructions…

Each level’s username will follow the format:

natas<level>

e.g. Level 1 = natas1

Each level’s URL will follow the format:

natas<level>.natas.labs.overthewire.org

e.g. Level 1 = http://natas1.natas.labs.overthewire.org

To get to the next level, find the password from the page. Then visit the next level’s URL, use the next level’s username and try accessing it with the password you found!

The first level, is level 0. Access it here using the following credentials:

Username: natas0
Password: natas0

A little stuck? Here’s a few hints!

Level 1 = How are websites built? Recon, recon, recon…

Level 2 = Even more recon! Keyboard shortcuts 🙂

Level 3 = … More recon! Directories are fun, and so are files!

Level 4 = Even MORE recon! Beep bop beep… you’re not allowed here.

Too easy?

Finding it all too easy so far? Let’s talk about a few more “advanced” web hacking techniques all about intercepting connections 🙂

Cookie Manipulation

Hackers 20th Anniversary Cookies - Kitchen Overlord

Cookies are small files stored on your computer when you visit a website. These files hold a number of values that can help the server customise the site to the client, for example…

FirstName = John
loggedIn=1
sessionID=123456789

These cookies can be manipulated by interception tools to access information or change the values to reveal hidden information.

Try having a look at this Wikipedia page. Right click and select “Inspect”, and go to the “Application” tab. Go to the cookie drop-down, can you see the cookie files?

Referrer Spoofing

What is a referrer? This is the website we are coming from.

What is referrer spoofing? This is faking the place we are coming from!

Hackers can use this technique in an attempt to “cover their tracks” or attempt to reveal sensitive information. You can use an interception tool to do this, to change the referrer header to a header of your choosing!

What Is Spoofing? - Cisco — Source: Cisco

Directory Traversal

Remember talking about file inclusion? This technique is pretty similar, but we’re “hopping” through directories to get to different files on the server. So instead of including a file from the web server directory, you can use special notations and commands to navigate to a different directory and retrieve different files.

A Guide to WordPress Server Directories — Source: iThemes

You use Linux command line syntax such as “../” to navigate to a different directory, so an example of this attack could be:

https://mywebsite.com/page.php?view=../../../thisfileplease.txt

To do a directory traversal attack you can either use an interceptor tool to play around with the requests, or try it directly using the URL. It’s your choice!

Feeling 1337?

If you’re feeling brave, and want to attempt more levels of Natas – you will need to use an interception tool like Burp. I wrote an introduction to Burp here, have a read – and see if you can hack the next few levels!

Finish GIF - Finish Race - Discover & Share GIFs

And at last, we are at the end of your introduction to web hacking!

A massive well done if you managed to do any of the Natas levels…

Happy hacking! (Legally, of course)

– Sophia x