How a Browser Works: A Beginner-Friendly Guide to Browser Internals

What a browser actually is (beyond “it opens websites”)
Browser is a software (just piece of code) that follow our commands. When we type a URL in browser it goes to server, downloads the website files(HTML, CSS, JS), understand them and then it build an page, and then display us it on the screen.
Before going to its internal let’s have a look over its main parts
Main Parts of a Browser (High-Level View)

At a high level, a browser is made of several parts working together:
User Interface – what you see and click
Networking – fetching data from the internet
Browser / Rendering Engine – understanding and displaying pages
JavaScript Engine – running JS code
Storage & Security layers – cookies, cache, sandboxing
User Interface: What You Actually Interact With
This is the visible part of the browser:
address bar
back / forward buttons
tabs
bookmarks
Important point: The UI does NOT decide how a webpage looks
It only helps you control the browser.
Browser Engine vs Rendering Engine (simple distinction)
1. The Rendering Engine (The "Painter")
The Rendering Engine is responsible for everything you see on the screen. Its job is to take web content (HTML, XML, CSS) and format it into a visual representation.
Primary Tasks: * Parsing HTML/CSS.
Building the DOM tree (structure) and CSSOM tree (style).
Calculating the layout (where things go).
Painting the actual pixels on your screen.
Examples: * Blink (Chrome, Edge, Opera)
WebKit (Safari)
Gecko (Firefox)
2. The Browser Engine (The "Manager")
The Browser Engine is the bridge between the User Interface (address bar, back button, bookmarks) and the Rendering Engine. It handles the high-level logic and marshals actions between different components.
Primary Tasks: * Handling user inputs (clicks, typing in the address bar).
Managing data storage and cookies.
Coordinating with the rendering engine to display the right page.
Comparison Table
| Feature | Rendering Engine | Browser Engine |
| Main Role | Turns code into pixels (builds and draws the webpage) | Manages the browser’s overall flow and coordination |
| Inputs | HTML, CSS, Images (and other visual resources) | User actions (clicks, typing), Network requests |
| Output | A visual webpage on the screen | A fully functioning browser application |
| Focus | Performance, visual accuracy, following web standards | Security, storage, navigation, user experience |
| Relationship | Works as a component inside the browser | Controls and directs the rendering engine |
Now lets go deep dive into Browser Internals and its Working:

Let’s suppose you type a URL (www.google.com), as soon as you press Enter Button you get youtube on the screen. Don’t you ever think how it actually works? As a CS student we must feel this.
So, i will try to make things simple easily explainable to you, Just have with me.
As soon as we press Enter Button, Browser understands this a domain name but i only talk to server using IP address. The Browser then ask the Operating Syatem, Give me IP address of server so that i can request server. Then OS first checks its Cache memory. Is domain name → IP address is already there or not. let suppose you are visiting the first time, OS find there is no IP adress mapped to the current domain name.
Then it call the DNS (Domain Name System) so it can connect to the server. But DNS resolver server also don’t know the IP address of domain name. So it call to root server (“Hey Do you know the IP address of www.google.com”), root server say, “I don’t know the IP, but I know which server handles .com domain but i know you can get this from TLD server , then DNS resolver call to TLD do you know the IP of this domain and TLD says, “I don’t know the exact IP, but I know which Authoritative Name Server manages google.com. Ask them.” Then DNS resolver call authorative server “Do you know the IP address of google.com” then it say yes.
Now the Answer Travels Back
Authoritative Server → DNS Resolver
DNS Resolver → Browser
Browser now knows the IP address
Browser can finally connect to the web server
Now browser get the IP address let say (142.250.190.132), it will now request a server with the given IP address but here is a catch. when we send some documents to your freind what you do ? you first all your freind and say i am going somewhere i am sending you my documents , should i ? then your freind approve it then you send it , through trusted organization or company.
In the same way, After DNS gives the browser an IP address, it still needs a reliable connection to the server. TCP is the protocol that sets up this connection before any HTTP data is sent.TCP establishes the connection using a three-step handshake that confirms both sides are ready to send and receive data.
Once the TCP connection is established, the browser can send an HTTP request to the server.
Click the "Enter" button, the HTTP request travel to the server and the HTTP response return to the browser. When the HTTP response arrives, the browser reads the raw HTTP response and starts rendering the HTML content.
Parsing HTML to build the DOM tree
After the HTTP response arrives, the browser separates the headers from the body and feeds the HTML bytes into the parser. The parser turns tags like <h1> into tokens and builds a DOM tree.

suppose below the response we get from the server:
<!doctype html>
<html>
<head>
<title>Example Domain</title>
</head>
<body>
<main>
<h1 style="color: red;">Example Domain</h1>
<p>An example paragraph.</p>
<p>
<a href="https://example.com">An example link</a>
</p>
</main>
</body>
</html>
The DOM tree
Document
|- <!doctype html>
`- html
|- head
| `- title
| `- "Example Domain"
`- body
`- main
|- h1 (style: color: red)
| `- "Example Domain"
|- p
| `- "An example paragraph."
`- p
`- a (href="https://example.com")
`- "An example link"
Parsing is streaming and error-tolerant: the browser starts building nodes before the full document is downloaded, and it inserts missing tags to keep the tree valid. When a <script> tag appears, parsing may pause so the script can run.
The DOM tree then combines with CSS to produce the render tree that layout and paint use to draw pixels.
The DOM is the browser's in-memory model of the document. It is the shared contract between the HTML parser, CSS selector engine, and JavaScript runtime, so changes to it immediately affect layout, styling, and what users can interact with.
Layout, Paint, and Composite
Once the DOM and CSS are ready, the browser runs the rendering pipeline: Layout (reflow) to calculate sizes and positions,Paint to fill pixels, then Composite to stitch layers together on the GPU.
Not every change reruns every stage. Changing colors usually repaints, while changing sizes forces layout and paint to recompute.
This is why layout-heavy pages feel slower: more work needs to happen before the next frame can be shown.




