Why Building a Modern Web Browser Is a Multi-Billion-Dollar Engineering Challenge
Building a web browser is often compared to building an operating system because it requires managing untrusted code, coordinating complex hardware re...
Author’s note:
Question: Why is a web browser so hard to make?
Context: I saw cursor made a web browser
Executive Summary
Building a web browser is often compared to building an operating system because it requires managing untrusted code, coordinating complex hardware resources, and enforcing strict security boundaries. The modern browser is not a single application but a distributed system of cooperating processes.
- Process Architecture is the First Hurdle: To prevent a single page crash from bringing down the entire application, browsers must implement a multi-process architecture. This involves separating the browser kernel, rendering engine, GPU handling, and network stack into distinct, sandboxed processes 1.
- Security is a Moving Target: Features like Site Isolation are now mandatory to defend against speculative side-channel attacks (like Spectre). This requires putting pages from different websites into different processes, significantly increasing memory usage and architectural complexity 2 3.
- The Rendering Pipeline is Massive: The Blink rendering engine (used in Chrome) has evolved into “RenderingNG,” a pipeline that must coordinate parsing, styling, layout, painting, and compositing across multiple threads and the GPU to deliver smooth performance 4 5.
- Standards Compliance is Non-Negotiable: A browser must pass thousands of Web Platform Tests (WPT) to ensure interoperability. In 2024, focus areas included complex accessibility standards like computed roles and accessible names, which are critical for enterprise adoption 6 7.
1. The Architectural Foundations: Multi-Process & Sandboxing
The days of the single-process browser are long gone. Early browsers were like cooperative multitasking operating systems: one misbehaving page could hang the entire application 1. Today, building a browser means building a multi-process system that mimics a modern OS.
The Process Model
Chromium (the basis for Chrome, Edge, Arc, and others) uses a complex process model to ensure stability and security.
| Process Type | Responsibility | Security Context |
|---|---|---|
| Browser Process | Manages the application state, coordinates other processes, and handles the UI. | Trusted, high privilege. |
| Renderer Process | Interprets HTML/CSS/JS using the Blink engine. One process per tab/site (typically). | Sandboxed: Restricted access to disk, network, and display 1. |
| GPU Process | Handles GPU tasks and compositing to prevent driver crashes from taking down the browser. | Isolated to protect against driver instability 1. |
| Network Service | Handles all network requests (HTTP/DNS/Caching). | Separate process to isolate sensitive network parsing 1. |
Site Isolation
Beyond just separating tabs, modern browsers enforce Site Isolation. This security feature ensures that pages from different websites are put into different renderer processes, even if they are in the same tab (e.g., an iframe) 2. This is a defense-in-depth measure against “arbitrary code execution” attacks and speculative side-channel attacks, ensuring a malicious page cannot read data from another origin within the same process address space 2 3.
2. Rendering Engine Deep Dive
The rendering engine is responsible for translating HTML, CSS, and JavaScript into pixels on the screen. This is not a linear process but a highly parallelized pipeline. In Chromium, this architecture is known as RenderingNG 4.
The Critical Path
The pipeline involves several distinct stages that must execute rapidly to maintain 60fps (or higher) animation:
- Parsing: Converting HTML into the Document Object Model (DOM) 1.
- Style: Resolving CSS rules to determine the visual properties of every element 8.
- Layout: Calculating the geometry (position and size) of elements. Modern engines like Blink use LayoutNG to handle complex fragmentation and text flow 4.
- Paint: Creating a list of drawing instructions (display lists) 8.
- Compositing: Breaking the page into layers and using the GPU to draw them. This happens after the main thread has finished, allowing for smooth scrolling even if the main thread is busy 5.
Why it’s hard: The engine must handle “invalidation” efficiently. If JavaScript changes a single style property, the engine must determine exactly which parts of the layout need to be recalculated and repainted to avoid unnecessary work 8.
3. JavaScript Engine Wars
A browser must include a high-performance JavaScript engine. These are among the most sophisticated compilers in existence.
- V8 (Chrome/Node.js): Uses a pipeline including the Ignition interpreter and the TurboFan optimizing compiler. It compiles JavaScript to machine code at runtime, handles memory allocation, and performs garbage collection 9 10.
- SpiderMonkey (Firefox): Features a multi-tier architecture including a Baseline Interpreter and the IonMonkey JIT (Just-In-Time) compiler to balance startup speed with peak execution performance 11.
- JavaScriptCore (WebKit): The engine powering Safari, which also utilizes multiple tiers of optimization (LLInt, Baseline, DFG, FTL) 12.
Code Snippet: V8 Embedding Concept Building a browser involves “embedding” these engines. The engine handles the JS, but the browser must provide the “host environment” (the DOM).
// Conceptual C++ pseudo-code for embedding V8// The browser must expose C++ DOM objects to JavaScriptv8::Local<v8::ObjectTemplate> global = v8::ObjectTemplate::New(isolate);global->Set( v8::String::NewFromUtf8(isolate, "document"), WrapDOMObject(current_document) // Browser-provided DOM implementation);Note: The DOM is not part of the JS engine; it is provided by the browser (e.g., Blink) 9.
4. Networking Stack & Protocols
A modern browser doesn’t just “speak HTTP.” It must maintain a massive networking stack that supports multiple protocol generations simultaneously.
Protocol Evolution
- HTTP/1.1: The legacy standard.
- HTTP/2 (RFC 7540): Introduced binary framing, header compression (HPACK), and stream multiplexing over a single TCP connection 13 14.
- HTTP/3 (RFC 9114): The newest standard, which maps HTTP semantics over QUIC (RFC 9000) instead of TCP. This provides stream multiplexing with better performance on lossy networks but requires a completely different transport layer implementation (UDP-based) 15 16.
Why it’s hard: The browser must negotiate these protocols automatically (via ALPN) and handle fallback seamlessly. HTTP/3, for instance, subsumes many features of HTTP/2 (like flow control) into the QUIC transport layer, requiring a distinct architectural approach 15.
5. Security & Permissions Model
Browsers are the primary interface to the web, making them a prime target for attacks. The security model is built on strict isolation.
Same-Origin Policy (SOP) & CORS
The Same-Origin Policy restricts how a document or script loaded from one origin can interact with resources from another 17.
- Cross-Origin Resource Sharing (CORS): A mechanism defined in the Fetch standard that allows servers to explicitly relax SOP restrictions using specific HTTP headers 18.
- Implementation: If a browser fails to enforce these checks correctly, malicious sites could steal data from authenticated sessions on other sites.
The “Rule of Two”
Chromium engineers often cite the “Rule of Two,” which states that when handling untrusted content, you can pick no more than two of the following:
- Written in an unsafe language (C++).
- Parses untrusted input.
- Runs with high privileges.
Since browsers are largely written in C++ and parse untrusted web content, they must drop privileges, necessitating the sandbox architecture described earlier 1.
6. Accessibility & Interoperability
A browser isn’t useful if it doesn’t work with the existing web. This requires strict adherence to standards.
Web Platform Tests (WPT)
Browser vendors collaborate on the Web Platform Tests, a shared test suite containing over 1.8 million tests. Passing these tests is the benchmark for interoperability 7.
Interop Projects
Initiatives like Interop 2024 focus on specific areas where browsers historically diverged. A major focus recently has been Accessibility.
- Computed Roles & Names: Browsers must consistently calculate the “accessible name” and “role” of HTML elements so screen readers work correctly. This involves complex logic defined in the ARIA and AccName specifications 6 7.
- Testing: In 2023 alone, over 1300 new accessibility tests were added to WPT to ensure consistent behavior across Chrome, Firefox, and Safari 7.
Bottom Line
Making a web browser is hard because it requires integrating several distinct, highly complex systems into a cohesive whole that is secure, fast, and compatible with decades of web content.
Key Takeaways:
- Concurrency is King: You aren’t building one app; you are building a distributed system of sandboxed processes 1.
- Security is Structural: Features like Site Isolation dictate the architecture, not just the code 2.
- Performance is Granular: The rendering pipeline (RenderingNG) breaks down pixel generation into minute, parallelizable stages 4.
- Compatibility is Quantifiable: Success is measured by passing millions of Web Platform Tests, covering everything from HTTP/3 networking to ARIA accessibility roles 7 15.
For a team like Cursor (or any new entrant), “making a browser” likely means forking an existing engine (like Chromium) and building on top of these foundational layers, rather than rewriting the rendering or JavaScript engines from scratch.
References
Footnotes
Other Ideas