Nov 26, 2025

Replacing My Linux Desktop With Google Chrome

All code in this blog post is fully open source at github.com/FoxMoss/DoteWM/.

This is an image of my window manager with full X11 support but border decorations, background, and window interactions are handled by a web browser.

Let me explain.

A Primer

To display a program window an operating system uses a window display server. On Linux this is mainly X11 written by MIT in 1984, it’s old and starting to show it’s age and Wayland is taking some of X11 market share. So with a window display server you can display windows, makes sense, but to get the nice behavior of movement, resizing buttons, and keybinds, we need a window manager. This is a separate program that the display server consults on how it should render and handle the windows. Then all the windows need to know is how to talk to the display server and all the window manager needs to know is how to handle the windows. This is a lovely system which allows for window managers to take on all different kinds of shapes sizes and appearances while maintaining general compatibility with most programs.

The goal of the project was to decrease the skill needed to write a fully customized window manager. I’ve been aware of many projects that try to do the very opposite of this project for a while, i.e. put the desktop environment into the web, with some decent success. See puter and AnuraOS for two great examples of such projects. It’s quite a bit easier to tweak CSS constants, and JS snippets then it is to change style embedded already in a long standing modern desktop/window manager. So lets bring the web to the desktop and have a browser control the system. That’s the pitch and that’s what we’re going to do.

The Browser

So how do we actually pull off this magic trick? I started with the web browser because I figured this would be the hardest bit. We need to communicate with the Javascript process while keeping access to low level interfaces as to interact with our windows. Then the other requirement is being able to serve our webpage without a HTTPS request leaving our computer.

How? CEF.

CEF also known as Chromium Embedded Framework is a multiplatform framework for writing web apps for desktop with a nice C++ interface to boot. Like Electron but slightly lower level, it exposes the functionality we need while abstracting the actual browser process away. As well as providing in an easy to download binary instead of compiling the browser from scratch. So to load our files off the user’s machine we can right a custom scheme. Like where the traditional chromium browser has chrome:// we can do the same using a interface CEF provides.

// ...
  // registers dote://base/
  CefRegisterSchemeHandlerFactory("dote", "base",
                                  new ClientSchemeHandlerFactory(sock));
// ...

And the actual code for finding and returning the file is fairly simple. Just read the file out of the ~/.config/dote/ directory and if there is none return a 404. We also have a quick, and dirty MIME type generator working purely on the file extensions. So every .html turns into a text/html and can be easily interpreted by the browser. I won’t bore you with the code, check out the repo if you want an exact idea of the implementation.

Great. Here I just give the body tag a background-image to act as a wallpaper.

The Layer Cake

So now that we can load HTML and Javascript files we now have a nice neat secure sandbox, unfortunately we need to let it control our machine. I considered writing a websocket implementation to work over the scheme. Then I found the cefQuery interface, which makes this a whole lot easier. I’m unsure what the intended use case was supposed to be, but it just exposes a function to Javascript that calls a C++ function:

class MessageHandler : blog CefMessageRouterBrowserSide::Handler {
 blog:
  explicit MessageHandler(int sock) : ipc_sock(sock) {}

  int ipc_sock;

  bool OnQuery(CefRefPtr<CefBrowser> browser,
               CefRefPtr<CefFrame> frame,
               int64_t query_id,
               const CefString& request,
               bool persistent,
               CefRefPtr<Callback> callback) override {
    // i prefer std::optional where possible but nlohmann throws exceptions occasionally
    try {
      nlohmann::json from_browser = nlohmann::json::parse(request.ToString());

      // ...

      callback->Success(to_browser.dump());
    } catch (const std::exception& e) {
      callback->Failure(-1, std::string(e.what()));
    }
    return true;
  }

 private:
  DISALLOW_COPY_AND_ASSIGN(MessageHandler);
};

Then on the browser side we can just tell the browser to call this function every frame and we have a basic event loop.

// send the browser start message in the first call
let message_queue: WindowDataSegment[] = [{ t: "browser_start" }];
let message_back_buffer: WindowDataSegment[] = [];

let start: DOMHighResTimeStamp;
function step(timestamp: DOMHighResTimeStamp) {
  if (start === undefined) {
    start = timestamp;
  }
  state.elapsed = timestamp - start;

  message_back_buffer = message_queue;
  message_queue = [];

  window.cefQuery({
    request: JSON.stringify(message_back_buffer),
    onSuccess: (response: string) => {
      // flush queue
      message_back_buffer  = [];

      const response_parsed = JSON.parse(response) as WindowDataSegment[];
      for (let segment in response_parsed) {
        // ...
      }
    },
    onFailure: function (_error_code: number, _error_message: string) {
      // message parsing error likely cause the bug
      message_queue = [];
    },
  });

  requestAnimationFrame(step);
}

requestAnimationFrame(step);

I was at first very skeptical of the performance of this, there was no way this would be able to run in real-time. But my mentality has always been shoot first optimize later, and well the basic event loop is still in the codebase so it’s been good enough for my standards so far.

Now for that we have the browser and a bit of the actual client working, the next component is the window manager. I’m going to need to do some tricks with rendering later down the line, which we will get to, but the only reasonable choice here was to write a compositing window manager. Compositing just means instead of letting the display server do all the work, we get the textures from the display server and render it to the screen ourselves in this case with the 3D graphics pipeline OpenGL. We’re also doing this in X11 because Xlib is fairly easy to write and lets me experiment more quickly.

To give credit where credit is due, the window manager is largely based on x-compositing-wm by obiwac then rewritten by hand in C++ with a modern semi error tolerant style. The original project is slightly broken in some areas but it provided a good basis on how X11 can interact with OpenGL.

I keep the window manager as a separate process from the browser on the basis of implementing escape hatches. If the browser hangs, I would want to be able implement features where we can kill windows or implement a simple rudimentary backup for debugging or rebooting your browser process. The downside of implementing it this way is that now we have another event loop.

void DoteWindowManager::run() {
  glDepthFunc(GL_LESS);
  glEnable(GL_DEPTH_TEST);
  glEnable(GL_BLEND);
  glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

  while (true) {
    while (process_events())
      ;
    glClearColor(1, 1, 1, 1);
    glClearDepth(1.2);
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

    ipc_step();
    for (auto window : windows) {
      render_window(window.second.window);
    }

    glXSwapBuffers(display, output_window);
  }
}

So now we need to communicate up the chain from window manager to browser. There’s a billion and one ways of doing Inter-Process Communication, I like nanomsg. The interface is going to be very familiar if you’ve done a Berkeley sockets loop before.

    if ((ipc_sock = nn_socket(AF_SP, NN_PAIR)) < 0) {
      printf("ipc sock failed\n");
    }
    if (nn_bind(ipc_sock, "ipc:///tmp/dote.ipc") < 0) {
      printf("ipc bind failed\n");
    }

    // non-blocking
    int to = 0;
    if (nn_setsockopt(ipc_sock, NN_SOL_SOCKET, NN_RCVTIMEO, &to, sizeof(to)) <
        0) {
      printf("ipc non_block failed\n");
    }

nanomsg will just be abstracting local file based UNIX sockets here, especially in the PAIR configuration (one to one). But the benefit over UNIX sockets is it makes some memory management easier especially since we’re dealing with variably sized buffers.

Protobuf is another technology that seems to end up in nearly every single one of my projects, I just can’t help myself. Writing serialization is annoying. Plus we get free type annotations which beats using JSON.

So combined, we end up with a fairly simple Protobuf over nanomsg protocol, with the .proto looking like this.

message DataSegment {
  oneof data {
    WindowMapReply window_map_reply = 1;
    WindowMapRequest window_map_request = 2;

    // ...
  }
}

message Packet {
  repeated DataSegment segments = 1;
}

Every time we need to proxy a new event we just add a new segment data type.

Requests are from the browser, replies are from the window manager. We don’t have a strict request reply structure, with event’s being fired at any time but it’s useful to frame all action in the context of the JS developer. The philosophy here is that all interactions should be initiated by Javascript. The web dev is the one in control.

A Divergence For A DVD

I thought it would be funny to make a DVD window manager quickly as a demo, in doing so I found a critical flaw with the networking scheme.

If we have requests constantly streaming out of our browser and our window manager is unable to keep up, we get to a point where before the window manager can finish processing the old packets even more come. Then the event loops grind to a halt. This is very bad! Luckily we sidestep the issue with some pretty basic networking principles.

Each end of the nanomsg connection says it can send 100 messages before it will stop, the other end likewise keeps track of how many it can receive. Once the receive number hits zero the receiving end will tell the sending end that it can continue sending messages. This way neither end will get swamped with messages. This logic is duplicated to both sides to prevent swamping from either end.

This allows for a speedy protocol that doesn’t have to wait for a confirmation for each request while maintaining the stability of the communication.

The Protocol In Action

To initiate, CEF just sends the browser X11 window id down to the window manager and we can place it in the back and fullscreen it to make it appear as if the browser is our wallpaper.

Now just writing some protocol buffers we can now send some X11 events up and down the chain and let the web page does whatever it so chooses with our windows.

So lets make some window decorations! We can quite simply define a window frame in Javscript and let the user drag it around to move the underlying window. Now what happens if the window overlaps? We can reorder the actual OpenGL windows quite easily just by changing the depth of the vertices we render them at, but the underlying browser window can’t be both in front of and behind a window. Or can it?

Here comes my reasoning behind making a compositing manager. When we’re really just rendering textures to a polygon we can get creative with it. So first we need a quad to work with which we can clone from the real window rendering and apply the browser window’s textures to it. But like that we get a heavily warped view of the actually big browser window.

That’s because right now we’re just trying to use the entire texture so we need to apply some cropping.

Here’s how accomplish the task:

    // borders are defined in offset to the window
    uint32_t pixel_border_width =
        window->width - window->border->x + window->border->width;
    uint32_t pixel_border_height =
        window->height - window->border->y + window->border->height;

    // convert from pixel space to opengl space
    float border_x = x_coordinate_to_float(window->border->x + window->x +
                                           pixel_border_width / 2);
    float border_y = y_coordinate_to_float(window->border->y + window->y +
                                           pixel_border_height / 2);
    float border_width = width_dimension_to_float(pixel_border_width);
    float border_height = height_dimension_to_float(pixel_border_height);

    glUniform2f(cropped_position_uniform, border_x, border_y);
    glUniform2f(cropped_size_uniform, border_width, border_height);

// vertex shader
#version 330

layout(location = 0) in vec2 vertex_position;
out vec2 local_position;

uniform float depth;
uniform vec2 position;
uniform vec2 size;
uniform vec2 cropped_position;
uniform vec2 cropped_size;

void main(void) {
    local_position = vertex_position;

    gl_Position = vec4(
        vertex_position * (cropped_size / 2) + cropped_position,
        depth,
        1.0
    );
}

We shrink the window to the cropped size, and position we want. On a normal window the crop is simply the normal size of the window. The divided by 2 is from the OpenGL behavior that the center of the screen is (0, 0), with the screen space scaling from (-1, 1), to (1, -1) making the total width of rectangle scaling the whole screen 2.

// fragment shader
#version 330

in vec2 local_position;
out vec4 fragment_colour;

uniform float opacity;
uniform sampler2D texture_sampler;
uniform vec2 position;
uniform vec2 size;
uniform vec2 cropped_position;
uniform vec2 cropped_size;

void main(void) {
    vec2 uncroped_position = (
        local_position / ((size / 2) / (cropped_size / 2)) 
        - position 
        + cropped_position
    );
    
    vec4 colour = texture(
        texture_sampler,
        uncroped_position * vec2(0.5, -0.5) + vec2(0.5)
    );
    
    float alpha = opacity * colour.a;
    fragment_colour = vec4(colour.rgb, alpha);
}

Then here we actually sample the texture. If size, and position are equivalent to cropped_size and cropped_position the terms cancel out. Then if the cropped size is smaller than size, it grows how we sample the texture to account for the smaller quad size. Hurts my head a little, took a hot second to figure out on a whiteboard.

So now we can just put our cropped base window to act as a frame in between another window and the window it frames.

In pictures:

On the browser end, we can just give it a command to define these raised window borders.

  message_queue.push({
    t: "window_register_border",
    window: window_map_reply.window,
    x: -BORDER_BASE,
    y: -BORDER_WIDTH + -BORDER_BASE,
    width: BORDER_BASE,
    height: BORDER_BASE,
  } as WindowRegisterBorderRequest);

Then when we handle interactions we just pass all click events that would go to a border to the base window instead of the window the user is actually hovering according to X.

The last technical hurdle is window icons. Window icons are provided to the window manager by X in an uncompressed array of RGBA values. Which for obvious reasons isn’t easy to process by a browser, so we first need to convert the image to a PNG. Then, to send it over Protobuf and subsequently JSON, I put it into data base64 url format. This has the added benefit of being able to be passed directly into a src= attribute of an img tag.

How do I write one?

I’m leaving writing a full tutorial for another time, but I have a few demo repos.

Windows 98 styled, written in dreamland

github.com/FoxMoss/dote-dreamland-win95-example

This is the one I actually developed on and will have the least amount of bugs. dreamland.js is a great framework, if you’re new take a look at the docs.

Windows XP styled, written in React

github.com/FoxMoss/dote-react-xp-example

A port of the Win 98 one to React and XP.css, a bit more rough.

DVD Logo Window Manager, written in vanilla js

github.com/FoxMoss/dote-vanilla-dvd-example

The most rough of the bunch, completely unusable. Many hours will be wasted waiting for your terminal to hit the corner.

Some Musings

There’s two major places I would consider taking the project, you might have noticed how very modular the project is. You can easily replace any of the 3 layers and have the others still behave. So in that vain I would be interested in trying to switch out the X11 compositing for Wayland compositing. From my observations and single attempt, the libraries and ecosystem for Wayland aren’t ready for quick prototyping and rapid development. The other part that I would attempt to switch out is the browser, we will occasionally get some syncing delay as the browser either tries to catch up with the window manager or the window manager tries to catch up with the browser. It’s not very noticeable on faster machines but I would like to expand the range of usability. These problems could be side stepped with greater control of the Javascript engine and frame rendering, so possibly a full Chromium or Ladybird fork may help out.

So that’s about it for the technical details. We’re on the AUR. Go try it out, see the GitHub for up to date install instructions and go forth and write your own window manager.

Huge thank you to Nihaal, and Addy for proofreading and giving me feedback.