Why Kosshi Stays Fast at Scale

Kosshi is built around keeping everything in one outline. Projects, journals, and tasks all go into the same outline, not separate files. Used this way, the outline grows to thousands, then tens of thousands of rows.

Keyboard input and drawing speed are handled by being a native app, but once the row count reaches hundreds of thousands or millions, that alone is not enough. The work behind each draw and edit has to be designed so that it does not grow with the row count.

This article explains what Kosshi does internally to keep the outline responsive as it grows.

Cost that grows with row count

Inside an outliner, each operation triggers several processes. Picking out which rows to draw, toggling visibility on a collapse, finding rows that match a search, updating layout after an edit. If all of this fits within the draw deadline, the screen stays smooth.

Problems appear when some of that work scales with the number of rows.

For example, finding "the third visible row from the top" looks simple, but a naive implementation walks the outline from the beginning — "this one is hidden by a collapse, skip; this one is visible, first; next one visible, second." Double the rows and the work doubles; ten times the rows, ten times the work. In programming this is called O(n), where n is the row count.

No matter how cheap each per-row step is, O(n) work eventually misses the draw deadline as n grows. A design with this kind of work scattered through it works up to around tens of thousands of rows. At 100,000 rows you start to feel pauses between operations; at 1,000,000 rows it is no longer usable.

Used this way, an outline can reach tens of thousands of rows in a year and hundreds of thousands in a few years. Kosshi is developed against a one-million-row benchmark.

Outline properties and efficiency

The outline has a few properties:

You usually zoom into the part you care about
Collapse hides descendants that are not relevant right now
Only 30 to 50 rows fit on the screen at any moment

Even with a million-row outline open, only a small fraction is on screen. The rest is either folded, outside the scroll range, or outside the zoom. Drawing and editing should only do work for what is on screen. The total row count can grow freely; the number of rows on screen does not, and neither does response time.

This approach is the foundation for handling large outlines. The rest of this article explains the mechanisms that make it work.

Speedup with SumTree

There is one problem. To draw the screen, you need to know each time which rows are between the top and the bottom of the viewport. Every search jump, collapse toggle, or scroll triggers this "which row goes here" calculation again.

A naive implementation ends up walking the outline from the start each time — "this row is hidden by a collapse, skip; this one is visible, first" — which is O(n) again. If this lookup is not fast, the earlier approach does not hold.

Kosshi uses a data structure called a SumTree (an augmented B+Tree) internally. B+Trees are widely used in databases and filesystems to index large amounts of data, letting you reach a target element by walking down a tree of nodes. A SumTree extends this by storing, at each node, a summary of what is in that subtree.

The name SumTree follows the one used by Zed, a fast code editor written in Rust¹. In Zed it keeps keystrokes instant even in large files; Kosshi adapts the same idea for an outliner.

Locating rows with aggregates

The SumTree's tree structure is not the outline's hierarchy. The outline's parent/child structure is what you see. The SumTree is an internal structure for handling all those rows efficiently, and it is invisible to the user.

The structure is shown below.

Diagram of the SumTree. Each node carries a summary of its subtree, with the actual rows at the bottom. — The internal structure of the SumTree. Internal nodes carry aggregates such as row count and completed-row count.

Each node has aggregates like:

how many rows are below it
how many of those are completed
the total pixel height

To find row 500,000, you walk down from the root: "left child has 200,000 rows, pass; next has 200,000, pass; the next has 150,000, it is in there — find row 100,000 within this one." No row is ever counted one at a time.

Since there are far fewer nodes than rows, even a one-million-row outline is reached in just a few steps.

Collapsing and zooming

Two operations characteristic of outliners are collapse and zoom. Both instantly change a large portion of what's on screen.

Suppose a parent has 10,000 descendants. A naive implementation visits all 10,000 to mark them hidden the moment you fold. Zoom is the same — every row placed outside the range has to be marked individually. The delay is proportional to the row count.

With the SumTree, you handle the subtree covering those 10,000 rows as a unit. For collapse: "this subtree has 0 visible rows." For zoom: "this subtree is outside the display range." None of the 10,000 rows inside need to be touched individually.

Drawing works the same way. "This node has 0 visible rows, skip it entirely" — this decision costs the work of just one node. Zoom-excluded parts, collapsed parts, and off-screen parts are all pruned at this level.

Applying aggregates

The aggregate mechanism has a side benefit: features that involve aggregation become easier to add.

Suppose you want to show "how many unfinished tasks are under this parent" next to the parent row. A naive implementation counts descendants on every open, invalidates the count on every edit, and becomes a fair amount of code.

In the SumTree, you add one field — "unfinished task count" — to the internal nodes' aggregate. Child aggregates sum up into parent aggregates automatically, so editing a row does not make the count stale. From the outline's perspective, the count under any given row is just a read of the aggregate on the SumTree node that covers that range.

Other features can be built the same way:

character count over a range
ratio of completed tasks
number of rows containing images
latest modification time
number of bookmarked rows

Each works by adding one field to the aggregate.

About Kosshi

Kosshi is a native outliner for macOS and iOS. Collapsing, moving, and restructuring respond to keyboard input without perceptible delay. It syncs between Mac and iPhone via iCloud.

For the design philosophy, see Everything in One Outline. For the basics, see What Is an Outliner?.

Try Kosshi Free for 7 Days

Zed Blog (2024). Zed Decoded: Rope & SumTree. https://zed.dev/blog/zed-decoded-rope-sumtree ↩