Leetcode 631. Design Excel Sum Formula
Asked at:
Airbnb
DESCRIPTION
Design a spreadsheet system that supports setting cell values (integers) and formulas (sum of other cells). The core challenge is maintaining a dependency graph where formulas automatically recompute when their dependencies change. For example, if cell C sums cells A and B, updating A should automatically propagate to C and any cells depending on C.
Input:
set_value("A", 5) set_value("B", 10) set_sum("C", ["A", "A", "B"]) get_value("C")
Output:
20
Explanation: Cell C computes A + A + B = 5 + 5 + 10 = 20
Constraints:
- All operations are serial (no concurrency)
- No circular dependencies will occur
- Non-existent cells have value 0
- Formulas can reference the same cell multiple times
Understanding the Problem
The main challenge is efficiently propagating updates through a dependency graph. When a cell value changes, all cells that directly or indirectly depend on it must recompute. This requires tracking two relationships: forward dependencies (which cells does this cell reference) and reverse dependencies (which cells reference this cell). A naive approach recomputes everything on each change, but we need targeted propagation to only affected cells.
Building Intuition
A naive approach stores cell values and recomputes all formulas on every get_value call, which is inefficient. A better approach maintains a dependency graph where each cell tracks its dependents, enabling targeted updates. When cell A changes, we propagate updates only to cells that depend on A, then recursively to their dependents. For example, updating A triggers recomputation of C, which then triggers D.
This pattern appears in spreadsheet applications (Excel, Google Sheets), reactive programming frameworks (React, Vue), and build systems (Make, Bazel). Understanding dependency tracking and change propagation is fundamental to building systems where derived values must stay synchronized with their sources efficiently.
Common Pitfalls
Implementation
Core Data Structures and Storage
Implement the foundational storage layer with three key structures: a values dictionary storing computed cell values, a formulas dictionary storing sum formulas, and a dependents graph tracking reverse dependencies (which cells depend on each cell). The dependents graph is critical for propagation—when cell A changes, we look up dependents[A] to find all cells that need recomputation. For example, if C depends on A, then dependents[A] contains C.
class Spreadsheet:def __init__(self):self.values = {} # key -> computed valueself.formulas = {} # key -> list of dependency keysself.dependents = {} # key -> set of keys that depend on it
Setting Values and Managing Dependencies
Implement set_value and set_sum operations that update the storage structures from Section 1. When setting a sum formula, first clear old dependencies by removing the cell from its previous dependencies' dependent lists, then register new dependencies by adding the cell to each referenced cell's dependent list. For example, changing set_sum("C", ["A"]) to set_sum("C", ["B"]) removes C from dependents[A] and adds it to dependents[B]. After updating dependencies, trigger recomputation of the changed cell.
def set_value(self, key: str, value: int):self._clear_dependencies(key)self.values[key] = valueself.formulas.pop(key, None)self._recompute(key)def set_sum(self, key: str, values: list):self._clear_dependencies(key)self.formulas[key] = valuesfor dep_key in values:if dep_key not in self.dependents:self.dependents[dep_key] = set()self.dependents[dep_key].add(key)self._recompute(key)def _clear_dependencies(self, key: str):if key in self.formulas:for dep_key in self.formulas[key]:if dep_key in self.dependents:self.dependents[dep_key].discard(key)
Dependency Propagation and Recomputation
Implement the update propagation mechanism that recursively recomputes affected cells. When a cell changes, compute its new value (either the set integer or sum of referenced cells), then recursively propagate to all cells in its dependents set. Use a visited set or topological ordering to handle complex dependency chains efficiently. For example, updating A recomputes C (which depends on A), then recomputes D (which depends on C). The get_value operation simply returns the precomputed value from the values dictionary.
def _recompute(self, key: str):if key in self.formulas:self.values[key] = sum(self.values.get(dep, 0) for dep in self.formulas[key])if key in self.dependents:for dependent in self.dependents[key]:self._recompute(dependent)def get_value(self, key: str) -> int:return self.values.get(key, 0)
Complete Integration
Combine the storage structures (Section 1), dependency management (Section 2), and propagation logic (Section 3) into a unified Spreadsheet class. The class encapsulates all three dictionaries and provides the public API (set_value, set_sum, get_value). Demonstrate how setting a value triggers dependency updates, which trigger propagation, resulting in all dependent cells being automatically recomputed. For example, show the complete flow: set_value("A", 5) → update values[A] → propagate to dependents[A] → recompute all affected cells.
# Complete Integration - Spreadsheet class already defined above# Demonstration of usage flow:if __name__ == "__main__":sheet = Spreadsheet()# Setting values triggers storage updatesheet.set_value("A", 5)sheet.set_value("B", 10)# Setting formulas establishes dependencies and computessheet.set_sum("C", ["A", "A", "B"]) # C = 5+5+10 = 20sheet.set_sum("D", ["C", "C", "A"]) # D = 20+20+5 = 45# Updating A propagates through C to D automaticallysheet.set_value("A", 100) # D = 200+200+100 = 500print(sheet.get_value("D")) # 500
What We've Learned
- Pattern: Maintain both forward (what I depend on) and reverse (who depends on me) dependency graphs for efficient propagation
- Use Case: Reactive systems, spreadsheets, build tools—any system where derived values must stay synchronized with sources
- Optimization: Precompute and cache values; only recompute on changes rather than on every read
- Critical Detail: Always clear old dependencies before registering new ones to prevent stale references
Problems to Practice
Topological sort deals with dependency graphs where nodes depend on other nodes, similar to how Excel cells depend on other cells through formulas. This lesson teaches how to handle and traverse dependency relationships, which is the core challenge in the Excel sum formula problem.
medium
This problem requires detecting cycles in a dependency graph and validating that prerequisites can be satisfied. The Excel problem similarly requires maintaining dependencies without cycles and propagating updates through the dependency chain, making this an excellent practice problem for understanding dependency management.
medium
This problem extends Course Schedule by requiring the actual ordering of dependent nodes, which directly parallels computing cell values in the correct order when dependencies exist. It teaches how to process nodes in dependency order, essential for propagating updates in the Excel formula system.
Question Timeline
See when this question was last asked and where, including any notes left by other candidates.
Late March, 2025
Airbnb
Senior
Part 1: Set Value as Sum Your task is to design a data structure which allows you to set and retrieve (key, value) pairs. There are two cases: - The key is a string and the value is an integer. - The key is a string and the value is a list of strings (keys). The value of the cell is the *sum* of the value of the keys. API: > set_value(key: str, value: int) > set_sum(key: str, values: List[str]) > get_value(key: str) If the value of a key changes, any sum using that value should update as well. You may assume all calls are serial and there will be no cycles. If a key doesn't exist, you can treat its value as 0. An example sequence of actions: * set_value("A", 5) * set_value("B", 10) * set_sum("C", ["A", "A", "B"]) * set_sum("D", ["C", "C", "A"]) * get_value("A") # -> 5 * get_value("B") # -> 10 * get_value("C") # -> 20 * get_value("D") # -> 45 * set_value("A", 100) # Anything depending on "A" should update * get_value("D") # -> 520 D / | \ C A C visualization of the / | \ / | \ dependencies above A A B A A B
Comments
Hello Interview Premium
Your account is free and you can post anonymously if you choose.