Brent Haskins / Applied AI
Dark Mode Isn't a Theme — It's a Design System Stress Test
Dark mode is no longer optional in 2026, but most teams treat it as a simple color inversion — and fail. Drawing from Etsy's engineering post-mortem, the Design Tokens Format Module, and real shipped experience, this post argues that dark mode is the ultimate stress test for your design system. If your tokens aren't semantic, your contrast ratios will break, your brand will look inconsistent, and your accessibility will suffer. The fix is not a theme toggle — it's a token architecture that separates semantics from values. Written for senior engineers and product builders who want to ship dark mode without the mess.
The short answer
Dark mode is not a theme toggle. It's a design system maturity test. Every team I've seen that treats it as a simple color inversion ends up with a mess: washed-out text, broken contrast ratios, and a brand that looks like a different product at night. The teams that ship dark mode cleanly — Etsy, for example — all share one thing: they had a semantic token system before they started.
The real work isn't picking dark colors. It's defining what "surface" means across modes, ensuring text meets WCAG contrast in both contexts, and deciding which components need special treatment. Dark mode exposes every inconsistency you've been hiding under a light background. If your design system isn't ready for that, dark mode will force it to grow up.
Key takeaways
- Dark mode exposes every inconsistency in your design system — fix those first, not after.
- Semantic tokens (e.g.,
color-surface-primary) are essential; primitive tokens (e.g.,color-gray-900) are not enough. - Contrast ratios must be re-evaluated per mode — what passes on white may fail on dark gray.
- Automation can generate token sets quickly, but only if you have a semantic layer to map to.
- Etsy's approach: treat dark mode as a product feature with its own UX decisions, not a CSS filter.
- Start with a small core set of tokens (background, text, border, interactive) and expand only after validation.
Why dark mode breaks most design systems
The most common mistake is assuming dark mode is just a color inversion. In reality, dark mode changes the entire visual hierarchy. A light shadow on a dark surface looks different. A blue link that works on white may disappear on dark blue. Etsy's engineering team discovered that their existing color palette had "ghost" inconsistencies — colors that looked fine in light mode but clashed in dark mode because they were never designed for that context.
Another trap is accessibility. WCAG contrast ratios are harder to hit in dark mode because the human eye perceives contrast differently on dark backgrounds. A 4.5:1 ratio that passes on white may feel insufficient on a dark surface. You need to test each token pair in both modes, not just assume the math transfers.
The token architecture that scales
Semantic tokens are the only way to manage multiple modes at scale. The Design Tokens Format Module 2025.10 defines a clean structure: primitive tokens hold raw values (like hex codes), and semantic tokens reference them with overrides per mode. For example, a color-surface-primary token might point to {color.white} in light mode and {color.gray-950} in dark mode. The $extends keyword lets you inherit a base set and override only what changes — essential for keeping token files manageable.
This isn't theoretical. In shipped products, I've seen token files grow to thousands of entries. Without a semantic layer, you end up with duplicate values, missed overrides, and a design system that no one trusts. The spec is a good starting point, but you also need tooling to validate that every semantic token has a value for every mode.
Automation isn't a silver bullet
The OneMinuteBranding article shows you can auto-generate tokens in seconds. That's true — if you already have a semantic structure. But automation without semantic mapping is just generating more primitive tokens. You'll get a list of dark colors that don't map to your components. The real value of automation is in generating the dark-mode values from your light-mode tokens using algorithms (like luminance adjustment), then letting designers tweak the outliers.
Etsy didn't automate their way out. They manually audited every color usage, then built a token system that could be extended. Automation helped with the grunt work, but the hard part was deciding which tokens existed in the first place.
What Etsy learned the hard way
Etsy's engineering blog details how dark mode forced them to improve their entire design system. They started with iOS and Android, not web, because mobile had the clearest user demand. They built a semantic token system that separated concerns: background, text, border, icon, and interactive tokens. The process revealed gaps — missing hover states, inconsistent border widths, and colors that were used interchangeably but should have been distinct tokens.
Their key insight: dark mode isn't just a UI change; it's a product feature. They had to decide which components should invert and which should stay light (like product images). They also had to handle edge cases like dark mode in a dark room — contrast needed to be lower to avoid eye strain. That's product thinking, not just CSS.
The real product decision
Dark mode is a product decision, not a frontend task. It affects readability, battery life (on OLED screens), and brand perception. Some users prefer dark mode for accessibility reasons (migraines, low vision). Others just like the aesthetic. Your job is to make it feel intentional, not like a half-baked toggle.
This means choosing which surfaces get dark — do you darken the entire app or just content areas? Do you offer a system-default option or force a choice? Etsy chose to respect the OS setting but also allow manual override. That's the right call for most products: give users control, but default to the system.
Closing: start with tokens, not themes
If you're planning dark mode for 2026, don't start by picking dark colors. Start by auditing your current design system. Find every color usage, group them into semantic categories, and build a token file with a clear structure. Then generate dark-mode values, test contrast in both modes, and iterate. The dark mode will ship cleaner, and your design system will be stronger for it.
Dark mode is a stress test. Pass it, and your design system can handle anything.
FAQ
Questions people ask about this topic.
Why is dark mode harder than just inverting colors?
Inverting colors destroys contrast ratios, brand identity, and readability. Dark mode requires rethinking every color's role: backgrounds need depth, text needs adjusted luminance, and interactive states must remain distinguishable. Etsy found that dark mode exposed inconsistencies they didn't know they had — it forced a complete color system overhaul, not a quick flip.
What are design tokens and why do they matter for dark mode?
Design tokens are the atomic values — colors, spacing, typography — that define your UI. For dark mode, you need semantic tokens (like 'background-primary' or 'text-body') that map to different values per mode. Without them, you're hardcoding colors everywhere, which makes dark mode a nightmare to maintain. The Design Tokens Format Module standardizes this with $extends and $type.
How did Etsy approach dark mode in their design system?
Etsy treated dark mode as a product feature, not a skin. They started with iOS and Android, built a semantic token system, and used dark mode as a forcing function to clean up their entire color palette. The process revealed gaps in their design system — like missing hover states and inconsistent borders — which they fixed before shipping. The result was a stronger system for both modes.
Sources