Why We Replaced Cube.js with an In-House Analytics Engine
And cut infrastructure costs by 97% while getting 20x faster
TABLE OF CONTENTS
At some point, every fast-growing engineering team runs into the same uncomfortable truth: the tool that helped you move quickly at the start is now quietly slowing you down.
When we started, we were using Cube.js, a great abstraction layer that helped us move fast, structure our metrics, and bridge warehouse-specific SQL between various application layers. But as our use cases stretched from dashboards to real-time decisioning, the cracks started to show, specifically in scaling patterns.
This blog walks through why we replaced Cube.js with a purpose-built in-house analytics engine called Antman, and how we ended up with a system that’s 20× faster, simpler, and deeply aligned with our needs.
We’re also preparing to open-source it soon, so here’s the backstory →
1. Where Cube.js Shined (And Why We Started There)
When we first integrated it, Cube.js was exactly what we needed.
We were building GobbleCube, a real-time operating layer for commerce brands, and analytics was core to everything. We needed to move fast, and Cube.js delivered:
- quick integration,
- standard metrics and definitions,
- SQL abstraction across warehouses,
- ship dashboards, and analytics without reinventing the wheel.
For early and mid-stage scale, the open-source offering worked beautifully. But when our production traffic crossed ~1,000 requests per minute, the cracks began to show. Performance wobbled, and latency crept up.
2. The Breaking Point
a. The Architectural Mismatch
Cube.js is designed to be flexible. It supports multiple heterogeneous data sources and complex semantic modeling. That’s powerful… if you and when you need it.
In our case, supporting complex semantic models was only part of the problem. Our workloads demanded a system that could sustain extremely high throughput while consistently delivering sub-second latencies.
The abstraction layer that made us productive early on was now slowing us down.
On top of that, Cube.js Open Source offered no reliable way to observe or monitor the system in production. That was a big no for us!
b. Application-Level Use Cases
Our analytics layer did more than serve dashboards. It powered:
- Application-facing queries
- Internal services
- Near-real-time decisioning paths
These aren’t passive queries - they sit directly in request workflows. That meant we needed:
- Predictable, low-latency responses
- Request-level execution control
- Tight integration with app logic
And Cube.js couldn’t keep up.
We hit several roadblocks:
- No reliable fallback to alternate data sources
- No support for bi-directional joins across schemas
- No native support for column-level search using PostgreSQL or warehouse indexing
- Inconsistent behavior with window functions
- No support for adding contextual information to data definitions for our AI Agents to autonomously generate queries
At its core, Cube.js is built for BI-style workloads. But we were running application-critical query paths, and that required a whole different level of precision and performance.
Cube.js’ core advantage lies in its pre-aggregation model, which proved effective for certain workloads but fell short for our particular requirements.
c. The Resource Problem
This is where it got absurd. Each Cube.js router pod consumed:
- ~6 GB of RAM
- ~4 vCPUs
And these pods were mostly generating SQL queries and routing them to our actual databases.
At peak traffic, we were running 30 router pods. That's 180 GB of RAM and 120 vCPUs just for the query orchestration layer. Not the databases actually doing the work, just the middleware sitting in front of them.
The infrastructure cost was climbing month over month, but more frustrating was the operational complexity. We couldn't fine-tune performance at the granular level we needed. Memory spikes were hard to predict. OOMs were rampant. CPU usage didn't scale linearly with query complexity.
We were stuck in a loop - throwing more pods at the problem, hoping to stay ahead of latency.
d. The Scaling Ceiling
And then came the final constraint - we couldn’t scale even if we wanted to.
On non-enterprise plans, Cube.js placed hard limits on infrastructure scaling. It wasn’t a technical ceiling; it was a contractual one.
Cube.js’s cloud enterprise wasn’t the right fit for us at the moment, and what should’ve been an operational decision became a gated one. And as traffic kept climbing, the system simply couldn’t keep pace with our growth.
3. The Decision: Build Instead of Patch
We didn't jump straight to "let's build our own." That would have been naive.
First, we tried everything else:
- Aggressive caching at multiple layers
- Reducing the Cube.js surface area
- Hybrid approaches
- Explored other open-source alternatives (none fit our use cases directly)
None of it got us far enough. Each ‘hack’ would have delayed the problem; it wouldn’t have solved it.
Then came the breakthrough: the abstraction layer itself was the bottleneck
Not our configuration of it. Not our usage patterns. The very architecture of a flexible, general-purpose semantic layer stood in the way of what we needed: raw, deterministic speed.
Once we saw that, the decision became obvious. We needed a purpose-built analytics engine designed specifically for our workload: high-concurrency, application-facing analytics against a known set of backends, optimized for the context-aware, state-driven decision-making that powers GobbleCube.
4. How we designed the In-House System, Antman:
Before writing a single line of code, we laid down four non-negotiables:
- Performance first, abstractions second: We'd only add abstraction layers when they demonstrably improved something without hurting latency or throughput.
- Explicit control over query planning and execution: Every query path should be predictable and debuggable.
- Predictable behavior under high concurrency: The system had to scale linearly under load. No sudden cliffs or degradation. It had to hold steady during flash sales and festive spikes — when our brands rely on us the most.
- Minimal layers, minimal surprises: When something breaks, we should instantly know where and why.
We built our custom-engine, Antman, to excel at three specific jobs:
- Application-facing analytics with strict latency SLAs
- Multi-destination query execution with intelligent routing and fallbacks
- Context-aware data access for our AI pipelines that needed to pull analytical context in real-time
5. Technology Choices:
We built the engine in Go, and this choice paid off immediately:
- Efficient concurrency model
- Predictable memory and CPU usage: Unlike Node.js (which Cube.js runs on), Go's memory management is deterministic. We can predict resource usage based on query patterns.
- Simple, observable runtime behavior
For data stores, we made an intentional decision to stay narrow:
→ PostgreSQL (for transactional state: current inventory levels, SKU metadata, campaign configurations)
→ ClickHouse (for high-volume time-series data: demand signals, rank changes, pricing history across thousands of localities)
6. The Result
The performance improvement was exactly what we expected:
After integrating Antman, we observed a measurable reduction in end-to-end latency during the initial rollout:
Since then, Antman has continued to deliver consistently low latency and stable performance in production:
7. Why This Worked for Us
It worked because we were operating with clarity, constraints, and the willingness to go deep.
- We deeply understood our access patterns: We'd been running these queries through Cube.js for over a year. We knew exactly which patterns were common, which were rare, and what our latency budget was for each use case.
- We were willing to sacrifice generality: We built exactly what GobbleCube needed and nothing more.
- We had tight control over concurrency: Go's goroutines gave us fine-grained control over how queries are scheduled and executed.
- We had full ownership of the analytics layer: We control the application, the analytics engine, and the databases. We could co-design optimizations across all three layers.
If you don't have this level of clarity and control, a general-purpose solution like Cube.js might still be your best option. This worked for us because we were past the exploration phase and into the optimization phase.
8. Closing Thoughts
Cube.js was the right choice when we needed to move fast and standardize our analytics layer.
But as our needs evolved, so did the limits of general-purpose tools. Building Antman wasn’t about reinventing the wheel; it was about removing the parts we didn’t need so we could move faster.
The "build vs. buy" decision is always contextual. If you're hitting similar scaling walls with your analytics infrastructure, the lesson isn't "build your own immediately."
The lesson is: understand your workload deeply, measure ruthlessly, and be willing to trade generality for performance when the abstraction cost outweighs the abstraction benefit.