Systems | Development | Analytics | API | Testing

Refactor Safely with AI: Using MCP and Traffic Replay to Validate Code Changes

So as software engineers using AI coding assistants, we’re quickly learning of a new anti-pattern: Hallucinated Success. You give your agent (e.g. Claude via terminal or various IDE code assistants) the command “refactor the billing controller.” The agent happily complies, churning out nice clean code. The agent even goes so far as to write a new unit test suite that passes at 100%. You integrate it. Your test suites pass. Your production code breaks. Why?

ROI of Digital Twin Testing: Cut Testing Costs by 50%

When engineering leaders review their cloud bills, they often focus on production costs—the infrastructure serving real users, processing real transactions, generating real revenue. But there’s a shadow cost lurking in every cloud environment that often goes unnoticed until it becomes painful: non-production infrastructure.
Sponsored Post

Peeking Under the Hood with Claude Code

Claude is one of the go-to AI-native code editors for developers. Because it's a simple chatbot interface housed inside a familiar CLI, it provides a pretty smooth path between traditional IDEs and agentic AI. But what's actually happening behind the scenes when you ask it to write code, generate a test, or debug an issue? Who and what is it talking to behind the scenes? Can I prevent data leakage or do I need to add another layer to my tin foil hat? To answer these questions, I used proxymock to inspect the network traffic flowing from the Claude IDE.

Moving Our Observability Data Collector from Sidecars to eBPF

For years, the Kubernetes sidecar pattern has been a practical way to capture observability data. Running a collector alongside each application pod gave us deep visibility into traffic, including full request and response payloads across supported protocols. However, as cloud-native environments have grown more complex, the limitations of sidecars—such as resource overhead, operational complexity, and scaling challenges—have become more apparent.

Mock vs Stub: Essential Differences

When discussing the process of testing an API, one of the most common sets of terms you might encounter are “mocks” and “stubs.” These terms are quite ubiquitous, but understanding exactly how they differ from one another - and when each is the correct method for software testing - is critical to building an appropriate test and validation framework. In this blog, we’re going to talk about the differences and similarities between mocks and stubs.

The CES Hangover: 3 Expensive Hardware Fails That Were Actually Software Problems

The dust has settled on Las Vegas. We saw transparent TVs, cars that drive sideways, and enough “AI-powered” toothbrushes to confuse a dentist. CES is incredible at selling the dream of hardware. The demos are slick, the lighting is perfect, and everything works on the showroom floor. But as engineers, we know the dirty secret of CES: The hardware is the easy part.

Let Your LLM Debug Using Production Recordings

Modern LLM coding agents are great at reading code, but they still make assumptions. When something breaks in production, those assumptions can slow you down—especially when the real issue lives in live traffic, API responses, or database behavior. In this post, I’ll walk through how to connect an MCP server to your LLM coding assistant so it can pull real production data on demand, validate its assumptions, and help you debug faster.

Speedscale vs. LocalStack for Realistic Mocks

API mocking plays a crucial role in modern software development allowing developers to simulate external API endpoints. It’s an effective way to isolate your application for testing and ensure that code changes don’t inadvertently break critical dependencies. Essentially, API mocking helps you create robust, reliable software by allowing you to test how your application interacts with external services.

How to Do Full-Text Search Across All Application Traffic with Speedscale

Modern DevOps observability tools are excellent for monitoring system health, tracking distributed traces, and aggregating metrics. However, they lack the fidelity needed for full-text search across application traffic. While observability platforms excel at showing what happened and when, they often fall short when you need to find where a specific piece of data (like an email address, user ID, or transaction token) appears as it flows through your entire application stack.