Two Radency engineers built the same MVP, with and without AI. We measured their efficiency

Eduard Dolynskyi
Oct 15
7 min read

Updated: Oct 16

At Radency, we’ve always been obsessed with radical efficiency (it’s literally what our name stands for). So when AI tools started entering the developer workflow, we made a strategic decision to integrate them into our delivery. Over the past year, we’ve put that into practice across client and internal projects. The next step was to quantify AI’s impact.

Since AI in software development is still new, there’s no single best practice for measuring its effect. Different tech teams track different things: Atlassian measures the acceptance rate of AI suggestions, while Dropbox looks at time saved per engineer each week.

We started with one simple metric: how much time AI saves one engineer in development workflow. To find out, we ran an R&D project.

Here’s what we learned: where AI helped, where it didn’t, and how that helped shape our approach to AI-assisted development that lets us ship 30%+ faster, with the same quality.

TL;DR: AI-assisted vs traditional development

Note: By AI-assisted we mean a human-led process enhanced by AI tools.

Two Radency engineers built the same set of features for a Task Manager. One worked the traditional way. The other used Claude Code.

Goal: Measure developer efficiency with and without AI.

Result: 99h (traditional) vs 41h (AI) → 59% faster overall, with the 60% faster development phase (23h vs 61h).

→ Where AI helped most: backend development (architecture, business logic, data schemas, APIs), debugging & troubleshooting, test generation, boilerplate setup, and documentation.

→ Weakest areas: frontend/UI consistency, cross-layer integration, long-session memory.

→ Key lesson: AI is most valuable when engineers know how to frame requirements and spot errors.

→ Impact: This experiment confirmed where AI adds the most value and gave us the first metrics that helped shape Radency’s AI-powered SDLC, which now cuts ~30% of delivery time → about $60K saved on a $200K project.

Two certified Radency engineers got the same task: build Task Manager MVP with a set of defined features, a base for a real product with maintainable code. We had detailed requirements and mockups. One worked the traditional way, the other with Claude Code, a coding assistant we’d found most cost-effective and easy-to-integrate in our workflow at the time. Both engineers completed our Certification program before the experiment began.

Tech task: Build the MVP of a PM tool for development teams.

Key features:

Task board with lists and cards (create, edit, delete, move).
Cards include name, description, due date, priority, and assignee.
Activity log records all changes (create, edit, move, delete, reassign).
Slack integration sends notifications when tasks are created, assigned, or moved.
Authentication with email + password and JWT token.

Tech stack: React, Node.js, PostgreSQL, and Slack API for integration, following a modular architecture with AI-powered boilerplate and human-reviewed core logic.

Note on engineer selection: We chose two engineers of roughly the same skill level. The one using Claude Code, was already AI-fluent (trained in our methodology of AI-powered engineering) and familiar with the tool, so no time was wasted on AI-adoption.

Here are the results.👇

Time spent with and without AI: 41h vs 99h

We tracked the hours spent in six development phases. The main question was how much AI could speed up the development stage, since it usually takes the most time. As expected, that’s where we saw the biggest savings:

Planning → 50% faster with Claude Code
Environment setup → 40% faster
Development → 62% faster
Testing → 75% faster
Deployment → 33% faster
Documentation → 60% faster

Phase	With AI (h)	Without AI (h)	Savings (%)	Claude Code usage
Planning	2	4	50	Acceptance criteria, Slack message template
Setup	4	7	43	Dockerfile, docker-compose, GitHub Actions config
Development	23	61	62	Frontend (UI/UX, business logic), backend (API, models, database, authentication), Slack integration, bug fixing
Testing	2	8	75	Unit tests (API, app layer, domain layer, infrastructure, components, effects, services), storybook setup
Deployment	6	9	33	Terraform scripts and deployment configs (ECS, Load Balancer, CI/CD), infrastructure design and environment setup.
Documentation	4	10	60	Project docs (Claude.md, conventions, patterns)
Total	41h	99h	59%	saved with Claude Code

AI cut total development time by more than a half. Biggest wins were in testing, documentation, and development, where AI reduced effort by over 60%.

AI cut 33–75% of time across different phases. Yet, hours saved are just one metric. To understand AI’s real value, we wanted to see what types of tasks Claude Code helped most, and why it fell short in others.

Where AI sped things up, and where it made things slower

The biggest wins in using AI were with tasks that had clear rules and repeatable patterns — backend logic, boilerplate, debugging. UI, cross-layer integration, and deployment still required a lot of manual effort.

Claude Code was strongest in the backend. APIs, database schema, and queries came together fast and clean, with little rework. It also saved time in setup: boilerplate like Docker configs and dependencies took minutes instead of hours.

Debugging was also a big win: Claude quickly interpreted error messages and suggested fixes.

Testing was another area of advantage, but we found it worked best when we generated tests in small groups: for a single component or feature at a time. This made them easier to check, edit, and fix, while also reducing token use during iterations.

Frontend was a different story. Claude quickly generated UI code, but often misread mockups (wrong colors, broken layouts, missing details), which meant extra debugging.

Keeping the frontend and backend in sync was another recurring problem: enums, DTOs, and API endpoints didn’t always match. And while configs were generated fast, cloud networking, TLS, and secrets still required manual setup.

In longer sessions, Claude tended to forget past instructions (like using inject() instead of dependency injection on the frontend).

When Slack integration went wrong (and why it mattered)

We knew from the start that we wanted Slack notifications, but we didn’t add this integration to Claude.md. When our engineer introduced it mid-development, Claude treated it as a brand-new feature and over-engineered the logic. For example, it created unnecessary Slack channels for each board. An engineer ended up wasting about 1.5 hours fixing and re-prompting with clearer expectations.

Lesson learned: If you know about the integration upfront, document it early. If you add it later, give the AI the same level of detail you would for a core feature. Otherwise it’ll “guess” and over-engineer.

After frontend glitches and the over-engineering story, we saw a clear pattern emerge:

3 signs AI will cost you more time than it saves

Vague or missing requirements → If the AI doesn’t have the full picture (like with Slack integration), it will “guess” and over-engineer.
Visual/UI tasks → Claude Code often misread mockups. Other AI tools may perform better here.
Long sessions → AI assistant loses track of earlier instructions, so code quality drifts or errors repeat.

3 things we did to fix Claude’s weak spots

These three things helped most:

Context management strategy → We split conversations by domain (frontend, backend, project-level) to reduce confusion and token waste from context mixing.
Documentation-driven development → Created a Claude.md configuration file with conventions and patterns (e.g. project structure, CQRS, dependency injection) as a reference.
Prompt engineering → Used precise prompts: error + context + expected fix.

Our Claude.MD file for Task Manager — *Our* *Claude.MD* *file for Task Manager*

AI works best when the task is well-defined and self-contained. It falls short when tasks depend on context, interpretation, or architecture decisions.

The problem teams face with AI-assisted development, and how we avoid it

From what we see in the market, many teams buy AI assistants or agents hoping for faster delivery. Often they discover the opposite: tools eat budget, produce more bugs as there’s no validation, or don’t integrate well.

AI doesn’t magically speed things up unless you adapt your whole process around it.

Inside our AI-assisted SDLC methodology

This R&D project gave us data and validation for a direction we were already moving toward: improving and formalizing AI-assisted SDLC. We’ve since rebuilt our software development lifecycle around AI and documented a methodology that integrates AI tools at every stage, with guardrails for quality, security, and cost control. Tool costs range from $20 to $1,000 per developer per month, higher for self-hosted or local-only setups that require stricter security.

Real-world impact: efficiency that actually scales

In live client projects, the average efficiency gain per developer is about 30% -intentionally lower than the 59% we saw in R&D.

That’s because production work includes the layers that make software reliable at scale: code reviews, QA pipelines, stakeholder feedback loops, and security audits. These introduce coordination overhead that AI can’t fully automate (yet).

Even so, 30% faster delivery at enterprise quality levels means ~$60K saved on a $200K project and a six-month roadmap completed in four, without cutting corners.

To achieve this, we set the following practices and guardrails:

End-to-end AI integration = using specific AI tools across the development lifecycle. For example, using Jira AI Assistant for user stories and acceptance criteria, Figma AI for wireframes, and CodeRabbit for automated analysis, along with Claude Code or Cursor for coding.
Context-driven development: We created a 5-file system (including specs, style guides, CLAUDE.md conventions, user stories, and constraints) to feed AI assistants structured context up front.
Human-in-the-loop quality control. Every AI output goes through a 5-layer review process:
- Prompts embed coding standards and rules
- Engineers always validate their own AI-generated code
- Automated checks catch inconsistencies
- Peer review with two reviewers on AI-heavy PRs
- QA stress-tests the final feature
Security by design: We configure AI tools differently depending on project sensitivity: from Copilot Business with retention policies, to fully self-hosted setups, to local-only processing for regulated industries.
Cost-effectiveness: To control API costs, we limit token waste by splitting conversations by domains and reusing structured docs. For example, instead of pasting the full project setup into every prompt, we use a configuration markdown file (like Claude.md) with conventions and reference it.

This experiment also confirmed something we’ve always believed: AI doesn’t replace engineering experience, but amplifies it. To code effectively with AI, you need to know how to clearly frame requirements and criteria. That’s exactly what we’ve been fostering across our engineering teams, and will continue to build on.

[Checklist]

Can your team benefit from AI? Here’s how to tell

After testing AI in our own work, we’ve seen what can make or break AI adoption. If your team is considering AI in software development, you can use this checklist to quickly evaluate if you’re AI-ready:

Standardized coding practices → documented conventions, style guides, and system rules.
Accessible project knowledge → specs, user stories, and architecture docs your AI tools can “read.”
Engineer maturity → developers who will review and own AI output.
Quality pipeline → automated checks and peer review steps that catch AI mistakes early.
Security alignment → understanding the level of data protection (enterprise tools, self-hosted, local-only) your projects need.
Token discipline → processes for managing API calls, reusing docs, and avoiding runaway costs.

In any case, if you need AI-powered developers to scale your product or team, without the headache of long hiring and onboarding, you know who to call.