Case Study — Enterprise Pharma · Regulated Industry

200+ AI Agents for Clinical Data Extraction:
From Pilot to HIPAA-Compliant Production

8 weeks. 200+ agents. 94% extraction accuracy. Zero data leakage incidents. How codetoday.io automated 20 manual clinical workflows for a mid-size pharma company — with full HIPAA compliance baked in from day one.

200+
AI Agents Deployed
94%
Extraction Accuracy
$340K
Annual Labor Saved
0
Data Leakage Incidents

The Challenge

Our client — a mid-size pharmaceutical company running drug discovery and clinical trials — had 20 separate manual workflows for extracting structured data from research documents: trial reports, lab results, regulatory filings, and clinical notes. Twelve full-time employees were spending 60% of their working hours on copy-paste extraction tasks.

The pain was compounding: document volume was growing 40% year-over-year as the pipeline expanded, while headcount was frozen. The team was falling further behind each quarter. Manual errors in extraction were also creating downstream issues in regulatory submissions, requiring expensive re-review cycles.

The constraint: every solution had to be HIPAA-compliant, with full audit trail, no data leaving the AWS environment, and documented evidence for SOC2 Type II review. Their previous attempt at automation using a SaaS NLP vendor had stalled on compliance review for 11 months.

Our Approach

01
Architecture Scoping
1-week deep dive into the 20 workflow types. Categorized by complexity: 6 simple extraction, 9 structured reasoning, 5 multi-doc synthesis. Designed agent taxonomy and data flow.
02
Bedrock KB + Agent Design
Built the Bedrock Knowledge Base from the document corpus. Designed Orchestrator + 6 sub-agent roles. Implemented Step Functions as the backbone for visibility, retries, and auditability.
03
Guardrails + Audit Trail
Configured Bedrock Guardrails: PII redaction, denied topics, content filters, custom word policies. Every agent invocation logged to CloudTrail with full input/output hash for compliance evidence.
04
Production Rollout
Phased rollout: 3 workflows in week 5, full 20 by week 8. Shadow mode comparison vs manual for 2 weeks before cutover. Human-in-the-loop review queue for low-confidence outputs (<85% score).

System Architecture

The full architecture runs entirely within the client's AWS VPC. No data egresses to external services. All model inference runs through Amazon Bedrock with VPC endpoints — no public internet traffic.


  ┌─────────────────────────────────────────────────────────────────┐
  │  DOCUMENT INGESTION                                             │
  │  S3 Bucket (encrypted, VPC-only) ──► Bedrock Knowledge Base    │
  └────────────────────────┬────────────────────────────────────────┘
                           │ Retrieval
  ┌────────────────────────▼────────────────────────────────────────┐
  │  ORCHESTRATOR AGENT (Claude 3.5 Sonnet via Bedrock)             │
  │  - Classifies document type                                     │
  │  - Routes to appropriate sub-agent                              │
  │  - Tracks extraction confidence                                 │
  └──┬──────────┬──────────┬──────────┬──────────┬─────────────────┘
     │          │          │          │          │
  ┌──▼──┐  ┌───▼───┐  ┌───▼───┐  ┌──▼───┐  ┌───▼──────────────┐
  │Doc  │  │Regex  │  │Clin-  │  │Valid-│  │Audit             │
  │Class│  │Extract│  │NLP    │  │ation │  │Logger            │
  │-ify │  │-or    │  │Agent  │  │Agent │  │(CloudTrail+DDB)  │
  └──┬──┘  └───┬───┘  └───┬───┘  └──┬───┘  └───────────────────┘
     └─────────┴──────────┴─────────┘
                           │
  ┌────────────────────────▼────────────────────────────────────────┐
  │  DynamoDB — Agent State + Extraction Results                    │
  └────────────────────────┬────────────────────────────────────────┘
                           │
  ┌────────────────────────▼────────────────────────────────────────┐
  │  REPORT GENERATOR AGENT                                         │
  │  - Assembles structured output                                  │
  │  - Confidence scoring                                           │
  │  - Routes <85% confidence to human review queue               │
  └────────────────────────┬────────────────────────────────────────┘
                           │
                   ┌───────▼──────────┐
                   │ S3 Output Bucket  │
                   │ (structured JSON) │
                   └──────────────────┘

  Step Functions Express Workflows orchestrate the entire pipeline.
  All agent-to-agent calls are synchronous invocations via Bedrock.
      
Architecture Decision

We chose Step Functions Express Workflows as the backbone — not direct agent-to-agent chaining. This gave us full execution history in the AWS console, automatic retries on transient Bedrock API errors, and per-execution cost visibility. The 50ms overhead per step is trivially worth the operational benefit in a regulated environment.

What We Built

Compliance & Governance

HIPAA Controls

SOC2 Evidence

Lesson Learned

Configure Bedrock Guardrails PII redaction from day one — retrofitting it after the Knowledge Base is populated requires a full re-ingestion. We learned this the hard way during a pre-production audit where a test document containing synthetic patient IDs was indexed without redaction. Budget an extra sprint for compliance hardening.

Results

MetricBeforeAfter
Extraction accuracy87% (manual, spot-checked)94% (automated, every document)
Documents processed/day~200 (12 FTE × 60%)3,000+ (fully automated)
Average extraction time8–25 minutes per document4.7 seconds per document
FTE hours on extraction~340 hrs/week~28 hrs/week (review queue only)
Annual labor cost (extraction)~$420K~$80K (review + oversight)
Regulatory re-review cycles14 per quarter2 per quarter
HIPAA audit readinessManual evidence collection, 3 weeksAutomated, continuous, <1 day

Client Perspective

"

We'd been blocked on AI adoption by compliance for nearly a year. Every vendor we evaluated either couldn't meet HIPAA requirements or couldn't demonstrate the audit trail our legal team needed. codetoday.io came in, understood the regulatory constraints immediately, and built the guardrails architecture first — before writing a single line of agent code. The result is a production system our compliance team actually trusts. The guardrails architecture alone was worth the entire engagement fee.

— CTO, Mid-Size Pharmaceutical Company (name withheld per NDA)

Related Resources

Ready to deploy AI in a regulated environment?

We'll review your use case, compliance requirements, and architecture options in a free 30-minute call. No sales pitch — just engineering honesty.

Book a Free AI Readiness Audit
// More Case Studies
// FinTech
Series B FinTech — 6-Week Cycles to 1-Day Deploys
6wk → 1d
// E-Commerce
E-Commerce DevOps Overhaul
$175K/mo saved
// Related Reading
// AI Agents
Deploying 200+ AI Agents on Bedrock AgentCore
// LLM
AWS Bedrock vs Azure OpenAI: The Honest 2025 Comparison