200+ AI Agents for Clinical Data Extraction

The Challenge

Our client — a mid-size pharmaceutical company running drug discovery and clinical trials — had 20 separate manual workflows for extracting structured data from research documents: trial reports, lab results, regulatory filings, and clinical notes. Twelve full-time employees were spending 60% of their working hours on copy-paste extraction tasks.

The pain was compounding: document volume was growing 40% year-over-year as the pipeline expanded, while headcount was frozen. The team was falling further behind each quarter. Manual errors in extraction were also creating downstream issues in regulatory submissions, requiring expensive re-review cycles.

The constraint: every solution had to be HIPAA-compliant, with full audit trail, no data leaving the AWS environment, and documented evidence for SOC2 Type II review. Their previous attempt at automation using a SaaS NLP vendor had stalled on compliance review for 11 months.

Our Approach

Architecture Scoping

1-week deep dive into the 20 workflow types. Categorized by complexity: 6 simple extraction, 9 structured reasoning, 5 multi-doc synthesis. Designed agent taxonomy and data flow.

Bedrock KB + Agent Design

Built the Bedrock Knowledge Base from the document corpus. Designed Orchestrator + 6 sub-agent roles. Implemented Step Functions as the backbone for visibility, retries, and auditability.

Guardrails + Audit Trail

Configured Bedrock Guardrails: PII redaction, denied topics, content filters, custom word policies. Every agent invocation logged to CloudTrail with full input/output hash for compliance evidence.

Production Rollout

Phased rollout: 3 workflows in week 5, full 20 by week 8. Shadow mode comparison vs manual for 2 weeks before cutover. Human-in-the-loop review queue for low-confidence outputs (<85% score).

System Architecture

The full architecture runs entirely within the client's AWS VPC. No data egresses to external services. All model inference runs through Amazon Bedrock with VPC endpoints — no public internet traffic.


  ┌─────────────────────────────────────────────────────────────────┐
  │  DOCUMENT INGESTION                                             │
  │  S3 Bucket (encrypted, VPC-only) ──► Bedrock Knowledge Base    │
  └────────────────────────┬────────────────────────────────────────┘
                           │ Retrieval
  ┌────────────────────────▼────────────────────────────────────────┐
  │  ORCHESTRATOR AGENT (Claude 3.5 Sonnet via Bedrock)             │
  │  - Classifies document type                                     │
  │  - Routes to appropriate sub-agent                              │
  │  - Tracks extraction confidence                                 │
  └──┬──────────┬──────────┬──────────┬──────────┬─────────────────┘
     │          │          │          │          │
  ┌──▼──┐  ┌───▼───┐  ┌───▼───┐  ┌──▼───┐  ┌───▼──────────────┐
  │Doc  │  │Regex  │  │Clin-  │  │Valid-│  │Audit             │
  │Class│  │Extract│  │NLP    │  │ation │  │Logger            │
  │-ify │  │-or    │  │Agent  │  │Agent │  │(CloudTrail+DDB)  │
  └──┬──┘  └───┬───┘  └───┬───┘  └──┬───┘  └───────────────────┘
     └─────────┴──────────┴─────────┘
                           │
  ┌────────────────────────▼────────────────────────────────────────┐
  │  DynamoDB — Agent State + Extraction Results                    │
  └────────────────────────┬────────────────────────────────────────┘
                           │
  ┌────────────────────────▼────────────────────────────────────────┐
  │  REPORT GENERATOR AGENT                                         │
  │  - Assembles structured output                                  │
  │  - Confidence scoring                                           │
  │  - Routes <85% confidence to human review queue               │
  └────────────────────────┬────────────────────────────────────────┘
                           │
                   ┌───────▼──────────┐
                   │ S3 Output Bucket  │
                   │ (structured JSON) │
                   └──────────────────┘

  Step Functions Express Workflows orchestrate the entire pipeline.
  All agent-to-agent calls are synchronous invocations via Bedrock.

Architecture Decision

We chose Step Functions Express Workflows as the backbone — not direct agent-to-agent chaining. This gave us full execution history in the AWS console, automatic retries on transient Bedrock API errors, and per-execution cost visibility. The 50ms overhead per step is trivially worth the operational benefit in a regulated environment.

What We Built

Bedrock Knowledge Base — 180K clinical documents ingested, chunked, and embedded using Amazon Titan Embeddings v2. OpenSearch Serverless as the vector store. Hybrid retrieval (semantic + keyword).
6-agent taxonomy — Document Classifier, Regex Extractor, Clinical NLP Agent, Validation Agent, Audit Logger, Report Generator. Each agent has a distinct system prompt, tool set, and temperature configuration.
Step Functions orchestration — Express Workflows with parallel branches for sub-agents, Wait states for human review, automatic retry with exponential backoff on Bedrock throttling.
Bedrock Guardrails — PII redaction (SSN, DOB, patient IDs), denied topics policy, content filters (medium sensitivity), custom word policies for regulatory terminology.
CloudTrail audit log — Every agent invocation captured: timestamp, agent ID, input hash (SHA-256), output hash, confidence score, user who triggered, document ID. Immutable S3 log bucket with Object Lock.
DynamoDB state store — Per-document extraction state, agent output storage, human review queue, confidence scoring history.
Human-in-the-loop queue — SQS queue + Lambda notifier for outputs with confidence <85%. Reviewers see agent output + source document snippet side-by-side in a minimal internal UI.
CloudWatch dashboards — Per-agent invocation count, latency, confidence score distribution, human review queue depth, daily cost per agent.

Compliance & Governance

HIPAA Controls

All data remains within the client's AWS account and VPC — zero egress to third-party services
Bedrock VPC endpoints — model inference never traverses public internet
PII redaction via Bedrock Guardrails applied before any logging
S3 Object Lock on audit log bucket — WORM compliance, immutable for 7 years
AWS KMS customer-managed keys on all storage (S3, DynamoDB, CloudWatch)
IAM least-privilege: each agent Lambda role has only the permissions it needs; no wildcard resource ARNs

SOC2 Evidence

CloudTrail logs for all API calls — forwarded to the client's SIEM
AWS Config rules for continuous compliance monitoring (encryption at rest, VPC isolation, public access blocks)
GuardDuty enabled on the account for anomalous API activity detection
Quarterly access reviews automated via IAM Access Analyzer

Lesson Learned

Configure Bedrock Guardrails PII redaction from day one — retrofitting it after the Knowledge Base is populated requires a full re-ingestion. We learned this the hard way during a pre-production audit where a test document containing synthetic patient IDs was indexed without redaction. Budget an extra sprint for compliance hardening.

Results

Metric	Before	After
Extraction accuracy	87% (manual, spot-checked)	94% (automated, every document)
Documents processed/day	~200 (12 FTE × 60%)	3,000+ (fully automated)
Average extraction time	8–25 minutes per document	4.7 seconds per document
FTE hours on extraction	~340 hrs/week	~28 hrs/week (review queue only)
Annual labor cost (extraction)	~$420K	~$80K (review + oversight)
Regulatory re-review cycles	14 per quarter	2 per quarter
HIPAA audit readiness	Manual evidence collection, 3 weeks	Automated, continuous, <1 day

Client Perspective

We'd been blocked on AI adoption by compliance for nearly a year. Every vendor we evaluated either couldn't meet HIPAA requirements or couldn't demonstrate the audit trail our legal team needed. codetoday.io came in, understood the regulatory constraints immediately, and built the guardrails architecture first — before writing a single line of agent code. The result is a production system our compliance team actually trusts. The guardrails architecture alone was worth the entire engagement fee.

— CTO, Mid-Size Pharmaceutical Company (name withheld per NDA)

Related Resources

Ready to deploy AI in a regulated environment?

We'll review your use case, compliance requirements, and architecture options in a free 30-minute call. No sales pitch — just engineering honesty.

Book a Free AI Readiness Audit