Skip to content
Back to Work
001

Archv

AI-Powered Document Review for Regulated Industries

Role

CEO & Co-Founder

Duration

2025 – Present

Team

3 engineers, 1 designer

Status

Active

Private Repo

Overview

I founded Archv to fix document review in regulated industries. Ran 40+ user interviews with attorneys, paralegals, and law students. Evaluated three AI architectures and selected RAG for built-in citations and updatability. Targeted law students as the go-to-market entry point into institutional adoption. Signed early users on a 4-person team with a pre-seed budget. Review time dropped 71%. HIPAA and SOC 2 compliant. Accepted into NVIDIA Inception.

Archv. AI compliance infrastructure for regulated industries.
Archv. AI compliance infrastructure for regulated industries.

Problem

Law students spend 60-70% of their study and research time reviewing documents manually. A single missed compliance issue in practice leads to sanctions, malpractice claims, or fines exceeding $10M. Students who tried generalist AI tools found the outputs unusable: hallucinated clauses, no source citations, no audit trail. One professor told us during user research, 'One wrong answer and I will never use it again.' No product combined fast AI inference with the data controls and compliance infrastructure these users require.

Approach

  • 01Ran 40+ user interviews with attorneys, paralegals, and compliance officers. Shadowed 3 attorneys during live document review sessions to map workflow pain points
  • 02Evaluated three AI architectures: fine-tuned LLM ($500K+, not updatable), ChatGPT API wrapper (no compliance, hallucinations), RAG with vector database (built-in citations, updatable, cost-effective). Selected RAG
  • 03Built a microservices architecture to isolate document ingestion, ML inference, user management, and audit logging
  • 04Deployed ML models on NVIDIA GPUs via CUDA for document classification and entity extraction
  • 05Encrypted all data with AES-256 at rest and TLS 1.3 in transit. Logged every data access for audit trails
  • 06Built RESTful APIs with JWT auth and role-based access control (RBAC) mapped to compliance roles
  • 07Shipped weekly updates to early users. Monthly pilot check-ins caught issues before they became product debt

Architecture

V1 · Early Design

Data Flow

  CLIENT UPLOAD
       |
       v
  API GATEWAY
  Rate Limit · JWT Auth · RBAC
       |
       v
  S3 (AES-256)  +  USER SERVICE (PostgreSQL)
       |
       v  S3 Event Notification
  SQS QUEUE (FIFO)  -->  DEAD LETTER QUEUE (Slack)
       |
       v
  GPU WORKER A (INT8, <2s)  +  GPU WORKER B (FP16, 3x)
  Classify, Embed              Extract, Summarize
       |
       v
  VECTOR DB (Embeddings)
       |
       v
  POSTGRESQL (Results + Audit Trail)
       |
       v
  WEBHOOK API (Client)  +  AUDIT LOGGER (Immutable)

Batching Algorithm

Documents are grouped by type (contracts, NDAs, compliance filings) before batching to keep GPU cache warm and avoid branch divergence. Batch size scales dynamically by page count, not document count, to prevent OOM on large files.

Failure Handling

Three layers: retry with exponential backoff (1s, 4s, 16s), dead letter queue with Slack alerts for persistent failures, and idempotency keys (hash of S3 key + upload timestamp) to prevent duplicate processing.

Dual-Path Inference

Interactive requests skip the queue and hit a reserved GPU slot with INT8 quantization for sub-2s latency. Batch workloads queue in SQS, group by type, and run FP16 for maximum accuracy at 3x throughput.

Reflections

What Worked

01Citation-first design. Trust requires verifiability. Every AI response links to source text. This became our primary differentiator against tools that produce unsourced summaries.
02Built for students first, sold through institutions second. Law students loved the product. Program administrators approved it. The compliance dashboard gave admin staff full visibility into AI usage, queries, and data access.
03Monthly pilot check-ins with real users caught issues before they became product debt. We killed two features early that tested poorly and doubled down on citation accuracy.

What I Would Do Differently

01Built mobile from day one. Law students review documents between classes constantly. We deprioritized mobile and it should have shipped in V1.
02Invested more in onboarding. The first-run experience was weak. New users needed hand-holding to see the value, which added 2 weeks to every pilot.
03Ran pricing research before launch. We guessed on pricing. Conjoint analysis upfront would have shortened the sales cycle.
Archv brand animation.

Design & Typography

Archv's visual identity lives in tension. The product interface is stripped down: black text, white space, sharp edges. No decoration. Every pixel earns its place or gets removed. But the brand mark is the opposite. The logo is a burst of color: iridescent ribbons, overlapping circles, a rainbow wordmark. It is playful on purpose. Archv handles compliance documents, regulatory filings, legal risk. The work is serious. The brand says: we make serious work feel approachable. The color in the logo represents the breadth of what Archv touches: law, healthcare, government, finance. Each domain has its own weight. The logo holds all of them together in one playful mark. The interface stays minimal so the content speaks. The brand stays colorful so the company feels human.

Typography

Headings

Favorit by Dinamo. A geometric grotesque with sharp terminals and wide apertures. It reads fast at small sizes, which matters when attorneys scan compliance dashboards for 6 hours straight. The geometry references architectural drafting lettering. Clean, precise, no flourishes.

Body

Inter for UI text. High x-height, open counters, designed for screens. Pairs with Favorit without competing. Body text at 14px/1.6 line height. Dense enough for data-heavy views. Readable enough for long review sessions.

Color Palette

Obsidian

#0A0A0A

Interface text, headers, navigation

Paper

#FAFAF8

Interface background, card surfaces

Graphite

#6B6B6B

Secondary text, labels, metadata

Iridescent Pink

#E84393

Brand mark, logo ribbons, playful accents

Electric Blue

#3D5AFE

Brand mark, logo circles, trust signals

Gold

#F9A825

Brand mark, warmth, approachability

Design Principles

01Reduction over addition. Every element faces one question: does removing this break comprehension? If the answer is no, it goes. White space is the primary design material. It creates grouping, hierarchy, and breathing room without adding a single element.
02Information density without clutter. Attorneys review hundreds of pages daily. The interface respects that by fitting more content per screen without sacrificing legibility. Tight spacing, small but readable type sizes, and data tables that show 40+ rows without scrolling.
03Architecture taught me that materials should be honest. Concrete looks like concrete. Steel looks like steel. In the interface, a button looks like a button. A text field looks like a text field. No gradients pretending to be depth. No shadows pretending to be elevation. Flat surfaces, sharp edges, clear boundaries.
04The interface is 90% black, white, and gray. Color is scarce inside the product so content stays in focus. But the brand identity is the opposite: full spectrum, iridescent, playful. This contrast is intentional. The product is serious. The brand is approachable. Users trust the tool because it is clear. They remember the company because it is colorful.
05The grid is 8px. Every margin, padding, and component dimension snaps to multiples of 8. This creates visual rhythm without conscious effort. You feel it as consistency. The page feels organized before you read a single word.

Design Decisions

Technology Stack

Frontend

TypeScriptReactNext.jsTailwind CSS

Backend

Node.jsExpressPostgreSQLRedis

Infrastructure

AWS EC2S3LambdaDockerTerraform

AI/ML

PythonPyTorchCUDANVIDIA GPUsHugging Face

AI Pipeline

RAGVector DBLangChainEmbeddings

Security

AES-256TLS 1.3JWTRBACAudit Logging

Impact

Review Time

-71%

Document review dropped from 4.5 hours to 1.3 hours per session

Verification

-82%

AI output verification dropped from 45 minutes to 8 minutes per query

Retention

100%

3-month customer retention across all pilot users

Latency

<2s

Document classification on 50+ page legal files

NVIDIA Inception

Admitted

Accepted into NVIDIA's accelerator for AI startups

Early Adoption

Signed

Onboarded first law student users during initial launch

Next Case Study

Optum