Gübelin Slip Processing: Cloud Provider Cost Comparison

Project: Process 600,000 historical Warenstammkarte (inventory cards) Date: December 2025 Author: Generated for cost analysis

Executive Summary

Provider	Total Estimated Cost	Processing Time (with batching)
GCP (Recommended)	$1,040 - $1,050	6-12 days
AWS	$1,040 - $1,050	6-12 days

⚠️ CRITICAL: Gemini API rate limits (RPD - Requests Per Day) are the primary constraint. Without batching, processing would take 60-600 days. Batching multiple slips per LLM call is mandatory.

Recommendation: GCP offers native integration with both Document AI OCR and Gemini Flash 2.0, reducing complexity. Costs are nearly identical between providers.

Architecture Overview

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Source PDFs    │────▶│  Split Images   │────▶│   OCR + LLM     │
│  (~240 GB)      │     │  (~240 GB JPG)  │     │   Processing    │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                                        │
                              ┌─────────────────────────┘
                              ▼
                        ┌─────────────────┐     ┌─────────────────┐
                        │  JSON Output    │────▶│  SQLite DB      │
                        │  (~2 GB)        │     │  + Logs         │
                        └─────────────────┘     └─────────────────┘

Processing Pipeline:

Upload source PDFs to cloud storage
Spin up compute instance
Split PDFs into individual slip images (PyMuPDF)
OCR all slips (can run in parallel - not rate limited)
Batch LLM extraction (10 slips per call to stay within rate limits)
Save results to SQLite DB
Download results, terminate instance

⚠️ CRITICAL: API Rate Limits

Rate limits are the primary constraint on processing time, not compute or API costs.

Document AI OCR Rate Limits

Limit	Default Value	600K Slips	Bottleneck?
Requests/minute	120	83 hours (~3.5 days)	❌ No
Pages/minute	120	83 hours (~3.5 days)	❌ No
Batch concurrent jobs	5	N/A	❌ No

Source: Document AI Quotas

Gemini API Rate Limits (THE BOTTLENECK)

Tier	RPM	TPM	RPD	600K Slips (no batching)	With 10x Batching
Free	5	32K	25	❌ 65 years	❌ 6.5 years
Tier 1 (billing enabled)	300	1M	1,000	❌ 600 days	⚠️ 60 days
Tier 2 ($250 spend + 30 days)	1,000	2M	10,000	⚠️ 60 days	✅ 6 days

Legend:

RPM = Requests Per Minute
TPM = Tokens Per Minute
RPD = Requests Per Day (the limiting factor!)

Source: Gemini API Rate Limits

Rate Limit Impact Analysis

Scenario	LLM Calls	At Tier 1 (1K RPD)	At Tier 2 (10K RPD)
No batching (1 slip/call)	600,000	600 days	60 days
5 slips/call	120,000	120 days	12 days
10 slips/call	60,000	60 days	6 days
20 slips/call	30,000	30 days	3 days

Recommendation: Use 10 slips per LLM call with Tier 2 for optimal balance of speed and reliability.

How to Unlock Tier 2

Enable Cloud Billing on your GCP project
Spend $250+ on Google Cloud services (cumulative)
Wait 30 days from first billing
Tier 2 is automatically unlocked

Source: Gemini API Rate Limits Guide

AWS Textract Rate Limits (for comparison)

Limit	Default Value	600K Slips
Requests/second	150	~1.1 hours
Concurrent async jobs	200	N/A

Source: AWS Textract Limits

Note: Textract has no daily request limits, making it less constrained than Gemini for high-volume processing.

Detailed Cost Breakdown

1. OCR API Costs

Both providers offer equivalent OCR capabilities with word-level confidence scores.

Service	Price per 1,000 pages	600K Slips	Confidence Scores
GCP Document AI OCR	$1.50	$900	✅ Word-level
AWS Textract (Detect Text)	$1.50	$900	✅ Word-level (0-100)

Sources:

2. LLM API Costs (Gemini 2.0 Flash)

Gemini 2.0 Flash is used regardless of cloud provider. Based on measured token usage from test runs:

Metric	Per Slip	600K Slips	Cost
Input tokens	980	588,000,000	$58.80
Output tokens	252	151,200,000	$60.48
Total LLM			$119.28

Pricing:

Input: $0.10 per 1M tokens
Output: $0.40 per 1M tokens

Note: Batching does not significantly change token costs (same content processed), but reduces API call overhead.

Source: Gemini API Pricing

3. Compute Costs

Comparable instances for running the Python processing pipeline:

Provider	Instance	vCPUs	RAM	Price/Hour	Est. Hours	Total
GCP	n2-standard-4	4	16 GB	$0.19	144-288	$27 - $55
AWS	c5.xlarge	4	8 GB	$0.17	144-288	$24 - $49

Updated Assumptions (with rate limits):

Processing is rate-limited by Gemini API, not compute
Instance runs 6-12 days continuously
Most time spent waiting for API responses

Sources:

4. Storage Costs

Item	Size	GCP ($/GB/mo)	GCP Total	AWS ($/GB/mo)	AWS Total
Source PDFs	240 GB	$0.020	$4.80	$0.023	$5.52
Split Images (JPG)	240 GB	$0.020	$4.80	$0.023	$5.52
Output (JSON, TXT, DB)	3 GB	$0.020	$0.06	$0.023	$0.07
Total (1 month)			$9.66		$11.11

Notes:

Prices for US regions (us-central1 / us-east-1)
Delete split images after processing to reduce ongoing costs

Sources:

5. Storage API Operations

Operation	Count	GCP ($/10K ops)	GCP Total	AWS ($/1K ops)	AWS Total
PUT (uploads)	600,000	$0.05	$3.00	$0.005	$3.00
GET (reads)	1,200,000	$0.004	$0.48	$0.0004	$0.48
Total			$3.48		$3.48

6. Network Egress

Downloading final results (JSON files, SQLite DB, logs):

Provider	Data Out	Free Tier	Price/GB	Billable	Total
GCP	~5 GB	100 GB/mo	$0.12	0 GB	$0.00
AWS	~5 GB	100 GB/mo	$0.09	0 GB	$0.00

Note: Results are small (~3-5 GB). Both providers include 100 GB free egress monthly.

Sources:

7. Logging Costs

Estimated log volume: ~50-100 MB (processing logs, errors, stats)

Provider	Service	Ingestion ($/GB)	Free Tier	Storage ($/GB/mo)	Total
GCP	Cloud Logging	$0.50	50 GB/mo	$0.01 (after 30d)	$0.00
AWS	CloudWatch Logs	$0.50	5 GB/mo	$0.03	$0.00

Notes:

Log volume is minimal (~100 MB), well under free tiers
Logs include: per-slip timing, token usage, errors, daily summaries

Sources:

Total Cost Summary

GCP (Google Cloud Platform)

Category	Low Estimate	High Estimate
Document AI OCR	$900.00	$900.00
Gemini 2.0 Flash	$119.28	$119.28
Compute (n2-standard-4, 6-12 days)	$27.36	$54.72
Storage (1 month)	$9.66	$9.66
Storage Operations	$3.48	$3.48
Network Egress	$0.00	$0.00
Logging	$0.00	$0.00
TOTAL	$1,059.78	$1,087.14

AWS (Amazon Web Services)

Category	Low Estimate	High Estimate
Textract (Detect Text)	$900.00	$900.00
Gemini 2.0 Flash	$119.28	$119.28
Compute (c5.xlarge, 6-12 days)	$24.48	$48.96
Storage (1 month)	$11.11	$11.11
Storage Operations	$3.48	$3.48
Network Egress	$0.00	$0.00
Logging	$0.00	$0.00
TOTAL	$1,058.35	$1,082.83

Processing Time Estimates

Realistic Timeline (with rate limits)

Gemini Tier	Batching	LLM Calls	Time (RPD limited)	Feasible?
Tier 1	None	600,000	600 days	❌
Tier 1	10x	60,000	60 days	⚠️ Slow
Tier 2	10x	60,000	6 days	✅ Recommended
Tier 2	20x	30,000	3 days	✅ Aggressive

Parallel Processing Breakdown

Phase	Constraint	Time Estimate
PDF Splitting	Compute (local)	~2-4 hours
OCR (Document AI)	120 RPM	~83 hours (~3.5 days)
LLM Extraction	10,000 RPD (Tier 2)	~6 days
Total	LLM is bottleneck	~6-12 days

Note: OCR and LLM can run in parallel (OCR ahead of LLM), but LLM rate limits dominate the timeline.

Provider Comparison Matrix

Criterion	GCP	AWS	Winner
OCR Cost	$900	$900	Tie
LLM Cost	$119	$119	Tie
Compute Cost (6-12 days)	$27-55	$24-49	AWS (marginal)
Storage Cost	$10	$11	GCP (marginal)
OCR Rate Limits	120 RPM	150 RPS	AWS
LLM Rate Limits	Same (Gemini)	Same (Gemini)	Tie
Native Gemini Integration	✅ Yes	❌ Cross-cloud	GCP
Native OCR Integration	✅ Document AI	❌ Textract	GCP
Setup Complexity	Lower	Higher	GCP
Existing Project Setup	✅ Already configured	❌ New setup needed	GCP

Recommendations

Primary Recommendation: GCP with Tier 2 + Batching

Prerequisites:

✅ Enable Cloud Billing (unlocks Tier 1)
⏳ Spend $250+ on GCP services (unlocks Tier 2 after 30 days)
✅ Implement 10-slip batching in LLM calls

Expected Timeline: 6-12 days Expected Cost: ~$1,060-$1,090

Batching Strategy (Mandatory)

Without batching:  600,000 calls → 600 days (Tier 1) or 60 days (Tier 2)
With 10x batching:  60,000 calls →  60 days (Tier 1) or  6 days (Tier 2)

The gubelin_parse.py script already includes extract_batch_with_llm() function for this purpose.

Cost Optimization Opportunities

Optimization	Savings	Effort
Delete intermediate JPGs after processing	$4.80/month	Low
Use Spot/Preemptible instances	$15-30	Medium
Request Gemini quota increase	Faster processing	Medium
Use Vertex AI instead of Gemini API	Higher limits, enterprise SLA	High

Risk Mitigation

Implement checkpointing: Save progress to SQLite DB to resume after failures
Exponential backoff: Handle 429 rate limit errors gracefully
Error logging: Track failed slips for manual review
Cost alerts: Set up billing alerts at $500, $800, $1,000
Daily progress monitoring: Log daily throughput to detect issues early

Appendix A: Pricing Sources

Item	Source	Date Accessed
GCP Document AI	cloud.google.com/document-ai/pricing	Dec 2025
AWS Textract	aws.amazon.com/textract/pricing	Dec 2025
Gemini API Pricing	ai.google.dev/gemini-api/docs/pricing	Dec 2025
GCP Compute	cloud.google.com/compute/vm-instance-pricing	Dec 2025
AWS EC2	aws.amazon.com/ec2/pricing/on-demand	Dec 2025
GCP Storage	cloud.google.com/storage/pricing	Dec 2025
AWS S3	aws.amazon.com/s3/pricing	Dec 2025
GCP Logging	cloud.google.com/stackdriver/pricing	Dec 2025
AWS CloudWatch	aws.amazon.com/cloudwatch/pricing	Dec 2025

Appendix B: Rate Limit Sources

Item	Source	Date Accessed
Document AI Quotas	cloud.google.com/document-ai/quotas	Dec 2025
Gemini API Rate Limits	ai.google.dev/gemini-api/docs/rate-limits	Dec 2025
Gemini Rate Limits Guide	blog.laozhang.ai/ai-tools/gemini-api-rate-limits-guide	Dec 2025
Gemini Tier Breakdown	aifreeapi.com/en/posts/gemini-api-rate-limit	Dec 2025
AWS Textract Limits	docs.aws.amazon.com/textract/latest/dg/limits.html	Dec 2025

Appendix C: Quick Reference

Timeline Summary

Phase	Duration	Constraint
Setup & Upload	1 day	Network speed
PDF Splitting	4 hours	Compute
OCR Processing	3.5 days	120 RPM
LLM Extraction	6 days	10,000 RPD
Total	~10-12 days	LLM rate limits

Cost Summary

Category	Cost
OCR	$900
LLM	$119
Compute	$27-55
Storage	$10
Other	$3-4
Total	~$1,060-$1,090

Report generated for Gübelin historical document processing project