This document tracks infrastructure improvements to implement as the project scales and generates revenue.
See Risk Assessment below.
None yet
Current Cost: ~$30/month (1x t4g.medium) HA Cost: ~$60/month (2x t4g.medium) Additional Cost: +$30/month
Changes required:
CodeDeployDefault.OneAtATimeTotal HA upgrade: +$90/month
Priority: Medium (security + minor latency improvement)
| Endpoint | Type | Cost/month |
|---|---|---|
| S3 | Gateway | FREE |
| SQS | Interface (2 AZs) | ~$17.50 |
| Secrets Manager | Interface (2 AZs) | ~$17.50 |
| ECR API | Interface (2 AZs) | ~$17.50 |
| ECR Docker | Interface (2 AZs) | ~$17.50 |
| SSM (for Session Manager) | Interface (2 AZs) | ~$17.50 |
| SSM Messages | Interface (2 AZs) | ~$17.50 |
| EC2 Messages | Interface (2 AZs) | ~$17.50 |
Recommended first step: Add S3 Gateway Endpoint (free) immediately.
Full implementation: ~$122/month for all interface endpoints
Note: Data processing adds $0.01/GB through interface endpoints, but this is typically less than NAT Gateway costs for the same traffic.
Priority: Medium (once handling sensitive customer data)
| Component | Cost/month |
|---|---|
| Web ACL | $5.00 |
| AWS Managed Rules - Common (baseline) | $1.00 |
| AWS Managed Rules - SQLi | $1.00 |
| AWS Managed Rules - Known Bad Inputs | $1.00 |
| Request charges (5M requests) | $3.00 |
Minimum WAF setup: ~$11/month Recommended setup (4-5 rule groups): ~$15/month
Consider adding when:
Current: 5 images retained Consideration: Increase to 10-15 for better rollback capability
Cost impact:
Recommendation: Increase to 10 images. Negligible cost, better rollback safety.
Current: No NAT Gateway (EC2 in public subnet) Cost if added: ~$38/month + $0.052/GB processed
See Risk Assessment for why this is acceptable for now.
EC2 instance in public subnet with:
| Risk | Severity | Mitigation | Residual Risk |
|---|---|---|---|
| Direct attack on EC2 | Low | SG blocks all ports except 8888 from ALB only | Minimal - no exposed ports |
| Instance metadata exposure | Low | IMDSv2 required (token-based) | Minimal |
| Outbound data exfiltration | Medium | Would need to compromise app first | Acceptable |
| AWS API credential theft | Low | Instance role with scoped permissions | Acceptable |
| Aspect | Public Subnet | Private + NAT |
|---|---|---|
| Monthly cost | $0 | ~$38 |
| Attack surface | SG-protected | Identical (SG still primary defense) |
| Compliance | May fail some audits | Preferred for SOC2/HIPAA |
| Operational complexity | Lower | Higher (NAT is SPOF unless HA) |
Keep public subnet for MVP. The security group is the primary defense in both architectures. Move to private subnet + NAT when:
Run migration tests only on schema-changing merges, not every deployment.
Add a CodeBuild step that:
resources/migrations/*.sql files changedNetworkStack → StorageStack → QueueStack → EcrStack → DatabaseStack
↓
DnsStack → ComputeStack → MonitoringStack → CicdStack
1. FoundationStack (replaces Network + Storage + Queue + ECR)
2. DataStack (replaces Database)
3. ComputeStack (replaces DNS + Compute)
4. OpsStack (replaces Monitoring + CICD)
Effort: ~4-6 hours Risk: Medium (resource importing can be tricky) Recommendation: Do this before first production deployment, not after
| Improvement | Monthly Cost | Priority | When |
|---|---|---|---|
| S3 Gateway Endpoint | FREE | High | Now |
| ECR retention 10 images | +$1.00 | Medium | Now |
| Stack consolidation | $0 | Medium | Before prod |
| WAF | +$15 | Medium | With customers |
| VPC Interface Endpoints | +$122 | Low | With revenue |
| HA (2 instances) | +$30 | Low | With SLA needs |
| NAT Gateway | +$38 | Low | With compliance |
| RDS Multi-AZ | +$60 | Low | With SLA needs |
Current: Manual rotation only. Database credentials are stored in Secrets Manager (/v1-orcha/db-credentials). After manual rotation, restart the application to pick up new credentials.
Future: Enable automatic rotation with application restart.
SecretsManager Secret Rotation Successful eventautoscaling:StartInstanceRefreshIf you need rotation without restart:
DB_HOST env var + v1-orcha/db-credentials secret (username/password only)When a large tenant requires dedicated resources to avoid noisy-neighbor issues:
SQS Queues:
v1-orcha-{tenant-slug}-ingest + v1-orcha-{tenant-slug}-ingest-dlqv1-orcha-{tenant-slug}-email-acquire + v1-orcha-{tenant-slug}-email-acquire-dlqS3 Bucket:
v1-orcha-{tenant-slug}-storage-{account_id}Dedicated Worker Instance(s):
Optional: Dedicated API Keys:
| Resource | Per-Tenant Cost |
|---|---|
| SQS queues (4x) | ~$0 (pay per message) |
| S3 bucket | ~$0 (pay per storage/request) |
| Worker instance (t4g.medium) | ~$30/month |
| Dedicated API keys | Depends on usage |
Recommendation: Only create dedicated resources when tenant volume justifies the operational overhead. Start with dedicated workers consuming from global queues, escalate to full isolation only if needed.
| Date | Change |
|---|---|
| 2026-01-10 | Added automatic database credential rotation section |
| 2026-01-10 | Added per-tenant resource scaling section |
| 2026-01-09 | Initial document created |