Sparki Fiber Storage Adapters Integration Matrix
:::info overview This document provides a comprehensive matrix of all supported Fiber storage adapters, their characteristics, use cases within Sparki, and implementation guidance. It serves as a reference for architecture decisions and adapter selection. ::: Version: 1.0Date: December 3, 2025
Document Owner: Infrastructure Team
Storage Adapter Selection Matrix
Primary Adapters (In Production)
| Adapter | Type | Latency | Throughput | Durability | Use Case in Sparki | Priority |
|---|---|---|---|---|---|---|
| PostgreSQL | Relational DB | 5-20ms | 10K+ qps | 99.99% | Primary data store (users, projects, pipelines, builds, deployments) | P0 |
| Redis | In-Memory Cache | <1ms | 100K+ ops/sec | 99.9% | Session caching, permission cache, build logs, rate limiting | P0 |
| S3 (AWS) | Object Storage | 50-200ms | Unlimited | 99.999999% | Build artifacts, deployment logs, long-term storage | P0 |
| Badger | Embedded KV | 1-5ms | 100K+ ops/sec | 99% | Local build state, temporary caching (edge deployments) | P1 |
Secondary Adapters (Extended Support)
| Adapter | Type | Latency | Throughput | Use Case | Priority |
|---|---|---|---|---|---|
| MongoDB | Document DB | 5-15ms | 10K+ qps | Alternative: build metadata, pipeline configs | P2 |
| SurrealDB | Multi-model | 10-30ms | 5K+ qps | Time-series metrics, real-time collaboration | P2 |
| Memcache | Cache | <1ms | 10K+ ops/sec | High-performance distributed caching | P2 |
| Cassandra | Distributed DB | 10-50ms | 100K+ qps | Audit logs at massive scale | P3 |
| ArangoDB | Graph DB | 10-30ms | 5K+ qps | Workflow dependency graphs, relationship queries | P3 |
| MinIO | S3-Compatible | 50-100ms | Unlimited | Self-hosted S3 alternative | P1 |
Supported But Not Primary
| Adapter | Type | Rationale |
|---|---|---|
| MySQL | Relational | PostgreSQL preferred for performance |
| MSSQL | Relational | Enterprise option if required |
| DynamoDB | NoSQL | AWS-only, higher costs |
| Firestore | NoSQL | Google Cloud only |
| Couchbase | Document DB | Enterprise option |
| Neo4j | Graph DB | Specialized use case |
| Etcd | Key-Value | Kubernetes configuration only |
| LevelDB | Embedded KV | Single-instance limitation |
| Pebble | Embedded KV | Go-native alternative to Badger |
| Nats | Message Queue | Event streaming (future) |
| Clickhouse | Analytics DB | Time-series/analytics (future) |
| SurrealDB | Multi-model | Future real-time collaboration |
| Azure Blob | Object Storage | Multi-cloud option |
| Cloudflare KV | Distributed Cache | Edge caching (future) |
Detailed Adapter Specifications
PostgreSQL
Classification: P0 Primary - Relational Database Fiber Adapter:storage/postgres
Characteristics:
- Latency: 5-20ms per query
- Throughput: 10,000+ queries per second (per instance)
- Concurrency: Connection pooling (20-100 connections)
- Durability: ACID transactions, replication support
- HA: Master-replica replication, automatic failover
- User Accounts: users table with authentication data
- Workspaces & Teams: workspace/team management
- Projects: Git repository metadata
- Pipelines: Pipeline definitions and configurations
- Builds: Build execution records
- Deployments: Deployment history and status
- Audit Logs: Immutable event logs
- RBAC: Roles, permissions, memberships
- Use connection pooling to maintain max 100 connections
- Index frequently queried columns
- Vacuum tables regularly to prevent bloat
- Monitor query performance with pg_stat_statements
- Consider read replicas for read-heavy workloads
Redis / Valkey
Classification: P0 Primary - In-Memory Cache Fiber Adapter:storage/redis or storage/valkey
Characteristics:
- Latency: <1ms (sub-millisecond)
- Throughput: 100,000+ operations per second
- Data Types: Strings, Lists, Sets, Hashes, Sorted Sets, Streams
- Persistence: Optional (snapshots, AOF)
- HA: Sentinel or Cluster mode
- Session Storage: User session tokens
- Permission Cache: Role-based permissions (<1ms validation)
- Build Logs: Real-time log streaming
- Pipeline Status: Real-time pipeline state
- Rate Limiting: Request throttling per user/IP
- WebSocket Sessions: Active connections
- Deployment Status: Real-time deployment progress
- Temporary State: In-flight build/deployment data
- Use pipeline commands for batch operations
- Implement proper TTL to prevent memory bloat
- Use Cluster mode for horizontal scaling
- Monitor memory usage and eviction policies
- Consider Valkey for open-source alternative
S3 / Object Storage
Classification: P0 Primary - Object Storage Fiber Adapter:storage/minio (S3-compatible)
Characteristics:
- Latency: 50-200ms per request
- Throughput: Unlimited
- Durability: 99.999999% (AWS S3)
- Scalability: Infinite
- Cost: Pay-per-GB
- Build Artifacts: Compiled binaries, packages
- Deployment Logs: Long-term log storage
- Test Reports: JUnit XML, Coverage reports
- Configuration Backups: Pipeline/deployment snapshots
- Audit Log Archives: Long-term compliance storage
- Build Cache: Container image layers, dependency caches
- Use multipart uploads for large files (>100MB)
- Implement retry logic for transient failures
- Consider S3 transfer acceleration
- Use CloudFront CDN for frequent downloads
- Archive old logs to Glacier for cost optimization
Badger (Embedded)
Classification: P1 Secondary - Embedded Key-Value Store Fiber Adapter:storage/badger
Characteristics:
- Latency: 1-5ms
- Throughput: 100,000+ operations/second
- Data Structure: Key-value pairs
- Persistence: LSM tree based
- Limitations: Single-instance only
- Edge Deployments: Local build state on deployment agents
- Build Cache: Temporary build artifacts during pipeline
- Fallback Storage: When Redis unavailable
- Development Mode: Local testing without external storage
- Not suitable for distributed systems
- Use for single-instance edge scenarios only
- Regular compaction needed to manage disk space
- Monitor LSM tree growth
MongoDB (Optional)
Classification: P2 Secondary - Document Database Sparki Use Cases (if selected):- Pipeline configuration storage (schemaless)
- Build metadata (flexible structure)
- Analytics data (document model fits well)
- Critical transactional data (use PostgreSQL)
- Real-time permission checks (use Redis)
- Audit logs (PostgreSQL better)
SurrealDB (Future)
Classification: P2 Secondary - Multi-Model Database Potential Sparki Use Cases:- Time-series pipeline metrics
- Real-time collaboration data
- Workflow dependency graphs
- Schema-flexible configuration
Adapter Selection Decision Matrix
Choosing Storage for New Feature
Flow Chart:Technology Selection Examples
Build Artifacts:Integration Patterns
Pattern 1: Multi-Tier Storage
Pattern 2: Event Sourcing with Archive
Pattern 3: Cache Invalidation
Monitoring & Observability
Key Metrics per Storage Type
PostgreSQL:Alerting Thresholds
| Metric | Adapter | Warning | Critical |
|---|---|---|---|
| Latency p95 | PostgreSQL | 50ms | 100ms |
| Latency p95 | Redis | 5ms | 10ms |
| Memory Usage | Redis | 80% | 95% |
| Connection Pool | PostgreSQL | 80% | 95% |
Conclusion
Sparki’s storage architecture leverages a polyglot persistence approach:- PostgreSQL for relational, transactional data
- Redis for ultra-fast caching and real-time data
- S3 for scalable object storage
- Badger for edge/embedded deployments
Document History:
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2025-12-03 | Sparki Engineering | Initial integration matrix |