TekTree Architecture Overview
Version: 1.0.0 Last Updated: 2025-12-16 Status: Foundation (Pre-Implementation)Executive Summary
TekTree is an event-driven, real-time knowledge network designed as a gamified, monetizable open-source platform. The architecture prioritizes horizontal scalability, observability, and extensibility through asynchronous event-driven patterns, enabling a fun and engaging user experience while supporting tiered subscription models via Polar.Core Architectural Principles
- Event-Driven First - All state mutations emit domain events for loose coupling
- Gamification as DNA - XP, achievements, and social proof embedded at the core
- Monetization-Ready - Tier-based access control and usage metering built-in
- Observable by Default - Structured logs, metrics, and traces at every layer
- Horizontally Scalable - Stateless services with event replay capability
- Secure by Design - Zero-trust architecture with defense in depth
C4 Model Architecture
Level 1: System Context Diagram
Level 2: Container Diagram
Level 3: Component Diagram - API Gateway
Service Boundaries
Core Services
1. API Gateway Service
Responsibility: Request routing, authentication, rate limiting, tier enforcement Technology: Go + Gin framework Port: 8080 (HTTP), 8443 (HTTPS) Dependencies: Redis (sessions, rate limits), All backend services Key Functions:- JWT validation and session management
- Tier-based rate limiting (Free: 100 req/min, Pro: 1000 req/min, Enterprise: unlimited)
- Feature flag evaluation
- Request/response logging with correlation IDs
- Circuit breaker for downstream services
- Prometheus metrics export
2. User Service
Responsibility: User identity, profiles, authentication, preferences Technology: Go Port: 8081 Dependencies: MongoDB (users collection), Redis (cache), Event Bus Domain Events Emitted:user.registereduser.profile.updateduser.authenticateduser.email.verified
- User (root)
- UserProfile
- UserPreferences
3. Knowledge Service
Responsibility: Content management (areas, questions, discussions, insights, resources) Technology: Go Port: 8082 Dependencies: MongoDB (content collections), Redis (cache), Event Bus, Event Store Domain Events Emitted:knowledge.area.createdknowledge.question.postedknowledge.answer.submittedknowledge.discussion.startedknowledge.insight.sharedknowledge.resource.uploadedknowledge.content.likedknowledge.comment.added
- Area (root)
- Question (root)
- Discussion (root)
- Insight (root)
- Resource (root)
4. Gamification Service
Responsibility: XP calculation, achievements, leaderboards, quests, streaks Technology: Go Port: 8083 Dependencies: MongoDB (gamification collections), Redis (leaderboards, cache), Event Bus, Event Store Domain Events Subscribed:user.registered→ Award welcome bonusknowledge.*→ Calculate XPgamification.achievement.unlocked→ Update user profile
gamification.xp.earnedgamification.level.upgamification.achievement.unlockedgamification.quest.completedgamification.streak.milestone
- UserGamificationProfile (root)
- Achievement
- Quest
- Leaderboard
5. Payment Service
Responsibility: Polar integration, subscription management, usage metering, tier enforcement Technology: Go Port: 8084 Dependencies: MongoDB (subscriptions collection), Redis (quota cache), Event Bus, Polar API Domain Events Subscribed:user.registered→ Create free tier subscriptionknowledge.*→ Meter usagepolar.webhook.*→ Handle payment events
payment.subscription.createdpayment.subscription.upgradedpayment.subscription.downgradedpayment.subscription.cancelledpayment.quota.exceededpayment.invoice.paid
- Subscription (root)
- UsageMetrics
- Invoice
6. Real-Time Service
Responsibility: WebSocket connections, presence, live notifications, collaborative editing Technology: Go + Gorilla WebSocket Port: 8085 Dependencies: Redis (presence, pub/sub), Event Bus Channels:/ws/notifications- User notifications/ws/presence- User presence tracking/ws/collaboration/:id- Live collaboration on content/ws/leaderboard- Real-time leaderboard updates
7. Background Worker Pool
Responsibility: Async job processing, scheduled tasks, email dispatch Technology: Go with worker pool pattern Dependencies: Event Bus, MongoDB, Email Service, Redis (job queue) Job Types:- Email notifications (transactional, digest)
- Analytics aggregation (daily, weekly)
- Content moderation queue processing
- Achievement calculation batch jobs
- Leaderboard recalculation
- Data exports
- Scheduled reminders
Data Flow Patterns
Write Path (Command Flow)
Read Path (Query Flow)
Event-Driven Gamification Flow
Deployment Architecture
Railway Deployment Strategy
Service Deployment Specifications
| Service | Instances | CPU | Memory | Autoscale Trigger |
|---|---|---|---|---|
| API Gateway | 2-10 | 0.5 | 512MB | CPU > 70% or RPS > 1000 |
| User Service | 2-5 | 0.25 | 256MB | CPU > 70% |
| Knowledge Service | 2-10 | 0.5 | 512MB | CPU > 70% or RPS > 500 |
| Gamification Service | 2-5 | 0.5 | 512MB | Event queue > 1000 |
| Payment Service | 2-3 | 0.25 | 256MB | CPU > 70% |
| Real-Time Service | 2-10 | 0.5 | 1GB | Active connections > 5000 |
| Background Workers | 2-5 | 0.5 | 512MB | Job queue > 500 |
| MongoDB | 1 (Replica Set) | 2 | 4GB | N/A (managed) |
| Redis | 1 (with persistence) | 1 | 2GB | N/A (managed) |
Technology Stack
Backend Services
- Language: Go 1.20+
- Web Framework: Gin (HTTP), Gorilla WebSocket (WSS)
- Service Communication: HTTP/REST initially, migrate to gRPC for inter-service calls
- Event Bus: Redis Streams (MVA), migrate to NATS Jetstream for scale
Data Layer
- Primary Database: MongoDB 6.0+ (document store, flexible schema)
- Cache & Sessions: Redis 7.0+ (in-memory, pub/sub, streams)
- Event Store: MongoDB (separate collection for event sourcing)
Observability
- Structured Logging: Zap (high-performance JSON logs)
- Metrics: Prometheus client library
- Tracing: OpenTelemetry with Jaeger backend
- Dashboards: Grafana
- Alerting: Alertmanager
Infrastructure
- Deployment: Railway (PaaS)
- CI/CD: GitHub Actions
- Secrets: Railway environment variables
- Object Storage: Railway-integrated S3-compatible storage
External Integrations
- Payments: Polar (subscriptions, checkouts, webhooks)
- Email: SendGrid or Resend
- Analytics: PostHog or Mixpanel (optional)
Key Architectural Patterns
1. Event Sourcing (Selective)
Applied To: Gamification aggregates (XP, achievements), critical audit logs Not Applied To: Operational CRUD (users, content) Rationale: Full event sourcing adds complexity; hybrid approach balances auditability with pragmatism.2. CQRS (Command Query Responsibility Segregation)
Commands: Routed to service write models, emit events Queries: Read from optimized MongoDB indexes and Redis cache Rationale: Decouples write and read concerns, enables independent scaling.3. Outbox Pattern
Implementation: MongoDB transactional outbox for reliable event publishing Rationale: Ensures events are published atomically with state changes.4. Saga Pattern (Choreography)
Use Case: Multi-service workflows (e.g., subscription upgrade → update tier → recalculate quota) Rationale: Distributed transactions via event choreography, avoiding tight coupling.5. Strangler Fig Migration
Strategy: Gradually extract services from monolith without Big Bang rewrite Order: Gamification → Payment → Real-Time → Knowledge decompositionScalability Strategy
Horizontal Scaling
- All services are stateless and can scale independently
- WebSocket connections use Redis pub/sub for cross-instance messaging
- MongoDB sharding by tenant/user for data partitioning
Caching Strategy
- L1 Cache: In-process LRU cache (10K entries, 5min TTL)
- L2 Cache: Redis (shared, 15min TTL)
- Cache Invalidation: Event-driven (clear on domain events)
Rate Limiting
- Per-User: Token bucket algorithm in Redis
- Tier-Based: Free (100/min), Pro (1000/min), Team (5000/min), Enterprise (unlimited)
- Burst Allowance: 2x sustained rate for 10 seconds
Security Architecture
Authentication
- Method: JWT with RS256 signing
- Token Lifetime: Access token (15min), Refresh token (7 days)
- Storage: Access token in memory, refresh token in HTTP-only cookie
Authorization
- Model: RBAC (Role-Based Access Control) + Tier-Based
- Roles: User, Moderator, Admin
- Tiers: Free, Pro, Team, Enterprise
- Enforcement: Middleware at API Gateway + service-level validation
Data Protection
- At Rest: MongoDB encryption at rest (AES-256)
- In Transit: TLS 1.3 for all external and inter-service communication
- Secrets: Railway environment variables, no secrets in code
API Security
- CORS: Whitelist configured domains only
- CSRF: SameSite cookies + CSRF tokens for state-changing operations
- Input Validation: JSON schema validation at API Gateway
- SQL/NoSQL Injection: Parameterized queries, MongoDB query sanitization
Migration Path from Current State
Phase 1: Foundation (TASKSET 1-2)
- ✅ Create documentation (current)
- Implement event bus infrastructure
- Add observability primitives
Phase 2: Event-Driven Refactor (TASKSET 3)
- Refactor existing services to emit domain events
- Implement outbox pattern
- Add event replay capability
Phase 3: Service Extraction (TASKSET 4-5)
- Extract Gamification Service
- Extract Payment Service
- Integrate Polar
Phase 4: Real-Time Layer (TASKSET 6-7)
- Implement WebSocket service
- Add background worker pool
- Build notification system
Phase 5: Production Hardening (TASKSET 8-10)
- Complete observability
- Load testing and optimization
- Security hardening
- Launch readiness
Architectural Decision Records (ADRs)
See.claude/.ADR for complete decision history. Key decisions:
- Redis Streams over NATS for MVA - Simpler setup, good enough for initial scale
- Hybrid Event Sourcing - Only for gamification, not full CQRS everywhere
- MongoDB for Event Store - Leverage existing database, avoid operational complexity
- Strangler Fig Migration - Gradual service extraction, minimize risk
- Go for All Services - Type safety, performance, operational simplicity
- Railway for Deployment - Managed infrastructure, faster time to market
Risks and Mitigations
| Risk | Impact | Probability | Mitigation |
|---|---|---|---|
| Event bus bottleneck | High | Medium | Monitor latency, ready to migrate to NATS |
| MongoDB connection exhaustion | High | Low | Connection pooling, query optimization |
| WebSocket scaling challenges | Medium | Medium | Redis pub/sub, sticky sessions with consistent hashing |
| Polar API downtime | High | Low | Circuit breaker, retry with exponential backoff |
| Complex event choreography | Medium | High | Saga visualization tools, thorough testing |
| Over-engineering | Medium | Medium | Start simple, add complexity only when needed |
Next Steps
- Proceed to TASKSET 1 documents: Functional requirements, non-functional requirements, event catalog
- Review and validate architectural decisions with stakeholders
- Begin TASKSET 2: Infrastructure implementation (event bus, observability)
- Iterative refinement based on implementation learnings
Document Status: ✅ Complete Review Required: Architecture review before proceeding to implementation Related Documents:
FUNCTIONAL_REQUIREMENTS.md, EVENT_CATALOG.md, DATA_MODELS.md