Multi-Platform Django Scraper Implementation Guide

Overview

This guide shows how to transform your single-platform Notion scraper into a multi-platform architecture that supports multiple data sources with tiered monetization.

Key Features

🏗️ Multi-Platform Architecture

Support for multiple platforms (Notion, Airtable, Monday.com, etc.)
Grouped datasets per platform
Unified API interface across platforms

💰 Tiered Monetization Strategy

Free Tier: 1,000 requests/month, 1 platform, 1 dataset
Basic Tier: 10,000 requests/month, 2 platforms, 3 datasets per platform
Pro Tier: 50,000 requests/month, 5 platforms, 10 datasets per platform
Enterprise Tier: 200,000 requests/month, unlimited platforms and datasets

🔒 Access Control & Usage Tracking

Per-user subscription management
Request counting and rate limiting
Feature-based access control

Implementation Steps

1. Database Setup

# Run migrations
python manage.py makemigrations scraper
python manage.py migrate

# Set up initial tiers and platforms
python manage.py setup_tiers

2. Settings Configuration

Add to your settings.py:

# vault/settings/base.py

MIDDLEWARE = [
    # ... existing middleware
    'scraper.middleware.tier_enforcement.TierEnforcementMiddleware',
]

# Add scraper settings
SCRAPER_SETTINGS = {
    'DEFAULT_CACHE_TIMEOUT': 3600,  # 1 hour
    'MAX_CONCURRENT_REQUESTS': 10,
    'ENABLE_USAGE_TRACKING': True,
    'ENABLE_TIER_ENFORCEMENT': True,
}

3. Platform Configuration Examples

Setting up a Notion Platform

from core.models import Platform, PlatformDataset

# Create Notion platform
notion_platform = Platform.objects.create(
    name="My Notion Workspace",
    platform_type="notion",
    api_config={
        "notion_token": "secret_xxx...",
        "notion_version": "2022-06-28",
        "timeout": 30
    }
)

# Create datasets for this platform
features_dataset = PlatformDataset.objects.create(
    platform=notion_platform,
    name="Product Features",
    description="Main product features database",
    database_config={
        "feature_db_id": "abc123...",
        "tier_db_id": "def456...",
        "gateway_db_id": "ghi789...",
        "tag_db_id": "jkl012...",
        "keyword_db_id": "mno345..."
    }
)

analytics_dataset = PlatformDataset.objects.create(
    platform=notion_platform,
    name="Analytics Data",
    description="Analytics and metrics database",
    database_config={
        "analytics_db_id": "pqr678...",
        "metrics_db_id": "stu901..."
    }
)

Setting up User Subscriptions

from scraper.services.subscription_manager import SubscriptionManager

# Create free subscription for new user
subscription = await SubscriptionManager.create_free_subscription(user)

# Upgrade user to Pro tier
result = await SubscriptionManager.upgrade_subscription(user, 'pro')

4. API Usage Examples

Get User’s Available Platforms

GET /api/scraper/platforms/

Response:
{
    "success": true,
    "data": {
        "platforms": [
            {
                "id": 1,
                "name": "My Notion Workspace",
                "platform_type": "notion",
                "datasets": [
                    {
                        "id": 1,
                        "name": "Product Features",
                        "description": "Main product features database",
                        "is_active": true
                    }
                ]
            }
        ],
        "usage_info": {
            "tier": "pro",
            "requests_used": 1250,
            "requests_limit": 50000,
            "requests_remaining": 48750,
            "usage_percentage": 2.5
        }
    }
}

Get Features from Specific Dataset

GET /api/scraper/datasets/1/features/

Response:
{
    "success": true,
    "data": {
        "dataset_info": {
            "id": 1,
            "name": "Product Features",
            "platform": "My Notion Workspace"
        },
        "features": [
            {
                "id": "feature_123",
                "name": "User Authentication",
                "status": "active",
                "tier": "basic",
                // ... other feature data
            }
        ]
    }
}

Check Subscription Status

GET /api/scraper/subscription/

Response:
{
    "success": true,
    "data": {
        "tier": "pro",
        "is_active": true,
        "expires_at": null,
        "usage": {
            "requests_used": 1250,
            "requests_limit": 50000,
            "requests_remaining": 48750,
            "usage_percentage": 2.5
        },
        "platforms_count": 3,
        "tier_limits": {
            "monthly_requests": 50000,
            "platforms_limit": 5,
            "datasets_per_platform": 10,
            "can_use_analytics": true,
            "can_export": true
        },
        "upgrade_options": {
            "enterprise": {
                "monthly_requests": 200000,
                "platforms_limit": -1,
                "datasets_per_platform": -1
            }
        }
    }
}

5. Adding New Platform Support

To add support for a new platform (e.g., Airtable):

Create Repository Class:

# v01t.io/api/scraper/repository/airtable.py

from .base import BaseRepository
from typing import Dict, List, Any

class AirtableRepository(BaseRepository):
    def __init__(self, api_key: str, base_id: str, **kwargs):
        self.api_key = api_key
        self.base_id = base_id

    async def get_features(self, **kwargs) -> List[Dict[str, Any]]:
        # Implement Airtable-specific feature fetching
        pass

    async def get_gateways(self, **kwargs) -> List[Dict[str, Any]]:
        # Implement Airtable-specific gateway fetching
        pass

    # ... implement other required methods

Register in Platform Manager:

# Update vault/api/scraper/services/platform_manager.py

PLATFORM_REPOSITORIES = {
    'notion': NotionRepository,
    'airtable': AirtableRepository,  # Add this line
    # 'monday': MondayRepository,
}

Update Platform Type Choices:

# In vault/api/scraper/models/platform.py

class PlatformType(models.TextChoices):
    NOTION = 'notion', 'Notion'
    AIRTABLE = 'airtable', 'Airtable'
    MONDAY = 'monday', 'Monday.com'
    ASANA = 'asana', 'Asana'

6. Revenue Analytics

# Generate revenue projections
from scraper.utils.monetization import calculate_monthly_revenue_projection

subscriptions_by_tier = {
    'free': 1000,
    'basic': 150,
    'pro': 75,
    'enterprise': 10
}

revenue_data = calculate_monthly_revenue_projection(subscriptions_by_tier)
# Returns projected monthly revenue breakdown

7. Monitoring and Maintenance

Reset Monthly Usage (Cron Job)

# Add to crontab to run monthly
0 0 1 * * python manage.py reset_monthly_usage

Monitor Platform Health

# Check all platform health
from scraper.services.platform_manager import PlatformManager

manager = PlatformManager()
for platform in Platform.objects.filter(is_active=True):
    for dataset in platform.datasets.filter(is_active=True):
        repo = await manager.get_platform_repository(platform, dataset)
        health = await repo.health_check()
        print(f"{platform.name} - {dataset.name}: {health['status']}")

Migration Strategy

Phase 1: Backward Compatibility

Keep existing single-platform endpoints working
Add new multi-platform endpoints alongside
Gradually migrate users to new API

Phase 2: Data Migration

# Create migration script to convert existing setup
from core.models import Platform, PlatformDataset

# Create default platform for existing users
default_platform = Platform.objects.create(
    name="Legacy Notion Workspace",
    platform_type="notion",
    api_config={
        "notion_token": settings.NOTION_TOKEN,
        # ... other existing config
    }
)

# Create default dataset
default_dataset = PlatformDataset.objects.create(
    platform=default_platform,
    name="Default Dataset",
    database_config={
        "feature_db_id": settings.FEATURE_DB_ID,
        # ... other existing database IDs
    }
)

Phase 3: Full Migration

Deprecate old endpoints
Force all users to new multi-platform system
Remove legacy code

Security Considerations

API Token Storage: Store platform API tokens securely using Django’s encryption
Rate Limiting: Implement per-user rate limiting to prevent abuse
Access Control: Ensure users can only access their authorized datasets
Audit Logging: Log all API requests for monitoring and debugging

Performance Optimizations

Caching: Cache repository instances and frequently accessed data
Connection Pooling: Reuse HTTP connections for external API calls
Async Processing: Use async/await for all external API calls
Background Tasks: Use Celery for heavy data processing operations

Testing Strategy

# v01t.io/api/scraper/tests/test_multiplatform.py

import pytest
from django.test import TestCase
from django.contrib.auth.models import User
from core.models import Platform, PlatformDataset, UserSubscription

class MultiPlatformTestCase(TestCase):
    def setUp(self):
        self.user = User.objects.create_user('testuser', 'test@example.com')
        self.platform = Platform.objects.create(
            name="Test Platform",
            platform_type="notion",
            api_config={"token": "test_token"}
        )

    async def test_user_can_access_authorized_dataset(self):
        # Test access control logic
        pass

    async def test_usage_tracking_increments_correctly(self):
        # Test usage tracking
        pass

This architecture provides a solid foundation for scaling your scraper to support multiple platforms while implementing a clear monetization strategy through tiered subscriptions.

​Multi-Platform Django Scraper Implementation Guide

​Overview

​Key Features

​🏗️ Multi-Platform Architecture

​💰 Tiered Monetization Strategy

​🔒 Access Control & Usage Tracking

​Implementation Steps

​1. Database Setup

​2. Settings Configuration

​3. Platform Configuration Examples

​Setting up a Notion Platform

​Setting up User Subscriptions

​4. API Usage Examples

​Get User’s Available Platforms

​Get Features from Specific Dataset

​Check Subscription Status

​5. Adding New Platform Support

​6. Revenue Analytics

​7. Monitoring and Maintenance

​Reset Monthly Usage (Cron Job)

​Monitor Platform Health

​Migration Strategy

​Phase 1: Backward Compatibility

​Phase 2: Data Migration

​Phase 3: Full Migration

​Security Considerations

​Performance Optimizations

​Testing Strategy