Pular para o conteúdo principal

Developing Cloud Storage Providers

This guide explains how to implement new cloud storage providers for Fermentrack 2's backup system. The cloud storage module uses an abstract provider pattern that allows adding support for services like OneDrive, Google Drive, Box, or any OAuth2-compatible cloud storage service.

Architecture Overview

The cloud storage system consists of these components:

fermentrack/cloud_storage/
├── __init__.py
├── apps.py
├── models.py # CloudStorageConnection, CloudBackupFile
├── admin.py
├── tasks.py # Celery upload task
├── providers/
│ ├── __init__.py # Provider factory & registry
│ ├── base.py # AbstractCloudStorageProvider
│ └── dropbox.py # Dropbox implementation
└── api/
├── __init__.py
├── serializers.py
├── views.py
└── urls.py

Key Concepts:

  • AbstractCloudStorageProvider: Base class all providers must inherit from
  • CloudStorageConnection: Database model storing OAuth tokens (encrypted)
  • CloudBackupFile: Tracks individual uploads to cloud storage
  • Provider Factory: Maps provider IDs to implementation classes

The Abstract Provider Interface

All providers must inherit from AbstractCloudStorageProvider and implement the required abstract methods. Here's the interface:

class AbstractCloudStorageProvider(ABC):
# Required class attributes
PROVIDER_ID: str = NotImplemented # e.g., 'onedrive'
DISPLAY_NAME: str = NotImplemented # e.g., 'OneDrive'

def __init__(self, connection: Optional["CloudStorageConnection"] = None):
"""Initialize with optional existing connection."""
self.connection = connection

@classmethod
@abstractmethod
def is_enabled(cls) -> bool:
"""Return True if credentials are configured in settings."""
pass

@abstractmethod
def get_authorization_url(self, state: str, redirect_uri: str) -> str:
"""Generate OAuth authorization URL for user to visit."""
pass

@abstractmethod
def exchange_code_for_tokens(self, code: str, redirect_uri: str) -> dict:
"""Exchange authorization code for access/refresh tokens."""
pass

@abstractmethod
def refresh_access_token(self, refresh_token: str) -> dict:
"""Refresh an expired access token."""
pass

@abstractmethod
def upload_file(
self,
file_obj: BinaryIO,
remote_path: str,
file_size: Optional[int] = None
) -> UploadResult:
"""Upload a file to cloud storage."""
pass

@abstractmethod
def get_account_info(self, access_token: str) -> dict:
"""Fetch user account information from provider."""
pass

# Optional override
def get_default_upload_path(self, filename: str) -> str:
"""Generate default upload path for a backup file."""
return f"/Fermentrack Backups/{filename}"

Exception Classes

The base module provides these exception classes for error handling:

class CloudStorageError(Exception):
"""Base exception for cloud storage operations."""
pass

class OAuthError(CloudStorageError):
"""Raised when OAuth flow fails."""
pass

class UploadError(CloudStorageError):
"""Raised when file upload fails."""
pass

class TokenExpiredError(CloudStorageError):
"""Raised when access token is expired and cannot be refreshed."""
pass

Use these exceptions in your provider implementation to ensure consistent error handling throughout the system.

UploadResult Dataclass

The upload_file method must return an UploadResult instance:

@dataclass
class UploadResult:
success: bool # True if upload succeeded
remote_path: str # Path where file was stored
file_id: Optional[str] # Provider-specific file identifier
file_size: Optional[int] # Size of uploaded file in bytes
error_message: Optional[str] # Error details if success=False

Step-by-Step: Creating a New Provider

This section walks through creating a hypothetical OneDrive provider.

Step 1: Create the Provider File

Create a new file fermentrack/cloud_storage/providers/onedrive.py:

"""
OneDrive cloud storage provider implementation.
"""

import json
import logging
from typing import BinaryIO, Optional
from urllib.parse import urlencode

import requests
from django.conf import settings

from .base import (
AbstractCloudStorageProvider,
CloudStorageError,
OAuthError,
TokenExpiredError,
UploadError,
UploadResult,
)

logger = logging.getLogger(__name__)


class OneDriveProvider(AbstractCloudStorageProvider):
"""
OneDrive provider implementation.

Uses Microsoft Graph API for file operations.

OAuth Documentation: https://docs.microsoft.com/en-us/onedrive/developer/rest-api/
"""

PROVIDER_ID = 'onedrive'
DISPLAY_NAME = 'OneDrive'

# Microsoft OAuth endpoints
AUTHORIZE_URL = 'https://login.microsoftonline.com/common/oauth2/v2.0/authorize'
TOKEN_URL = 'https://login.microsoftonline.com/common/oauth2/v2.0/token'

# Microsoft Graph API endpoints
GRAPH_URL = 'https://graph.microsoft.com/v1.0'
UPLOAD_URL = f'{GRAPH_URL}/me/drive/root:/Fermentrack Backups'
USER_INFO_URL = f'{GRAPH_URL}/me'

@classmethod
def is_enabled(cls) -> bool:
"""Check if OneDrive credentials are configured."""
return getattr(settings, 'ONEDRIVE_ENABLED', False)

def get_authorization_url(self, state: str, redirect_uri: str) -> str:
"""Generate Microsoft OAuth authorization URL."""
if not self.is_enabled():
raise OAuthError("OneDrive provider is not enabled")

params = {
'client_id': settings.ONEDRIVE_CLIENT_ID,
'redirect_uri': redirect_uri,
'response_type': 'code',
'state': state,
'scope': 'Files.ReadWrite.AppFolder offline_access User.Read',
'response_mode': 'query',
}

return f"{self.AUTHORIZE_URL}?{urlencode(params)}"

def exchange_code_for_tokens(self, code: str, redirect_uri: str) -> dict:
"""Exchange authorization code for Microsoft tokens."""
if not self.is_enabled():
raise OAuthError("OneDrive provider is not enabled")

data = {
'client_id': settings.ONEDRIVE_CLIENT_ID,
'client_secret': settings.ONEDRIVE_CLIENT_SECRET,
'code': code,
'redirect_uri': redirect_uri,
'grant_type': 'authorization_code',
}

try:
response = requests.post(self.TOKEN_URL, data=data, timeout=30)
response.raise_for_status()
token_data = response.json()
except requests.RequestException as e:
logger.error(f"OneDrive token exchange failed: {e}")
raise OAuthError(f"Failed to exchange code for tokens: {e}")

access_token = token_data['access_token']

# Get account info
try:
account_info = self.get_account_info(access_token)
except CloudStorageError:
account_info = {'user_id': '', 'email': '', 'display_name': ''}

return {
'access_token': access_token,
'refresh_token': token_data.get('refresh_token'),
'expires_in': token_data.get('expires_in'),
'account_info': account_info,
}

def refresh_access_token(self, refresh_token: str) -> dict:
"""Refresh an expired OneDrive access token."""
if not self.is_enabled():
raise OAuthError("OneDrive provider is not enabled")

if not refresh_token:
raise OAuthError("No refresh token available")

data = {
'client_id': settings.ONEDRIVE_CLIENT_ID,
'client_secret': settings.ONEDRIVE_CLIENT_SECRET,
'refresh_token': refresh_token,
'grant_type': 'refresh_token',
}

try:
response = requests.post(self.TOKEN_URL, data=data, timeout=30)
response.raise_for_status()
token_data = response.json()
except requests.RequestException as e:
logger.error(f"OneDrive token refresh failed: {e}")
raise OAuthError(f"Failed to refresh token: {e}")

return {
'access_token': token_data['access_token'],
'expires_in': token_data.get('expires_in'),
}

def get_account_info(self, access_token: str) -> dict:
"""Fetch Microsoft account information."""
headers = {'Authorization': f'Bearer {access_token}'}

try:
response = requests.get(
self.USER_INFO_URL,
headers=headers,
timeout=30
)
response.raise_for_status()
data = response.json()

return {
'user_id': data.get('id', ''),
'email': data.get('mail') or data.get('userPrincipalName', ''),
'display_name': data.get('displayName', ''),
}
except requests.RequestException as e:
logger.warning(f"Failed to fetch OneDrive account info: {e}")
raise CloudStorageError(f"Failed to fetch account info: {e}")

def upload_file(
self,
file_obj: BinaryIO,
remote_path: str,
file_size: Optional[int] = None
) -> UploadResult:
"""Upload a file to OneDrive."""
if not self.connection:
raise UploadError("No connection configured for upload")

access_token = self.connection.get_valid_access_token()
if not access_token:
raise TokenExpiredError("Unable to obtain valid access token")

# Ensure path format
if not remote_path.startswith('/'):
remote_path = '/' + remote_path

# Build upload URL
# OneDrive uses path-based upload for small files
upload_url = f"{self.GRAPH_URL}/me/drive/root:{remote_path}:/content"

headers = {
'Authorization': f'Bearer {access_token}',
'Content-Type': 'application/octet-stream',
}

try:
file_content = file_obj.read()
actual_size = len(file_content)

# OneDrive simple upload limit is 4MB
# Larger files need upload sessions (not implemented)
max_simple_upload = 4 * 1024 * 1024
if actual_size > max_simple_upload:
raise UploadError(
f"File too large ({actual_size} bytes). "
f"Maximum for simple upload is {max_simple_upload} bytes."
)

response = requests.put(
upload_url,
headers=headers,
data=file_content,
timeout=300
)
response.raise_for_status()
result = response.json()

logger.info(f"Uploaded {remote_path} to OneDrive ({actual_size} bytes)")

return UploadResult(
success=True,
remote_path=result.get('parentReference', {}).get('path', '') + '/' + result.get('name', ''),
file_id=result.get('id'),
file_size=result.get('size', actual_size),
)

except requests.HTTPError as e:
error_msg = f"OneDrive upload failed: {e}"
logger.error(error_msg)
raise UploadError(error_msg)
except requests.RequestException as e:
logger.error(f"OneDrive upload failed: {e}")
raise UploadError(str(e))

Step 2: Register the Provider

Update fermentrack/cloud_storage/providers/__init__.py:

from .base import AbstractCloudStorageProvider, CloudStorageError
from .dropbox import DropboxProvider
from .onedrive import OneDriveProvider # Add import


def get_provider(provider_type: str, connection=None):
"""Factory function to get a provider instance."""
providers = {
'dropbox': DropboxProvider,
'onedrive': OneDriveProvider, # Add to registry
}

provider_class = providers.get(provider_type)
if not provider_class:
raise ValueError(f"Unknown provider type: {provider_type}")

return provider_class(connection=connection)


def get_enabled_providers():
"""Get list of providers that are enabled."""
enabled = []

if DropboxProvider.is_enabled():
enabled.append(('dropbox', 'Dropbox'))

if OneDriveProvider.is_enabled(): # Add check
enabled.append(('onedrive', 'OneDrive'))

return enabled


__all__ = [
'AbstractCloudStorageProvider',
'CloudStorageError',
'DropboxProvider',
'OneDriveProvider', # Add to exports
'get_provider',
'get_enabled_providers',
]

Step 3: Add Provider Choice to Model

Update the ProviderChoices enum in fermentrack/cloud_storage/models.py:

class ProviderChoices(models.TextChoices):
DROPBOX = 'dropbox', _('Dropbox')
ONEDRIVE = 'onedrive', _('OneDrive')
# Add more as needed

Step 4: Add Django Settings

Update config/settings/base.py to include the new provider's credentials:

# OneDrive Cloud Storage Configuration
ONEDRIVE_CLIENT_ID = env("ONEDRIVE_CLIENT_ID", default="")
ONEDRIVE_CLIENT_SECRET = env("ONEDRIVE_CLIENT_SECRET", default="")
ONEDRIVE_ENABLED = (ONEDRIVE_CLIENT_ID != "" and ONEDRIVE_CLIENT_SECRET != "")

Step 5: Create Migration

After updating the model choices, create a migration:

uv run python manage.py makemigrations cloud_storage
uv run python manage.py migrate

Step 6: Update Serializers (if needed)

The serializers should automatically pick up the new provider choice since they reference CloudStorageConnection.ProviderChoices.choices. However, verify that the InitiateOAuthSerializer and CompleteOAuthSerializer in fermentrack/cloud_storage/api/serializers.py work correctly.

Method Implementation Details

is_enabled()

This class method should check if the required environment variables are set:

@classmethod
def is_enabled(cls) -> bool:
return getattr(settings, 'ONEDRIVE_ENABLED', False)

The settings module should compute the _ENABLED flag based on whether credentials are present. This allows the feature to be disabled simply by not setting the environment variables.

get_authorization_url()

Parameters:

  • state: A random CSRF token generated by the view and stored in the session
  • redirect_uri: The URL where the OAuth provider should redirect after authorization

Returns: Full authorization URL with all query parameters

Key considerations:

  • Always request offline_access scope (or equivalent) to get refresh tokens
  • Include appropriate scopes for file operations
  • Some providers require specific response modes

Example (Dropbox):

def get_authorization_url(self, state: str, redirect_uri: str) -> str:
params = {
'client_id': settings.DROPBOX_APP_KEY,
'redirect_uri': redirect_uri,
'response_type': 'code',
'state': state,
'token_access_type': 'offline', # Dropbox-specific
}
return f"{self.AUTHORIZE_URL}?{urlencode(params)}"

exchange_code_for_tokens()

Parameters:

  • code: Authorization code from the OAuth callback
  • redirect_uri: Must match the URI used in get_authorization_url()

Returns: Dictionary with these keys:

{
'access_token': str, # Required
'refresh_token': Optional[str], # For token refresh
'expires_in': Optional[int], # Seconds until expiry
'account_info': { # User information
'user_id': str,
'email': str,
'display_name': str,
}
}

Key considerations:

  • Always fetch account info after getting tokens (for display purposes)
  • Handle cases where account info fetch fails gracefully
  • Log errors but don't expose internal details to users

refresh_access_token()

Parameters:

  • refresh_token: The stored refresh token

Returns: Dictionary with:

{
'access_token': str, # New access token
'expires_in': Optional[int], # Seconds until expiry
}

Key considerations:

  • Some providers return a new refresh token; most don't
  • Handle network errors with appropriate retry logic
  • Raise OAuthError for permanent failures

upload_file()

Parameters:

  • file_obj: File-like object (opened in binary mode)
  • remote_path: Destination path in cloud storage
  • file_size: Optional size hint for validation

Returns: UploadResult dataclass

Key considerations:

  • Get access token from self.connection.get_valid_access_token()
  • Check file size limits before reading entire file into memory
  • Handle provider-specific upload mechanisms:
    • Simple upload (small files)
    • Chunked/resumable upload (large files)
  • Raise TokenExpiredError if token cannot be refreshed
  • Raise UploadError for upload failures

Example token handling:

def upload_file(self, file_obj, remote_path, file_size=None):
if not self.connection:
raise UploadError("No connection configured")

access_token = self.connection.get_valid_access_token()
if not access_token:
raise TokenExpiredError("Unable to obtain valid access token")

# Proceed with upload using access_token
...

get_account_info()

Parameters:

  • access_token: Valid access token

Returns: Dictionary with:

{
'user_id': str, # Provider-specific unique ID
'email': str, # User's email address
'display_name': str, # User's display name
}

This information is displayed in the UI to help users identify which account is connected.

Testing Your Provider

Unit Tests

Create tests in fermentrack/cloud_storage/tests/test_onedrive_provider.py:

import pytest
from unittest.mock import patch, MagicMock
from fermentrack.cloud_storage.providers.onedrive import OneDriveProvider


class TestOneDriveProvider:

def test_is_enabled_returns_false_when_not_configured(self, settings):
settings.ONEDRIVE_ENABLED = False
assert OneDriveProvider.is_enabled() is False

def test_is_enabled_returns_true_when_configured(self, settings):
settings.ONEDRIVE_ENABLED = True
assert OneDriveProvider.is_enabled() is True

def test_get_authorization_url_includes_required_params(self, settings):
settings.ONEDRIVE_ENABLED = True
settings.ONEDRIVE_CLIENT_ID = 'test_client_id'

provider = OneDriveProvider()
url = provider.get_authorization_url(
state='test_state',
redirect_uri='http://localhost/callback'
)

assert 'client_id=test_client_id' in url
assert 'state=test_state' in url
assert 'redirect_uri=' in url

@patch('requests.post')
def test_exchange_code_for_tokens(self, mock_post, settings):
settings.ONEDRIVE_ENABLED = True
settings.ONEDRIVE_CLIENT_ID = 'test_id'
settings.ONEDRIVE_CLIENT_SECRET = 'test_secret'

mock_response = MagicMock()
mock_response.json.return_value = {
'access_token': 'test_access_token',
'refresh_token': 'test_refresh_token',
'expires_in': 3600,
}
mock_post.return_value = mock_response

provider = OneDriveProvider()

with patch.object(provider, 'get_account_info') as mock_account:
mock_account.return_value = {
'user_id': '123',
'email': 'test@example.com',
'display_name': 'Test User',
}

result = provider.exchange_code_for_tokens(
code='auth_code',
redirect_uri='http://localhost/callback'
)

assert result['access_token'] == 'test_access_token'
assert result['refresh_token'] == 'test_refresh_token'

Integration Tests

For integration testing with real APIs, use environment variables for credentials and mark tests appropriately:

import pytest
import os

@pytest.mark.integration
@pytest.mark.skipif(
not os.environ.get('ONEDRIVE_CLIENT_ID'),
reason="OneDrive credentials not configured"
)
class TestOneDriveIntegration:

def test_full_oauth_flow(self):
# Test with real credentials
pass

Provider-Specific Considerations

Different providers have unique requirements:

Dropbox:

  • Uses token_access_type: offline for refresh tokens
  • App folder access recommended for security
  • 150MB simple upload limit

OneDrive/Microsoft Graph:

  • Uses offline_access scope for refresh tokens
  • 4MB simple upload limit (larger needs upload sessions)
  • Supports personal and business accounts

Google Drive:

  • Uses access_type: offline for refresh tokens
  • Requires enabling Drive API in Google Cloud Console
  • Has quotas and rate limits

Box:

  • Similar OAuth flow to others
  • Has chunked upload API for large files
  • Supports enterprise features

Checklist for New Providers

Before submitting a new provider, verify:

  • Provider class inherits from AbstractCloudStorageProvider
  • PROVIDER_ID and DISPLAY_NAME class attributes are set
  • All abstract methods are implemented
  • Provider is registered in providers/__init__.py
  • Provider choice added to CloudStorageConnection.ProviderChoices
  • Settings added to config/settings/base.py
  • Migration created for model changes
  • is_enabled() checks for required credentials
  • OAuth flow requests refresh tokens (offline access)
  • upload_file() uses self.connection.get_valid_access_token()
  • Appropriate exceptions raised (OAuthError, UploadError, etc.)
  • File size limits are validated before upload
  • Logging added for debugging
  • Unit tests written
  • User-facing documentation updated

Reference: Dropbox Implementation

The complete Dropbox implementation serves as the reference implementation. See fermentrack/cloud_storage/providers/dropbox.py for:

  • OAuth flow implementation
  • Token exchange and refresh
  • Simple file upload
  • Error handling patterns
  • Logging best practices