- Introduced `saas.plan` model to define subscription plans with limits and pricing. - Created `saas.restaurant` model to manage restaurant tenants, including database provisioning and subscription management. - Implemented views for managing SaaS plans and restaurant tenants, including tree and form views. - Added security access rights for the new models. - Developed a backup management view for database backups. - Updated menu structure to include new SaaS management options. - Added Docker and deployment configurations for PostgreSQL, Redis, and Odoo services. - Included scaling guide and backup scripts for production environments. - Enhanced theme with new images and layout adjustments.
3.7 KiB
Dine360 SaaS Multi-Tenant Platform: Production Scaling & Deployment Guide
This guide describes how to configure, deploy, secure, and scale the Dine360 SaaS platform to support 25 to 500+ concurrent restaurants.
1. High Availability Architecture
To ensure 99.9% uptime, the infrastructure must be divided into separate tiers with no single point of failure (SPOF).
graph LR
User[Clients / POS] -->|HTTPS| DNS[Route53 / Cloudflare DNS]
DNS -->|Anycast| ALB[Load Balancer Pool: Nginx / AWS ALB]
ALB -->|Proxy Pass| WebPool[Odoo App instances cluster]
WebPool -->|Session Cache| Redis[Redis Replication Group]
WebPool -->|DB Pool| Bouncer[PgBouncer Poolers]
Bouncer -->|Transactions| DBPrimary[(PostgreSQL Primary)]
DBPrimary -->|Streaming replication| DBReplica[(PostgreSQL Hot Standby)]
DBPrimary -->|Archived Logs| S3[(AWS S3 Backup Bucket)]
2. Server Sizing Recommendations
25 to 100 Restaurants (Starter Tier)
- Odoo Servers: 1x App Server (8 vCPU, 16GB RAM) running Odoo with
--workers=17. - Database Server: 1x PostgreSQL Server (8 vCPU, 32GB RAM).
- Cache/Session: Local Redis instance.
100 to 500+ Restaurants (Enterprise / High-Scale Tier)
- App Instance Pool: 3x App Servers (each 8 vCPU, 16GB RAM) running behind a Load Balancer (ALB).
- PostgreSQL Database Cluster:
- 1x Primary Write Node (32 vCPU, 128GB RAM, NVMe storage).
- 1x Replica Read-Only Node (16 vCPU, 64GB RAM) for heavy reports, API queries, and read scaling.
- PgBouncer: Dedicated PgBouncer container on database nodes configured in transaction pooling mode.
- Cache/Session: Managed AWS ElastiCache for Redis (Replicated/Sharded cluster).
3. Database Connection Pooling (PgBouncer)
With 500+ databases, Odoo's default connection model opens up to 500 * (workers + 2) connections, easily overwhelming PostgreSQL connection limits and exhausting system file descriptors.
- Fix: Use PgBouncer in transaction pooling mode:
pool_mode = transaction max_client_conn = 10000 default_pool_size = 50 - Transaction Pooling Note: In transaction mode, cursor-based operations (like Odoo's temporary tables or session locks) can sometimes fail if not handled. Odoo is compatible with transaction pooling from version 12 onwards, but long-lived locks should be avoided in custom modules.
4. Let's Encrypt Wildcard SSL Automation
For auto-onboarding subdomains (e.g. *.dine360.com), a Wildcard SSL certificate is required.
Setup using Certbot (DNS-01 Challenge)
Since HTTP validation cannot verify wildcard domains, you must use DNS-01 validation.
-
Install Certbot with DNS plugin (e.g. Route53 plugin for AWS):
sudo apt-get install certbot python3-certbot-dns-route53 -
Acquire Wildcard Certificate:
certbot certonly --dns-route53 -d dine360.com -d *.dine360.com -
Cron for Automatic Renewal: Add this to
/etc/crontabto check renewals daily and reload Nginx:0 0 * * * root certbot renew --post-hook "systemctl reload nginx"
5. Monitoring Setup
To maintain system observability, implement the following stack:
- Node Exporter & Prometheus: Collect CPU, Memory, Disk IOPS, and Network metrics of all host servers.
- postgres_exporter: Track PostgreSQL active backends, transaction delays, lock waits, and cache hit rates.
- Grafana Dashboard:
- Create alerts for DB connection limits (>80%).
- Create alerts for disk usage (>85%).
- Monitor average query response time.
- Sentry Integration: Configure Odoo to send trace logs and exceptions directly to Sentry for debugging tenant-specific issues without accessing logs directly.