Technical Guide

Designing Multi-Tenant Systems That Don't Break at Scale

How to design multi-tenant architectures with real isolation. Three enforcement layers, hierarchical scoping, RBAC patterns, and lessons from building three different multi-tenant systems.

February 28, 202620 min readOronts Engineering Team

Multi-Tenancy Is Not a Database Decision

Most articles about multi-tenancy start with the database question: shared database, shared schema, or dedicated database per tenant? That's the wrong place to start. The database model is an implementation detail. The architecture question is: how do you enforce isolation at every layer of the system so that tenant A can never see, modify, or affect tenant B's data?

We've built three different multi-tenant systems. Each solved isolation differently because each had different constraints. One uses hierarchical scoping with five identity levels. Another uses flat RBAC with four roles. A third uses organization-based scoping with channel isolation. The database model was the least interesting decision in all three.

This article covers the architectural patterns that make multi-tenancy safe at scale. For broader context on how we approach system architecture, that guide covers our methodology. For specific examples of multi-tenant AI systems, see our guides on agentic commerce and AI governance.

The Three-Layer Enforcement Model

Tenant isolation must be enforced at three layers. If any layer is missing, data leaks.

┌─────────────────────────────────────────────────┐
│  Layer 1: API MIDDLEWARE                         │
│  Every request authenticated and scoped          │
│  tenant_id extracted from JWT/API key            │
│  Injected into request context                   │
│                                                  │
├─────────────────────────────────────────────────┤
│  Layer 2: QUERY FILTERS                          │
│  Every database query includes tenant_id         │
│  Every search query scoped by tenant             │
│  No query runs without tenant context            │
│                                                  │
├─────────────────────────────────────────────────┤
│  Layer 3: POLICY ENFORCEMENT                     │
│  Tool calls checked against tenant policies      │
│  Agent memory scoped per tenant + session        │
│  Output filtered by tenant visibility rules      │
│                                                  │
└─────────────────────────────────────────────────┘

Layer 1: API Middleware

Every incoming request must be authenticated and scoped to a tenant before it reaches any business logic.

// Middleware extracts tenant context from every request
async function tenantMiddleware(req: Request, res: Response, next: NextFunction) {
    const token = req.headers.authorization?.replace('Bearer ', '');
    if (!token) return res.status(401).json({ error: 'No token provided' });

    const decoded = await verifyJwt(token);
    const tenant = await tenantStore.getById(decoded.tenant_id);

    if (!tenant || tenant.status !== 'active') {
        return res.status(403).json({ error: 'Tenant not found or suspended' });
    }

    // Inject tenant context into request
    req.tenantContext = {
        tenantId: tenant.id,
        channelId: decoded.channel_id,
        role: decoded.role,
        permissions: decoded.permissions,
    };

    next();
}

The tenant context is not optional. Every route handler, every service method, every database query receives it. If a function doesn't have tenant context, it cannot access tenant-scoped data.

Layer 2: Query Filters

Every database query must include the tenant scope. This is not enforced by convention ("remember to add the WHERE clause"). It's enforced by architecture.

// Repository base class that enforces tenant scoping
class TenantScopedRepository<T> {
    async findMany(tenantId: string, filters: Partial<T>): Promise<T[]> {
        return this.db.query({
            TableName: this.tableName,
            KeyConditionExpression: 'tenant_id = :tid',
            FilterExpression: this.buildFilterExpression(filters),
            ExpressionAttributeValues: {
                ':tid': tenantId,
                ...this.buildFilterValues(filters),
            },
        });
    }

    async findById(tenantId: string, id: string): Promise<T | null> {
        const result = await this.db.get({
            TableName: this.tableName,
            Key: { tenant_id: tenantId, id },
        });
        return result.Item || null;
    }

    // No method exists that queries without tenant_id
    // Cross-tenant queries are architecturally impossible
}

For search engines (OpenSearch, MeiliSearch, Elasticsearch), every query includes a tenant filter:

async function searchProducts(tenantId: string, channelId: string, query: string) {
    return opensearch.search({
        index: 'products',
        body: {
            query: {
                bool: {
                    must: [{ match: { searchText: query } }],
                    filter: [
                        { term: { tenant_id: tenantId } },
                        { term: { channel_ids: channelId } },
                    ],
                },
            },
        },
    });
}

Layer 3: Policy Enforcement

Beyond data access, tenants have different permissions for what actions they can perform. The policy layer checks these before any action executes.

interface TenantPolicy {
    tenant_id: string;
    rules: PolicyRule[];
}

interface PolicyRule {
    action: string;          // "create_order", "export_data", "use_ai_agent"
    effect: "allow" | "deny";
    conditions?: {
        max_value?: number;
        allowed_channels?: string[];
        require_approval?: boolean;
    };
}

// Policy check before any action
async function checkPolicy(tenantId: string, action: string, params: any): Promise<boolean> {
    const policy = await policyStore.getForTenant(tenantId);
    const matchingRules = policy.rules.filter(r => r.action === action);

    // Deny rules take precedence
    if (matchingRules.some(r => r.effect === 'deny')) return false;

    // No matching allow rule = denied (default-deny)
    const allowRule = matchingRules.find(r => r.effect === 'allow');
    if (!allowRule) return false;

    // Check conditions
    if (allowRule.conditions?.max_value && params.value > allowRule.conditions.max_value) {
        return false;
    }

    return true;
}

Hierarchical vs Flat Scoping

Flat Scoping (Simple SaaS)

Every resource belongs to exactly one tenant. No sub-levels.

Tenant A
  ├── Users (owner, admin, member, viewer)
  ├── Products
  ├── Orders
  └── Settings

Tenant B
  ├── Users
  ├── Products
  ├── Orders
  └── Settings

Good for: SaaS products where each customer is an isolated workspace. Think project management tools, CRM systems, internal dashboards.

Flat scoping needs four role tiers:

Role	Permissions
Owner	Full access, billing, delete tenant
Admin	Manage users, settings, all data
Member	Create and edit own data, view shared data
Viewer	Read-only access to shared data

Hierarchical Scoping (Enterprise Platforms)

Resources are scoped through multiple levels. Each level narrows visibility.

Tenant (merchant organization)
  └── Channel (storefront, API, widget)
       └── Supplier Binding (which suppliers visible per channel)
            └── Customer (end user within a channel)
                 └── Session (browser/device session)
                      └── Agent Thread (single AI conversation)

Good for: marketplace platforms, multi-brand commerce, enterprise systems where one organization has multiple storefronts, sales channels, or subsidiary brands.

Each level adds a filter. A product visible in Channel A is not necessarily visible in Channel B, even within the same tenant. A customer in Channel A has no access to Channel B's data. An agent thread in one session cannot see conversations from another session.

// Hierarchical context passed through every operation
interface TenantContext {
    tenantId: string;        // organization
    channelId: string;       // storefront or sales channel
    customerId?: string;     // end user (if authenticated)
    sessionId?: string;      // browser session
    threadId?: string;       // AI conversation thread
}

// Query scoped to full hierarchy
async function getVisibleProducts(ctx: TenantContext) {
    const channel = await channelStore.get(ctx.tenantId, ctx.channelId);
    return productStore.findMany({
        tenant_id: ctx.tenantId,
        supplier_id: { $in: channel.visibleSupplierIds },
        status: 'active',
    });
}

Hybrid Scoping

Some systems need flat scoping for most resources but hierarchical scoping for specific features. For example, a Vendure commerce installation might use flat scoping (one tenant per store) but channel-based scoping for product visibility and pricing.

// Vendure's channel scoping
async findByCustomer(ctx: RequestContext, customerId: number) {
    return this.connection.getRepository(ctx, CiWishlist).find({
        where: {
            customerId,
            channelId: ctx.channelId,  // Channel scoping within tenant
        },
    });
}

For more on how we implement channel scoping in Vendure, see our Vendure production architecture guide.

Auth and RBAC Patterns

JWT-Based Tenant Scoping

The JWT token carries tenant identity. Every API request includes it.

// JWT payload structure
interface TenantJwtPayload {
    sub: string;              // user ID
    tenant_id: string;        // which tenant
    channel_id?: string;      // which channel (if applicable)
    role: string;             // owner | admin | member | viewer
    permissions: string[];    // fine-grained permissions
    iat: number;
    exp: number;
}

The tenant_id in the JWT is the primary scoping mechanism. It's set at login time and cannot be changed without re-authenticating. The backend extracts it from every request and uses it to scope all data access.

API Key Authentication

For machine-to-machine communication (ERP integrations, external services, webhooks), API keys map to tenants:

async function apiKeyMiddleware(req: Request, res: Response, next: NextFunction) {
    const apiKey = req.headers['x-api-key'];
    if (!apiKey) return next(); // fall through to JWT auth

    const keyRecord = await apiKeyStore.findByKey(apiKey);
    if (!keyRecord || keyRecord.status !== 'active') {
        return res.status(401).json({ error: 'Invalid API key' });
    }

    req.tenantContext = {
        tenantId: keyRecord.tenantId,
        channelId: keyRecord.channelId,
        role: keyRecord.role,
        permissions: keyRecord.permissions,
    };

    next();
}

API keys are tenant-scoped. Key rotation doesn't change the tenant binding. Rate limits and permission scopes are per-key, not per-tenant.

Permission Granularity

Roles define broad access levels. Permissions define specific capabilities:

const PERMISSIONS = {
    // Product management
    PRODUCT_READ: 'product:read',
    PRODUCT_CREATE: 'product:create',
    PRODUCT_UPDATE: 'product:update',
    PRODUCT_DELETE: 'product:delete',

    // Order management
    ORDER_READ: 'order:read',
    ORDER_CREATE: 'order:create',
    ORDER_CANCEL: 'order:cancel',
    ORDER_REFUND: 'order:refund',

    // AI features
    AI_AGENT_USE: 'ai:agent:use',
    AI_AGENT_CONFIGURE: 'ai:agent:configure',
    AI_EXPORT: 'ai:export',

    // Admin
    USER_MANAGE: 'user:manage',
    SETTINGS_MANAGE: 'settings:manage',
    BILLING_MANAGE: 'billing:manage',
};

// Role-permission mapping
const ROLE_PERMISSIONS = {
    owner: Object.values(PERMISSIONS),
    admin: Object.values(PERMISSIONS).filter(p => p !== 'billing:manage'),
    member: ['product:read', 'product:create', 'product:update', 'order:read', 'order:create', 'ai:agent:use'],
    viewer: ['product:read', 'order:read'],
};

What Happens When Scoping Fails

The most instructive way to understand why three-layer enforcement matters is to see what breaks when each layer is missing.

Missing Layer	What Happens	Real Example
No API middleware	Any request with a valid JWT can access any tenant's data by guessing tenant IDs	Competitor scrapes your customer's product catalog
No query filters	A developer forgets the WHERE clause in a new endpoint, cross-tenant data leaks	Admin dashboard shows all customers across all tenants
No policy enforcement	A tenant with "starter" plan accesses "enterprise" features through direct API calls	Free-tier tenant exports unlimited data, bypassing plan limits

The scariest version: all three layers work for reads but not for writes. Tenant A can't see tenant B's data, but a bug in the update endpoint lets tenant A overwrite tenant B's product prices. We caught this in testing. In production, it would have been catastrophic.

Testing Multi-Tenant Isolation

Testing multi-tenancy requires specific test patterns that most test suites don't cover.

The Cross-Tenant Access Test

For every endpoint, test that tenant A cannot access tenant B's data:

describe('Tenant isolation', () => {
    it('tenant A cannot read tenant B products', async () => {
        // Create product as tenant B
        const product = await createProduct(tenantB.token, { name: 'Secret Product' });

        // Try to read it as tenant A
        const response = await api.get(`/products/${product.id}`, {
            headers: { Authorization: `Bearer ${tenantA.token}` },
        });

        expect(response.status).toBe(404); // Not 403, not 200 with empty data
    });

    it('tenant A cannot update tenant B products', async () => {
        const product = await createProduct(tenantB.token, { name: 'Original' });

        const response = await api.patch(`/products/${product.id}`, {
            headers: { Authorization: `Bearer ${tenantA.token}` },
            body: { name: 'Hacked' },
        });

        expect(response.status).toBe(404);

        // Verify product wasn't modified
        const check = await api.get(`/products/${product.id}`, {
            headers: { Authorization: `Bearer ${tenantB.token}` },
        });
        expect(check.body.name).toBe('Original');
    });
});

Return 404 (not 403) for cross-tenant access attempts. A 403 confirms the resource exists, which is itself an information leak.

The Search Isolation Test

Verify that search results are tenant-scoped:

it('search results are tenant-scoped', async () => {
    await createProduct(tenantA.token, { name: 'Widget Alpha' });
    await createProduct(tenantB.token, { name: 'Widget Beta' });
    await waitForSearchIndex();

    const results = await api.get('/search?q=Widget', {
        headers: { Authorization: `Bearer ${tenantA.token}` },
    });

    expect(results.body.items).toHaveLength(1);
    expect(results.body.items[0].name).toBe('Widget Alpha');
    // Widget Beta must not appear
});

The Bulk Operation Test

Verify that bulk operations (exports, imports, batch updates) respect tenant boundaries:

it('export only includes own tenant data', async () => {
    const exportResult = await api.post('/export/products', {
        headers: { Authorization: `Bearer ${tenantA.token}` },
    });

    const exportedIds = exportResult.body.products.map(p => p.id);
    const tenantBProducts = await getAllProducts(tenantB.token);
    const tenantBIds = tenantBProducts.map(p => p.id);

    // No tenant B product IDs in tenant A's export
    const overlap = exportedIds.filter(id => tenantBIds.includes(id));
    expect(overlap).toHaveLength(0);
});

For how we approach testing more broadly, see our software engineering guide.

Shared vs Dedicated Infrastructure

Model	When to Use	Trade-offs
Shared everything (one DB, one schema)	SaaS with many small tenants	Cheapest. Hardest to isolate. Noisy neighbor risk.
Shared DB, separate schemas	Medium tenants needing logical isolation	Good isolation. More migration complexity.
Dedicated databases	Enterprise tenants with compliance requirements	Best isolation. Most expensive. Hardest to manage.
Dedicated clusters	Regulated industries (healthcare, finance)	Complete isolation. Highest cost. Separate ops per tenant.

For most SaaS applications, shared everything with application-level enforcement (the three-layer model above) is the right choice. It's simpler to operate, cheaper to run, and if the enforcement layers are correct, just as secure.

Dedicated infrastructure becomes necessary when tenants have regulatory requirements that mandate physical isolation (e.g., data must reside in a specific country), or when one tenant's workload is so large that it affects others (the noisy neighbor problem).

Common Pitfalls

Treating multi-tenancy as a database problem. The database model (shared vs dedicated) is the least important decision. The enforcement model (three layers) is the most important.
Enforcing isolation by convention. "Developers should always include tenant_id in queries" is not a strategy. Make it architecturally impossible to query without tenant context.
Returning 403 for cross-tenant access. Return 404. A 403 confirms the resource exists, which leaks information across tenants.
No cross-tenant access tests. Every endpoint needs a test that verifies tenant A cannot access tenant B's data. For both reads and writes.
Forgetting search index isolation. Database queries might be scoped, but if the search index isn't filtered by tenant, search results leak across tenants.
Shared caches without tenant key prefix. If your Redis cache key is product:123, it's shared across tenants. Use tenant_abc:product:123.
Background jobs without tenant context. A scheduled job that processes "all pending orders" without tenant scoping processes every tenant's orders in one batch. Pass tenant context through job payloads.
No rate limiting per tenant. One tenant's bulk import shouldn't degrade performance for all other tenants. Rate limit per tenant, not just per IP.

Key Takeaways

Three-layer enforcement is non-negotiable. API middleware, query filters, and policy enforcement. All three. Every request, every query, every action.
Hierarchical scoping handles enterprise complexity. Flat scoping works for simple SaaS. Enterprise platforms need tenant, channel, customer, session, and thread-level scoping.
Make cross-tenant access architecturally impossible. Don't rely on developers remembering WHERE clauses. Repository base classes that require tenant_id as a mandatory parameter.
Test isolation explicitly. Every endpoint needs a cross-tenant access test. For reads, writes, searches, exports, and bulk operations.
Return 404, not 403. Cross-tenant access attempts should look like the resource doesn't exist, not like the user doesn't have permission.
Shared infrastructure with application-level isolation works for most cases. Dedicated infrastructure is for regulatory requirements or noisy neighbor problems, not for security.

We apply these patterns across our AI services, custom software projects, and commerce platforms. If you're designing a multi-tenant system, talk to our team or request a quote. You can also explore our solutions page and our trust and compliance approach for how we handle tenant isolation guarantees.

Topics covered

multi-tenant architecturetenant isolationdata scopingSaaS architecturemulti-tenancy designtenant securityRBAC multi-tenantdata isolation

Ready to build production AI systems?

Our team specializes in building production-ready AI systems. Let's discuss how we can help transform your enterprise with cutting-edge technology.

Start a conversation