Technical Guide

E-Commerce Search Architecture: MeiliSearch, OpenSearch, and Real Migration Stories

How to design product search for commerce. MeiliSearch vs OpenSearch vs Elasticsearch, index design, faceted search, multilingual strategies, hybrid search, and real-time sync from PIM and commerce systems.

February 3, 202618 min readOronts Engineering Team

Why Product Search Is Not Text Search

Product search looks simple. A user types "blue running shoes size 42" and expects relevant results. But the implementation is fundamentally different from document search or web search. Products have structured attributes (size, color, price, brand), hierarchical categories, availability that changes in real time, localized names and descriptions, and facets that users expect to filter by.

A document search engine finds documents that match a query. A product search engine must find products, rank them by commercial relevance (not just text relevance), present filterable facets, handle typos and synonyms, support multiple languages, and update in real time when inventory changes.

We've built product search systems on both MeiliSearch and OpenSearch, migrating from Elasticsearch 7.4 in one case and building from scratch in another. This article covers the architecture decisions, not the configuration details. For vector search patterns specifically, see our vector search architecture guide. For the broader commerce context, see our ecommerce platforms guide.

MeiliSearch vs OpenSearch vs Elasticsearch

Criteria	MeiliSearch	OpenSearch	Elasticsearch
Language	Rust	Java	Java
Typo tolerance	Built-in, excellent	Plugin/custom	Plugin/custom
Faceted search	Built-in, fast	Built-in (aggregations)	Built-in (aggregations)
Vector search	Experimental	Built-in (k-NN)	Built-in (dense_vector)
Multilingual	Good (language-specific tokenizers)	Excellent (analyzers per field)	Excellent (analyzers per field)
Real-time indexing	Near instant (< 50ms)	Near real-time (1s refresh)	Near real-time (1s refresh)
Complexity	Low (single binary, REST API)	High (cluster, shards, replicas)	High (cluster, shards, replicas)
Memory usage	Low (Rust, efficient)	High (JVM heap)	High (JVM heap)
Operational cost	Low (runs on small instances)	Medium to high	Medium to high
Sorting	Built-in ranking rules	Flexible sort	Flexible sort
License	MIT	Apache 2.0	SSPL (not open source)
Best for	Small to medium catalogs (< 500K products)	Large catalogs, complex queries, hybrid search	Same as OpenSearch (if already invested)

When to Choose MeiliSearch

Catalog under 500K products
Typo tolerance is critical (consumer-facing search)
Team has limited search infrastructure experience
Fast setup matters more than advanced query features
Budget is tight (runs on a single small instance)

When to Choose OpenSearch

Catalog over 100K products with complex facets
Need hybrid search (text + vector / k-NN)
Multiple consumer groups process the same index
Already on AWS (OpenSearch Serverless is managed)
Need advanced aggregations and analytics on search data

When to Choose Elasticsearch

Already running Elasticsearch and no reason to migrate
Need specific Elastic-only features (ML inference, security)
Enterprise support contract is required

For most new commerce projects, we recommend MeiliSearch for simplicity or OpenSearch for power. Elasticsearch's SSPL license makes it less attractive for new deployments.

Index Design

The most common mistake: indexing your database schema directly. Product tables are normalized. Search indices must be denormalized.

// Database: normalized (relational)
// products table: id, name, category_id, brand_id
// categories table: id, name, parent_id
// variants table: id, product_id, sku, price, size, color
// translations table: id, product_id, locale, name, description

// Search index: denormalized (flat document)
interface ProductSearchDocument {
    id: string;
    name: string;                    // current locale
    description: string;             // current locale
    slug: string;
    sku: string[];                   // all variant SKUs
    brand: string;                   // denormalized from brand table
    categories: string[];            // full hierarchy: ["Shoes", "Running", "Trail"]
    categoryIds: string[];           // for facet filtering
    price: number;                   // lowest variant price (for sorting)
    priceRange: { min: number; max: number };
    sizes: string[];                 // all available sizes
    colors: string[];                // all available colors
    inStock: boolean;                // any variant in stock
    imageUrl: string;                // primary image
    rating: number;                  // average review rating
    reviewCount: number;
    tags: string[];                  // searchable tags
    createdAt: number;               // for "newest first" sorting
    popularity: number;              // sales count or view count
}

Rules for denormalization:

Flatten all relations into the document (brand name, not brand ID)
Include the full category hierarchy as an array (enables facet drill-down)
Include all variant attributes (sizes, colors) as arrays on the product
Use the lowest price for sorting, price range for display
Include computed fields (rating, reviewCount, popularity) for ranking
One document per product per locale (not one document with all locales)

One Index Per Locale

For multilingual commerce, create one index per locale:

products_en
products_de
products_fr
products_ar

Each index uses language-specific analyzers, tokenizers, and stop words. A German search for "Laufschuhe" uses German stemming. An Arabic search uses Arabic morphological analysis. Mixing locales in one index forces compromises on analysis that degrade quality for every language.

// MeiliSearch: one index per locale
await meili.createIndex('products_de', { primaryKey: 'id' });
await meili.index('products_de').updateSettings({
    searchableAttributes: ['name', 'description', 'brand', 'tags', 'categories'],
    filterableAttributes: ['categories', 'brand', 'sizes', 'colors', 'price', 'inStock'],
    sortableAttributes: ['price', 'createdAt', 'popularity', 'rating'],
});

Faceted Search Architecture

Facets are the filters on the left side of every commerce search page. They look simple but require careful design.

Facet Types

Type	Example	Implementation
Term facet	Brand: Nike (42), Adidas (38)	Term aggregation on `brand` field
Range facet	Price: 0-50 (15), 50-100 (28), 100+ (12)	Range aggregation on `price` field
Hierarchical facet	Category: Shoes > Running > Trail	Multi-level term aggregation on category hierarchy
Boolean facet	In Stock: Yes (89), No (11)	Term aggregation on `inStock` field
Color facet	Color swatches with counts	Term aggregation on `colors` array field
Size facet	Size: 40 (5), 41 (8), 42 (12)	Term aggregation on `sizes` array field

Facet Interaction

When a user selects a facet, the other facets must update to reflect the filtered results. This is called "facet refinement" and is the most complex part of search UI.

// MeiliSearch: facet counts with active filters
const results = await meili.index('products_de').search('laufschuhe', {
    filter: ['brand = "Nike"', 'inStock = true'],
    facets: ['categories', 'brand', 'sizes', 'colors', 'price'],
});

// results.facetDistribution:
// {
//   categories: { "Running": 42, "Trail": 18, "Road": 24 },
//   brand: { "Nike": 42 },  // only Nike (because filtered)
//   sizes: { "40": 5, "41": 8, "42": 12, "43": 10, "44": 7 },
//   colors: { "Black": 20, "White": 15, "Blue": 7 },
// }

The key UX decision: when a brand filter is active, should the brand facet show only the selected brand (with its count) or all brands (with counts reflecting the current query minus the brand filter)? The second approach ("disjunctive faceting") lets users compare counts across brands. MeiliSearch supports this natively. OpenSearch requires separate aggregation queries per disjunctive facet.

Real-Time Sync from Source Systems

Search indices must stay in sync with the source of truth (PIM, commerce database, ERP). The sync architecture depends on the source system.

Event-Driven Sync (Recommended)

The source system emits events on data changes. A worker consumes events and updates the search index.

// Vendure: sync on product events
@Injectable()
export class SearchIndexSubscriber {
    constructor(
        private eventBus: EventBus,
        private searchService: SearchIndexService,
    ) {
        this.eventBus.ofType(ProductEvent).subscribe(async event => {
            if (event.type === 'updated' || event.type === 'created') {
                await this.searchService.indexProduct(event.ctx, event.product.id);
            }
            if (event.type === 'deleted') {
                await this.searchService.removeProduct(event.product.id);
            }
        });

        this.eventBus.ofType(ProductVariantEvent).subscribe(async event => {
            // Variant change affects parent product's search document
            await this.searchService.indexProduct(event.ctx, event.productVariant.productId);
        });
    }
}

Scheduled Full Reindex

Even with event-driven sync, run a scheduled full reindex as a safety net. Events can be lost (broker downtime, worker crash). A nightly full reindex catches anything that event-driven sync missed.

// Nightly full reindex job
async function fullReindex(locale: string) {
    const batchSize = 500;
    let offset = 0;
    let products = [];

    do {
        products = await productService.findAll({ take: batchSize, skip: offset });
        const documents = products.map(p => buildSearchDocument(p, locale));
        await meili.index(`products_${locale}`).addDocuments(documents);
        offset += batchSize;
    } while (products.length === batchSize);
}

Handling Deletions

Product deletions are tricky. If you delete a product from the database, the event-driven sync removes it from the index. But if the event is lost, the deleted product stays in search results.

Two solutions:

Track deletion timestamps and filter by "not deleted" in queries
Full reindex replaces the entire index atomically (swap alias)

// Atomic reindex with alias swap (OpenSearch/Elasticsearch)
async function atomicReindex(locale: string) {
    const newIndex = `products_${locale}_${Date.now()}`;
    await opensearch.indices.create({ index: newIndex, body: indexSettings });

    // Index all products into new index
    await bulkIndex(newIndex, locale);

    // Swap alias atomically
    await opensearch.indices.updateAliases({
        body: {
            actions: [
                { remove: { index: `products_${locale}_*`, alias: `products_${locale}` } },
                { add: { index: newIndex, alias: `products_${locale}` } },
            ],
        },
    });

    // Delete old indices
    await cleanupOldIndices(`products_${locale}_*`, keepLast: 2);
}

For how we handle data sync pipelines at scale, our Vendure Data Hub Plugin implements all these patterns with 7 different search sinks.

Relevance Tuning

Default search relevance is wrong for commerce. Text relevance (how well the query matches the document) is one signal. Commercial relevance (how likely the user is to buy) is equally important.

Ranking Signals

Signal	Weight	Source
Text match (title)	High	Search engine
Text match (description)	Medium	Search engine
In stock	Critical (boost or filter)	Inventory system
Popularity (sales count)	Medium	Order data
Review rating	Low-Medium	Reviews
Recency (new products)	Low	Product creation date
Margin (internal)	Optional	Business rules

// MeiliSearch: custom ranking rules
await meili.index('products_de').updateSettings({
    rankingRules: [
        'words',           // 1. Text match quality
        'typo',            // 2. Typo tolerance
        'proximity',       // 3. Word proximity
        'attribute',       // 4. Which field matched (title > description)
        'sort',            // 5. User-requested sort
        'exactness',       // 6. Exact vs partial match
        'popularity:desc', // 7. Popular products rank higher
        'rating:desc',     // 8. Higher-rated products rank higher
    ],
});

Boosting In-Stock Products

Out-of-stock products should appear lower in results, not disappear entirely. Users might want to see upcoming products or subscribe to back-in-stock notifications.

// OpenSearch: boost in-stock products
const query = {
    bool: {
        must: [{ match: { searchText: userQuery } }],
        should: [
            { term: { inStock: { value: true, boost: 5.0 } } }, // Strong boost for in-stock
        ],
        filter: [
            { term: { tenant_id: tenantId } },
        ],
    },
};

Hybrid Search for Commerce

Combining text search with vector search improves results for natural language queries while preserving exact match capability for SKUs and product codes.

// OpenSearch: hybrid search (text + vector)
const results = await opensearch.search({
    index: 'products_en',
    body: {
        query: {
            bool: {
                should: [
                    // Text search (handles SKUs, exact product names)
                    { multi_match: { query: userQuery, fields: ['name^3', 'description', 'sku^5', 'tags'], type: 'best_fields' } },
                    // Vector search (handles natural language, semantic similarity)
                    { knn: { embedding: { vector: queryEmbedding, k: 20 } } },
                ],
            },
        },
        // Facets
        aggs: {
            brands: { terms: { field: 'brand.keyword', size: 20 } },
            categories: { terms: { field: 'categories.keyword', size: 30 } },
            price_ranges: { range: { field: 'price', ranges: [{ to: 50 }, { from: 50, to: 100 }, { from: 100 }] } },
        },
    },
});

SKU queries ("ABC-12345") hit the text search path with high precision. Natural language queries ("comfortable shoes for long walks") hit the vector search path with semantic understanding. Both contribute to the final ranking.

For more on vector search internals, see our vector search architecture guide.

Common Pitfalls

Indexing normalized data. Your search documents must be denormalized. Flatten all relations into the document. Don't reference IDs that require a second lookup.
One index for all locales. Create one index per locale. Mixed-locale indices can't use language-specific analyzers, and search quality degrades for every language.
No facet design. Facets are not an afterthought. Plan which attributes are filterable, how hierarchical categories work, and how facet counts update when filters are applied.
Sync only via scheduled reindex. Event-driven sync gives near-real-time updates. Scheduled reindex is a safety net, not the primary mechanism.
No relevance tuning. Default text relevance is wrong for commerce. Boost in-stock products, incorporate popularity and ratings, and weight title matches higher than description matches.
Ignoring out-of-stock products. Don't remove them from the index. Demote them in ranking. Users may want back-in-stock alerts or to browse upcoming products.
No atomic reindex. If your reindex process fails halfway, you have a partially updated index. Use alias swapping for atomic switchover.
Treating search as a feature, not infrastructure. Search is a core service. It needs its own cluster, its own monitoring, its own scaling strategy. Don't run it on the same server as your database.

Key Takeaways

Product search is not text search. Structured attributes, facets, commercial relevance, real-time inventory, and multilingual support make it fundamentally different.
Denormalize for search, normalize for storage. The search document is a flat, self-contained representation of everything needed to render a search result. No joins, no lookups.
One index per locale. Language-specific analyzers, tokenizers, and stop words produce dramatically better results than a single mixed-language index.
Event-driven sync with scheduled reindex as safety net. Real-time updates for normal operations. Full reindex nightly to catch anything events missed.
Relevance tuning is a business decision. Text match quality, in-stock status, popularity, ratings, and margin are all ranking signals. Default relevance is wrong for commerce.
MeiliSearch for simplicity, OpenSearch for power. MeiliSearch is perfect for catalogs under 500K with great typo tolerance. OpenSearch handles complex aggregations, hybrid search, and large-scale deployments.

We build search infrastructure as part of our data engineering and ecommerce practice. If you're building or migrating product search, talk to our team or request a quote. Our Vendure Data Hub Plugin includes search sinks for MeiliSearch, OpenSearch, Elasticsearch, Algolia, and Typesense.

Topics covered

ecommerce searchMeiliSearch ecommerceproduct search architecturefaceted searchhybrid search commerceElasticsearch migrationOpenSearch commercesearch indexing

Ready to build production AI systems?

Our team specializes in building production-ready AI systems. Let's discuss how we can help transform your enterprise with cutting-edge technology.

Start a conversation

E-Commerce Search Architecture: MeiliSearch, OpenSearch, and Real Migration Stories

Why Product Search Is Not Text Search

MeiliSearch vs OpenSearch vs Elasticsearch

When to Choose MeiliSearch

When to Choose OpenSearch

When to Choose Elasticsearch

Index Design

One Index Per Locale

Faceted Search Architecture

Facet Types

Facet Interaction

Real-Time Sync from Source Systems

Event-Driven Sync (Recommended)

Scheduled Full Reindex

Handling Deletions

Relevance Tuning

Ranking Signals

Boosting In-Stock Products

Hybrid Search for Commerce

Common Pitfalls

Key Takeaways

Topics covered

Related Guides

Enterprise Guide to Agentic AI Systems

Agentic Commerce: How to Let AI Agents Buy Things Safely

The 9 Places Your AI System Leaks Data (and How to Seal Each One)

Ready to build production AI systems?

Get the Latest AI Insights

Services

Solutions

Company

Resources

Legal