Quick Reference
Key Entity Tables
| Domain | Core Tables | ERD Section |
|---|---|---|
| Products | products, brands,
product_categories,
product_attributes,
product_images |
§9a |
| Suppliers | suppliers, x_supplier_brands,
x_supplier_products,
supplier_available_products |
§9a |
| Locations | locations,
location_relationships,
location_settings |
§9a |
| Pricing | pricing_lists, price_tiers,
currencies |
§9a + §9d |
| Customers | customers,
customer_types |
§9a |
| Sales | sales_orders,
sales_order_lines,
sales_order_line_updates |
§9b |
| Purchasing | purchase_orders,
purchase_order_lines |
§9b |
| Transfers | transfer_orders,
transfer_order_lines,
store_orders |
§9b |
| Behaviors | product_behaviors,
supplier_behaviors,
location_behaviors,
category_behaviors |
§9c |
| Demand | demand_forecasts,
demand_forecast_products |
§9c |
| Inventory | inventory_quantities,
inventory_strategies,
inventory_targets |
§9c |
| Recommendations | po_recommendations,
to_recommendations,
assortment_recommendations |
§9c |
| Data Quality | data_quality_runs,
data_quality_checks,
data_quality_check_reports,
data_health_reports |
§9c |
| Statistics | daily_*, weekly_*,
monthly_*, quarterly_*
(16 tables) |
§10 |
| Global | countries, currencies,
client_config, users,
roles |
§9d |
13-Step Pipeline Overview
| Step | Name | Input | Output |
|---|---|---|---|
| 1 | Data Ingestion | ERP APIs, Excel, Market Data | S3 raw-data/ (JSON, CSV, XLSX) |
| 2 | DataQualityGateway | S3 raw JSON | Staging DB checks → gate decision |
| 3 | Transform & Load | S3 raw JSON / Excel | DB core tables (§9a, §9b) |
| 4 | DataHealthMonitor | Freshly loaded DB tables | Warning logs + data_health_reports |
| 5 | BehaviorCalculator | Core tables | *_behaviors tables (§9c) |
| 6 | Assortment Optimizer | Behaviors + settings | assortment_recommendations |
| 7 | Demand Estimation | Behaviors + sales history | demand_forecasts, demand_forecast_products |
| 8 | Aggregation | Core + behaviors | _stats schema (16 tables) |
| 9 | AI/ML Preprocessor | DB behaviors + forecasts | S3 features.parquet |
| 10 | AI/ML Engine | S3 feature Parquet | S3 models (.pkl/.json) + config |
| 11 | Inventory Strategy (MEIO) | Forecasts + settings | inventory_targets |
| 12 | AI/ML Inference | S3 models + DB targets + inventory | forecasts + recommendations → DB |
| 13 | UpliftMonitor | Recommendations + actuals | *_recommendation_impacts |
Design Principles (abridged)
| # | Principle |
|---|---|
| 1 | Data recoverability & reproducibility (S3 = source of truth) |
| 2 | Idempotent transforms (INSERT ... ON CONFLICT DO UPDATE) |
| 3 | Multi-tenancy & client isolation (separate SQL Database schemas) |
| 4 | Per-client customization (subclass overrides) |
| 5 | Different input ERP data structures (client-specific transforms) |
| 6 | Strong cohesion, loose coupling |
| 7 | Maintainability & easy extensibility |
| 8 | Data scientists work with Parquet |
| 9 | DB-level domain constraints (lookup tables + FKs) |
| 10 | Audit-proof configuration (append-only, versioned) |
| 11 | Explicit naming (no abbreviations, x_ junctions) |
| 12 | No cyclical data dependencies (strict DAG) |
| 13 | Minimal nullable FKs (5 approved exceptions) |
| 14 | No polymorphic tables (per-entity typed tables) |
| 15 | 4NF: No bare string enums (FK to lookup tables) |
| 16 | Cached columns with documented authority + staleness detection |
| 17 | Structured over schemaless (JSONB only for dynamic data) |
| 18 | Aggressive DB-level constraints (UNIQUE, NOT NULL, CHECK) |
| 19 | Serialized pipeline execution (one run at a time per schema) |
| 20 | No hard deletes on ERP-sourced entities (soft-delete only) |
| 21 | Atomic pipeline steps (per-step transactions, rollback on failure) |
| 22 | Date partitioning for time-series tables (monthly/quarterly) |
| 23 | Dates client-local, timestamps UTC |
| 24 | Data quality gating (hard gate before DB load) |
Glossary
| Term | Definition |
|---|---|
| Echelon | A level in the supply chain hierarchy (Store → DC → Warehouse). Each echelon has its own inventory strategies. |
| MEIO | Multi-Echelon Inventory Optimization — coordinated inventory targets across the full supply chain. |
| Newsvendor | Single-period stochastic inventory model. Balances overage cost (Co) vs. underage cost (Cu) using critical ratio Cu/(Cu+Co). |
| FMR | Fast / Medium / Rare — segmentation by % days with sales. |
| ABC | Pareto classification by cumulative revenue — A (top 80%), B (80–95%), C (95–100%). |
| XYZ | Demand variability — X (CV ≤ 0.5), Y (CV ≤ 1.0), Z (CV > 1.0). |
| Hotness Score | Composite 0–100 product ranking: velocity (0.25) + trend (0.15) + shelf (0.15) + margin (0.15) + volume (0.30). |
| Phase-out Score | Algorithmic dead / declining product detection based on sales recency, velocity, and supplier availability. |
| Upsert | INSERT ... ON CONFLICT DO UPDATE —
idempotent write ensuring re-running a transform produces
identical DB state. |
| UK | UNIQUE constraint (single-column, marked in ERD column definitions). |
| Junction table | Many-to-many cross table, prefixed with x_
(e.g. x_supplier_products). |
| CASCADE | Impact severity: change propagates through 4+ pipeline steps, affects forecasts AND recommendations. |
| GATE | Impact severity: binary on/off — includes or excludes an entity from the entire pipeline. |
| _stats schema | Derivative statistics schema
(e.g. cw_200_stats). Droppable and
rebuildable from the core schema. |
| cw_global | Shared reference schema for cross-client data: geography, currencies, users, client configuration. |