Tensor Hierarchy Database Schema
Data dictionary for the six SQLite tables added in Feature 025 (tensor
hierarchy). These tables extend the existing 3NF schema in
src/babylon/data/reference/schema.py.
For architectural context, see Tensor Hierarchy Architecture. For the loader APIs that populate these tables, see BEA Input-Output Tables and BTS FAF5 Freight Analysis Data.
BEA I-O Tables
dim_bea_io_table_type
Classifies which BEA Input-Output table a coefficient was derived from. One row per table type (USE, MAKE, SUPPLY, TOTAL_REQ).
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Auto-increment primary key. |
|
VARCHAR(20) |
NOT NULL |
BEA table identifier. Unique. Check constraint:
|
|
TEXT |
NULL |
Human-readable description of the table type. |
Allowed values for table_type:
Value |
Description |
|---|---|
|
Use table at Producers’ prices (IOUse_Before_Redefinitions_PRO). Commodity use by industry. Source for direct requirements A[i,j]. This is the primary value loaded by BEAIOLoader. |
|
Make table (IOMake_Before_Redefinitions_PRO). Industry-by-commodity output structure. |
|
Supply table. Total supply of each commodity from all sources (domestic production + imports). |
|
Total requirements table (Leontief inverse) as published by BEA. Alternative to computing L = (I − A)⁻¹ internally. |
Populated by: BEAIOLoader.
fact_bea_io_coefficient
Sparse storage of BEA Input-Output direct requirements coefficients A[i,j] by year and industry pair. Zero entries are omitted (sparse storage).
One row per (year, table_type, source_industry, target_industry) combination. A[i,j] = dollar value of industry i’s output required per dollar of industry j’s output.
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Auto-increment primary key. |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
FLOAT |
NOT NULL |
Direct requirements coefficient A[i,j]. Dollar value of industry i output required per dollar of industry j output. Range: [0, 1). Zero values are not stored (sparse storage). |
Unique constraint: (time_id, table_type_id, source_industry_id, target_industry_id)
named uq_bea_io_coeff. Enables idempotent upsert on reload.
Indexes:
idx_bea_io_coeff_timeontime_ididx_bea_io_coeff_table_typeontable_type_ididx_bea_io_coeff_sourceonsource_industry_ididx_bea_io_coeff_targetontarget_industry_id
Note on matrix semantics: A[i,j] uses source=row, target=column convention.
source_industry_id corresponds to row i (the input commodity industry);
target_industry_id corresponds to column j (the using industry). To
reconstruct the full matrix for year Y:
coeffs = session.query(FactBEAIOCoefficient).filter(time_id=Y, table_type=USE)
matrix[src_idx, tgt_idx] = coeff.coefficient
Populated by: BEAIOLoader.
CFS Area and FAF Flow Tables
dim_cfs_area
Census Commodity Flow Survey (CFS) geographic areas — the ~130 zones used by the Bureau of Transportation Statistics for freight analysis.
CFS areas are aggregations of one or more counties. They provide a geographic resolution between individual counties (~3,100) and states (51). The exact count varies by FAF5 version; FAF5.7.1 uses approximately 132 zones.
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Auto-increment primary key. |
|
VARCHAR(10) |
NOT NULL |
CFS Area code as string (e.g., |
|
VARCHAR(200) |
NOT NULL |
Human-readable CFS area name (e.g., |
|
INTEGER |
NULL |
Foreign key → |
Index: idx_cfs_area_state on state_id.
Populated by: FAFLoader (on-demand,
one record per unique CFS code encountered in FAF5 CSV).
bridge_cfs_county
Many-to-many junction table mapping each CFS Area to the counties it contains, with allocation weights. Weights represent what fraction of the CFS area’s freight is attributable to each constituent county.
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
NUMERIC(10,8) |
NOT NULL |
Fraction of CFS area allocated to this county. For all counties in a CFS area, weights sum to 1.0. |
Primary key: Composite (cfs_area_id, county_id).
Allocation weight semantics: To disaggregate a CFS-area-level value (e.g., imperial rent φ[a]) to county level:
county_value = phi_cfs_area * allocation_weight
Weights are typically derived from population shares, employment shares, or employment-weighted population proxies. The bridge table is not populated by FAFLoader; it requires a separate county-mapping ETL step.
dim_sctg_commodity
Standard Classification of Transported Goods (SCTG) commodity codes. The SCTG classification is used by the Census Bureau’s Commodity Flow Survey and the BTS Freight Analysis Framework. FAF5 uses 2-digit SCTG codes (42 codes, numbered 01–43 with code 40 unused).
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Auto-increment primary key. |
|
VARCHAR(5) |
NOT NULL |
Zero-padded 2-digit SCTG code (e.g., |
|
VARCHAR(200) |
NOT NULL |
Commodity description (e.g., |
|
VARCHAR(50) |
NULL |
Broad commodity category: |
|
VARCHAR(20) |
NULL |
Strategic importance for simulation events.
Check constraint: |
Indexes: idx_sctg_category on category.
Populated by: FAFLoader (on-demand,
one record per unique sctg2 code encountered in FAF5 CSV).
fact_faf_commodity_flow
BTS FAF5 commodity flows at CFS Area geographic resolution. One row per (origin, destination, commodity, mode, year) combination.
Distinct from fact_commodity_flow (county-level Census CFS data).
This table preserves FAF5’s native ~130-area resolution rather than
disaggregating to counties.
Column |
Type |
Null |
Description |
|---|---|---|---|
|
INTEGER |
NOT NULL |
Auto-increment primary key. |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Foreign key → |
|
INTEGER |
NOT NULL |
Calendar year of the flow estimate (e.g., 2017, 2022). |
|
NUMERIC(14,2) |
NULL |
Estimated shipment value in millions of USD.
|
|
NUMERIC(14,2) |
NULL |
Estimated shipment weight in thousands of short tons.
|
|
NUMERIC(14,2) |
NULL |
Estimated ton-miles in millions.
|
|
VARCHAR(10) |
NULL |
Transport mode code as string: |
Indexes:
idx_faf_flow_originonorigin_cfs_area_ididx_faf_flow_destondest_cfs_area_ididx_faf_flow_sctgonsctg_ididx_faf_flow_yearonyear
Sparsity note: All three flow value fields use NULL for zero rather
than storing explicit zeros. This reduces storage significantly since FAF5
has many zero-flow origin-destination pairs for specific commodity codes and
modes. When reconstructing the O-D matrix, treat NULL as 0.
Scale (2022 data): ~2.49 million records, representing $18.7 trillion in domestic commodity shipments across ~130 CFS areas, 42 SCTG codes, and 5 transport modes.
Populated by: FAFLoader.