BEA Input-Output Tables
Reference for the Bureau of Economic Analysis (BEA) Input-Output data files
used in Feature 025, including file inventory, XLSX structure, the coefficient
computation formula, and the BEAIOLoader API.
For the theoretical background, see Input-Output Economics and the Leontief Inverse. For the schema tables these populate, see Tensor Hierarchy Database Schema.
Data Overview
The BEA publishes annual Input-Output accounts at three levels of industry detail: Detail (~400 industries), Summary (~70 industries), and Sector (~15). Babylon uses the Summary level because it aligns with the four-Marxian- department aggregation (see BEA Industry to Marxian Department Mapping).
Files are downloaded from the BEA I-O Accounts data page and stored in
data/input-output/ relative to the project root. The files are large
multi-sheet XLSX workbooks, one sheet per year.
File Inventory
File path |
Contents |
|---|---|
|
Primary source for BEAIOLoader. Use table at Producers’ prices, Summary level. Multi-sheet XLSX, one sheet per year (1997–2024). Contains commodity-by-industry intermediate use data. |
|
Use table at Purchasers’ prices, Summary level. Not used by default. |
|
Make table, Summary level (industry-by-commodity output structure). |
|
Use table at Producers’ prices, Sector level (~15 industries). |
|
Make table, Sector level. |
|
Supply table, Summary level. |
|
Use table (supply-use framework), Summary level. |
|
Industry-by-industry total requirements (Leontief inverse as published
by BEA). Useful for validation against |
|
Commodity-by-commodity total requirements, Summary level. |
|
Domestic-only total requirements (excludes imports), Summary level. |
Loader default: BEAIOLoader reads only
make-use/IOUse_Before_Redefinitions_PRO_Summary.xlsx.
XLSX Row Structure
Each sheet in the Use table XLSX (one sheet per year) follows this fixed row layout (0-indexed):
Row |
Content |
Example |
|---|---|---|
0 |
Title string |
|
1 |
Units |
|
2 |
Source |
|
3 |
Year |
|
4 |
Empty |
(blank row) |
5 |
Column codes |
|
6 |
Column names |
|
7+ |
Data rows |
|
Column extraction: Industry columns start at column index 2 of row 5.
Final-demand columns (prefixed F) are excluded. The parser identifies
industry columns as those whose row-5 code does not start with F.
Row codes: Each data row’s first cell (column 0) is the IOCode. Special rows are excluded from the industry list:
T001–T020: Total/subtotal rows (intermediate totals, final demand totals, gross output total T019)V001–V006: Value-added rows (compensation, gross operating surplus, etc.)
Only rows whose IOCode is not in the _SKIP_CODES set are treated as
industry rows.
Missing Data
BEA uses the string '...' to indicate suppressed or unavailable data.
The loader converts '...' to 0.0. This understates inter-industry
linkages in sectors with incomplete survey coverage but is the conservative
default.
Coefficient Computation
The Use table contains raw dollar values: Use[i,j] = millions USD of commodity i used by industry j as intermediate input.
The direct requirements coefficient is:
Gross output source: Row T019 ("T019" IOCode) contains gross output by
industry, if present. The loader reads this row into a gross_output dict
mapping industry code → gross output value (minimum 1.0 to avoid division by
zero).
Fallback when T019 is absent: If T019 is not found, gross output is approximated as column sum: total intermediate use plus value-added rows:
This fallback is less accurate but produces valid (if slightly off) coefficients.
BEAIOLoader API
- class BEAIOLoader
Loader for BEA I-O coefficients into the 3NF SQLite schema.
Reads
IOUse_Before_Redefinitions_PRO_Summary.xlsxfromdata/input-output/make-use/. Each sheet (named by year, e.g.,'1997','2021') is parsed and loaded independently.Prerequisite:
dim_bea_industrymust be pre-populated byBEANationalLoaderbefore BEAIOLoader can run. The industry lookup maps BEA codes tobea_industry_idvalues.- __init__(config=None, data_dir=None)
- Parameters:
config –
LoaderConfigfor batch size and operational settings. Default:LoaderConfig()withbatch_size=1000.data_dir – Base data directory. Default:
Path("data")relative to project working directory.
- load(session, reset=True, verbose=True, **kwargs) LoadStats
Load BEA I-O coefficients into 3NF schema.
- Parameters:
session – SQLAlchemy
Sessionconnected to the normalized DB.reset – If
True, delete existing records infact_bea_io_coefficientanddim_bea_io_table_typebefore loading. Default:True.verbose – If
True, log progress per sheet. Default:True.kwargs – Ignored (present for interface compatibility).
- Returns:
LoadStatswith fields:dimensions_loaded["dim_bea_io_table_type"]: 1 (USE record)facts_loaded["fact_bea_io_coefficient"]: total coefficient rowsfiles_processed: number of year sheets successfully parsederrors: list of error strings (empty on success)
- Raises:
Does not raise; errors are appended to
LoadStats.errors.
Tables populated:
dim_bea_io_table_type: Creates/retrieves theUSErecord.fact_bea_io_coefficient: Inserts non-zero coefficient rows for each year via upsert (conflict key:uq_bea_io_coeff).
Year range: Sheets named with valid years (1990–2030) are processed. Sheet names that do not parse as integers in this range are skipped.
Example:
from babylon.data.bea.io_loader import BEAIOLoader from babylon.data.reference.database import get_session_factory loader = BEAIOLoader() session_factory = get_session_factory() with session_factory() as session: stats = loader.load(session) print(f"Loaded {stats.facts_loaded['fact_bea_io_coefficient']} coefficients") print(f"Sheets processed: {stats.files_processed}")
IOMatrixParser API
- class IOMatrixParser
Internal parser for BEA I-O XLSX row data. Used by
BEAIOLoader; can be used independently for custom XLSX processing.- parse_rows(rows) tuple[list[str], np.ndarray]
Parse a list of row tuples (from
openpyxl iter_rows(values_only=True)) into industry codes and coefficient matrix.- Parameters:
rows – List of row tuples. Expected format: 7 header rows then data.
- Returns:
Tuple of
(industries, matrix)where:industries: Ordered list of BEA industry codes (columns).matrix: Direct requirements matrix A, shape(n, n),float64. Zero if insufficient data.
How to Obtain the Data
BEA I-O data files are already included in the repository under
data/input-output/. No manual download is required.
If refreshing to a newer BEA release:
Go to https://www.bea.gov/industry/input-output-accounts-data
Download “Use Tables/Before Redefinitions/Producers’ Prices/Summary” XLSX
Replace
data/input-output/make-use/IOUse_Before_Redefinitions_PRO_Summary.xlsxRe-run
BEAIOLoader.load()