QCEW Employment Data
The QCEW (Quarterly Census of Employment and Wages) module provides access to employment and wage data from the Bureau of Labor Statistics for Babylon’s simulation systems.
Overview
QCEW is integrated into Babylon to provide:
County-level employment: Detailed establishment and worker counts by NAICS industry
State-level aggregates: Broader geographic patterns for macro analysis
Metro area data: MSA, Micropolitan, and CSA aggregates for urban economic analysis
Location quotients: Industry concentration metrics for comparative analysis
Class composition: NAICS industry mapping to Marxian class categories
All data is stored in data/sqlite/marxist-data-3NF.sqlite using a normalized
star schema.
Setup
No setup required for API loading. QCEW uses a hybrid loading strategy:
API (2021-2025): Fetches directly from BLS QCEW Open Data API
Files (2013-2020): Reads from local CSV files (optional, for historical data)
For file-based loading of historical years, download CSV files from: https://www.bls.gov/cew/downloadable-data-files.htm
Loading Data
Command Line
# Default hybrid loading (API for 2021+, files for historical)
mise run data:qcew
# Specific year range
mise run data:qcew -- --years 2020-2023
# Force API for all years (may fail for old years not in API)
mise run data:qcew -- --force-api
# Force file-based loading (requires downloaded CSVs)
mise run data:qcew -- --force-files --data-path data/qcew
# Reset tables before loading
mise run data:qcew -- --reset
Python API
from babylon.data.qcew import QcewLoader, QcewAPIClient
from babylon.data.loader_base import LoaderConfig
from babylon.data.reference.database import get_normalized_session
# Hybrid loading (default)
config = LoaderConfig(qcew_years=list(range(2013, 2026)))
loader = QcewLoader(config)
with get_normalized_session() as session:
stats = loader.load(session, reset=True)
print(f"County: {stats.facts_loaded.get('qcew_county', 0):,}")
print(f"State: {stats.facts_loaded.get('qcew_state', 0):,}")
print(f"Metro: {stats.facts_loaded.get('qcew_metro', 0):,}")
# Direct API access
with QcewAPIClient() as client:
# Fetch California state-level data for 2023
for record in client.get_area_annual_data(2023, "06000"):
print(f"{record.industry_code}: {record.annual_avg_emplvl:,} workers")
Geographic Levels
QCEW data is loaded at three geographic levels based on aggregation level codes:
Level |
agglvl_code |
Fact Table |
Description |
|---|---|---|---|
County |
70-78 |
|
3,200+ US counties |
State |
20-28 |
|
52 states/territories |
MSA |
30-38 |
|
Metropolitan Statistical Areas |
Micropolitan |
40-48 |
|
Micropolitan Statistical Areas |
CSA |
50-58 |
|
Combined Statistical Areas |
Database Schema
Fact Tables
fact_qcew_annual (County Level)
Column |
Type |
Description |
|---|---|---|
county_id |
FK |
Reference to dim_county |
industry_id |
FK |
Reference to dim_industry |
ownership_id |
FK |
Reference to dim_ownership |
time_id |
FK |
Reference to dim_time |
establishments |
INT |
Number of business establishments |
employment |
INT |
Annual average employment |
total_wages_usd |
DECIMAL |
Total annual wages paid |
avg_weekly_wage_usd |
INT |
Average weekly wage per worker |
avg_annual_pay_usd |
INT |
Average annual pay per worker |
lq_employment |
DECIMAL |
Location quotient for employment |
lq_annual_pay |
DECIMAL |
Location quotient for pay |
disclosure_code |
VARCHAR |
Data suppression indicator |
fact_qcew_state_annual and fact_qcew_metro_annual follow the same
structure, with FK references to dim_state and dim_metro_area respectively.
NAICS Hierarchy Levels
The dim_industry table includes a naics_level column indicating
each code’s position in the NAICS hierarchy:
naics_level |
Meaning |
Example |
|---|---|---|
0 |
Grand total (all industries) |
|
2 |
Sector (includes compound codes) |
|
3 |
Subsector |
|
4 |
Industry group |
|
5 |
NAICS industry |
|
6 |
National industry (leaf) |
|
98 |
BLS ownership variants |
(special codes) |
99 |
BLS supersectors |
|
Warning
QCEW reports wages at every hierarchy level. Parent-level wages
include all children. Summing across levels without filtering
produces catastrophic double-counting (empirically 10.4x in Wayne
County). Always filter to naics_level = 6 when aggregating
wages. See The NAICS Hierarchy and Double-Counting for full explanation.
National Wages Cache Table
InterpolatingBEASource maintains a materialized cache table
_cache_national_wages_bea to avoid repeated aggregation of the
43M-row fact_qcew_annual table. The cache stores pre-aggregated
national wages per BEA industry per year.
Column |
Type |
Description |
|---|---|---|
bea_industry_id |
INT |
FK to |
year |
INT |
Data year |
national_wages_usd |
REAL |
Sum of |
cache_version |
INT |
Schema version (current: 2). Stale caches are auto-rebuilt. |
The cache is built on first use and persists across sessions. When the
cache_version column is missing or does not match the current version,
the table is dropped and rebuilt with leaf-only NAICS filtering.
SQL Queries
Employment by Class Composition
SELECT
sec.class_composition,
SUM(q.employment) as total_employment,
ROUND(SUM(q.total_wages_usd) / 1e9, 2) as wages_billions
FROM fact_qcew_annual q
JOIN dim_industry i ON q.industry_id = i.industry_id
JOIN dim_sector sec ON i.sector_code = sec.sector_code
JOIN dim_time t ON q.time_id = t.time_id
WHERE t.year = 2023 AND sec.class_composition IS NOT NULL
GROUP BY sec.class_composition
ORDER BY total_employment DESC;
State-Level Industry Concentration
SELECT
s.state_name,
i.naics_title,
q.lq_employment as concentration,
q.employment
FROM fact_qcew_state_annual q
JOIN dim_state s ON q.state_id = s.state_id
JOIN dim_industry i ON q.industry_id = i.industry_id
JOIN dim_time t ON q.time_id = t.time_id
WHERE t.year = 2023
AND q.lq_employment > 2.0 -- Highly concentrated
ORDER BY q.lq_employment DESC
LIMIT 20;
Metro Area Manufacturing Employment
SELECT
m.metro_name,
q.area_type,
SUM(q.employment) as manufacturing_jobs,
ROUND(AVG(q.avg_annual_pay_usd), 0) as avg_pay
FROM fact_qcew_metro_annual q
JOIN dim_metro_area m ON q.metro_area_id = m.metro_area_id
JOIN dim_industry i ON q.industry_id = i.industry_id
JOIN dim_time t ON q.time_id = t.time_id
WHERE i.naics_code LIKE '31%'
OR i.naics_code LIKE '32%'
OR i.naics_code LIKE '33%'
AND t.year = 2023
GROUP BY m.metro_name, q.area_type
ORDER BY manufacturing_jobs DESC
LIMIT 20;
API Client
The QcewAPIClient class provides direct access to the BLS QCEW Open Data API:
from babylon.data.qcew import QcewAPIClient, QcewAPIError
with QcewAPIClient() as client:
try:
# Fetch by area (state/county FIPS code)
records = list(client.get_area_annual_data(2023, "01001")) # Autauga County, AL
# Fetch by industry (NAICS code)
records = list(client.get_industry_annual_data(2023, "31-33")) # Manufacturing
except QcewAPIError as e:
print(f"API error {e.status_code}: {e.message}")
API Features:
Rate limiting: 0.5s delay between requests (polite to BLS servers)
Retry logic: Exponential backoff on 429/5xx errors
CSV parsing: Automatic parsing of BLS CSV responses
Error handling: Distinguishes 404 (missing data) from server errors
Marxian Analysis
QCEW data supports Marxian class analysis through:
Class Composition by Sector
Industries are mapped to class composition categories in dim_sector:
goods_producing- Manufacturing, construction (productive labor)service_producing- Services, retail (mixed productive/unproductive)circulation- Finance, real estate (unproductive, surplus redistribution)government- Public administration (social reproduction)extraction- Mining, resources (ground rent extraction)
Location Quotients for Labor Market Analysis
Location quotients (LQ) reveal industry concentration:
LQ > 1.25: Industry concentrated (potential labor aristocracy formation)
LQ < 0.75: Industry underrepresented (labor mobility constraints)
Wage Differentials
Cross-geographic wage comparisons reveal unequal exchange dynamics:
Core metro areas: Higher wages, higher productivity
Periphery counties: Lower wages, extraction patterns
See Also
The NAICS Hierarchy and Double-Counting - Why NAICS hierarchy causes double-counting and the leaf-only filtering solution
FRED Economic Data - FRED macroeconomic data
Census Data Analysis for Babylon - Census ACS demographics
BEA Industry to Marxian Department Mapping - BEA industry to Marxian department mapping
babylon.data.qcew- API reference