|
The
purpose of the Cluster Mapping Project is to assemble a
detailed picture of the location and performance of industries
in the United States, with a special focus on the linkages or
externalities across industries that give rise to clusters.
The raw data for the project are County Business
Pattern data (excluding agriculture and government) on
employment, establishments, and wages by four-digit Standard
Industrial Classification (SIC) or North American Industry
Classification System (NAICS) by U.S. County. In
addition, U.S. patent by location of inventor are allocated to
industries and clusters using a concordance of technology
classifications with SIC codes. There are also confidentiality
limitations, which mean that the actual data are not disclosed
for every county and economic area in every industry. Various
techniques are used to compensate for missing data.
Economies are analyzed at various geographic levels,
including states, economic areas, metropolitan areas, and
counties.
All the industries in the economy are
separated into "traded" and "local" based on the degree of
industry dispersion across geographic areas. Local industries
are those present in most if not all geographic areas, are
evenly distributed, and hence primarily sell locally. Traded
industries are those that are concentrated in a subset of
geographic areas and sell to other regions and nations.
Among traded industries, clusters are identified using
the correlation of industry employment across geographic
areas. The principle is that industries normally located
together are those that are linked by some external economies.
These industries, then, constitute a cluster.
Clusters
are defined initially using state-level data (n=50). The
robustness of cluster composition is verified using Economic
Area as the geographical unit.
Clusters are
constructed using two approaches, which are then reconciled.
First, select a prominent "core" industry in a field or part
of the economy. Calculate the locational correlations of all
other industries with the core. Those industries with
statistically significant correlations with the core define
the extent of the cluster. Second, calculate locational
correlations between all pairs of industries in a general
field and potentially related fields. Those set of industries
with statistically significant and substantial
intercorrelations among each other define the cluster.
In both cases some industries may have spurious
correlations to a cluster because of the co-location of
several strong clusters in the same geographical area.
Spurious correlation is eliminated using Input-Output tables,
industry definitions, and industry knowledge.
Note
that a given industry can be part of more than one cluster.
This sometime reflects overly broad industry definitions.
However, it is also the case that there are multiple forms of
externalities, and some industries are suppliers or customers
of many other industries. Thus, overlapping clusters are
expected and their overlaps are important economically.
For the purposes of this Project, clusters definitions have
been "narrowed" so that each industry is assigned to only one
cluster. This allows a more intuitive understanding of the
economic composition of the regions.
For further discussion, please see an
in-depth discussion of
the cluster methodology (pdf) |
|
|