Title: | Tabulate P.L. 94-171 Redistricting Data Summary Files |
---|---|
Description: | Tools to process legacy format summary redistricting data files produced by the United States Census Bureau pursuant to P.L. 94-171. These files are generally available earlier but are difficult to work with as-is. |
Authors: | Cory McCartan [aut, cre], Christopher T. Kenny [aut] |
Maintainer: | Cory McCartan <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.2 |
Built: | 2024-10-25 04:02:10 UTC |
Source: | https://github.com/corymccartan/PL94171 |
Downloads crosswalks from https://www.census.gov/geographies/reference-files/time-series/geo/relationship-files.html. Adjusts land overlap area to ensure weights sum to 1.
pl_crosswalk(abbr, from_year = 2010L, to_year = from_year + 10L)
pl_crosswalk(abbr, from_year = 2010L, to_year = from_year + 10L)
abbr |
the state to download the crosswalk for. |
from_year |
the year with the blocks that the data is currently tabulated with respect to. |
to_year |
the year with the blocks that the data should be tabulated into. |
A tibble, with two sets of GEOIDs and overlap information.
## Not run: # Takes a bit of time to run pl_crosswalk("RI", 2010, 2020) ## End(Not run)
## Not run: # Takes a bit of time to run pl_crosswalk("RI", 2010, 2020) ## End(Not run)
This data contains a subset of the 2020 prototype PL data
data("pl_ex")
data("pl_ex")
list of tibbles containing the four PL files.
00001
Tables P1 and P2
00002
Tables P3, P4, and H1
00003
Table P5
geo
geographic header file
data(pl_ex)
data(pl_ex)
This dataset is tibble version of the descriptions of (potentially) available summary levels within the P.L. 94-171 data, as described in the 2018 Redistricting Data Prototype (Public Law 94-171) Summary File documentation.
pl_geog_levels
pl_geog_levels
a tibble with two columns:
The three character summary level code
The summary level description
From the Census: "The Block Assignment Files (BAFs) are among the geographic products that the Census Bureau provides to states and other data users containing the small area census data necessary for legislative redistricting. The BAFs contain Census tabulation block codes and geographic area codes for a specific geographic entity type."
pl_get_baf(abbr, geographies = NULL, cache_to = NULL, refresh = FALSE)
pl_get_baf(abbr, geographies = NULL, cache_to = NULL, refresh = FALSE)
abbr |
the state abbreviation to get the BAF for |
geographies |
the geographies to get. Defaults to all available. |
cache_to |
the file name, if any, to cache the results to (as an RDS).
If a file exists and |
refresh |
if |
A list of tibbles, one for each available BAF geography.
pl_get_baf("RI") pl_get_baf("RI", "VTD")
pl_get_baf("RI") pl_get_baf("RI", "VTD")
These prototype shapefiles correspond to the Rhode Island end-to-end Census
test and the accompanying prototype P.L. 94-171 data. This function is
unlikely to be useful for working with any actual decennial Census data.
The corresponding tinytiger
or tigris
functions should be used instead.
pl_get_prototype( geog, year = 2020, full_state = TRUE, cache_to = NULL, clean_names = TRUE, refresh = FALSE )
pl_get_prototype( geog, year = 2020, full_state = TRUE, cache_to = NULL, clean_names = TRUE, refresh = FALSE )
geog |
Geography to download data for. See details for full list. |
year |
year, either 2010 or 2020 |
full_state |
whether to return the full state (TRUE) or the single county subset (FALSE) |
cache_to |
the file name, if any, to cache the results to (as an RDS).
If a file exists and |
clean_names |
whether to clean and rename columns |
refresh |
if |
Current acceptable arguments to geog include:
block
: block
block_group
: block group
tract
: tract
county
: county
state
: state
sld_low
: state legislative district lower house
sld_up
: state legislative district upper house
congressional_district
: federal congressional district
place
: Census place
voting_district
: voting tabulation district
An sf
object containing the blocks.
shp <- pl_get_prototype("block")
shp <- pl_get_prototype("block")
A (likely temporary) function to download TIGER shapefiles for 2020 voting tabulation districts (VTDs).
pl_get_vtd(abbr, cache_to = NULL, refresh = FALSE)
pl_get_vtd(abbr, cache_to = NULL, refresh = FALSE)
abbr |
Geography to download data for. See details for full list. |
cache_to |
the file name, if any, to cache the results to (as an RDS).
If a file exists and |
refresh |
if |
An sf
object containing the VTDs.
shp <- pl_get_vtd("RI")
shp <- pl_get_vtd("RI")
PL files come in one of four types and are pipe-delimited with no header row. This function speedily reads in the files and assigns the appropriate column names and types.
pl_read(path, ...) read_pl(path, ...)
pl_read(path, ...) read_pl(path, ...)
path |
a path to a folder containing PL files. Can also be path or a URL for a ZIP file, which will be downloaded and unzipped. |
... |
passed on to |
A list of data frames containing the four PL files.
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) # or try `pl_read(pl_url("RI", 2010))`
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) # or try `pl_read(pl_url("RI", 2010))`
Applies a block crosswalk to a table of block data using areal interpolation. That is, the fraction of land area in the overlapping region between old and new blocks is used to divide the population of the old blocks into the new.
pl_retally(d_from, crosswalk)
pl_retally(d_from, crosswalk)
d_from |
The data frame to process. All numeric columns will be re-tallied. Integer columns will be re-tallied with rounding. Character columns will be preserved if constant across new block geometries. |
crosswalk |
The crosswalk data frame, from |
All numeric columns will be re-tallied. Integer columns will be re-tallied with rounding. Character columns will be preserved if constant across new block geometries.
Blocks from other states will be ignored.
A new data frame, like d_from
, except with the geometry column
dropped, if one exists. New geometry should be loaded, perhaps with
tinytiger::tt_blocks()
.
crosswalk = pl_crosswalk("RI", 2010, 2020) RI_2010 = pl_tidy_shp("RI", pl_url("RI", 2010), 2010) pl_retally(RI_2010, crosswalk)
crosswalk = pl_crosswalk("RI", 2010, 2020) RI_2010 = pl_tidy_shp("RI", pl_url("RI", 2010), 2010) pl_retally(RI_2010, crosswalk)
Selects the standard set of basic population groups and VAP groups. Optionally renames them from the PXXXYYYY naming convention (where XXX is the table and YYYY is the variable) to more human readable names. pop_* is the total population, from tables 1 and 2, while vap_* is the 18+ population (voting age population).
pl_select_standard(pl, clean_names = TRUE)
pl_select_standard(pl, clean_names = TRUE)
pl |
A list of PL tables, as read in by |
clean_names |
whether to clean names |
If clean names=TRUE
, then the variables extracted are as follows:
\*_hisp
: Hispanic or Latino (of any race)
\*_white
: White alone, not Hispanic or Latino
\*_black
: Black or African American alone, not Hispanic or Latino
\*_aian
: American Indian and Alaska Native alone, not Hispanic or Latino
\*_asian
: Asian alone, not Hispanic or Latino
\*_nhpi
: Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino
\*_other
: Some Other Race alone, not Hispanic or Latino
\*_two
: Population of two or more races, not Hispanic or Latino
where \* is pop
or vap
.
A tibble with the selected and optionally renamed columns
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) pl <- pl_select_standard(pl)
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) pl <- pl_select_standard(pl)
This subsets a pl table to a desired summary level. Typical choices include:
'750': block
'150': block group
'630': voting district
'050': county
pl_subset(pl, sumlev = "750")
pl_subset(pl, sumlev = "750")
pl |
A list of PL tables, as read in by |
sumlev |
the summary level to filter to. A 3 character SUMLEV code. Default is '750' for blocks. |
All summary levels are listed in pl_geog_levels.
tibble
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) pl <- pl_subset(pl)
pl_ex_path <- system.file('extdata/ri2018_2020Style.pl', package = 'PL94171') pl <- pl_read(pl_ex_path) pl <- pl_subset(pl)
Downloads block geography and merges with the cleaned PL 94-171 file.
pl_tidy_shp(abbr, path, year = 2020, type = c("blocks", "vtds"), ...)
pl_tidy_shp(abbr, path, year = 2020, type = c("blocks", "vtds"), ...)
abbr |
The state to make the shapefile for |
path |
The path to the PL files, as in |
year |
The year to download the block geography for. Should match the year of the PL files. |
type |
If |
... |
passed on to |
an sf
object with demographic and shapefile information for the
state.
pl_ex_path <- system.file("extdata/ri2018_2020Style.pl", package = "PL94171") pl_tidy_shp("RI", pl_ex_path)
pl_ex_path <- system.file("extdata/ri2018_2020Style.pl", package = "PL94171") pl_tidy_shp("RI", pl_ex_path)
Get the URL for PL files for a particular state and year
pl_url(abbr, year = 2010)
pl_url(abbr, year = 2010)
abbr |
The state to download the PL files for |
year |
The year of PL file to download. Supported years: 2000, 2010, 2020 (after release). 2000 files are in a different format. Earlier years available on tape or CD-ROM only. |
a character vector containing the URL to a ZIP containing the PL files.
pl_url("RI", 2010)
pl_url("RI", 2010)