---
title: "NIH Enrollment Tables in HTML"
author: "Will Beasley and Peter Higgins"
date: '`r Sys.Date()`'
output:
html_document:
theme: simplex
toc: true
toc_depth: 2
toc_float: true
number_sections: true
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{NIH Enrollment Tables in HTML}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r declare, echo = F}
```
Install `codified` Package
-----------------------------------------------------
First, install the `codified` package if necessary, and then [load it into memory](http://r-pkgs.had.co.nz/package.html#package).
```{r install-codified}
# if( !requireNamespace("codified", quietly = TRUE) )
# remotes::install_github(repo = "OuhscBbmc/codified")
library(codified)
```
Local Data Source
-----------------------------------------------------
### Establish Datasets
```{r local-establish}
path <- system.file("misc/example-data-1.csv", package = "codified")
col_types <- readr::cols_only(
record_id = readr::col_integer(),
name_last = readr::col_character(),
dob = readr::col_date(format = ""),
gender = readr::col_integer(),
race = readr::col_integer(),
ethnicity = readr::col_integer()
)
ds <- readr::read_csv(path, col_types = col_types) |>
dplyr::mutate(
gender = as.character(gender),
race = as.character(race),
ethnicity = as.character(ethnicity)
)
ds |>
head(10) |>
knitr::kable(caption = "Observed Dataset (first ten rows)")
ds_lu_gender <- tibble::tribble(
~input, ~displayed ,
"0" , "Female" ,
"1" , "Male" ,
"U" , "Unknown/Not Reported"
)
knitr::kable(ds_lu_gender, caption = "Gender Mapping")
ds_lu_race <- tibble::tribble(
~input , ~displayed ,
"1" , "American Indian/Alaska Native" ,
"2" , "Asian" ,
"3" , "Native Hawaiian or Other Pacific Islander" ,
"4" , "Black or African American" ,
"5" , "White" ,
"M" , "More than One Race" ,
"6" , "Unknown or Not Reported"
)
knitr::kable(ds_lu_race, caption = "Race Mapping")
ds_lu_ethnicity <- tibble::tribble(
~input, ~displayed ,
"2" , "Not Hispanic or Latino" ,
"1" , "Hispanic or Latino" ,
"0" , "Unknown/Not Reported Ethnicity"
)
knitr::kable(ds_lu_ethnicity, caption = "Ethnicity Mapping")
```
### Apply Map to Observed Dataset
```{r local-apply-map}
ds_summary_long <- codified::table_nih_enrollment(
d = ds,
d_lu_gender = ds_lu_gender,
d_lu_race = ds_lu_race,
d_lu_ethnicity = ds_lu_ethnicity
)
knitr::kable(ds_summary_long, caption = "Counts of Each Subgroup")
```
### Conform to NIH Cosmetics
```{r local-cosmetically-format}
codified::table_nih_enrollment_pretty(
d = ds,
d_lu_gender = ds_lu_gender,
d_lu_race = ds_lu_race,
d_lu_ethnicity = ds_lu_ethnicity
)
```
REDCap Data Source
-----------------------------------------------------
A hosted (fake) clinical trial dataset demonstrates how to extract demographic data from [REDCap](https://projectredcap.org/) and then present the demographic data in the NIH Inclusion Enrollment Report format.
### Establish Datasets
First, install the `REDCapR` package if necessary, and then [load it into memory](http://r-pkgs.had.co.nz/package.html#package).
```{r install-redcapr}
# if( !requireNamespace("REDCapR", quietly = TRUE) )
# remotes::install_github("OuhscBbmc/REDCapR")
library(REDCapR)
```
Next, download the data from the REDCap database into the `ds_2` data.frame. If you're running the most recent version of REDCapR (available on GitHub), the code will be:
```r
ds_2 <- REDCapR::redcap_read(
redcap_uri = "https://bbmc.ouhsc.edu/redcap/api/", # URL of REDCap Server.
token = "F304DEC3793FECC3B6DEEFF66302CAD3", # User-specific token/password.
guess_type = FALSE # Keep all variables as strings/characters.
)$data
```
However [CRAN policy](https://cran.r-project.org/web/packages/policies.html) understandably discourages vignettes from using "Internet resources" so this vignette mimics the code above with this local call. On your own computer, feel free to call that demonstration REDCap project.
```{r redcap-establish}
ds_2 <-
readr::read_csv(
file = system.file("misc/example-data-2.csv", package = "codified"),
col_types = readr::cols(.default = readr::col_character())
)
```
### Conform to NIH Cosmetics
Now, convert these demographic data into a properly formatted NIH enrollment table. Pass the `ds_lu_gender`, `ds_lu_race`, and `ds_lu_ethnicity` metadata, which was defined above. As a reminder, these translate values like `1` to `Male` and `3` to `Native Hawaiian or Other Pacific Islander`.
```{r redcap-local-cosmetically-format}
table_nih_enrollment_pretty(
d = ds_2,
d_lu_gender = ds_lu_gender,
d_lu_race = ds_lu_race,
d_lu_ethnicity = ds_lu_ethnicity
)
```
Collapsing Levels
-----------------------------------------------------
Many observed datasets may collect race with a different set of levels. For instance in `ds_3`, the `American Indian` level is separate from the `Alaska Native` level. In the first and second rows in the metadata below, the two levels are effectively combined into the`American Indian/Alaska Native` level, so it complies with the format of the NIH Enrollment table.
```{r collapsing-levels}
ds_lu_race_3 <- tibble::tribble(
~input , ~displayed ,
"American Indian" , "American Indian/Alaska Native" , # Combine w/ Alaska Native
"Alaska Native" , "American Indian/Alaska Native" , # Combine w/ American Indian
"Asian" , "Asian" ,
"Native Hawaiian" , "Native Hawaiian or Other Pacific Islander" , # Combine w/ Pacific Islanders
"Pacific Islander" , "Native Hawaiian or Other Pacific Islander" , # Combine w/ Hawaiian
"Black or African American" , "Black or African American" ,
"White" , "White" ,
"More than One Race" , "More than One Race" ,
"Unknown or Not Reported" , "Unknown or Not Reported"
)
ds_3 <- tibble::tribble(
~subject_id, ~gender , ~race , ~ethnicity ,
1L, "Female" , "American Indian", "Not Hispanic or Latino" ,
2L, "Male" , "American Indian", "Not Hispanic or Latino" ,
3L, "Male" , "American Indian", "Not Hispanic or Latino" ,
4L, "Female" , "Alaska Native" , "Not Hispanic or Latino" ,
5L, "Male" , "Alaska Native" , "Not Hispanic or Latino" ,
6L, "Male" , "Alaska Native" , "Not Hispanic or Latino" ,
7L, "Male" , "White" , "Not Hispanic or Latino" ,
8L, "Male" , "White" , "Not Hispanic or Latino"
)
table_nih_enrollment_pretty(
d = ds_3,
d_lu_race = ds_lu_race_3
)
```
Feedback Welcome
-----------------------------------------------------
Please share your opinions with us by creating an issue at https://github.com/OuhscBbmc/codified/issues.
### Upcoming Features
1. Create tables in a fresh docx file with [flextable](https://davidgohel.github.io/flextable/). [Issue #5](https://github.com/OuhscBbmc/codified/issues/5)
1. Create tables in a fresh pdf file [Issue #5](https://github.com/OuhscBbmc/codified/issues/5).
1. Populate tables in an existing pdf file with [staplr](https://CRAN.R-project.org/package=staplr)?
1. The 'totals' column & row are calculated. The NIH form does this already, so this is mostly for the purpose of checking. [Issue #4](https://github.com/OuhscBbmc/codified/issues/4).
1. Allow variable names to change (*e.g.*, `Gender` instead of `gender`). [Issue #3](https://github.com/OuhscBbmc/codified/issues/3).
1. [pkgdown site](https://pkgdown.r-lib.org/), like [REDCapR](https://ouhscbbmc.github.io/REDCapR/).
### Group discussion
* Any holes in the functionality or concept?
* What else (*e.g.*, features or explanation) do you think would be helpful?
* Are people able to run this package? Is the vignette clear?
* What other standardized tables/graphs/whatever could this be applied to?
* IRB continuation tables?
* Graphs?