--- title: "NIH Enrollment Tables in HTML" author: "Will Beasley and Peter Higgins" date: '`r Sys.Date()`' output: html_document: theme: simplex toc: true toc_depth: 2 toc_float: true number_sections: true vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{NIH Enrollment Tables in HTML} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ```{r declare, echo = F} ``` Install `codified` Package ----------------------------------------------------- First, install the `codified` package if necessary, and then [load it into memory](http://r-pkgs.had.co.nz/package.html#package). ```{r install-codified} # if( !requireNamespace("codified", quietly = TRUE) ) # remotes::install_github(repo = "OuhscBbmc/codified") library(codified) ``` Local Data Source ----------------------------------------------------- ### Establish Datasets ```{r local-establish} path <- system.file("misc/example-data-1.csv", package = "codified") col_types <- readr::cols_only( record_id = readr::col_integer(), name_last = readr::col_character(), dob = readr::col_date(format = ""), gender = readr::col_integer(), race = readr::col_integer(), ethnicity = readr::col_integer() ) ds <- readr::read_csv(path, col_types = col_types) |> dplyr::mutate( gender = as.character(gender), race = as.character(race), ethnicity = as.character(ethnicity) ) ds |> head(10) |> knitr::kable(caption = "Observed Dataset (first ten rows)") ds_lu_gender <- tibble::tribble( ~input, ~displayed , "0" , "Female" , "1" , "Male" , "U" , "Unknown/Not Reported" ) knitr::kable(ds_lu_gender, caption = "Gender Mapping") ds_lu_race <- tibble::tribble( ~input , ~displayed , "1" , "American Indian/Alaska Native" , "2" , "Asian" , "3" , "Native Hawaiian or Other Pacific Islander" , "4" , "Black or African American" , "5" , "White" , "M" , "More than One Race" , "6" , "Unknown or Not Reported" ) knitr::kable(ds_lu_race, caption = "Race Mapping") ds_lu_ethnicity <- tibble::tribble( ~input, ~displayed , "2" , "Not Hispanic or Latino" , "1" , "Hispanic or Latino" , "0" , "Unknown/Not Reported Ethnicity" ) knitr::kable(ds_lu_ethnicity, caption = "Ethnicity Mapping") ``` ### Apply Map to Observed Dataset ```{r local-apply-map} ds_summary_long <- codified::table_nih_enrollment( d = ds, d_lu_gender = ds_lu_gender, d_lu_race = ds_lu_race, d_lu_ethnicity = ds_lu_ethnicity ) knitr::kable(ds_summary_long, caption = "Counts of Each Subgroup") ``` ### Conform to NIH Cosmetics ```{r local-cosmetically-format} codified::table_nih_enrollment_pretty( d = ds, d_lu_gender = ds_lu_gender, d_lu_race = ds_lu_race, d_lu_ethnicity = ds_lu_ethnicity ) ``` REDCap Data Source ----------------------------------------------------- A hosted (fake) clinical trial dataset demonstrates how to extract demographic data from [REDCap](https://projectredcap.org/) and then present the demographic data in the NIH Inclusion Enrollment Report format. ### Establish Datasets First, install the `REDCapR` package if necessary, and then [load it into memory](http://r-pkgs.had.co.nz/package.html#package). ```{r install-redcapr} # if( !requireNamespace("REDCapR", quietly = TRUE) ) # remotes::install_github("OuhscBbmc/REDCapR") library(REDCapR) ``` Next, download the data from the REDCap database into the `ds_2` data.frame. If you're running the most recent version of REDCapR (available on GitHub), the code will be: ```r ds_2 <- REDCapR::redcap_read( redcap_uri = "https://bbmc.ouhsc.edu/redcap/api/", # URL of REDCap Server. token = "F304DEC3793FECC3B6DEEFF66302CAD3", # User-specific token/password. guess_type = FALSE # Keep all variables as strings/characters. )$data ``` However [CRAN policy](https://cran.r-project.org/web/packages/policies.html) understandably discourages vignettes from using "Internet resources" so this vignette mimics the code above with this local call. On your own computer, feel free to call that demonstration REDCap project. ```{r redcap-establish} ds_2 <- readr::read_csv( file = system.file("misc/example-data-2.csv", package = "codified"), col_types = readr::cols(.default = readr::col_character()) ) ``` ### Conform to NIH Cosmetics Now, convert these demographic data into a properly formatted NIH enrollment table. Pass the `ds_lu_gender`, `ds_lu_race`, and `ds_lu_ethnicity` metadata, which was defined above. As a reminder, these translate values like `1` to `Male` and `3` to `Native Hawaiian or Other Pacific Islander`. ```{r redcap-local-cosmetically-format} table_nih_enrollment_pretty( d = ds_2, d_lu_gender = ds_lu_gender, d_lu_race = ds_lu_race, d_lu_ethnicity = ds_lu_ethnicity ) ``` Collapsing Levels ----------------------------------------------------- Many observed datasets may collect race with a different set of levels. For instance in `ds_3`, the `American Indian` level is separate from the `Alaska Native` level. In the first and second rows in the metadata below, the two levels are effectively combined into the`American Indian/Alaska Native` level, so it complies with the format of the NIH Enrollment table. ```{r collapsing-levels} ds_lu_race_3 <- tibble::tribble( ~input , ~displayed , "American Indian" , "American Indian/Alaska Native" , # Combine w/ Alaska Native "Alaska Native" , "American Indian/Alaska Native" , # Combine w/ American Indian "Asian" , "Asian" , "Native Hawaiian" , "Native Hawaiian or Other Pacific Islander" , # Combine w/ Pacific Islanders "Pacific Islander" , "Native Hawaiian or Other Pacific Islander" , # Combine w/ Hawaiian "Black or African American" , "Black or African American" , "White" , "White" , "More than One Race" , "More than One Race" , "Unknown or Not Reported" , "Unknown or Not Reported" ) ds_3 <- tibble::tribble( ~subject_id, ~gender , ~race , ~ethnicity , 1L, "Female" , "American Indian", "Not Hispanic or Latino" , 2L, "Male" , "American Indian", "Not Hispanic or Latino" , 3L, "Male" , "American Indian", "Not Hispanic or Latino" , 4L, "Female" , "Alaska Native" , "Not Hispanic or Latino" , 5L, "Male" , "Alaska Native" , "Not Hispanic or Latino" , 6L, "Male" , "Alaska Native" , "Not Hispanic or Latino" , 7L, "Male" , "White" , "Not Hispanic or Latino" , 8L, "Male" , "White" , "Not Hispanic or Latino" ) table_nih_enrollment_pretty( d = ds_3, d_lu_race = ds_lu_race_3 ) ``` Feedback Welcome ----------------------------------------------------- Please share your opinions with us by creating an issue at https://github.com/OuhscBbmc/codified/issues. ### Upcoming Features 1. Create tables in a fresh docx file with [flextable](https://davidgohel.github.io/flextable/). [Issue #5](https://github.com/OuhscBbmc/codified/issues/5) 1. Create tables in a fresh pdf file [Issue #5](https://github.com/OuhscBbmc/codified/issues/5). 1. Populate tables in an existing pdf file with [staplr](https://CRAN.R-project.org/package=staplr)? 1. The 'totals' column & row are calculated. The NIH form does this already, so this is mostly for the purpose of checking. [Issue #4](https://github.com/OuhscBbmc/codified/issues/4). 1. Allow variable names to change (*e.g.*, `Gender` instead of `gender`). [Issue #3](https://github.com/OuhscBbmc/codified/issues/3). 1. [pkgdown site](https://pkgdown.r-lib.org/), like [REDCapR](https://ouhscbbmc.github.io/REDCapR/). ### Group discussion * Any holes in the functionality or concept? * What else (*e.g.*, features or explanation) do you think would be helpful? * Are people able to run this package? Is the vignette clear? * What other standardized tables/graphs/whatever could this be applied to? * IRB continuation tables? * Graphs?