Skip to contents

This function checks that the first column of the data set is the primary ID for each participant labeled as SUBJECT_ID, that values contain no illegal characters or padded zeros, and that each participant has an ID.

Usage

id_check(DS.data, verbose = TRUE)

Arguments

DS.data

Data set.

verbose

When TRUE, the function prints the Message out, as well as more detailed diagnostic information.

Value

Tibble, returned invisibly, containing: (1) Time (Time stamp); (2) Name (Name of the function); (3) Status (Passed/Failed); (4) Message (A copy of the message the function printed out); (5) Information (Detailed information about the four ID checks that were performed).

Details

Subject IDs should be an integer or string value. Integers should not have zero padding. IDs should not have spaces. Specifically, only the following characters can be included in the ID: English letters, Arabic numerals, period (.), hyphen (-), underscore (_), at symbol (@), and the pound sign (#). All IDs should be filled in (i.e., no misisng IDs are allowed).

Examples

# Example 1: Fail check, 'SUBJECT_ID' not present
data(ExampleO)
id_check(DS.data.O)
#> $Message
#> [1] "ERROR: not all ID variable requirements are met. See Information for more details."
#> 
#> $Information
#> # A tibble: 5 × 4
#>   check.name check.description                              check.status details
#>   <chr>      <chr>                                          <chr>        <chr>  
#> 1 Check 1    Column 1 is labeled as 'SUBJECT_ID'.           Failed       The fi…
#> 2 Check 2    'SUBJECT_ID' is a column name in the data set. Failed       'SUBJE…
#> 3 Check 3    'SUBJECT_ID' is a column name in the data set. Failed       Checks…
#> 4 Check 4    No leading zeros detected in 'SUBJECT_ID' col… Failed       Checks…
#> 5 Check 5    No missing values for 'SUBJECT_ID'.            Failed       Checks…
#> 
print(id_check(DS.data.O, verbose=FALSE))
#> # A tibble: 1 × 5
#>   Time                Function Status Message                        Information
#>   <dttm>              <chr>    <chr>  <chr>                          <named lis>
#> 1 2023-09-27 11:01:16 id_check Failed ERROR: not all ID variable re… <tibble>   

# Example 2: Fail check, 'SUBJECT_ID' includes illegal spaces
data(ExampleP)
id_check(DS.data.P)
#> $Message
#> [1] "ERROR: not all ID variable requirements are met. See Information for more details."
#> 
#> $Information
#> # A tibble: 5 × 4
#>   check.name check.description                              check.status details
#>   <chr>      <chr>                                          <chr>        <chr>  
#> 1 Check 1    Column 1 is labeled as 'SUBJECT_ID'.           Passed       The fi…
#> 2 Check 2    'SUBJECT_ID' is a column name in the data set. Passed       'SUBJE…
#> 3 Check 3    'SUBJECT_ID' is a column name in the data set. Failed       Illega…
#> 4 Check 4    No leading zeros detected in 'SUBJECT_ID' col… Passed       No lea…
#> 5 Check 5    No missing values for 'SUBJECT_ID'.            Passed       No mis…
#> 
results <- id_check(DS.data.P)
#> $Message
#> [1] "ERROR: not all ID variable requirements are met. See Information for more details."
#> 
#> $Information
#> # A tibble: 5 × 4
#>   check.name check.description                              check.status details
#>   <chr>      <chr>                                          <chr>        <chr>  
#> 1 Check 1    Column 1 is labeled as 'SUBJECT_ID'.           Passed       The fi…
#> 2 Check 2    'SUBJECT_ID' is a column name in the data set. Passed       'SUBJE…
#> 3 Check 3    'SUBJECT_ID' is a column name in the data set. Failed       Illega…
#> 4 Check 4    No leading zeros detected in 'SUBJECT_ID' col… Passed       No lea…
#> 5 Check 5    No missing values for 'SUBJECT_ID'.            Passed       No mis…
#> 
results$Information[[1]]$details
#> [1] "The first column name is SUBJECT_ID."                                                                                                                                                                                                                
#> [2] "'SUBJECT_ID' is the name of column 1."                                                                                                                                                                                                               
#> [3] "Illegal characters detected in 'SUBJECT_ID' for 100 row(s). SUBJECT_ID may contain only: English letters, Arabic numerals, period (.), hyphen (-), underscore (_), at symbol (@), and the pound sign (#). No spaces or other characters are allowed."
#> [4] "No leading zeros detected in 'SUBJECT_ID'."                                                                                                                                                                                                          
#> [5] "No missing values detected for 'SUBJECT_ID'."                                                                                                                                                                                                        
print(id_check(DS.data.P, verbose=FALSE))
#> # A tibble: 1 × 5
#>   Time                Function Status Message                        Information
#>   <dttm>              <chr>    <chr>  <chr>                          <named lis>
#> 1 2023-09-27 11:01:16 id_check Failed ERROR: not all ID variable re… <tibble>   

# Example 3: Pass check
data(ExampleA)
id_check(DS.data.A)
#> $Message
#> [1] "Passed: All ID variable checks passed."
#> 
#> $Information
#> # A tibble: 5 × 4
#>   check.name check.description                              check.status details
#>   <chr>      <chr>                                          <chr>        <chr>  
#> 1 Check 1    Column 1 is labeled as 'SUBJECT_ID'.           Passed       The fi…
#> 2 Check 2    'SUBJECT_ID' is a column name in the data set. Passed       'SUBJE…
#> 3 Check 3    'SUBJECT_ID' is a column name in the data set. Passed       No ill…
#> 4 Check 4    No leading zeros detected in 'SUBJECT_ID' col… Passed       No lea…
#> 5 Check 5    No missing values for 'SUBJECT_ID'.            Passed       No mis…
#> 
print(id_check(DS.data.A, verbose=FALSE))
#> # A tibble: 1 × 5
#>   Time                Function Status Message                        Information
#>   <dttm>              <chr>    <chr>  <chr>                          <named lis>
#> 1 2023-09-27 11:01:16 id_check Passed Passed: All ID variable check… <tibble>