This function scans both the data dictionary and data set for problematic characters that may interfere with dbGaP submission requirements. Specifically, it checks for: (1) Non-ASCII characters (e.g., with accents) and (2) Newline and carriage return characters (e.g., line breaks). It returns a list of any variable names (columns), row numbers, and values where these issues are detected.
Value
Tibble, returned invisibly, containing: (1) Time (Time stamp); (2) Name (Name of the function); (3) Status (Passed/Failed); (4) Message (A copy of the message the function printed out); (5) Information (Column, row, and value of detected non-ASCII characters).
Examples
# Passed example
data(ExampleA)
ascii_check(DD.dict.A, DS.data.A)
#> $Message
#> [1] "Passed: no non-ASCII characters detected in data dictionary or data set."
#>
# Failed example
data(ExampleT)
ascii_check(DD.dict.T, DS.data.T)
#> $Message
#> [1] "ERROR: non-ASCII characters detected. See Information for details."
#>
#> $Information
#> file column row value issue_type
#> 1 Data dictionary VALUES 5 0=café Non-ASCII character
#>