Preface

R packages

In this practical, a number of R packages are used. The packages used (with versions that were used to generate the solutions) are:

  • R version 3.6.1 (2019-07-05)
  • mice (version: 3.6.0)
  • RColorBrewer (version: 1.1.2)
  • reshape2 (version: 1.4.3)
  • ggplot2 (version: 3.2.1)

Dataset

For this practical, we will again use the NHANES2 dataset that we have seen in the previous practical.

To load this dataset, you can use the command file.choose() which opens the explorer and allows you to navigate to the location of the file NHANES2_for_practicals.RData on your computer. If you know the path to the file, you can also use load("<path>/NHANES2_for_practicals.RData"). RStudio users can also just click on the file in the “Files” pane/tab to load it.

Imputed data

The imputed data are stored in a mids object called imp that we created in the previous practical.

You can load it into your workspace by clicking the object imps.RData if you are using RStudio. Alternatively, you can load this workspace using load("<path>/imps.RData"). You then need to run:

imp <- savedimps_imp

Evaluate the imputation

Checking the settings

It is good practice to make sure that mice() has not done any processing of the data that was not planned or that you are not aware of. This means checking that the correct method, predictorMatrix and visitSequence were used.

Task

Do these checks for imp.

Solution

identical(imp$method, meth)
## [1] TRUE
identical(imp$predictorMatrix, pred)
## [1] TRUE
identical(imp$visitSequence, visSeq)
## [1] TRUE
# or:
imp$method
##             HDL            race            bili           smoke              DM          gender 
##          "norm"              ""          "norm"          "polr"              ""              "" 
##              WC            chol        HyperMed             alc             SBP             wgt 
##          "norm"          "norm"              ""          "polr"          "norm"          "norm" 
##          hypten          cohort           occup             age            educ            albu 
##        "logreg"              ""       "polyreg"              ""          "polr"          "norm" 
##           creat        uricacid             BMI         hypchol             hgt 
##           "pmm"          "norm" "~I(wgt/hgt^2)"        "logreg"          "norm"
imp$predictorMatrix
##          HDL race bili smoke DM gender WC chol HyperMed alc SBP wgt hypten cohort occup age educ
## HDL        0    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## race       1    0    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## bili       1    1    0     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## smoke      1    1    1     0  1      1  1    1        0   1   1   0      1      0     1   1    1
## DM         1    1    1     1  0      1  1    1        0   1   1   0      1      0     1   1    1
## gender     1    1    1     1  1      0  1    1        0   1   1   0      1      0     1   1    1
## WC         1    1    1     1  1      1  0    1        0   1   1   0      1      0     1   1    1
## chol       1    1    1     1  1      1  1    0        0   1   1   0      1      0     1   1    1
## HyperMed   1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## alc        1    1    1     1  1      1  1    1        0   0   1   0      1      0     1   1    1
## SBP        1    1    1     1  1      1  1    1        0   1   0   0      1      0     1   1    1
## wgt        1    1    1     1  1      1  0    1        0   1   1   0      1      0     1   1    1
## hypten     1    1    1     1  1      1  1    1        0   1   1   0      0      0     1   1    1
## cohort     1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## occup      1    1    1     1  1      1  1    1        0   1   1   0      1      0     0   1    1
## age        1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   0    1
## educ       1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    0
## albu       1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## creat      1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## uricacid   1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## BMI        1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## hypchol    1    1    1     1  1      1  1    1        0   1   1   0      1      0     1   1    1
## hgt        1    1    1     1  1      1  1    1        0   1   1   1      1      0     1   1    1
##          albu creat uricacid BMI hypchol hgt
## HDL         1     1        1   1       1   0
## race        1     1        1   1       1   0
## bili        1     1        1   1       1   0
## smoke       1     1        1   1       1   0
## DM          1     1        1   1       1   0
## gender      1     1        1   1       1   0
## WC          1     1        1   1       1   0
## chol        1     1        1   1       0   0
## HyperMed    1     1        1   1       1   0
## alc         1     1        1   1       1   0
## SBP         1     1        1   1       1   0
## wgt         1     1        1   0       1   1
## hypten      1     1        1   1       1   0
## cohort      1     1        1   1       1   0
## occup       1     1        1   1       1   0
## age         1     1        1   1       1   0
## educ        1     1        1   1       1   0
## albu        0     1        1   1       1   0
## creat       1     0        1   1       1   0
## uricacid    1     1        0   1       1   0
## BMI         1     1        1   0       1   0
## hypchol     1     1        1   1       0   0
## hgt         1     1        1   0       1   0
imp$visitSequence
##  [1] "HDL"      "race"     "bili"     "smoke"    "DM"       "gender"   "WC"       "chol"    
##  [9] "HyperMed" "alc"      "SBP"      "wgt"      "hypten"   "cohort"   "occup"    "age"     
## [17] "educ"     "albu"     "creat"    "uricacid" "hypchol"  "hgt"      "BMI"

Logged events

Checking the loggedEvent shows us if mice() detected any problems during the imputation.

Task 1

Check the loggedEvents for imp.

Solution 1

imp$loggedEvents
## NULL

There are no logged events, great!

Task 2

Let’s see what would have happened if we had not prepared the predictorMatrix, method and visitSequence before imputation.

Run the imputation without setting any additional arguments:
impnaive <- mice(NHANES2, m = 5, maxit = 10)

Take a look at the loggedEvents of impnaive.

Solution 2

impnaive$loggedEvents