Re: using !MVINCLUDE with missing random effects

From: craig walling <asremlforum_at_VSNI.CO.UK>
Date: Fri, 29 Jan 2010 11:16:35 +0000

Dear Damian,

Thanks for the quick reply! I did indeed mean !MVINCLUDE, sorry! This does help to explain the difference I am seeing.

Since yesterday I have been playing around with more models with missing data in either the fixed or random effects structure, and have another question. When a model has a fixed effect fit that has missing values and !MVREMOVE is specified, does this drop all of the data contained in this record for the analysis, or just for the analysis at that factor? I thought it removed all data for the record where the factor was missing, but some analyses I ran yesterday suggests this is not the case. The analysis is again a bivariate analysis of male and female fitness as the model I posted yesterday, but this time I fit 2 additional fixed effects, reproductive status (stat) and where in the population an individual lives (nwownps), but only for females. The model is:

femalesr malesr ~ Trait Trait.age @(Trait,1).stat @(Trait,1).nwownps !r Trait.id Trait.ide(id) Trait.byr Trait.mum

In this analysis, I have replaced all missing mum values with unique mothers, so there are no missing values in the random effects structure. However, there are missing values in the factor nwownps, 88 in total. To simplify things, although males never have known "nwownps" I have dummy coded them as "MISS" but the effect is not fit for males so this shouldn't make any difference. If I fit this model specifying !MVINCLUDE, I now understand that an extra "reference level" will be fit for this term which includes all the missing values. If I specify !MVREMOVE, the output states that "88 records were dropped from the analysis because of missing values in design variables and !MVREMOVE was specified". This is fine and exactly what I expected to see. However, when I go into the data myself and remove the entire record when it is missing at the factor "nwownps" and then re-run the analysis, the output is not identical to the model specifying !MVREMOVE. In fact, the variance paramet!
 er estimates are closer to the model where !MVINCLUDE is specified with the missing values left in the factor "nwownps". Do I missunderstand what !MVREMOVE does? Does it somehow only remove a record when evaluating the factor at which that record contains missing data, but includes the record again when evaluating at other factors? The output from these three models are:

######################################################
Specifying !MVINCLUDE for a dataset including all individuals
Using 5749 records of 5749 read

  Model term Size #miss #zero MinNon0 Mean MaxNon0
   1 id !P 4051 0 0 462.0 1833. 3683.
   2 sex 2 0 0 1 1.3103 2
   3 byr 34 0 0 1 16.2922 34
   4 calfyr 34 0 0 1 15.1265 34
   5 age 13 0 0 1 5.0557 13
   6 sqage 13 0 0 1 5.0557 13
   7 mum 382 351 0 1 171.5067 382
   8 nwmum 496 0 0 1 189.3195 496
   9 stat 5 1784 0 1 2.7253 5
  10 nwstat 6 0 0 1 3.7415 6
  11 ownps 5 1872 0 1 3.2262 5
  12 nwownps 6 88 0 1 4.1003 6
  13 afwbar Variate 1784 74 0.4324 1.167 1.706
  14 amwbar Variate 3965 77 0.4102 1.352 7.295
  15 Trait 2
  16 Trait.age 26 15 Trait : 2 5 age : 13
  17 at(Trait,1) 1
  18 at(Trait,1).nwstat 6 17 at(Trait,1: 1 10 nwstat : 6
  19 at(Trait,1).nwownps 6 17 at(Trait,1: 1 12 nwownps : 6
  20 Trait.id 8102 15 Trait : 2 1 id : 4051
  21 ide(id) 4051 0 0 462 1832.5429 3683
  22 Trait.ide(id) 8102 15 Trait : 2 21 ide(id) : 4051
  23 Trait.byr 68 15 Trait : 2 3 byr : 34
  24 Trait.nwmum 992 15 Trait : 2 8 nwmum : 496
 Warning: 176 missing values were detected in the design variables
          Missing values are treated as zeros
          Consider deleting the records in which they occur
   5749 identity
      2 UnStructure 0.0303 0.0000 0.1935
   11498 records assumed pre-sorted 2 within 5749
      2 UnStructure 0.0061 0.0000 0.0387
   4051 Ainverse
 Structure for Trait.id has 8102 levels defined
      2 UnStructure 0.0012 0.0000 0.0077
   4051 identity
 Structure for Trait.ide(id) has 8102 levels defined
      2 UnStructure 0.0002 0.0000 0.0015
    496 identity
 Structure for Trait.nwmum has 992 levels defined
      2 UnStructure 0.0000 0.0000 0.0003
     34 identity
 Structure for Trait.byr has 68 levels defined
 Forming 17304 equations: 40 dense.
 Initial updates will be shrunk by factor 0.141
 Notice: 8 singularities detected in design matrix.
   1 LogL= 27.6609 S2= 1.0000 5717 df : 3 components constrained
   2 LogL= 433.644 S2= 1.0000 5717 df : 3 components constrained
   3 LogL= 1189.97 S2= 1.0000 5717 df : 3 components constrained
   4 LogL= 1746.33 S2= 1.0000 5717 df : 1 components constrained
   5 LogL= 2006.70 S2= 1.0000 5717 df
   6 LogL= 2140.41 S2= 1.0000 5717 df
   7 LogL= 2171.95 S2= 1.0000 5717 df
   8 LogL= 2172.64 S2= 1.0000 5717 df
   9 LogL= 2172.65 S2= 1.0000 5717 df
  10 LogL= 2172.65 S2= 1.0000 5717 df

 Source Model terms Gamma Component Comp/SE % C
 Residual UnStructured 1 1 0.887295E-01 0.887295E-01 40.96 0 P
 Residual UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Residual UnStructured 2 2 0.486756 0.486756 26.55 0 P
 Trait.id UnStructured 1 1 0.283601E-02 0.283601E-02 1.49 0 U
 Trait.id UnStructured 2 1 -0.316927E-02 -0.316927E-02 -0.70 0 U
 Trait.id UnStructured 2 2 0.141520E-01 0.141520E-01 0.69 0 U
 Trait.ide(id) UnStructured 1 1 0.484497E-02 0.484497E-02 2.25 0 P
 Trait.ide(id) UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Trait.ide(id) UnStructured 2 2 0.111596 0.111596 3.71 0 P
 Trait.nwmum UnStructured 1 1 0.995829E-03 0.995829E-03 0.82 0 U
 Trait.nwmum UnStructured 2 1 0.153878E-02 0.153878E-02 0.39 0 U
 Trait.nwmum UnStructured 2 2 0.109241E-01 0.109241E-01 0.58 0 U
 Trait.byr UnStructured 1 1 0.569147E-03 0.569147E-03 1.13 0 U
 Trait.byr UnStructured 2 1 0.157119E-02 0.157119E-02 1.24 0 U
 Trait.byr UnStructured 2 2 0.205659E-02 0.205659E-02 0.34 0 U
 Covariance/Variance/Correlation Matrix UnStructured Residual
  0.8873E-01 0.000
   0.000 0.4868
 Covariance/Variance/Correlation Matrix UnStructured Trait.id
  0.2836E-02 -0.5003
 -0.3169E-02 0.1415E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.ide(id)
  0.4845E-02 0.000
   0.000 0.1116
 Covariance/Variance/Correlation Matrix UnStructured Trait.nwmum
  0.9958E-03 0.4665
  0.1539E-02 0.1092E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.byr
  0.5691E-03 1.452
  0.1571E-02 0.2057E-02

 Analysis of Variance NumDF DenDF_con F_inc F_con M P_con
  15 Trait 2 24.0 7276.11 1332.60 . <.001
  16 Trait.age 21 4346.6 27.42 29.92 B <.001
  18 at(Trait,1).nwstat 4 3873.5 52.53 53.45 B <.001
  19 at(Trait,1).nwownps 5 373.0 2.95 2.95 B 0.015
 Notice: The DenDF values are calculated ignoring fixed/boundary/singular
             variance parameters using numerical derivatives.
  23 Trait.byr 68 effects fitted
  24 Trait.nwmum 992 effects fitted
  20 Trait.id 8102 effects fitted ( 1110 are zero)
  22 Trait.ide(id) 8102 effects fitted ( 7073 are zero)
 SLOPES FOR LOG(ABS(RES)) on LOG(PV) for Section 1
  -0.41 2.21
          71 possible outliers: see .res file
 Finished: 29 Jan 2010 11:05:16.117 LogL Converged

####################################################
Specifying !MVREMOVE for a dataset including all individuals
Using 5749 records of 5749 read

  Model term Size #miss #zero MinNon0 Mean MaxNon0
   1 id !P 4051 0 0 462.0 1833. 3683.
   2 sex 2 0 0 1 1.3103 2
   3 byr 34 0 0 1 16.2922 34
   4 calfyr 34 0 0 1 15.1265 34
   5 age 13 0 0 1 5.0557 13
   6 sqage 13 0 0 1 5.0557 13
   7 mum 382 351 0 1 171.5067 382
   8 nwmum 496 0 0 1 189.3195 496
   9 stat 5 1784 0 1 2.7253 5
  10 nwstat 6 0 0 1 3.7415 6
  11 ownps 5 1872 0 1 3.2262 5
  12 nwownps 6 88 0 1 4.1003 6
  13 afwbar Variate 1784 74 0.4324 1.167 1.706
  14 amwbar Variate 3965 77 0.4102 1.352 7.295
  15 Trait 2
  16 Trait.age 26 15 Trait : 2 5 age : 13
  17 at(Trait,1) 1
  18 at(Trait,1).nwstat 6 17 at(Trait,1: 1 10 nwstat : 6
  19 at(Trait,1).nwownps 6 17 at(Trait,1: 1 12 nwownps : 6
  20 Trait.id 8102 15 Trait : 2 1 id : 4051
  21 ide(id) 4051 0 0 462 1832.5429 3683
  22 Trait.ide(id) 8102 15 Trait : 2 21 ide(id) : 4051
  23 Trait.byr 68 15 Trait : 2 3 byr : 34
  24 Trait.nwmum 992 15 Trait : 2 8 nwmum : 496
 Warning: 176 missing values were detected in the design variables
          and were treated as zeros. They were all associated with missing
          values in the response variable.
 Warning: 88 records were dropped from the analysis because
          of missing values in design variables and !MVREMOVE was specified.
   5705 identity
      2 UnStructure 0.0302 0.0000 0.1172
   11410 records assumed pre-sorted 2 within 5705
      2 UnStructure 0.0060 0.0000 0.0234
   4051 Ainverse
 Structure for Trait.id has 8102 levels defined
      2 UnStructure 0.0012 0.0000 0.0047
   4051 identity
 Structure for Trait.ide(id) has 8102 levels defined
      2 UnStructure 0.0002 0.0000 0.0009
    496 identity
 Structure for Trait.nwmum has 992 levels defined
      2 UnStructure 0.0000 0.0000 0.0002
     34 identity
 Structure for Trait.byr has 68 levels defined
 Forming 17304 equations: 40 dense.
 Initial updates will be shrunk by factor 0.141
 Notice: 9 singularities detected in design matrix.
   1 LogL=-385.873 S2= 1.0000 5630 df : 3 components constrained
   2 LogL=-5.91475 S2= 1.0000 5630 df : 3 components constrained
   3 LogL= 702.555 S2= 1.0000 5630 df : 3 components constrained
   4 LogL= 1226.37 S2= 1.0000 5630 df : 2 components constrained
   5 LogL= 1476.18 S2= 1.0000 5630 df : 1 components constrained
   6 LogL= 1584.01 S2= 1.0000 5630 df : 1 components constrained
   7 LogL= 1625.91 S2= 1.0000 5630 df : 1 components constrained
   8 LogL= 1640.49 S2= 1.0000 5630 df : 1 components constrained
   9 LogL= 1645.08 S2= 1.0000 5630 df
  10 LogL= 1646.69 S2= 1.0000 5630 df
  11 LogL= 1646.88 S2= 1.0000 5630 df
  12 LogL= 1646.89 S2= 1.0000 5630 df
  13 LogL= 1646.89 S2= 1.0000 5630 df

 Source Model terms Gamma Component Comp/SE % C
 Residual UnStructured 1 1 0.888824E-01 0.888824E-01 30.50 0 P
 Residual UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Residual UnStructured 2 2 0.277180 0.277180 38.76 0 P
 Trait.id UnStructured 1 1 0.164406E-03 0.164406E-03 0.11 0 U
 Trait.id UnStructured 2 1 -0.378516E-02 -0.378516E-02 -1.00 0 U
 Trait.id UnStructured 2 2 0.123025E-01 0.123025E-01 0.62 0 U
 Trait.ide(id) UnStructured 1 1 0.269165E-02 0.269165E-02 1.35 0 P
 Trait.ide(id) UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Trait.ide(id) UnStructured 2 2 0.147112 0.147112 5.03 0 P
 Trait.nwmum UnStructured 1 1 0.174412E-02 0.174412E-02 1.44 0 U
 Trait.nwmum UnStructured 2 1 0.298935E-02 0.298935E-02 0.74 0 U
 Trait.nwmum UnStructured 2 2 0.121297E-01 0.121297E-01 0.64 0 U
 Trait.byr UnStructured 1 1 -0.139263E-03 -0.139263E-03 -0.40 0 U
 Trait.byr UnStructured 2 1 0.956015E-03 0.956015E-03 0.94 0 U
 Trait.byr UnStructured 2 2 0.365272E-02 0.365272E-02 0.59 0 U
 Covariance/Variance/Correlation Matrix UnStructured Residual
  0.8888E-01 0.000
   0.000 0.2772
 Covariance/Variance/Correlation Matrix UnStructured Trait.id
  0.1644E-03 -2.662
 -0.3785E-02 0.1230E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.ide(id)
  0.2692E-02 0.000
   0.000 0.1471
 Covariance/Variance/Correlation Matrix UnStructured Trait.nwmum
  0.1744E-02 0.6499
  0.2989E-02 0.1213E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.byr
 -0.1393E-03 1.340
  0.9560E-03 0.3653E-02

 Analysis of Variance NumDF DenDF_con F_inc F_con M P_con
  15 Trait 2 7.7 14360.54 14360.54 . <.001
  16 Trait.age 21 4741.8 27.57 28.58 B <.001
  18 at(Trait,1).nwstat 4 3262.9 29.95 29.90 B <.001
  19 at(Trait,1).nwownps 4 164.0 0.86 0.86 B 0.491
 Notice: The DenDF values are calculated ignoring fixed/boundary/singular
             variance parameters using numerical derivatives.
  23 Trait.byr 68 effects fitted
  24 Trait.nwmum 992 effects fitted ( 2 are zero)
  20 Trait.id 8102 effects fitted ( 1112 are zero)
  22 Trait.ide(id) 8102 effects fitted ( 7077 are zero)
 SLOPES FOR LOG(ABS(RES)) on LOG(PV) for Section 1
  -0.66 1.73
          73 possible outliers: see .res file
 Finished: 29 Jan 2010 11:07:49.184 LogL Converged

##################################################
Specifying !MVREMOVE for a dataset in which individuals with missing fixed efects have been removed by me (the output is identical whether you specify MVREMOVE or MVINCLUDE as you would expect)
Using 5661 records of 5661 read

  Model term Size #miss #zero MinNon0 Mean MaxNon0
   1 id !P 4051 0 0 462.0 1834. 3683.
   2 sex 2 0 0 1 1.3151 2
   3 byr 34 0 0 1 16.3323 34
   4 calfyr 34 0 0 1 15.0694 34
   5 age 13 0 0 1 5.0548 13
   6 sqage 13 0 0 1 5.0548 13
   7 mum 381 351 0 1 170.8411 381
   8 nwmum 495 0 0 1 188.8935 495
   9 stat 5 1784 0 1 2.7034 5
  10 nwstat 6 0 0 1 3.7423 6
  11 ownps 5 1784 0 1 3.2262 5
  12 nwownps 6 0 0 1 4.1003 6
  13 afwbar Variate 1784 71 0.4324 1.168 1.706
  14 amwbar Variate 3877 77 0.4102 1.352 7.295
  15 Trait 2
  16 Trait.age 26 15 Trait : 2 5 age : 13
  17 at(Trait,1) 1
  18 at(Trait,1).nwstat 6 17 at(Trait,1: 1 10 nwstat : 6
  19 at(Trait,1).nwownps 6 17 at(Trait,1: 1 12 nwownps : 6
  20 Trait.id 8102 15 Trait : 2 1 id : 4051
  21 ide(id) 4051 0 0 462 1833.9258 3683
  22 Trait.ide(id) 8102 15 Trait : 2 21 ide(id) : 4051
  23 Trait.byr 68 15 Trait : 2 3 byr : 34
  24 Trait.nwmum 990 15 Trait : 2 8 nwmum : 495
   5661 identity
      2 UnStructure 0.0301 0.0000 0.1935
   11322 records assumed pre-sorted 2 within 5661
      2 UnStructure 0.0060 0.0000 0.0387
   4051 Ainverse
 Structure for Trait.id has 8102 levels defined
      2 UnStructure 0.0012 0.0000 0.0077
   4051 identity
 Structure for Trait.ide(id) has 8102 levels defined
      2 UnStructure 0.0002 0.0000 0.0015
    495 identity
 Structure for Trait.nwmum has 990 levels defined
      2 UnStructure 0.0000 0.0000 0.0003
     34 identity
 Structure for Trait.byr has 68 levels defined
 Forming 17302 equations: 40 dense.
 Initial updates will be shrunk by factor 0.141
 Notice: 9 singularities detected in design matrix.
   1 LogL= 25.4263 S2= 1.0000 5630 df : 3 components constrained
   2 LogL= 424.075 S2= 1.0000 5630 df : 3 components constrained
   3 LogL= 1166.71 S2= 1.0000 5630 df : 3 components constrained
   4 LogL= 1712.87 S2= 1.0000 5630 df : 1 components constrained
   5 LogL= 1968.34 S2= 1.0000 5630 df
   6 LogL= 2099.44 S2= 1.0000 5630 df
   7 LogL= 2130.29 S2= 1.0000 5630 df
   8 LogL= 2130.95 S2= 1.0000 5630 df
   9 LogL= 2130.96 S2= 1.0000 5630 df
  10 LogL= 2130.96 S2= 1.0000 5630 df

 Source Model terms Gamma Component Comp/SE % C
 Residual UnStructured 1 1 0.879386E-01 0.879386E-01 40.46 0 P
 Residual UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Residual UnStructured 2 2 0.486752 0.486752 26.55 0 P
 Trait.id UnStructured 1 1 0.217031E-02 0.217031E-02 1.18 0 U
 Trait.id UnStructured 2 1 -0.296384E-02 -0.296384E-02 -0.68 0 U
 Trait.id UnStructured 2 2 0.141078E-01 0.141078E-01 0.68 0 U
 Trait.ide(id) UnStructured 1 1 0.596537E-02 0.596537E-02 2.71 0 P
 Trait.ide(id) UnStructured 2 1 0.00000 0.00000 0.00 0 F
 Trait.ide(id) UnStructured 2 2 0.111939 0.111939 3.73 0 P
 Trait.nwmum UnStructured 1 1 0.652810E-03 0.652810E-03 0.54 0 U
 Trait.nwmum UnStructured 2 1 0.165685E-02 0.165685E-02 0.42 0 U
 Trait.nwmum UnStructured 2 2 0.107217E-01 0.107217E-01 0.57 0 U
 Trait.byr UnStructured 1 1 0.415484E-03 0.415484E-03 0.87 0 U
 Trait.byr UnStructured 2 1 0.145915E-02 0.145915E-02 1.20 0 U
 Trait.byr UnStructured 2 2 0.186592E-02 0.186592E-02 0.31 0 U
 Covariance/Variance/Correlation Matrix UnStructured Residual
  0.8794E-01 0.000
   0.000 0.4868
 Covariance/Variance/Correlation Matrix UnStructured Trait.id
  0.2170E-02 -0.5356
 -0.2964E-02 0.1411E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.ide(id)
  0.5965E-02 0.000
   0.000 0.1119
 Covariance/Variance/Correlation Matrix UnStructured Trait.nwmum
  0.6528E-03 0.6263
  0.1657E-02 0.1072E-01
 Covariance/Variance/Correlation Matrix UnStructured Trait.byr
  0.4155E-03 1.657
  0.1459E-02 0.1866E-02

 Analysis of Variance NumDF DenDF_con F_inc F_con M P_con
  15 Trait 2 21.0 8130.41 8130.41 . <.001
  16 Trait.age 21 4301.9 27.41 29.71 B <.001
  18 at(Trait,1).nwstat 4 3783.6 53.28 52.97 B <.001
  19 at(Trait,1).nwownps 4 236.4 1.01 1.01 B 0.401
 Notice: The DenDF values are calculated ignoring fixed/boundary/singular
             variance parameters using numerical derivatives.
  23 Trait.byr 68 effects fitted
  24 Trait.nwmum 990 effects fitted
  20 Trait.id 8102 effects fitted ( 1112 are zero)
  22 Trait.ide(id) 8102 effects fitted ( 7077 are zero)
 SLOPES FOR LOG(ABS(RES)) on LOG(PV) for Section 1
  -0.43 2.19
          72 possible outliers: see .res file
 Finished: 28 Jan 2010 19:21:03.735 LogL Converged

Thanks again for the explanation with the previous problem. Sorry if this one is a similar missunderstanding of mine!

Craig

-------------------- m2f --------------------

Sent using Mail2Forum (http://www.mail2forum.com).

Read this topic online here:
http://www.vsni.co.uk/forum/viewtopic.php?p=1323#1323

-------------------- m2f --------------------
Received on Fri Jan 29 2010 - 11:16:35 EST

This webpage is part of the ASReml-l discussion list archives 2004-2010. More information on ASReml can be found at the VSN website. This discussion list is now deprecated - please use the VSN forum for discussion on ASReml. (These online archives were generated using the hypermail package.)