A Prediction probelm

From: <arthur.gilmour_at_DPI.NSW.GOV.AU>
Date: Fri, 24 Oct 2008 10:37:04 +1100

Dear ASReml users

I enclose my response to an ASReml user whi had trouble predicting a
twoway table and its margins.
Since the example is instructive, I have taken the liberty of sharing it
for your interest.

Title: exampledata.
 A * # !I
 B * # !I
 C #!*10
 D #!*10
 E * # !I
 F * # !I
# Check/Correct these field definitions.
exampledata.csv !SKIP 1
tabulate C ~ B A F !stats
tabulate C ~ B A !stats
tabulate C ~ A B !stats
tabulate C ~ F B !stats

C ~ mu F B D F.B F.D -B.D -F.B.D, # Specify fixed model
      !r A # Specify random model

  A has 140 levels,
  B has 6 levels,
  D is a covariate
  E is not used so ignored in these notes
  F has 4 levels.
 The combinations of B A F define the individual observations.
 It appears levels of A are largely nested in levels of B but not
completely in that some levels of A appear in two different levels of B
 So there are 140 levels of A and 157 levels in A.B
 Use of !i SAYS TREAT THE values in the data file as labels rather than
directly as codes.
 However it seems that they should be taken directly as level codes so I
have changed !I to *
 so that they appear in nateral order.
 The analysis of the example gives

          - - - Results from analysis of C - - -

          Approximate stratum variance decomposition
 Stratum Degrees-Freedom Variance Component Coefficients

 Source Model terms Gamma Component Comp/SE %
 A 140 140 0.102574E-05 0.133539E-08 0.00 0
 Variance 280 248 1.00000 0.130188E-02 11.14 0
 Warning: Code B - fixed at a boundary (!GP) F - fixed by user
               ? - liable to change from P to B P - positive definite
               C - Constrained by user (!VCC) U - unbounded
               S - Singular Information matrix
 S means there is no information in the data for this parameter.
 Very small components with Comp/SE ratios of zero sometimes indicate poor
           scaling. Consider rescaling the design matrix in such cases.

                                   Wald F statistics
     Source of Variation NumDF DenDF_con F_inc F_con M
   7 mu 1 248.0 1281.91 283.34 .
   6 F 3 248.0 201.57 192.33 A
   2 B 5 248.0 11.44 11.72 A
   4 D 1 248.0 3.31 0.06 A

   8 F.B 9 248.0 3.89 3.40 b
   9 F.D 3 248.0 1.06 0.98 B
  10 B.D 4 248.0 1.18 1.18 B
  11 F.B.D 6 248.0 1.26 1.26 C
 Notice: The DenDF values are calculated ignoring fixed/boundary/singular
             variance parameters using algebraic derivatives.
   1 A 140 effects fitted ( 2 are

   Which shows no variance component associated with A,
   A big effect of F and B, and interaction. ; no effect of D

   So you wanted to predict these tables.
   Tabulation shows 18 combinations of F and B
   F1 B1 b2 B3 B4 B5 B6
   F2 B1 b2 B3 - - -
   F3 B1 b2 B3 B4 B5 B6
   F4 - - - B4 B5 B6

   Surprisingly, only 14 combinations are reported from
   predict F B
   The ones missing are F1B6 F3B6 F4B5 and F4B6
   despite the fact that there is the correct DF (1 + 3 + 5 + 9=18)
   My first guess is that this might be a scaling effect but multiplying
by 10 did not solve the problem.
   Second was that it was associated with the NS D regressions. Dropping
the F.B.D and B.D model terms
   resolved the problem. Looking at the ANOVA table again, we see that
these terms were deficient
   in DF (B.D had 4 not 5, F.B.D had 6 not 9) so these singularities were
sufficient to make
   some cells not estimable.
   Now concerning the F and B tables, given the 6 missing cells, there
is no standard way to calculate the margins
   (except F1 and F3 which are complete).
   There are two possabilities in ASReml but you must determine which if
either is valid.
   I have added !FITMARGIN to PREDICT F B
   and this generates marginal means from the F B table assuming that
interaction effects associated
   with missing cells are zero.
   I have added !PRESENT F B to the other two predict statements
   so that marginal means are calculated just from those cells in the
row/column of F x B table which are present.
   Neither of these approaches is necessarily appropriate or reasonable.
   Given the large F effects, B means using the PRESENT strategy will be
confounded with the F effects
   (at least comparisons between the B1 B2 B3 set and the B4 B5 B6 set.
   I trust this helps.

May Jesus Christ continue to be gracious to you,

Arthur Gilmour, His servant .
Mixed model regression mapping for QTL detection in experimental crosses.
Computational Statistics and Data Analysis 51:3749-3764 at

Personal website: http://www.cargovale.com.au/

Skype: arthur.gilmour
mailto:Arthur.Gilmour_at_dpi.nsw.gov.au, arthur_at_cargovale.com.au
Principal Research Scientist (Biometrics)
NSW Department of Primary Industries
Orange Agricultural Institute, Forest Rd, ORANGE, 2800, AUSTRALIA
fax: 02 6391 3899; 02 6391 3922 Australia +61
telephone work: 02 6391 3815; home: 02 6364 3288; mobile: 0438 251 426

ASREML 2 is available from http://www.VSNi.co.uk/products/asreml
Register on the ASReml forum at http://www.vsni.co.uk/forum
ASReml-L Archives are at
Cookbook: http://uncronopio.org/ASReml

The ASReml3 alpha download registration page is:
There are asreml3alpha forums at: http://www.austatgen.org/asreml3forum

Proposed travel:
Coffs Harbour WS 14-17 October (Forestry)
PMADs Wollongong 21-24 Oct
Received on Sun Oct 24 2008 - 10:37:04 EST

This webpage is part of the ASReml-l discussion list archives 2004-2010. More information on ASReml can be found at the VSN website. This discussion list is now deprecated - please use the VSN forum for discussion on ASReml. (These online archives were generated using the hypermail package.)