# Re: why "Singularity in Average Information Matrix"? how to do?

From: <arthur.gilmour_at_DPI.NSW.GOV.AU>
Date: Mon, 27 Oct 2008 20:56:26 +1100

Dear Luan sheng

Singularity in AI matrix when fitting ANIMAL model
Question
I want to estimate the heritability of fish body weight but get the ASReml
error Singularity in Average Information Matrix The asreml manual suggests
I need to modify the model but the model is very simple. How can I get the
heritability of body weight? My ASReml job is

Data analysis for Flounder
animal !A !P
sire !A !P
dam !A !P
tank * !I
age
bl
wt
yaping.dat !ALPHA !SKIP 1 !MAKE
yaping.dat !SKIP 1 !MAXIT 500
wt ~ mu age !r animal

Note that !A is not required with !P since the fact that the fields are
alphanumeric is declared by the !ALPHA qualifier on the pedigree file
line.
This is a common problem which arises because of the nature of the animal
model.
What is happening?
Looking at the iteration summary we see

1 LogL=-3727.99 S2= 105.60 1298 df 0.1000 1.000
2 LogL=-3698.00 S2= 99.129 1298 df 0.1296 1.000
3 LogL=-3644.01 S2= 86.829 1298 df 0.2218 1.000
4 LogL=-3594.78 S2= 72.626 1298 df 0.4400 1.000
5 LogL=-3560.18 S2= 54.273 1298 df 1.062 1.000
6 LogL=-3545.76 S2= 35.421 1298 df 2.550 1.000
7 LogL=-3540.01 S2= 18.033 1298 df 6.809 1.000
8 LogL=-3537.88 S2= 5.9630 1298 df 24.47 1.000
9 LogL=-3537.25 S2= 0.84544 1298 df : 1 components
restrained
10 LogL=-3537.16 S2= 0.53997E-01 1298 df : 1 components
restrained
11 LogL=-3537.15 S2= 0.34172E-02 1298 df 0.4607E+05 1.000

Notice that the residual is shrinking as the variance ratio explodes. It
fails because the residual has become too small.
The singularity in AI matrix did not appear in the first iteration so the
problem is not structural (a common couse of this message) but data
dependent.
Why is it happening?
The summary of the structure of the pedigree (given in ASReml 3) is

1339 identities in the pedigree over 1 generations
Sires SiresofSire DamsofSire Dams SiresofDam DamsofDam
26 0 0 13 0 0

There is no pedigree on the parents, and it looks like there are 26
families. After defining sire and dam as

animal !P
sire !A
dam !A

Using tabulate wt ~ sire dam confirms that there are 13 dams and 2 sires
per dam.
Fitting the model wt ~ mu age !r sire dam, the model converges to give

10 LogL=-3534.80 S2= 77.627 1298 df 0.6284 1.006 1.000

- - - Results from analysis of wt - - -

Approximate stratum variance decomposition
Stratum Degrees-Freedom Variance Component Coefficients
dam 11.17 11412.2 128.1 65.2 1.0
sire 12.88 3545.87 0.0 44.4 1.0
Residual Variance 1273.94 77.6266 0.0 0.0 1.0

Source Model terms Gamma Component Comp/SE %
C
dam 13 13 0.628358 48.7773 1.19 0
P
sire 26 26 1.00558 78.0599 2.48 0
P
Variance 1300 1298 1.00000 77.6266 25.24 0
P

Wald F statistics
Source of Variation NumDF DenDF Fic Prob
8 mu 1 11.0 71.29 <.001
5 age 1 13.8 0.34 0.568

Fitting just sire gives

11 LogL=-3535.74 S2= 77.624 1298 df 1.586 1.000
Final parameter values 1.5864 1.0000

- - - Results from analysis of wt - - -

Source Model terms Gamma Component Comp/SE %
C
sire 26 26 1.58636 123.139 3.42 0
P
Variance 1300 1298 1.00000 77.6240 25.24 0
P

which is almost the same LogL. In the sire + dam model, the actual sire
variance is 126.84 (48.78+78.06) and the covariance between families with
the same dam is 48.78). Assuming no covariance between families with the
same dam, the sire variance is 123.14.
The animal model is based on the genetic assumption that the sire variance
represents 0.25 ó2A and the residual represents ó2E+0.75ó2A . This gives ó
2A=4 cross 123.14 = 492.56 and ó2E= 77.62 - 369.42 = -291.8. The animal
model falls over because ASReml can't estimate a negative residual
variance directly.
The bottom line is that there appears to be family effects over and above
simple genetic effects. Maybe you need to replicate at the family level so
that you can partition the variance better.
What else is happening?
A plot of the residuals against fitted values shows a few (at least 5)
fish that are very large relative to their full sibs. Additionally, there
is a general fanning of the residuals (but age was not significant) so
that heavier families are more variable. This suggests a sqrt
transformation might be in order (after fixing outliers).

Plot of Residuals [ -24.7561 63.3026] vs Fitted values [ 6.6769
43.7587] RE11
------------1-----------------------------------------------
. .
. 1 .
. 1 .
. 1 .
. 1 1 .
. 1 .
. 1 .
. 1
. 1 1 1
. 1 2 .
. 1 1 11 1 2 2 1 11
. 1 2 1 2 1 2 23
. 1 121 2 1 1 4 22 1 5 6 22
. 33 43 115 5 4 3 43 1 2 54
. 3 243 *5 365 127 3 53 1 2 4 31
* * 2 *7* *7 683 587 4 7* 5 5 5 *2
* * * *2* *9 989 755 8 9* 4 7 7 36
*---*--*--***1--*9--*87---*46-----6-56------8-----3----3--8*
7 7 9 *** ** 8** 668 8 82 8 * 3 33
. 1 3 554 *7 737 7*8 7 45 4 4 7 37
. 1 11 194 3 31 3 4 4 34
. 21 2 25 3 3 4 22
. 11 5 1 1 12
-------------------------------------------------------1--31

Work through the ASReml tutorial B2 and B3 on the sire and animal model
for further explanation.

May Jesus Christ continue to be gracious to you,

Arthur Gilmour, His servant .

Mixed model regression mapping for QTL detection in experimental crosses.
Computational Statistics and Data Analysis 51:3749-3764 at
http://dx.doi.org/10.1016/j.csda.2006.12.031

Profile:http://www.dpi.nsw.gov.au/research/staff/arthur-gilmour
Personal website: http://www.cargovale.com.au/

Skype: arthur.gilmour
mailto:Arthur.Gilmour_at_dpi.nsw.gov.au, arthur_at_cargovale.com.au
Principal Research Scientist (Biometrics)
NSW Department of Primary Industries
Orange Agricultural Institute, Forest Rd, ORANGE, 2800, AUSTRALIA

fax: 02 6391 3899; 02 6391 3922 Australia +61
telephone work: 02 6391 3815; home: 02 6364 3288; mobile: 0438 251 426

ASREML 2 is available from http://www.VSNi.co.uk/products/asreml
Register on the ASReml forum at http://www.vsni.co.uk/forum
ASReml-L Archives are at
https://listserv.dpi.nsw.gov.au/cgi-bin/wa.exe
Cookbook: http://uncronopio.org/ASReml