[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: asreml@chiswick.anprod.csiro.au*Subject*: Re: non-normality*From*: southey@ux1.cso.uiuc.edu*Date*: Tue, 18 Jul 2000 10:42:43 -0500 (CDT)*In-Reply-To*: <vines.bmW7+pffQtA@vines2.wau.nl>*Sender*: asreml-owner@lamb.chiswick.anprod.csiro.au

Hi, There are many issues involved here. Some of it pertains to the use of generalized linear mixed models in general. I hope the following clears certain aspects up or at least points the discussion towards the main issues. Haja - many distributions (e.g. binomial and Poisson) depend on the mean. Hence why some people recommend variance stablizing transformations. However, I doubt that these are valid for GLMM's. The potential lack of information in an animal model relative a sire model is well known and not just limited to non-normal distributions. Failure to get estimates from one method and not the other is not a problem of the model but the algorithm and it's implementation. The frequentist and most Bayesian approaches are approximations of varying degrees! Some of the MCMC relates to improper posteriors. See Hoeschele and Tier (1995 Genet Sel Evol 27:519) about MCMC 'blowing up'. The basic definition of the sire model, the sire-dam model and the animal model, assuming a pure additive model means that these are NOT linearly equivalent! This is the major error that occurs in Mayer's 1995 paper where it was concluded that the so-called equivalent sire and animals were in fact not equivalent. This should of been clear since in order to make linearly equivalent models, you need the same first two moments. That is you have to account for the 3/4 of the additive variance in the sire model otherwise it is overdispersed relative to the animal model. The 'residual' variance is the variance of some function - e.g. the logistic or probit functions in the binary case. These are the same 'residual' variances used in these models so you must get different answers unless there is no additive genetic variance. In the sire-dam models, it was often observed that estimated genetic variance was different between the sire and dam variances. Often attributed to maternal effects, common environmental effects, dominance effects etc., which indicate the initial assumptions do not hold. In addition animal models also use all the genetic links that are present where the other models assume independence between all sires and dams. > > But first of all, it is not quite clear to me, if animal models applied to > discrete data lead to proper estimates? > What do you mean by 'proper'? Under the Bayesian sense the posteriors should be (in theory), implementation issues (such as approximations made) aside. All Frequentist methods rely on some approximation and usually involve maximizing a joint likelihood following Henderson et al. (1959) approach. Others end with similar algorithms. Templeman (1998) reviewed generalized linear mixed models in the Journal of Dairy Science and discussed some of the issues involved. McCulloch and Feng (1996 Technical report, Cornell University) showed that these methods have two undesirable properties of inconsistency and lack of invariance under equivalent specification of the statistical problem. They also show why the joint-maximization works of the normal distribution with identity link and hence, when it will not work for other approaches. This would agree with Patrik's email. With the options in ASREML, there is no excuse not to try potentially more correct assumptions. My experience is that it very much depends on the data, algorithm and software and the assumed model. Some of the failures reported have been due to the incorrect use of the assumed distribution - particularly associated with the failure to correctly account for possible dispersion. It may also relate to inappropriate link functions. Just because the data is binary or counts, it does not mean that binomial with probit link or Poisson with log link, respectively, are the correct distributional assumptions. It is clear that it is the amount of information about the different parameters that is important rather than the size of the data set. Several authors, e.g. Hoeschele, Templeman, have shown that if the data is binary or Poisson distributed, then methods based on these distributions are superior to normality with identity link. I have also seen this with real data. Bruce -- Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml

**References**:**non-normality***From:*Bart.Ducro@alg.vf.wau.nl

- Prev by Date:
**Re: non-normality** - Next by Date:
**Local maximum of LogL** - Prev by thread:
**non-normality** - Next by thread:
**Re: non-normality** - Index(es):