[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: Kim Bunter <kbunter@metz.une.edu.au>*Subject*: Re: DF when there are missing factor levels*From*: <southey@ux1.cso.uiuc.edu>*Date*: Wed, 30 May 2001 10:10:42 -0500 (CDT)*cc*: Arthur Gilmour <gilmoua@agric.nsw.gov.au>, <asreml@chiswick.anprod.csiro.au>*In-Reply-To*: <3.0.6.32.20010530111232.00882af0@metz.une.edu.au>*Sender*: asreml-owner@lamb.arm.li.csiro.au

Hi, > > The problem: only a subsection of the full data has a particular effect > recorded. We wish to estimate this effect without subsetting the data, > given that other effects of significance are estimated from the complete > data, and these estimates behave poorly in the subset. So, we wish to > analyse a data set where factor levels are missing for all animals without > this effect. > > Previously, Arthur indicated that missing factor levels are treated as a > zero level. In our case we have four factor levels (+ the one that is > missing). The DF reported are 4 instead of three, and all levels have a > non-zero solution reported. No equations are generated for level zero, > which probably explains why there is no zero solution for level zero. > > How are these results to be interpreted? Are the solutions meaningful, and > what exactly do they represent a deviation from for the effect of interest? > I can't really see that we should have four degrees of freedom for > starters, or that you can generate a legitimate DF by having missing > records. There is no equation fitting a dummy effect for all animals where > the level is missing (unless this became one of the singularities > reported?). We get identical solutions when we specify the effect is only > to be fit for animals with records (eg using at(yesrecord,1).effect). > If the level is really missing, then this is just analysis of data set with missing cells (Type IV sum of squares in SAS parlance and I don't remember how Genstat does it). There should be no reason to just consider the subset if this is the case. You just have to make sure that you are dealing with estimable functions, especially with interactions. From your description, you don't have a missing cell when you consider this term as a main effect (or it is not correctly coded in ASREML). All five levels have some observations present. Consider the following structure with the number of observations for each cells: Factor2 Factor1 A B Total1 L 10 20 30 M 10 20 30 N 0 30 30 Total2 20 50 In a main effects model there is no problem since the missing combination doesn't appear in the calculations of the sum of squares. However, if you fitted the interaction, you would need to find the estimable contrasts. The interaction will only have 1 df and generally you can find alternative ways to calculate the sum of squares associated with the main effects. The interpretation remains the same after you allow for the fact that you cannot say anything about level N of Factor1 interacts with Factor2. If your library has it, see Milliken and Johnson (1992) book 'Analysis of Messy Data Volume 1: Designed experiments'. Note that they use GLM procedure of SAS throughout so it is a little dated but the general concepts remain valid. Regards Bruce -- Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml

**References**:**DF when there are missing factor levels***From:*Kim Bunter <kbunter@metz.une.edu.au>

- Prev by Date:
**Re: DF when there are missing factor levels** - Next by Date:
**RE: asreml subscription** - Prev by thread:
**DF when there are missing factor levels** - Next by thread:
**Fixed effects problem** - Index(es):