Re: editing outliers
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: editing outliers



Dear Kim

Best wishes for 1999
> 
> Hi all,
> 
> I am interested in using asremls feature of identifying outliers to edit my
> data - rather than editing my data prior to analyses. So - I thought I
> might canvas peoples ideas about what is the most appropriate strategy!
> 
> For example, when you are developing a model for analyses, can you use the
> number of outliers as an indication of whether your model is getting better
> or worse? (in the absence of R2 values and assuming the same data of course)

NO.  This would not work because the criterion for detecting outliers would
change between models.  ASREML takes the average absolute residual and
calculates a criterion of 3 standard deviaitons based on that figure
and assuming independent nomality.  Residuals are however not independent
and often not normal.  Changing the model would change the criterion
so this would not work to assess fit of the model.

> 
> If you know that your raw data values lie within a sensible distribution
> (assuming close to normal distribution), should you then remove outliers
> based on their residual solutions once you have the appropriate model
> established. (What came first - the best model or the identification of
> outliers?)

ASREML only identifies possible outliers.  
Such points should not be automatically deleted.
They should be investigated and deleted if you conclude that the
data value is not a plausible value.  

If you drop all the 'outliers' and repeat the analysis, you are likely 
to get further 'outliers' identified.  Thus you need to establish the
value is implausible.

In most cases that really matter, genuine outliers will be evident in the 
raw data as well as in the residuals.  Outliers are easier
to detect when there is good replication.  Data plots are often useful
in determining if a point is an outlier.


> 
> I know the usual approach is to edit your data before analyses based on raw
> values and perhaps within levels of fixed effects if things are getting
> hairy. However, this editing is usually done with no knowledge of animal
> (random) effects, and when you have unbalanced data it seems to me that
> using asreml to identify outliers (fitting both fixed and random effects
> simultaneously) may be a better option. Otherwise, I would use SAS
> facilities for the fixed effect model development, and asreml to include
> random effects.
>
I would just use ASREML but then I am a bit one-eyed.  SAS, Genstat and Splus
probably have tools like QQ plots to help investigate the normality
of the residuals.

From memory, Bev.Gogel@agric.nsw.gov.au  proposes an outlier test based
on fitting the 'outlier' as a random effect and testing the LogL change
from that component.  This has been developed in the context of correlated
data so is appropriate to animal data.
  
  
  Rdmout !=V0 !==123      would set a 'factor' coded '1'  for record '123'
  
  Including it in the model as a random effect would provide a test as to
   whether the observation in record 123 was an outlier
    [had significantly extra variance).
    
> What do any of you think?
> 
> Thanks for any ideas.
> 
> Cheers
> 
> Kim
> Kim Bunter
> PhD Student
> Animal Genetics and Breeding Unit
> University of New England
> Armidale, NSW, 2351
> 
> Ph:  (02) 6773 3788
> Fax: (02) 6773 3266
> email: kbunter@metz.une.edu.au
> --
> Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml


<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Arthur Gilmour PhD                    email: Arthur.Gilmour@agric.nsw.gov.au
Senior Research Scientist (Biometrics)                 fax: <61> 2 6391 3899
NSW Agriculture                                             <61> 2 6391 3922
Orange Agricultural Institute               telephone work: <61> 2 6391 3815
Forest Rd, ORANGE, 2800, AUSTRALIA                    home: <61> 2 6362 0046

ASREML is still free by anonymous ftp from pub/aar on ftp.res.bbsrc.ac.uk
    or point your web browser at ftp://ftp.res.bbsrc.ac.uk/pub/aar/ 

To join the asreml discussion list, send the message  
     subscribe
to  asreml-request@chiswick.anprod.CSIRO.au

The address for messages to the list is asreml@chiswick.anprod.CSIRO.au

Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml

                        <> <> <> <> <> <> <>
"You (Father) have given Him (Your Son) authority over all flesh,
   that He should give eternal life to as many as You have given Him.
And this is eternal life,
   that they may know You, the only true God,
   and Jesus Christ whom You have sent."  John 17: 2-3.
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>


--
Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml