Re: Binary Data with missing values

From: Bruce Southey <bsouthey_at_GMAIL.COM>
Date: Wed, 8 Jul 2009 21:07:21 -0500

On Wed, Jul 8, 2009 at 5:16 AM, Clempson, Andrew
Martin<aclempson_at_rvc.ac.uk> wrote:
> Dear All
>
>
>
> I wonder if someone could provide advice on dealing with missing values in
> binary data models.

Avoid it! :-)

>
>
>
> I am in the process of performing association studies between SNPs and
> pregnancy status in cows. I have genotypic information on approximately 90%
> of animals and pregnancy status data on approximately 80% of animals (coded
> as 0 = pregnant, 1 = not pregnant).

First some observations regarding missing records without considering the SNPs:
You probably should remove all animals with missing pregnancy status
from your data file but not the pedigree file. Just ensures the
correct data will be used.
If you have repeated records on the same animal, you must fit that
otherwise the genetic parameters are biased. If few animal have
repeated records, you probably need to delete the repeated records.
Since you are fitting an animal model, you need pedigree relationships
for all animals. If you have animals with no or limited genetic
relations you probably need to remove those.
Then fit your model without any SNPs to ensure that it does run and
the results are sensible (including solutions of animal effects).
Beware of the possibility for the occurrence of the extreme value
problem.

It is not a good idea to fit the actual SNPs rather you should do
association mapping (as been addressed by this list). That way you
will not have missing data due to missing SNPs.

If you really really must fit the actual SNPs, then if an SNP is very
rare then remove it because of the extreme category problem. If an SNP
is missing then you can impute it by different methods especially if
you know the location and surrounding SNPs - there is a genome now!

Bruce
Received on Thu Jul 08 2009 - 21:07:21 EST

This webpage is part of the ASReml-l discussion list archives 2004-2010. More information on ASReml can be found at the VSN website. This discussion list is now deprecated - please use the VSN forum for discussion on ASReml. (These online archives were generated using the hypermail package.)