Hi, I wish to use ASREML to analyse a large dataset of horse race results. I have previosuly tried to use DFREML for this dataset with unsatisfactory results as it seems unable to handle the volume of data and often fails to find results or just crashes. I should add that I have virtually no comprehension of advanced statistics and I am far from confident in my ability to use ASREML appropriately. I thus apologise for the possible naivity of the questions below but I don't wish to misuse the product. Accordingly if anyone is prepared to offer me advice privately rather than through this list I would be most grateful. Below are approximate volumes of the data I possess. Total Horses: 100,000 Horses with race results: 60,000 (the others are Sires, Dams or Damsires) Individual Race results: 500,000 (there is a "result" for each horse in each race) For each race result I have the following (conceptual) data structure: Horse, Sire, Dam, Damsire, Age, Sex, Race Distance, Track Condition, Year, Score1, Score2, Score3, Score4, Score5 where Race Distance is in metres, Track Condition consists of five categories of how wet/dry the track was and Year is whether the result was in first or second year for which I have data. Scores 1 through 5 are assessments of the horse's performance using different techniques such as time, earnings etc. It should be noted that my race results are all within one generation - no sires or dams have results. Some sires do however also exist as damsires. Desired analyses: 1) Heritability estimates for each technique of determining Scores 2) Correlation / regression analyses for the techniques of determining scores 3) Estimation of any maternal effect Questions: 1) How do I construct a datastructure (and appropriate .as file) for the input which supports a highly variable number of raceresults per horse (1 to about 70) plus allows the multiple regression analyses? Do they need to be performed separately? 2) Will ASREML be able to handle this volume of data for these calculations? I am currently using a midrange pc with windows. 3) Any suggestions on how to model Race Distance? In previous simple analysis of variance calculations I used distance ranges but I have been instructed to not do this if possible. I don't consider that Distance is a fixed effect as different horses excel at different distances, although for assessments such as race times/ speed it clearly has a close to fixed effect. 4) What is the correct Pin file for calculating heritability using Sire, Dam and Damsire simultaneously? Any assistance most appreciated. regards Stuart Williamson PhD student University of Melbourne -- Asreml mailinglist archive: http://www.chiswick.anprod.csiro.au/lists/asreml

