Dear Nick,
We have a similar analysis in the SPLines paper we are presenting in
South Africa. It may be useful to read it and look at the attached .as
file that produced the analysis.
On Thu, 29 Oct 1998, N.W. Galwey wrote:
> I would be grateful for feedback on the ASREML command file and output
> embedded below. The objective is to determine the relationship between
> field establishment (plants/m2) and grain yield (t/ha) in three varieties of
> lupin.
>
> I'd like to know whether the output indicates that the model I am fitting is
> appropriate. In particular:
>
> i. I put two starting values after the codeword AR with the intention of
> getting first- and second-lag autocorrelations. Is this the correct method?
yes I believe so.
>
> ii. If so, does the output indicate that the inclusion of second-lag
> autocorrelation gives a significantly better fit? I am assuming that it
> does because the Compnt/StndErr values for both AR terms are quite large
> (>3).
The most reliable way to do this is to drop the ar(2) parameter and
examine the change in REMLlog lik.
>
> iii. Why are the gamma and component values for these two terms so
> similar? Is this just coincidence?
I presume you mean the coeffs for the row and column terms. this is no
coincidence. They are the same. These are not components, but correlations.
> > iv. I don't want to fit any column effects. So is
the codeword IDEN
> okay? How should I decide whether the identify matrix should be scaled?
>
No scaling is required we are fitting a separable model to the row.column
terms. This is usually the error term as it uniquely defines the plots.
The model is
var(vec(e)) = \sigma^2 C_c x C_r
where the C_i's are the correlation matrices for cols and rows.
> v. Is it okay to fit a linear effect of the variate ESTAB (and its
> interaction with Variety), to account for any overall trend, before
> including the spline in the model?
This must be done, otherwise you dont have a natural cubic smoothing
spline (NCS). The spl() term only includes the nonlinear component. You
dont need c(variety) in the random part, as this is only relevant for fixed
effects.
>
> vi. Should the spline (and its interaction with Variety) be put in the
> random effects model? If so, will the variance explained by the spline be
> included in the error against which the linear trend is tested?
>
Definitely! Yes. But my comment above stills holds. Dropping linear terms
implies you no longer are fitting a NCS.
> vii. Am I right in thinking that it makes no difference to the fitted value
> of a data point whether a term (e.g. spl(ESTAB)) is fitted as a fixed or a
> random effect?
>
No. it must be a random effect. I make a few more points further down in
the analysis which are relevant.
> The desired outcome is a curve relating yield to establishment in each
> variety, and it is therefore necessary to obtain predicted values. I would
> like to avoid the complexities of setting up a .pin file, and the ASREML
> manual suggests adding some "missing data" and fitting them with "missing
> values" (p 86, version of 2 October 1998). However, it seems to me that
> this approach won't work for a spatial analysis, as each observation must be
> allocated to a row and column. I had hoped that the row and column values
> would not influence the fitted value (just as the variogram is not
> influenced by whether AR terms are fitted), but it turns out that they do
> have an effect. Any suggestions?
>
see further down
> __________________
>
> 96wh14.as
>
> Restricted branching lupin - density trials
> Variety 3 !A
> TDENS 6
> Rep 5
> PLOT 90
> COL 2
> ROW 101
> THA
> ESTAB
>
> C:\DOCS\Dracup_M\96WH14\96WH14.dat !skip 1 !maxit 20
>
> THA ~ mu mv c(Variety) ESTAB c(Variety).ESTAB,
> !r Rep spl(ESTAB) c(Variety).spl(ESTAB)
> 1 2
> ROW ROW AR .1 .1
> COL COL IDEN
> ______________________
> 96wh14.asr
>
> ASREML [30 Sep 1998] Restricted branching lupin - density trials
>
> 10/29/98 12:09:43.22 8.00 Mbyte c:\DOCS\Dracup_M\96WH14\96wh14.as
> QUALIFIERS: !skip 1
> Reading C:\DOCS\Dracup_M\96WH14\96WH14.dat FREE FORMAT skipping 1
> lines
> Univariate analysis of THA
> Using 108 records [of 108 read from 108 lines of
> C:\DOCS\Dracup_M\96W]
> Model term Size Type COL Minimum Mean Maximum #zero #miss
> 1 Variety 3 Factor 1 1 2.0000 3 0 18
> 2 TDENS 6 Factor 2 25 71.6667 125 0 18
> WARNING - More levels in factor than expected
> 3 Rep 5 Factor 3 1 3.0000 5 0 18
> 4 PLOT 90 Factor 4 1 45.5000 90 0 18
> 5 COL 2 Factor 5 1 1.5000 2 0 0
> 6 ROW 54 Factor 6 1 27.5000 54 0 0
> 7 THA 1 Variate 7 0.7530 1.014 1.428 0 20
> 8 ESTAB 1 Covariat 8 19.60 71.19 137.6 0 18
> 9 mu 1 Constant Term
> 10 mv_estimates 20 Missing value
> 11 c(Variety) 2 Factor 1 1 2.0000 3 0 18
> 12 c(Variety).E 2 Interaction 11 c(Vari: 2 8 ESTAB : 1
> 13 spl(ESTAB) 47 Spline 8 19.60 71.19 137.6 0 18
> 14 c(Variety).s 94 Interaction 11 c(Vari: 2 13 spl(ESTAB) : 47
> 54 AR=AutoR 0.10 0.10
> 2 identity
> Forming 173 equations: 27 dense
> Initial updates will be shrunk by factor 0.548
> LogL= 80.6926 S2= 0.11703E-01 82 df 0.10000 0.10000 0.10000
> 1.0000 0.10000 0.10000
> LogL= 104.813 S2= 0.11911E-01 82 df 0.16662 0.10000E-010.10000E-01
> 1.0000 0.21445 0.27837
> LogL= 116.546 S2= 0.13897E-01 82 df 0.30356 0.10000E-020.10000E-02
> 1.0000 0.30181 0.35295
> LogL= 121.136 S2= 0.15483E-01 82 df 0.40011 0.10000E-030.10000E-03
> 1.0000 0.35163 0.34838
> LogL= 122.124 S2= 0.14720E-01 82 df 0.44461 0.58089E-040.10000E-04
> 1.0000 0.35465 0.31151
> LogL= 122.722 S2= 0.18048E-01 82 df 0.42452 0.49216E-040.10000E-06
> 1.0000 0.34659 0.41749
> LogL= 122.719 S2= 0.20917E-01 82 df 0.36671 0.45774E-040.10000E-06
> 1.0000 0.40451 0.39840
> Final parameter values 0.56309 0.65929E-040.10000E-06
> 1.0000 0.37353 0.37798
>
> Source Model terms Gamma Component Compnt/StndErr
> Rep 5 5 0.563094 0.117781E-01 1.77 P
> spl(ESTAB) 47 47 0.659291E-04 0.137902E-05 0.75 P
> c(Variety).spl(ESTAB 94 94 0.100000E-06 0.209167E-08 1.86 B
> Variance 108 82 1.00000 0.209167E-01 1.86 P
> Residual AR=AutoR 54 0.373534 0.373534 3.20 U
> Residual AR=AutoR 54 0.377980 0.377980 3.20 U
################################################
POINTS TO NOTE
###############################################
1. I wouldnt normally include Rep as a fixed effect in a spatial
analysis! This would need some reasonable diagnostic to convince me that
it was necessary. The gamma seems large so I am suspicious that there is
something strange with the field layout. Were the replicates adjacent. Is
the field layout correct. The data must be exhibiting large trend!!
2. the gamma for variety.spl() is zero. This could mean that ESTAB may
need scaling or it is zero. Try rerunning the job with ESTAB /100. Gammas
for spl() terms are scale dependant, just like regression coeff's so be
careful.
3. There are too many knots in ESTAB (ie 47!). I presume this trial was
designed
as a density by variety trial, with target densities. I would try the
target densities as input for the spline. This is an interesting, as yet
unsolved problem. My pragmatic approach is to choose knot points for
which there is coverage of the design space for the 'x' but only enough
to be sensible. Marx and Eilers have done some work in this area but I
haven't fully investigated the issue. I certainly dont think ASREML has
it right yet. We (ie Robin and I) have some notes on how to fit models
with more data than knots but it is not in ASREML yet. This data set is
an exmaple of one where you should nominate knots and then use the
results. The current strategy for these data would be to use the target
density as a sensible 'x'.
4. Getting fitted splines!!!!!! Well this is tricky. I dont use pin files
(sorry Arthur) I prefer to reconstruct the spline myself. The output in
the .asr file is all you need, along with some code to generate the
spline design matrix (Z_s) which we have, if you would like it. You can
turn it into a GENSTAT proc if you like ie pick up the output from .asr
and use GENSTAT to get the fitted spline. The .vrb file can be used to
get prediction intervals.
The output below gives the values of Z_su_s for spl() and variety.spl().
You simply add these to the linear part ie for variety 1
fitspl = 157.88 - .0355 - .001966*ESTAB +.000582*ESTAB + spl() +
var1.spl()
mu MERRIT ESTAB c(var).ESTAB spl() v1.spl()
Ie there are bits from the usual linear part and then add the two spl()
parts.
4. I get nervous about ROW and COL as you have it. I always do it as I
have it in the attached file. The data is sorted into rows within cols
and then I cant be worried about whether ASREML is fitting the model I
want.
Hope this helps
cheers
brian
> WARN ING: Code B - fixed at a boundary (!GP)
> C - Constrained by user (!CON)
> S - Singular Information matrix
> S means there is no information in the data for this parameter.
> Very small components with Comp/SE ratios of zero sometimes indicate poor
> scaling. Consider rescaling the design matrix in such cases
> Fitted Spline 49 (X) for spl(ESTAB)
> 19.600 21.800 24.400 27.050 28.200 29.800 32.800 37.400
> 41.200 42.900 44.000 45.200 46.600 48.400 50.350 52.400
> 54.200 56.000 57.700 59.000 60.600 63.350 64.733 69.900
> 71.400 72.800 76.000 79.500 80.800 82.400 85.000 87.267
> 89.400 91.600 93.400 95.600 98.600 100.000 103.800 105.800
> 111.000 115.267 116.900 118.800 121.000 122.700 128.400 130.800
> 137.600
> Fitted Spline 49 (Y) for spl(ESTAB) 1
> 0.028825 0.025985 0.022633 0.019228 0.017760 0.015734 0.011984 0.006362
> 0.001891 -0.000024 -0.001228 -0.002504 -0.003938 -0.005686 -0.007444 -0.009122
> -0.010445 -0.011620 -0.012588 -0.013231 -0.013901 -0.014735 -0.015003 -0.015187
> -0.015038 -0.014829 -0.014125 -0.013057 -0.012588 -0.011960 -0.010830 -0.009747
> -0.008657 -0.007466 -0.006443 -0.005138 -0.003267 -0.002361 0.000183 0.001561
> 0.005231 0.008293 0.009465 0.010819 0.012369 0.013552 0.017448 0.019085
> 0.023756
> ', , 0.03
> '',, ,' 0.02
> ' ,,,'' 0.01
> ',, ,,' 0.00
> ''',, ,,,'' 0.00
> ''',,,,,,,,,,,'''' -0.01
> -0.02
> -0.03
> Fitted Spline 49 (Y) for c(Variety).spl(ESTAB 1
> -0.74E-06 -0.58E-06 -0.40E-06 -0.23E-06 -0.16E-06 -0.58E-07 0.14E-06 0.50E-06
> 0.89E-06 0.11E-05 0.13E-05 0.15E-05 0.17E-05 0.20E-05 0.24E-05 0.29E-05
> 0.32E-05 0.35E-05 0.38E-05 0.39E-05 0.39E-05 0.37E-05 0.35E-05 0.20E-05
> 0.14E-05 0.73E-06 -0.97E-06 -0.30E-05 -0.37E-05 -0.45E-05 -0.57E-05 -0.65E-05
> -0.70E-05 -0.74E-05 -0.76E-05 -0.76E-05 -0.72E-05 -0.69E-05 -0.59E-05 -0.51E-05
> -0.26E-05 -0.24E-06 0.73E-06 0.19E-05 0.32E-05 0.42E-05 0.76E-05 0.91E-05
> 0.13E-04
> ''''''''''''''''''''''''''''''''''''''''''''''''' 0.00
> 0.00
> 0.00
> -0.01
> -0.01
> -0.01
> -0.01
> -0.01
> Fitted Spline 49 (Y) for c(Variety).spl(ESTAB 2
> -0.61E-05 -0.52E-05 -0.41E-05 -0.31E-05 -0.26E-05 -0.20E-05 -0.93E-06 0.40E-06
> 0.11E-05 0.13E-05 0.14E-05 0.16E-05 0.17E-05 0.17E-05 0.18E-05 0.17E-05
> 0.16E-05 0.15E-05 0.13E-05 0.12E-05 0.94E-06 0.53E-06 0.33E-06 -0.14E-06
> -0.18E-06 -0.17E-06 0.13E-07 0.44E-06 0.65E-06 0.93E-06 0.14E-05 0.19E-05
> 0.22E-05 0.26E-05 0.28E-05 0.29E-05 0.30E-05 0.30E-05 0.27E-05 0.24E-05
> 0.13E-05 0.12E-06 -0.38E-06 -0.10E-05 -0.18E-05 -0.24E-05 -0.46E-05 -0.55E-05
> -0.82E-05
> ''''''''''''''''''''''''''''''''''''''''''''''''' 0.00
> 0.00
> 0.00
> -0.01
> -0.01
> -0.01
> -0.01
> -0.01
> Solution Standard Error T-value T-prev
> 12 c(Variety).ESTAB 2 3.46 3.46 [DF
> F_inc F_all]
> 1 0.582442E-03 0.385741E-03 1.51
> 2 -0.101474E-02 0.388952E-03 -2.61 -2.41
> 8 ESTAB 1 23.84 24.87 [DF
> F_inc F_all]
> 3 -0.196641E-02 0.394323E-03 -4.99
> 11 c(Variety) 2 0.37 2.08 0.5136E-01 [DF F_i
> F_a SED]
> MERRIT -0.305525E-01 0.302952E-01 -1.01
> 85SO46-37 0.607833E-01 0.298523E-01 2.04 1.73
> 10 mv_estimates 20 13.49 14.01 [DF
> F_inc F_all]
> 6 -0.882628 0.916597E-01 -9.63
> 7 -0.647879 0.941482E-01 -6.88 1.92
> 8 -1.07879 0.116429 -9.27 -2.83
> 9 -1.11726 0.122686 -9.11 -0.34
> 10 -1.10920 0.133636 -8.30 0.07
> 11 -1.12127 0.138845 -8.08 -0.10
> 12 -1.12293 0.143739 -7.81 -0.01
> 13 -1.12842 0.147031 -7.67 -0.05
> 14 -1.13130 0.149708 -7.56 -0.02
> 15 -1.13465 0.151712 -7.48 -0.03
> 16 -1.13715 0.153298 -7.42 -0.02
> 17 -1.13950 0.154532 -7.37 -0.02
> 18 -1.14145 0.155512 -7.34 -0.02
> 19 -1.14317 0.156290 -7.31 -0.01
> 20 -1.14464 0.156913 -7.29 -0.01
> 21 -1.14592 0.157416 -7.28 -0.01
> 22 -1.14703 0.157823 -7.27 -0.01
> 23 -1.14799 0.158154 -7.26 -0.01
> 24 -1.14882 0.158426 -7.25 -0.01
> 25 -1.14953 0.158650 -7.25 -0.01
> 9 mu 1 157.88 287.57 [DF
> F_inc F_all]
> 26 1.15412 0.680586E-01 16.96
> 3 Rep 5 effects fitted
> 13 spl(ESTAB) 47 effects fitted
> 14 c(Variety).spl(ESTAB 94 effects fitted
> Finished: 12:10:59.39 LogL Converged
>
> _____________________________________________________________________
> N.W. Galwey,
> Faculty of Agriculture,
> University of Western Australia,
> Nedlands, WA 6709, Australia.
>
> Tel.: +61 9 380 1959 (direct line)
> +61 9 380 2554 (switchboard)
> Fax: +61 9 380 1108
>
>
...............................................................................
Brian Cullis Tel: 02 6938 1855
NSW Agriculture Fax: 02 6938 1809
Wagga Agricultural Institute email: brian.cullis@agric.nsw.gov.au
Pine Gully Rd
WAGGA WAGGA NSW 2650
met cowl
num 1
row 27
column 6
variety 9 !A
seedrate 7 !I
lincol 1
linrow 1
rcol 6
rrow 27
fixrow 4
x 1
site 11
yield 1
splrow 1
trt 62 !A
cowlred.asd !skip 1 !dense 99
yield ~ mu mv c(site) x site.x,
site|2/lincol site|5/lincol,
site|6/lincol site|8/lincol,
site|9/lincol site|10/lincol,
site|3/linrow site|5/linrow,
site|7/linrow site|8/linrow,
site|2/c(fixrow) site|3/c(fixrow),
site|5/c(fixrow) site|7/c(fixrow),
site|8/c(fixrow) site|9/c(fixrow) site|11/c(fixrow),
!r spl(x) .00349 ,
variety .002 va.x .00046 va.spl(x) .000757,
si.va .018 si.va.x .00021,
site.spl(x) .000450 -si.va.spl(x) .00002,
-seedrate .001 -va.se .001 -si.se .001 -si.va.se .000113,
site|3/rrow .001 site|6/rrow .0017,
site|7/rrow .00046,
site|9/rrow .0045,
site|1/rcol .01 site|2/rcol .025 site|3/rcol .082,
-site|7/rcol .0031 site|9/rcol .0042 site|10/rcol .00043,
-rcol -rrow -rcol.rrow,
si|5/rc.rr .0042,
si|7/rc.rr .013,
si|8/rc.rr .013,
-si|10/rc.rr .0024
11 2 2
6 column AR .15 !S2=.034
16 row AR .21
6 column AR .16 !S2=.039
16 row AR .31
4 column AR .35 !S2=.067
16 row AR .17
6 column AR .04 !S2=.035
16 row AR .08
6 column AR .07 !S2=.018
16 row AR .58
6 column AR .22 !S2=.024
27 row AR .34
6 column AR .51 !S2=.041
27 row AR .92
6 column AR .33 !S2=.020
27 row AR .74
6 column AR -.02 !S2=.018
27 row AR .24
6 column AR .18 !S2=.0088
27 row AR .53
6 column AR .082 !S2=.054
27 row AR .21
variety 2
2 0 CORR .67 .0053 .000409
9 0 0
si.va 2
2 0 CORR .995 .018 .000178
99 0 0