Data Structure for QTL Analysis

A data set used in QTL analysis will consist of a set of molecular marker scores and a set of phenotypic traits for each individual or line of the mapping population. Typically, a QTL data set will consist of 100 or more markers and five or more traits. A partial view of a data set for the Oregon Wolfe Barley population is shown in Table 1.

 

Marker Scores

Trait Values

Line

ABG704

Wx

MWG089

CDO475

plant height

days to heading

1

A

A

A

A

73.34

39.3

2

B

B

A

A

40.01

23.7

4

A

A

A

A

59.37

41.0

5

B

A

.

B

81.92

46.3

6

B

B

A

A

37.47

28.0

7

B

B

B

B

71.12

43.0

8

A

A

B

B

39.69

30.4

9

A

B

B

B

78.42

60.8

10

A

A

A

A

48.90

32.0

Table 1. A subset of marker scores and phenotypic traits for 10 lines of the Oregon Wolfe Barley population, http://barleyworld.org/oregonwolfe. A and B scores refer to the parental alleles at a marker locus, and a period indicates missing data.

In addition to the marker and trait data, a marker linkage map is usually developed for each mapping population. A linkage map is required for simple and composite interval mapping, and is very useful, but not essential, for single-factor analysis of variance. Note the linkage map example (Fig. 7).

Fig. 7. An example of a marker linkage map in wheat (after Gupta et al., 2002).