Genetic Measurement Theory of Epistatic Eeects

نویسنده

  • Manfred D. Laubichler
چکیده

Epistasis is de ned as the in uence of the genotype at one locus on the e ect of a mutation at another locus. As such it plays a crucial role in a variety of evolutionary phenomena such as speciation, population bottle necks and the evolution of genetic architecture (i.e. the evolution of dominance, canalization and genetic correlations). In mathematical population genetics, however, epistasis is often represented as a mere noise term in an additive model of gene e ects. In this paper it is argued that epistasis needs to be scaled in a way that is more directly related to the mechanisms of evolutionary change. A review of general measurement theory shows that the scaling of a quantitative concepts has to re ect the empirical relationships among the objects. To apply these ideas to epistatic mutation e ects it is proposed to scale AxA epistatic e ects as the change in the magnitude of the additive e ect of a mutation at one locus due to a mutation at a second locus. It is shown that the absolute change in the additive e ect at locus A due to a substitution at B is always identical to the absolute change in B due to the substitution at the A locus. The absolute AxA epistatic e ects of A on B and of B on A are identical, even if the relative e ects can be di erent. The proposed scaling of AxA epistasis leads to particularly simple equations for the decomposition of genotypic variance. The Kacser Burns model of metabolic ux is analyzed for the presence of epistatic e ects on ux. It is shown that the non-linearity of the Kacser Burns model is not su cient to cause AxA epistasis among the genes coding for the enzymes. It is concluded that non-linearity of the genotype-phenotype map is not su cient to cause epistasis. Finally it is shown that there exist correlations among the additive and epistatic effects among pairs of loci, caused by the inherent symmetries of Mendelian genetic systems. For instance, it is shown that a mutation which has a larger than average additive e ect will tend to decrease the additive effect of a second mutation, i.e. it will tend to have a negative (canalizing) interaction with a subsequent gene substitution. This is con rmed in a preliminary analysis of QTL-data for adult body weight in mice. Introduction Classical populationand quantitative genetic theory are largely theories of additive e ects. This does not mean that linear e ects are assumed to be the rule, they are not, but rather that the linear component of a genotype/character regression is of particular importance. The reason is that the response to natural and arti cial selection depends on the level of heritability, which in turn is determined by the fraction of phenotypic variance that can be attributed to the statistically linear component of gene e ects, that is the additive genetic variance. Hence, breeding success and adaptation by natural selection both depend on these additive e ects and they are thus the center of much of mathematical evolutionary theory. Up to now epistatic e ects played only a secondary role 1 in this context. They are mainly treated as a noise term, i.e. as any deviation from the additive model. There is very little mathematical theory that would highlight the importance of epistatic e ects in evolutionary processes. A notable exception is the theory of local breeding values (Goodnight, 1995), which has been developed to predict the consequences of random drift on genetic architecture. This relative lack of epistasis theory is certainly not caused by a lack of interest in the subject, as witnessed by recent reviews (Moreno, 1994; Whitlock, et al., 1995). We suggest that the lack of progress in this area is in part due to the de nition of epistasis as a noise term relative to the additive model, rather than an explicitly de ned mutational e ect. There are at least three areas in evolutionary biology in which epistasis plays more than just a secondary role, these are the genetics of speciation (Carson and Templeton, 1984; Coyne, 1992; Gavrilets and Gravner, 1997; Orr and Coyne, 1989; Templeton, 1982; Wagner, et al., 1994; Wu and Palopoli, 1994), the genetic consequences of population bottle necks (Bryant, et al., 1986; Cheverud and Routman, 1996; Goodnight, 1987; Goodnight, 1988) and the evolution of genetic architecture or of development (Schmalhausen, 1986; Waddington, 1957; Wagner and Altenberg, 1996b). Species barriers are often associated with the development of genetic incompatibilities among the members of di erent species. They are always caused by epistatic interactions among mutations. Population bottle necks cause the loss of genetic variation due to drift and inbreeding, but may lead to an increase of additive genetic variance of phenotypic characters. The role of epistatic e ects in the evolution of genetic architecture and development is less explicitly acknowledged, but also quite obvious. Epistasis is the in uence of one locus on the expression of genetic variation at another locus. They are thus per de nition invoked in models of dominancemodi cation (Mayo and B urger, 1997) and in research on genetic canalization (Scharloo, 1962; Wagner, et al., 1997). Hence, epistasis modulates the e ects of future mutations and may thus have a long-term e ect on the patterns of natural variation (Wagner and Altenberg, 1996a). Other areas in which epistasis turned out to be important are the unit of selection problem (Goodnight, et al., 1992; Laubichler, 1997), and the maintenance of genetic variation (Gavrilets and deJong, 1993). The short overview in the last paragraph shows that epistasis is recognized as an important factor in the process of evolutionary change, nevertheless in mathematical models epistasis maintains a secondary role, and even the available methods for its modeling are relatively crude. It is the goal of this paper to suggest a scaling of epistatic e ects that may facilitate research on the role of epistasis in evolution. The intent of the de nition is not to change the meaning of epistasis but only to provide a di erent mathematical representation of the concept that is intended to lead to a better understanding of the role of epistasis in evolution. For the approach in this paper it is important to distinguish clearly between the de nition of a quantitative concept and the methods for its estimation. This distinction is not always clear cut, however. In the pre-theoretical phase of con2 ceptual development, a quantitative term is often tied to a speci c method of estimation. For instance, temperature was originally de ned via the expansion of a body as a consequence of heating, i.e. the de nition was linked to one possible way to measure temperature. With a deeper understanding of the relationship between the structure of matter and temperature, the concept was rede ned. It was then related to the kinetic energy of the particles in a physical system. Temperature is therefore no longer linked to a particular way of measuring it, but integrated into a larger theoretical context, in this case statistical thermodynamics. Similarly epistasis was originally linked to the interaction terms in the analysis of variance, which is one way of how epistatic e ects can be detected. As such, epistasis is not integrated into the conceptual core of mathematical population genetics and can not easily play a positive role in its development. In this paper we use ideas from measurement theory (Suppes and Zinnes, 1963) and from QTL based genetic analysis (Cheverud and Routman, 1995) to propose a scaling of epistasis. It will be shown that this scaling leads to signi cant simpli cations in the partitioning of genetic variance into additive and non-additive components, and to results in the analysis of gene e ects that can be more easily interpreted, for instance in terms of their in uence on canalization. Furthermore it will be shown that nonlinear metabolic models of the Kascer-Burns type predict no AxA epistasis in a linear metabolic pathway. Finally it is discussed how the symmetries of genetic systems lead to correlations among the additive and epistatic e ects of pairs of genes. Genetic Measurement Theory Sound de nitions of quantitative concepts are at the heart of every predictive scienti c theory. Among all sciences physics has perhaps been most sophisticated in introducing quantitative concepts and much of its success as a paradigm of science is based on the rigorous de nitions of these concepts. To a lesser extent this is also true of quantitative genetics, where concepts like breeding values, heritability and additivity are at the center of both the empirical research as well as the mathematical theory. In fact any meaningful quantitative concept has this double role in setting the agenda for empirical investigations as well as de ning the core of predictive theories. In this section the basic ideas of general measurement theory will be reviewed and applied to the de nition of additive by additive epistatic e ects. This review leads to a scaling of epistatic interaction which will be developed in the next section with reference to the basic two locus two allele model. General Measurement Theory The general logical structure of extensive quantitative concepts has been worked out as early as the beginning of this century (Hlder, 1901). Later in this cen3 tury general measurement theory attracted a considerable amount of attention, mostly in connection with quantitative methods in psychology. The short summary of general measurement theory presented here is based on the reviews by Suppes and Zinnes (1963) and Luce and Krumhansl (Luce and Krumhansl, 1988). The basic idea of any measurement theory is that a quantitative scale is a map between some empirical objects and associated numerical values. The prototype of a scale is the mapping of physical bodies to a measure of their physical mass. This mapping, however, is not arbitrary but is supposed to meet some requirements. The most important of them is that the mapping includes also a map between empirical relations between the objects to algebraic relations between the numerical values. Again the simplest example is that of physical mass. In this case the quantitative measurements are constructed in such a way that, for instance, the operation of combining objects O3 = (object obtained by physically combining O1 with O2) corresponds to the addition of the masses of objects O1 with the mass of O2 to obtain the mass of O3: m (O3) = m (O1) + m (O2) : The physical operation of combining objects corresponds to the mathematical operation of summation. This is the most important aspect of de ning a scale, since it implicitly de nes the scienti c meaning of the concept and determines how to use the measured values for predictions; for instance predicting the mass of a lled container from the masses of the container and that of the cargo. There are other, more technical aspects of general measurement theory that we will not review here. They concern the types of scales and the uniqueness of scales. Scales are for instance classi ed as fundamental or derived, depending on whether they are based on existing scales or not. For those interested in these aspects of measurement theory we recommend the excellent summary by Suppes and Zinnes (1963). Genetic Measurement Theory of AxA Epistasis The de nitions of quantitative genetic concepts have largely been introduced by Fisher in his seminal paper from 1918. If we look at the familiar de nition of an average e ect of an allele from the standpoint of general measurement theory it turns out that the empirical relation system, on which this measure is based, is that of gene substitutions. The average e ect of an allele roughly corresponds to the average deviation of the mean phenotypic value from the population mean if the allele at a particular locus is replaced by another allele (Falconer, 1981). For the moment the complications resulting from the di erences between average e ect and average excess (Templeton, 1987) will be disregarded. The exact nature of this scaling procedure in terms of general measurement theory is somewhat di cult to determine since the e ects are measured on the basis of the given scaling of the phenotypic character. It is thus in a sense a derived scaling, but on the other hand it also does not follow a strict de nition 4 of a derived scaling since it introduces new empirical relations among objects, namely gene substitutions among genotypes. Furthermore the e ects of the gene substitutions are expressed in terms of population or sample distributions, which makes the thus de ned e ects population dependent. However, this is a complication that need not be introduced at the outset, as shown recently by Cheverud and Routman (1995). Cheverud and Routman de ned the e ect of genes solely on the basis of the genotypic values of individual genotypes rather than in relation to a population average. This approach has been motivated by the availability of molecular markers which makes it possible to assign genotypic values to individual marker genotypes.1 The traditional population dependent values can be derived from the \physiological" e ects and the genotype frequencies. For the present purpose we follow the approach of Cheverud and Routman looking at individual genotypes and their genotypic values. The starting point of a genetic measurement theory are genotypes and the empirical relations among them are the gene substitutions which turn one genotype into another. The scaling of gene e ects is then based on the in uence of such a substitution on the genotypic value of the genotype, leading to the de nition of additive e ects. This de nition has a natural connection to the process of selection which is also the replacement of one allele by another. However, in almost no case are all the genotypic values explained in terms of additive e ects. So the question is how to measure deviations from additivity. There are various ways how one can deal with deviations from additivity. The most commonly employed models that measure epistatic e ects are based on regression models or variance analysis. Not surprisingly then, the measures of epistasis are only of limited predictive and theoretical value, with the exception of the theory of local breeding values which are particularly useful to predict the genetic consequences of random drift (Goodnight, 1995). In order to obtain a measure of epistasis, which is based on empirical relations between genotypes rather than statistical relationships, the use of a straightforward extension of Fishers approach is suggested, which is to consider the consequences of gene substitutions. In this case, however, we have to consider two successive gene substitutions to nd the e ect of gene interactions. This approach leads to the following preliminary de nition of epistasis: AxA epistasis between locus A and locus B is the in uence of a gene substitution at locus B on the additive e ect of a subsequent substitution at locus A and vice versa. Of course this de nition does not change the meaning of the concept, but only makes the relationship between the e ects of gene substitutions and epistasis explicit. This de nition also does not capture all the deviations from 1Of course there is an implicit population dependence even in Cheverud's de nition since the genotypic values are the average over the genetic background. However the de nitions are independent of the gene frequencies of the focal loci. 5 linearity that are covered by the DxA, AxD and DxD interactions. These will be considered after a discussion of the AxA e ects. Scaling AxA-Epistasis in the Two-Locus Two-Allele Model Consider a two locus two allele model in which the in uence of epistatic interaction on additive genetic e ects is to be measured. The two loci are called A and B, with A1 and A2 being the two alleles at the A locus and analogously for the B locus. The genotypic value of the AiAj=BkBh genotype is written as Gijkh. We denote the genotype speci c physiological e ect (sensu Cheverud and Routman, 1995) of substituting A1 by A2 with aA;kh, where the indices k and h denote the genotype at the B-locus. This value is de ned as 1=2 of the di erence in the genotypic values G11kh and G22kh: aA;kh = G22kh G11kh 2 : The genotypic value after gene substitution then is G22kh = G11kh + 2aA;kh: Now let us consider how strongly the genetic background on the B-locus a ects the additive e ect of the gene substitution at the A-locus. For this purpose we declare the genotype G1111 as the reference and compare the additive e ect of A2 in the B1B1 and B2B2 background, i.e. we compare aA;11 with aA;22. The relative change of these values is the epistatic in uence of a gene substitution at the B-locus on the e ect of a gene substitution at the A-locus. eB!A = aA;22 aA;11 2aA;11 which is a dimension-less number. The genotype speci c e ect of a substitution at the A-locus in the genetic background B2B2 can be written as aA;22 = aA;11 (1 + 2eB!A) and the genotypic value of A2A2=B2B2 then is G2222 = G1111+ 2aB;11 + 2aA;11 (1 + eB!A) : This equation can be read as accounting rst for the gene substitution at the B-locus and then the gene substitution at the A-locus, who's e ect is now a ected by the new genetic background at the B-locus: G2222 = G1111+2aB;11+ 2aA;22: Then substituting aA;22 by aA;11 (1 + 2eB!A) leads to the above equation. For symmetry reasons the genotypic value of A2A2=B2B2 can also be derived from the e ect of a gene substitution at the A-locus rst and then a substitution 6 at the B-locus. Now we have to take into account the epistatic e ects of the A locus on the B-locus. G2222 = G1111+ 2aA;11 + 2aB;11 (1 + 2eA!B) with eA!B = aB;22 aB;11 2aB;11 : Of course these two equations for G2222 have to yield the same value, which implies a constraint on the epistatic coe cients aA;11eB!A = aB;11eA!B = EAB: It is easy to show that this equation is always ful lled. EAB measures the absolute epistatic e ect: EAB = aA;22 aA;11 = aB;22 aB;11; which is identical for both loci. This means that AxA interaction is a strictly symmetrical relationship among loci, only the relative magnitude of epistatic in uence eA!B and eB!A can be di erent because the additive e ects can be di erent. Dominance E ects So far we have scaled additive by additive epistatic interactions. A complete description, however, has to account for the e ects of dominance as well. Dominance e ects include within-locus dominance e ects and, in the case of epistasis, the potential epistatic in uence of the second locus of the within-locus dominance e ects of the rst locus and vice versa. Again, a distinction between the physiological and the statistical interpretation of dominance has to be made. Physiological dominance is de ned as the deviation of the heterozygote phenotype value from the midway value of the two homozygote phenotype values at one locus (Cheverud and Routman, 1995). Statistical dominance deviations are deviations of single locus genotype values from a linear regression of the composing alleles to the genotypic values. As such, statistical dominance e ects are dependent on the genotype frequencies in the population. The within-locus dominance value dA;kl at the A-locus can be de ned as the deviation of the heterozygote genotype value from the additive combination of the physiological e ects of the alleles and all additive-by-additive epistatic components. It describes the within-locus dominance e ects at the A locus in the genetic background kl at the second locus and is given as: dA;kl = G12;kl G(A A)12;kl where G(A A)12;kl is the genotypic value predicted from the additive and AxA epistatic e ects. A similar equation holds for the within-locus genotype values at the B locus. Next, the epistatic in uence of the second locus on the dominance deviations of the rst locus is de ned as: edB!A = dA;22 dA;11 2dA;11 : 7 A1A1 A1A2 A2A2 B1B1 G11;11 G11;11+ aA;11 +dA;11 G11;11+ 2aA;11 B1B2 G11;11+ aB;11 +dB;11 G11;11+ aA;11 + aB11 +EAB + dA;11(1 + edB!A) +dB;11(1 + edA!B) + edd G11:11+ 2aA;11 + aB;11 +2EAB +dB;11(1 + 2edA!B) B2B2 G11;11+ 2aB;11 G11;11+ aA;11 + 2aB;11 +2EAB + dA;11(1 + edB!A) G11;11+ 2aA;11 + 2aB;11 +4EAB Table 1: Complete description of the genotype values in a two locus two allele system. The genotype values of a two locus two allele system are expressed as a function of the reference genotype G11;11, the genotype speci c physiological e ects of a gene substitution, aA;11, and aB;11, the AxA epistasis coe cient EAB, the within-locus dominance values, da;kl, the values of the epistatic in uence of one locus on the dominance deviations of the other locus, ed, and the \epistatically" induced deviation of dominance e ects in the double heterozygote genotype, edd12;12. Similar equations hold for the other locus. The dominance deviation of a heterozygote genotype at A locus with an B2B2 homozygote genotype at the second locus is then given as: dA;22 = dA;11(1 + 2edB!A) This form of epistasis describes the additive components of the epistatic in uence of the second locus on the dominance e ects of the rst locus. It captures the additive by dominance form of epistasis. In the case of the double heterozygote genotype additional e ects can exist. There can be a \dominance deviation of dominance e ects" or a second order dominance e ect as well as an epistatic interaction between dominance e ects. Unfortunately the two locus system is underdetermined for the double heterozygote genotype and we have to describe all those e ects as the \epistatically induced deviation of dominance e ects" in the double heterozygote genotype. It is de ned as: edd12;12 = G12;12 G(A A)12;12 dA;11(1 + edB!A) dB;11(1 + edA!B): Taking the G11;11 genotype as the starting point, all the genotype values in the two locus two allele system can then be described in terms of the values de ned in Table 1. Variance Decomposition The matrix of genotype values presented in Table 1 is a complete description of the two-locus two-allele system. It includes all possible forms of epistasis 8 and dominance based on the relationships of genotypic values. Using the values de ned above the total genetic variance (VG) can be decomposed into its components. Generally, three components are considered; the additive genetic variance, the variance of the dominance e ects, and the epistatic or interaction variance. The contributions of epistatic e ects to the additive genetic variance of a phenotypic character can best be illustrated with a special model. If we assume a two locus two allele model with epistasis but no within-locus dominance we get the following matrix of genotype values: A1A1 A1A2 A2A2 B1B1 b 0 b B1B2 0 b+EAB 2b+ 2EAB B2B2 b 2b+ 2EAB 3b+ 4EABb The parameters needed to describe this system are: G11;11 = b; aA;11 = aB;11 = b; EAB : If we allow for linkage disequilibrium we can derive the equations for the total genetic variance (VG) and the additive genetic variance (VA) as: V G = b2(::a1::+Dis(::a2::)) + bEAB(::a3::+Dis(::a4::)) +E2 AB(::a5::+Dis(::a6::)) and V A = b2(::a1::+Dis(::a2::) +Dis2(::a3::)+Dis3(::a4::)) + bEAB(::a5::+Dis(::a6::) +Dis2(::a7::)+Dis3(::a8::)) + E2 AB(::a9::+Dis(::a10::) +Dis2(::a11::)+Dis3(::a12::)) where the ai terms stand as a shorthand for various combinations of gamete frequencies pi (see Laubichler, 1997 for details). The interaction variance in this model is then given by: V I = E2 AB(::a1::+Dis2(::a2::)): The above equations are very big and involved in their explicit form, but have a very symmetric inner structure. The formula for the additive genetic variance, for instance, can be decomposed into three terms: one proportional to the square of the additive physiological e ect b, one proportional to bEAB, and one proportional to E2 AB. Epistasis in Models of Metabolic Physiology In this section a model of metabolic pathways is examined as an application of the epistasis coe cient de ned above. This example also serves to illustrate the point that non-linear physiological relations per se do not imply epistasis. 9 A Fitness Function for Metabolic Pathways A classic example for examining a metabolic pathway in its evolutionary context is Dean and co-workers' analysis of lactose catabolism in Escherichia coli (Dean, et al., 1986). This is a basic linear pathway involving the di usion of lactose through the outer cell membrane followed by active transport through the periplasmic membrane by permease activity. The lactose is subsequently broken down to glucose and galactose by -galactosidase. The ux rates giving rise to glucose and galactose can usually be correlated to bacterial growth rate by a linear relation. Thereby we can causally relate metabolic ux to relative growth rate. Relative growth rate is also the variable by which tness is measured in bacterial population ecology. Consequently it is possible to place the metabolic pathway in its evolutionary context by relating the underlying physiological characteristics (in the form of enzyme activities and kinetic parameters) to a relevant phenotype (metabolic ux) and hence tness. The relative tness (i.e. relative growth rate) of a mutant strain can be expressed as !0 = Y J 0 Y J = J 0 J (1) where !0 is the relative tness of the mutant, J 0 is the ux rate of the mutant, J is the ux rate of the wild type strain and Y is the linear yield coe cient for growth. Using the derivation of pathway ux developed by Kacser and Burns (Kacser and Burns, 1973) and independently by Heinrich and Rapoport (Heinrich and Rapoport, 1974), the pathway ux can be expressed as: J = [L] 1 D + Km(p) Vmax(p) + Km( ) Vmax( )Keq(p) = [L] (2) where D is the di usion constant across the outer membrane, Keq(p) is the equilibrium constant for the permease reaction, Km(i) and Vmax(i) are the dissociation constants and maximumvelocities respectively, with the subscripts (p) or ( ) designating permease or -galactosidase as the respective enzymes. The symbol is an abbreviation for the summation of kinetic ratios in the denominator and, as a component of ux, can be considered a physiological phenotype. In e ect equation (1) can be re-written using (2) as: !0 = [L] 0 [L] = 0 (3) (Dykhuizen, et al., 1987). The kinetic parameters in this derivation are based on the Michaelis-Menten conception of enzymatic reactions represented by: E + S k1 ! k2 ES k3 ! k4 E + P 10 where ki is the linear rate coe cient for each reaction (i=1,...,4), E is the enzyme, S is the substrate, ES is the enzyme-substrate complex and P the product. Consider a mutation a ecting the maximal rate Vmax of an enzyme. Assuming that k4 is negligible the mutation is actually a ecting the underlying kinetic parameter k3, such that the mutant enzyme's maximal rate V 0 max is expressed by: V 0 max = Vmax + ( k3)[E]; (4) where k3 is the change in k3 caused by the mutation.2 For a haploid organism eq. (4) can be substituted into (3) such that: !0 = 0 = 1 D + Km(p) Vmax(p) + Km( ) Vmax( )Keq(p) 1 D + Km(p) Vmax(p)+ k3(p) [E(p) ] + Km( ) (Vmax( )+ k3( ) [E( ) ])Keq(p) : (5) This equation predicts a tness surface as a function of Vmax(p) and Vmax( ) . Epistatic E ects of Mutations on Fitness The e ects of a substitution in a two locus, two allele haploid system can be rede ned as aAK = G2K G1K and aKB = GK2 GK1 where the epistatic in uence of locus B on locus A is de ned as EAB = aA2 aA1 = aB2 aB1: If the value of is considered a physiological phenotype, it can be equated with the genotypic values delineated in the previous section such that Gxy = xy and 8><>:x = 1 if permease is wild type x = 2 if permease is mutant y = 1 if galactosidase is wild type y = 2 if galactosidase is mutant: Thereafter by using the de nition that aA1 = G21 G11 we have aA1 = 21 11 (6) 2Note that mutations a ecting regulation of enzyme levels can be modeled as those changing [E]. 11 = 1 D + Km(p) Vmax(p) + k3(p)[E(p)] + Km( ) Vmax( )Keq(p) 1 D + Km(p) Vmax(p) + Km( ) Vmax( )Keq(p) ) aA1 = Km(p) Vmax(p) k3(p)[E(p)] (Vmax(p) + k3(p)[E(p)]) (7) Similarly, aA2 = 22 12 (8) which with the appropriate substitutions and simpli cations gives aA2 == Km(p) Vmax(p) k3(p)[E(p)] (Vmax(p) + k3(p)[E(p)]) : (9) From eq. (9) and (7) we see that aA1 = aA2 (10) ) EAB = aA2 aA1 = 0: In other words the strength of epistatic interactions between locus A (i.e. Vmax(p)) and B (i.e. Vmax( )) with respect to ( ux denominator) is zero. Furthermore, with appropriate substitutions and rearrangements, epistasis with respect to tness can also be shown to be zero. This is an interesting result since it shows that a non-linear genotype-phenotype map per se does not necessarily imply epistasis. In our example the explanation for zero epistasis can be found at two levels. First as a technical property of the equation and secondly, as a consequence of the assumptions that underlie this model of a metabolic pathway. The technical reason is apparent if the structure of in (2) is examined. Essentially there is no interaction term between the two enzymes and consequently the e ect of changing either of the two Vmax parameters is independent. As long as temperature is constant, the condition of zero epistasis will hold, since according to thermodynamic principles Keq(p) will remain constant. For the more subtle representational explanation we have to refer back to the original derivation of the ux equations by Kacser and Burns (1973). Although the details of the derivation will not be treated here, an important assumption that went into these equations was that all enzymes in the pathway are far from saturation. It was also assumed that changes due to mutations were small enough that they would not lead to saturation. It is not too di cult to visualize the converse scenario in which at least one enzyme such as -galactosidase would be at saturation and would thereby be rate limiting. Epistasis is a natural consequence in such a scenario. 12 Interpretation of Measured Epistatic E ects One of the processes in which epistasis plays a crucial role is the evolution of canalization (Wagner and Altenberg, 1996; Wagner, et al., 1997). Genetic canalization is the insensitivity of a character to mutations. The evolution of canalization thus requires the selection of epistatic e ects which decrease the additive e ects. In this section we discuss how the epistasis coe cients can be interpreted in terms of the role epistasis plays in the evolution of canalization and point out some pitfalls in the interpretation of QTL data. In the context of the evolution of canalization, one is interested in whether an epistatic e ect increases (i.e. decanalizes) or decreases (i.e. canalizes) the e ects at another locus. A simple way to determine whether a given epistatic interaction leads to canalization or decanalization is to transform the relative epistatic e ect eB!A into an epistasis factor: fB!A = 1 + 2eB!A. This factor directly relates to the change in the magnitude of the additive e ects on the A-locus aA;22 = fB!AaA;11 and is easy to interpret. The locus B has a canalizing e ect on A if and only if fB!A 2 ( 1; 1), or if jfB!Aj 2 [0; 1). There is no AxA epistasis if fB!A = 1, and the e ect at the A-locus is changing sign if fB!A < 0. But if fB!A < 1 then B changes the sign of the A-locus e ect and also de-canalizes, i.e. makes the A-locus e ect bigger. Analogous coe cients can be calculated for the epistatic e ects of the A-locus on the B-locus fA!B . Note that in general fB!A 6= fA!B . The interpretation of measured genetic e ects is further complicated by the fact that the additive e ects of genes and the epistatic e ects among genes are not independent variables. The main reason is that most (all?) genetic systems ful ll a fundamental symmetry: the gene substitutions are commutative, i.e. their order of execution is irrelevant to the genotypic value of the end product. Whether one substitutes an allele at locus A rst and then at locus B or vice versa shall not matter. All that counts is which genes are present in a genotype.3 As a consequence additive and epistatic e ects are to some degree correlated, even if there might be no physiological reason for them to \interact." In order to explore these complications we consider a simple statistical model and compare the results with real QTL data. Let us consider two loci A and B with two alleles where the two alleles come from two lines that di er in a quantitative character such that the alleles A1 and B1 come from the line with the lower value of the character and A2 and B2 from the other line. Hence the genotypic values will ful ll the relations G1111 < G1122 = G2211 < G2222. Now let us consider an ensemble of genotypes in association with two mutations which ful ll these relationships for each genotype. This scenario mimics the situation in a typical QTL study, where one 3Maternal e ects are an interesting exception. 13 identi es genotype values based on recombinants from two lines which di er in a quantitative character, say adult body weight. Now we can ask how the additive and AxA epistatic e ects are expected to be distributed. From the inspection of the de nitions of additive e ects it is quite obvious that the set of four additive e ects in the ensemble are correlated: hCov(aA11; aB11)i = 1 4V ar (G1111) > 0 hCov (aA11; aB22)i = 1 4V ar (G2211) < 0 hCov (aA22; aB11)i = 14V ar (G1122) < 0: There are also correlations between some of the additive e ects and the AxA epistatic e ects, in particular hCov (EB!A; aB11)i = 14 [V ar (G1111) + V ar (G1122)] : Since the epistasis factors fB!A are just a linear function of EB!A a negative correlation has to be expected for them as well. In other words, the alleles which have larger additive e ects in the original genotype have on average a canalizing e ect on the second gene substitution fB!A < 1. The situation is di erent if the two alleles have opposite e ects, for instance if the A alleles have a positive e ect and the B alleles have a negative e ect on the character: hG1111i < hG2211i ; hG1111i > hG1122i ; hG1111i = hG2222i. Then genes with a large additive e ect aA11 or aB11, have on average a de-canalizing e ect on the second gene: fB!A > 1 and fA!B > 1. In gure 1A the epistasis factors are plotted over the additive e ects from a random data set. The data consists of normally distributed random data for the four genotypes, where the mean values where hG1111i = 1; hG1122i = hG2211i = 3; hG2222i = 5, and the variance in all genotype value classes is 1. Note that there is a negative correlation between the epistasis factors and the additive e ects (t=-0.335). The same pattern is found in the QTL data on adult body weight in mice (Cheverud et al., 1996) (Fig. 1B), and the pattern is maintained if the genotype values are permutated in each genotype class (not shown). The pattern described above, however, is predicated on \weak" epistasis, i.e. cases where the within genotype class standard deviation is less than the di erences between the mean values of the genotype classes. Another pattern in epistatic e ects caused by the symmetries of the genotype space is a correlation between fA!B and fB!A. The main reason is that if fA!B = 1, i.e. if there is no epistatic in uence of A on B, then there is also fB!A = fA!B = 1, even if in general fA!B 6= fB!A. Depending on the degree of epistasis in the data set the epistasis factors remain similar in a neighborhood around f = 1, causing an overall positive correlation among them. This means that on average a gene that is canalizing on another gene is also being canalized by that respective gene. In gure 2A this pattern is shown in the same random 14 data set as in gure 2B. Again this pattern is also seen in the QTL data set, and maintained after permutation of the genotypic values (not shown). It is important to stress that these patterns are not due to any physiological reasons, such as the laws of enzyme kinetics, rather they are induced by the fundamental symmetry in the genotype space. These patterns are also not expected if one would consider all mutations around a given reference genotype, where alleles will have both positive as well as negative e ects on the character. So the question of course is whether these patterns have any biological signi cance. One area where these patterns are potentially important is when selection is acting to prevent certain alleles to enter the population. For instance directional selection will tend to select mutations with the same directional effect in the population. Then one has to expect the statistical patterns described here (and perhaps others). For instance a gene with a strong additive e ect is likely to be selected by directional selection. Consequently one has to expect that it will have a canalizing e ect on a subsequent mutation, even if the same mutation may be also of large e ect in the original genotype. Under stabilizing selection one would expect to have genes with small opposing additive e ects to segregate while genes with large e ects and sets of genes with e ects in the same direction will not be maintained. Under these circumstances the mutations with small additive e ects are expected to be mutually canalizing. These statistical features make epistatic models fundamentally di erent from models with only additive e ects. Discussion In order to appreciate how the scaling of epistasis proposed here di ers from previous approaches one has make some distinctions. First, we have to separate the verbal de nition of a concept from its mathematical representation. Epistasis has always been de ned as the in uence of one locus on the e ects of a gene substitution at the other locus, however, the ways in which epistasis has been introduced into speci c models has usually been informed by methods of variance decomposition rather than the relevance to the mechanisms of evolutionary change. This leads us to the second distinction, the one between statistical and physiological genetic de nitions of epistasis. The consequences of this distinction have been elaborated by Cheverud and Routman (1995). Fisher (Fisher, 1918) introduced the concept of epistacy as a statistical feature, namely as a deviation from linearity in situations with more than one Mendelian factor, analogous to the deviation from linearity caused by dominance e ects within each single Mendelian locus. Both cases account for deviations from strict additivity as de ned by a least square regression. Consequently, Fisher partitioned the phenotypic variance into an additive portion and two components resulting from the two forms of deviations from additivity, dominance and epistasis. 15 This statistical approach to epistasis was extended by Cockerham (1954) andKempthorne (1954). They further partitioned the epistatic component of the to-tal genetic variance into additive by additive, additive by dominance, dominanceby additive, and dominance by dominance fractions. All these components ofthe total genetic variance account for deviations from linearity as de ned by aleast square regression. Asoh and Mhlenbein (1995) have recently derived sim-ilar results as Kempthorne (1954) in the context of genetic algorithms. Theyderived the consequences for the composition of the genetic variance and theestimation of heritability of an orthogonal decomposition of the tness function.Crow and Kimura (1970) de ne epistasis as \any circumstance where a sub-stitution at the A locus has a di erent e ect depending on the genotype at theB locus is an example of epistasis." They also develop a mathematical represen-tation of the di erent forms of epistasis, AxA, AxD, DxA, and DxD epistasis.But they do not give an interpretation of the mechanistic consequences of theseforms of interaction.Cheverud and Routman (1995) have proposed a mathematical representationof physiological epistasis. Their approach consists of de ning epistasis values asthe residuals of the un-weighted regression of genotypic values onto single-locuscomponents. This approach is thus in spirit akin to the statistical approach, i.e.considering epistasis as the deviation from the linear model, even if the valuesare based on un-weighted averages rather than population averages.An obstacle in the theory of epistatic e ects to overcome is that the additiveand epistatic e ects are not independent. This is due to the fundamental sym-metry of gene-substitutions, i.e. the fact that the order of gene substitutionsshall not matter. If the possible alleles are thought of forming a continuum ofpossible e ects this problem can be overcome by using second derivatives asthe measure of epistasis (Rice, submitted). The discrete analogue of the secondderivative is the so-called graph Laplacien, which has been used to measurethe \ruggedness" of tness landscapes in discrete con guration spaces (Stadler,1996; Stadler and Wagner, 1997; Weinberger, 1991). It needs to be seen, how-ever, how the information captured in these measures can be used to represent amechanistic meaning. So far these approaches are purely geometrical measuresof the local curvature of the tness or character value landscape.By no means is the present treatment meant to be the only most useful repre-sentation for modeling the mechanistic consequences of epistasis. The usefulnessof a mathematical representation of epistatic e ects depends on the mathemat-ical model in which it is applied. For instance an alternative approach to theone presented here is that of local breeding values (Goodnight, 1995) which isdesigned for applications in random drift theory. The present scaling of AxAinteraction is inspired by our work on the evolution of canalization (Wagner,et al., 1997) and genetic integration (Mezey and Wagner, in preparation). Thenovelty of the present approach lies in the goal to represent epistatic e ects insuch a way that the mechanistic consequences of epistasis are most obvious. Forinstance AxA interaction is the in uence of one gene substitution on the addi-16 tive e ect of a second gene substitutions at another locus. The epistasis factorfB!A directly shows how the substitution at the B locus changes the e ect atthe locus A. If fB!A = 1 there is no AxA interaction and the e ects at theA-locus remain unchanged. If jfB!Aj < 0 the B locus decreases the e ect atthe A-locus, i.e. B has an canalizing e ect on A. If jfB!Aj > 1 the e ect ofthe A-locus increases as a consequence of the substitution at the B locus, i.e. Bde-canalizes the e ects at the A-locus. As a consequence the proposed formulafor measuring AxA interaction is useful for estimating the potential for the evo-lution of canalization. Similarly, AxD interactions play a role in the evolutionof dominance (for a recent review see Mayo and Burger 1997). These facts arenot obvious in a statistical decomposition of variance, but immediately clear ina measurement theoretical approach using physiological gene e ects.Acknowledgment: The authors are grateful to Sean Rice for reading and dis-cussing an earlier version of this manuscript, and to Jim Cheverud for sharingthe QTL data with us as well as many discussions on the subject of this paper.Discussions with Leo Buss, Ashley Carter, Robert Dorit, Juhnyong Kim, JasonMezey, Gavin Naylor, Christian Pazmandi, Sean Rice on the subject of this pa-per are greatly appreciated. The nancial support by NSF grant BIR-9400642and the Yale Institute for Biospheric Studies is gratefully acknowledged. Thisis contribution # 48 of the Center for Computational Ecology.ReferencesAsoh, H. and H. Mulenbein, 1995. Estimating the heritability by decomposingthe genetic variance. In Parallel Problem Solving from Nature PPSN III.,edited by Y. Davidor, H.-P. Schwefel and R. Manner. Springer Verl., Berlin.Bryant, E. H., S. A. McCommas and L. M. Combs, 1986. The e ect of anexperimental bottleneck upon genetic variation in the house y. Genetics 114:1191-1211.Carson, H. L. and A. R. Templeton, 1984. Genetic revolutions in relation tospeciation phenomena: The founding of new populations. Ann. Rev. Ecol.Syst. 15: 97-131.Cheverud, J. and E. Routman, 1995. Epistasis and its contribution to geneticvariance components. Genetics 130: 1455-1461.Cheverud, J. M. and E. J. Routman, 1996. Epistasis as a source of increasedadditive genetic variance at population bottlenecks. Evolution 50: 1042-1051.Cheverud, J., E. Routman, F. M. Duarte, B. v. Swinderen, K. Cothran andC. Perel, 1996. Quantitative trait loci for murine growth. Genetics 142: 1305-17 1319.Cockerham, C. C., 1954. An extension of the concept of partitioning hereditaryvariance for analysis of covariance among relatives when epistasis is present.Genetics 39: 859-882.Coyne, J. A., 1992. Genetics and speciation. Nature 355: 511-515.Crow, J. F. and M. Kimura, 1970. An Introduction to Population GeneticsTheory. Harper & Row, New York.Dean, A. M., D. E. Dykhuizen and D. L. Hartl, 1986. Fitness as a function ofbeta-galactose activity in Escherichia coli. Genet. Res., Camb. 48: 1-8.Dykhuizen, D. E., A. M. Dean and D. L. Hartl, 1987. Metabolic ux and tness.Genetics 115: 25-31.Falconer, D. S., 198

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semiparametric Regression Models for Repeated Events with Random E ects and Measurement Error

Statistical methodology is presented for the regression analysis of multiple events in the presence of random eeects and measurement error. Omitted covariates are modeled as random eeects. Our approach to parameter estimation and signiicance testing is to start with a naive model of semi-parametric Poisson process regression, and then to adjust for random eeects and any possible covariate measu...

متن کامل

Analysis of the epistatic and QTL×environments interaction effects of plant height in maize (Zea mays L.)

A genetic map containing 103 microsatellite loci and 200 F2 plants derived from the cross R15 × Ye478 were used for mapping of quantitative trait loci (QTL) in maize (Zea mays L.). QTLs were characterized in a population of 200 F2:4 lines, derived from selfing the F2 plants, and were evaluated with two replications in two environments. QTL mapping analysis of plant height was performed by using...

متن کامل

Point Process Regression Models for Multiple Events with Random Eeects and Measurement Error

Statistical methodology is presented for the analysis of multiple events with random eeects and measurement error. We model multiple events in a general space using a random measure, and deene point process regression models with residual random eeects. Our approach to parameter estimation and signiicance testing is to start with a simple naive model of Poisson process regression, and then to a...

متن کامل

Ordering genetic algorithm genomes with reconstructability analysis

The building block hypothesis implies that genetic algorithm (GA) effectiveness is influenced by the relative location of epistatic genes on the chromosome. We demonstrate this effect in four experiments, where chromosomes with adjacent epistatic genes provide improved results over chromosomes with separated epistatic genes. We also show that information-theoretic reconstructability analysis ca...

متن کامل

Mapping QTL with additive effects and additive x additive epistatic interactions for plant architecture in wheat (Triticum aestivum L.)

In bread wheat (Triticum aestivum L.), crop height is an important determinant of agronomic performance. To map QTLs with additive effects and additive×additive epistatic interactions, 148 recombinant inbred lines and their parents, (‘YecoraRojo’ and Iranian landrace (No. #49)) were evaluated under normal and water deficit conditions. The experiments were carried out on research farms of Mahaba...

متن کامل

Complex Adaptations and the Evolution of Evolvability

1 One may wonder, ...] how complex organisms evolve at all. They seem to have so many genes, so many multiple or pleiotropic eeects of any one gene, so many possibilities for lethal mutations in early development, and all sorts of problems due to their long development. Abstract: The problem of complex adaptations is studied in two largely disconnected research traditions: evolutionary biology ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997