# Additive genetic variance with \$n\$ alleles

The genetic variance of a quantitative trait (the quantitative trait in question is fitness) can be express as the sum of two components, the dominance and additive variance:

\$\$sigma_D^2 + sigma_A^2 = sigma^2\$\$

, where \$sigma\$ is the genetic variance, \$sigma_D^2\$ is the dominance variance and \$sigma_A^2\$ is the additive variance. \$sigma_D^2\$ and \$sigma_A^2\$ are given by

\$\$sigma_D^2 = x^2(1-x)^2(2cdot W_{12} - W_{11} - W_{22})^2\$\$

\$\$sigma_A^2 = 2x(1-x)(xW_{11}+(1-2x)W_{12} - (1-x)W_{22})^2\$\$

, where \$W_{11}\$, \$W_{12}\$ and \$W_{22}\$ are the fitness of the three possible genotypes and \$x\$ and \$1-x\$ give the allele frequencies.

Question

The above definition makes sense for one bi-allelic locus.

• How are \$sigma_D^2\$, \$sigma_A^2\$ and \$sigma^2\$ defined for a locus that have \$n\$ alleles?

Here is a related question

Well, the total genetic variance is just, by the definition of the variance, \$\$ sigma^2 =sum_{i,j} f_i f_j (w_{ij}-ar{w})^2 \$\$ (using \$f_i\$ and \$w_{ij}\$ for frequency and fitness, respectively), and \$\$ar{w} = sum_{i,j} f_i f_j w_{ij}\$\$ is just the average fitness.

You can calculate the additive genetic variance for different loci by simply assuming that there is no dominance effect, i.e. the alleles are independent. If it helps, think of it as a quantitative trait in a haploid organism. Thus,

\$\$ sigma^2_A =sum_{i,j} f_i f_j (w_iw_j-ar{w}')^2 = sum_i f_i(w_i-ar{w}')^2. \$\$ with \$\$ f_i=sum_jf_{ij}; w_i=sum_j f_{ij}w_{ij}; ar{w}' = sum_i f_i w_i; \$\$

## Standing genetic variation as a major contributor to adaptation in the Virginia chicken lines selection experiment

Artificial selection provides a powerful approach to study the genetics of adaptation. Using selective-sweep mapping, it is possible to identify genomic regions where allele-frequencies have diverged during selection. To avoid false positive signatures of selection, it is necessary to show that a sweep affects a selected trait before it can be considered adaptive. Here, we confirm candidate, genome-wide distributed selective sweeps originating from the standing genetic variation in a long-term selection experiment on high and low body weight of chickens.

### Results

Using an intercross between the two divergent chicken lines, 16 adaptive selective sweeps were confirmed based on their association with the body weight at 56 days of age. Although individual additive effects were small, the fixation for alternative alleles across the loci contributed at least 40 % of the phenotypic difference for the selected trait between these lines. The sweeps contributed about half of the additive genetic variance present within and between the lines after 40 generations of selection, corresponding to a considerable portion of the additive genetic variance of the base population.

### Conclusions

Long-term, single-trait, bi-directional selection in the Virginia chicken lines has resulted in a gradual response to selection for extreme phenotypes without a drastic reduction in the genetic variation. We find that fixation of several standing genetic variants across a highly polygenic genetic architecture made a considerable contribution to long-term selection response. This provides new fundamental insights into the dynamics of standing genetic variation during long-term selection and adaptation.

## Background

Quantifying the (causal) relationships between genes and observed phenotypic traits is a central task of empirical studies of adaptive evolution [1, 2] and of plant and animal breeding [3]. The animal model [4,5,6] has become a popular statistical approach to disentangle genetic effects on a phenotype from other factors that may induce phenotypic similarities among relatives, such as shared environmental effects [7], inbreeding [8], or individual traits such as age or sex [9, 10]. Fundamental to the animal model is information on how animals are related to each other, information typically obtained from pedigree data [11, 12], from genomic data (e.g.[13, 14]), or a combination of both [15]. Pedigrees are still the most commonly used source of relatedness information in animal models (e.g. [16, 17]), in part because the use of pedigrees leads to models that are computationally efficient.

All pedigrees necessarily start with a founder generation of individuals with unknown parents, so-called ‘phantom parents’ [17]. The animal model assumes that all founder individuals stem from a single, genetically homogeneous baseline population, and that the model estimates additive genetic variance (denoted as (sigma _A^2) ) of the respective base population. When the homogeneity assumption is violated, for example in the presence of immigrants from another population or in crossbred livestock breeds, estimates of (sigma _A^2) may be biased [18, 19]. To address this problem, animal breeders developed animal models with genetic groups, briefly denoted as genetic group models (e.g. [20]), and these are now also receiving attention in evolutionary ecology [17, 21, 22]. The main idea behind genetic group models is that accounting for differences in mean breeding values may reduce or eliminate the bias [17]. However, current genetic group models have an important key limitation: genetic groups are allowed to differ in mean breeding value, but are assumed to have the same additive genetic variance. This homogeneity assumption is violated in some animal breeding applications [23,24,25], and is likely also violated in many natural populations, where source populations of immigrants may differ in additive genetic variance, for example due to differences in effective population size (genetic drift) and selection regimes (e.g. [26,27,28]). In fact, different populations or ecotypes within the same species have been found to differ in their additive genetic (co)variances in plants (e.g. [29]), invertebrates (e.g. [30]), and vertebrates (e.g. [31]). Because additive genetic variances determine the evolutionary potential of phenotypic traits [1, 32], and because of the fundamental importance of understanding the processes that shape additive genetic variances, as well as the consequences that selection will have on the rate and direction of evolution within and across populations, it is essential to be able to estimate the additive genetic variance of each baseline population in the presence of interbreeding genetic groups.

Aiming for better predictions of breeding values in crossbred populations, animal breeders have suggested approaches that account for heterogeneous additive genetic variances across genetic groups [19, 24, 33]. One drawback of these models is that they rapidly become unfeasibly complex, because variability of the genetic values must now be split into components from the pure breeds plus components due to segregation terms when breeds are mixed, which requires that the respective segregation variance terms enter the model. Segregation variance refers to the increase in variance caused by differences in allele combinations, average allelic effects, and linkage disequilibrium at and between loci underlying the phenotype in the mixing breeds, as in (F_1) and (F_2) generations of line crosses [34,35,36]. The respective terms can be large in crossbreeding applications [1, p. 11]. However, as we explain in detail below, we expect the segregation variance from crossing different genetic groups in wild study populations to be small for many traits of interest, so that omitting it from the animal model does not lead to significant bias.

Thus, here we use the simplified models without segregation terms to derive genetic group models that allow for group-specific additive genetic variances. In order to properly consider each individual’s genetic contribution to the actual population, we additively split the breeding value of each individual into group-specific components, similar to the approach suggested by García-Cortes and Toro [33]. For each group, the components that stem from the same genetic group covary according to a group-specific relatedness matrix. The main challenge is to find these matrices. Instead of implementing a recursive procedure to calculate the inverse of the additive genetic covariance matrix [23, 33], we propose to derive group-specific relatedness matrices by first decomposing the full relatedness matrix (disregarding genetic groups) via a generalized Cholesky decomposition (as described by [12]), and then appropriately scaling the respective matrix components for each group. This procedure has the advantage that we can use the same mathematical approach that Henderson and Quaas developed to decompose a single population’s relationship matrix [12, 37]. Moreover, by incorporating multiple inverse relatedness matrices into a single mixed model, existing algorithms for the analysis of single populations can easily be extended to genetic group animal models with group-specific additive genetic variances.

In the following, we first summarize the current state of genetic group models and then give a detailed description of the extension to heterogeneous group-specific additive genetic variances. We illustrate the performance of our method with a simulation study and an application to a meta-population of house sparrows (Passer domesticus) in Norway, where genetic groups are determined by geographical properties of the bird’s natal island population. By also fitting a model that includes a segregation term to the sparrow data, we illustrate that omitting segregation variances is unproblematic in such applications. We also provide a short tutorial including (mathsf ) code for the analysis, and discuss opportunities and limitations of our extended genetic group model.

## Results and discussion

The UK Biobank [12] has genotyped

500,000 participants for an array that contains

847,441 single nucleotide polymorphisms (SNPs). After employing stringent quality control (QC) criteria (see ‘Methods’), we extracted 13,068 self-reported and genetically inferred White-British (Additional file 1: Figure S1) male–female pairs that shared the same household address but were less related to each other than first cousins once removed, that is, with a coefficient of relationship (r) below 0.0625 (Additional file 1: Figure S2). Of these male–female pairs,

92 % reported that they lived with their spouses, which is consistent with our hypothesis that these pairs were couples. We kept relatives (i.e. individuals with r > 0.0625) in our dataset providing they lived in different households (Additional file 1: Figure S3). Rare variants (those with minor allele frequency < 0.05) were removed from the analysis because they are known to distort the estimates of relatedness [13]. After removing possible outliers (see ‘Methods’), we modelled two phenotypes for each individual: the person’s own measured height and their partner’s measured height. The couples’ phenotypic correlation was 0.26 (95 % confidence interval [CI] 0.24, 0.27) (Additional file 1: Figure S4). We then adjusted for social and genetic population structure, correcting for the first 20 principal components (PCs) derived from an LD-pruned genomic relationship matrix (see ‘Methods’), age, gender, and Townsend deprivation index. The phenotypic correlation between couples remained high, at 0.23 (95 % CI 0.22, 0.24).

To estimate the contribution of genetic and environmental factors to variation in choice of mate height, we estimated relationships (Additional file 1: Figure S3) between the 26,136 individuals available [14] using the 318,852 autosomal SNPs that passed our QC protocol. We used a mixed linear model to estimate variance components [15]. To account for population and social structure, the analyses included the first 20 PCs, gender, age at recruitment, and Townsend deprivation index as fixed effects, and a genetic and an environmental (residual) random effect.

First, we used a univariate analysis to estimate to what degree attraction to a mate of similar height was explained by a person’s genotype. To that purpose, we treated the height of the partner as the person’s own trait (i.e. the choice of mate height). We estimated that the heritability of choice of mate height was 0.041 (standard error 0.014), which indicates that there is a significant genetic component for choice of mate height in humans. This is consistent with a model where mate selection for height is driven by one’s own height (see ‘Methods’).

We then asked whether the genetic determinants of choice of mate height were shared with the genetic determinants of a person’s own height. To answer this question, we treated the height of the partner as a phenotype of an individual and used a bivariate analysis to estimate the genetic and environmental correlation between the two traits. A genetic correlation equal to zero would imply that one’s own height and the choice of mate by height are not affected by the same genetic variants or that there is no directional pleiotropy, whilst a genetic correlation of one would imply that the two traits share the same genetic determinants, working in the same direction. Similarly, a non-zero environmental correlation would imply that the factors that affect the environmental and non-additive genetic deviations are at least partly shared between the two traits. The bivariate analysis (Table 1) performed using all available autosomal SNPs revealed that additive genetic factors explained 60 % and 3.6 % of the phenotypic variation for height and choice of mate height, respectively. These estimates are consistent with the estimates obtained in the univariate analysis. By analysing both traits jointly, we also demonstrated that 89 % of the genetic variation that affects height and choice of mate height is shared. Overall, this indicates that there is an innate preference for partners of similar height. To investigate this further we removed all related individuals (r > 0.0625) and performed two genome-wide association studies, one for height and one for choice of mate height. The correlation among estimated SNP effects was 0.25 (Additional file 1: Figure S5), which supports the hypothesis that height and choice of mate height share a substantial number of contributing loci and that alleles that increase height also, on average, increase attraction for increased height.

To strengthen the evidence for this hypothesis, we estimated, using genetic marker information and a univariate mixed-linear model (see ‘Methods’), the additive genetic effect (also known as breeding value in the quantitative genetics literature) for the height of individuals whose partner had not been genotyped, but for whom we had information on height. We reasoned that if the genetic correlation between height and choice of mate height was high, then we would be able to predict the height of one of the partners from the additive genetic effect (i.e. breeding value) for the height of the other partner. The correlation between the additive genetic effect for one’s own height and one’s partner’s height phenotype (i.e. the accuracy of prediction) was 0.13 (P = 7.55 × 10 −59 ), that is, 64 % of the maximum expected correlation the expected maximum correlation between the additive genetic effect for choice of mate height and phenotype for choice of mate height being 0.2, the square root of the heritability of choice of mate height.

The genetic consequences of assortative mating depend on whether the primary cause of assortment among partners is phenotypic (e.g. tall people are attracted to tall people), genetic (e.g. matings are within differentiated ethnic groups) or environmental (e.g. matings are with socially homologous groups). Primary genetic or environmental correlations arise when mating occurs within groups that are either genetically or environmentally differentiated. We argue that for human height the primary source of partner similarity is phenotypic, rather than caused by genetic or environmental structure within the population. We believe that the observed correlation in height between partners is not an artefact of mating within groups or populations that are genetically differentiated, because our analyses were adjusted for the first 20 PCs and because, for mixed-origin couples (those for which a partner is classified as White-British and the other as non White-British), we observed similar heritabilities to those of White-British couples for both height and mate’s height (Additional file 1: Table S1). In addition, we performed an analysis following a permutation approach that, whilst maintaining a height-associated mating structure, removed any genetic (Fig. 1) and environmental (Fig. 2) within-pair structure due to assortment based on alternative factors like geography, age or socio-economic status (see ‘Methods’). Specifically, we swapped the male partners amongst pairs of couples with similar phenotypes for both individuals. The results of this analysis (Additional file 1: Table S2) were practically identical to the results obtained for the original data, indicating that the genetic or environmental structure of the population is not driving the correlation between mates (Additional file 1: Table S3 and Fig. 2).

Correlation between distance of birthplaces and relatedness. The regression coefficient of relatedness on distance (m) was −7.9 × 10 −10 (P = 0.026) and −4.9 × 10 −10 (P = 0.134), for the real husband and swapped husband, respectively

## Heritable Variation, With Little or No Maternal Effect, Accounts for Recurrence Risk to Autism Spectrum Disorder in Sweden

Background: Autism spectrum disorder (ASD) has both genetic and environmental origins, including potentially maternal effects. Maternal effects describe the association of one or more maternal phenotypes with liability to ASD in progeny that are independent of maternally transmitted risk alleles. While maternal effects could play an important role, consistent with association to maternal traits such as immune status, no study has estimated maternal, additive genetic, and environmental effects in ASD.

Methods: Using a population-based sample consisting of all children born in Sweden from 1998 to 2007 and their relatives, we fitted statistical models to family data to estimate the variance in ASD liability originating from maternal, additive genetic, and shared environmental effects. We calculated sibling and cousin family recurrence risk ratio as a direct measure of familial, genetic, and environmental risk factors and repeated the calculations on diagnostic subgroups, specifically autistic disorder (AD) and spectrum disorder (SD), which included Asperger's syndrome and/or pervasive developmental disorder not otherwise specified.

Results: The sample consisted of 776,212 children of whom 11,231 had a diagnosis of ASD: 4554 with AD, 6677 with SD. We found support for large additive genetic contribution to liability heritability (95% confidence interval [CI]) was estimated to 84.8% (95% CI: 73.1-87.3) for ASD, 79.6% (95% CI: 61.2-85.1) for AD, and 76.4% (95% CI: 63.0-82.5) for SD.

Conclusions: There was modest, if any, contribution of maternal effects to liability for ASD, including subtypes AD and SD, and there was no support for shared environmental effects. These results show liability to ASD arises largely from additive genetic variation.

Keywords: Autism Epidemiology Genetics Heritability Population-based Psychiatry.

### Conflict of interest statement

The authors report no biomedical financial interests or potential conflicts of interest.

## Discussion

The fact that heterozygosity can be heritable has been shown more than two decades ago, at least in the case of biallelic loci in outbred populations (Mitton et al., 1993). However, as this is generally not well known, we first reviewed this seminal work. Subsequently, we provided a quantitative genetic framework for the prediction of genetic (co)variance components and heritability of heterozygosity, which allows for any number of multiallelic loci and inbred populations. This provides a useful tool for explicit theoretical and empirical investigations of the importance of selection of heterozygosity for the maintenance of genetic variation in fitness. Indeed, irrespective of the conditions under which selection of heterozygosity might or might not occur, heritability of heterozygosity is an essential requirement for an evolutionary response. The quantitative genetic framework outlined here can be applied to (multiallelic) markers that have fitness effects themselves or to loci that are in linkage disequilibrium with loci that influence fitness (Slatkin, 1995). In addition, heritability of heterozygosity provides an alternative, biologically intuitive explanation for why the dominance coefficient d contributes to additive genetic variance of any trait whenever allele frequencies are unequal (Nietlisbach and Hadfield, 2015).

Whereas heritability of heterozygosity can be calculated for any type of genetic locus, an evolutionary response is not necessarily expected. In particular, allele frequencies at neutral marker loci are not expected to change when they are not linked to loci that are under selection, even if marker heterozygosity correlates with fitness (‘apparent selection’ Charlesworth, 1991). In other words, heterozygosity–fitness correlations are not evidence of selection at the marker loci under study (Szulkin et al., 2010). However, when applied to loci that influence fitness, or to loci linked to them, our framework will be useful to evaluate the possibility of an evolutionary response to selection of heterozygosity. Also, the expected response to selection, as predicted from the product of heritability and selection differential (i.e., the breeder’s equation), will often deviate from the observed response, for example, if there is no additive genetic covariance between trait and fitness (Merilä et al., 2001 Morrissey et al., 2010, 2012).

Heritability of heterozygosity is highest for highly unequal allele frequencies and is reduced by inbreeding. Reductions in heritability with increasing inbreeding are typical for traits determined by additive gene action. However, with dominance effects, changes in heritability are difficult to predict and can go in any direction (Falconer and Mackay, 1996, p 266), as has been shown experimentally for various traits other than heterozygosity (e.g., Wade et al., 1996 Kristensen et al., 2005). Our framework can be used to assess how strong the effects of inbreeding are on the heritability of heterozygosity in a focal population. This is useful, because empirical estimation of all genetic (co)variance components relevant under inbreeding is very challenging (Wolak and Keller, 2014). Because empirical quantification may often be difficult or impossible, being able to predict these (co)variance components for given allele frequencies and a certain level of inbreeding is of practical relevance. In addition, this framework offers a way to describe the amount of (additive and total genetic) variance in heterozygosity introduced by inbreeding.

Although evaluating the conditions under which heterozygosity may be selected for is beyond the scope of this article, selection for heterozygosity is possible at loci displaying heterozygote advantage (Lehmann et al., 2007 Fromhage et al., 2009). Among the few known loci showing indications for a heterozygote advantage is the major histocompatibility complex (or human leukocyte antigen system in humans), where heterozygous individuals are more resistant against pathogens (reviewed in Hedrick, 2012). However, because most rare alleles occur in heterozygous form (Halliburton, 2004, p 78), it is often not possible to distinguish heterozygote advantage from selection for rare alleles (Spurgin and Richardson, 2010). Nevertheless, even if heterozygote advantage seems to be rare in nature, just a few overdominant loci would have a larger effect compared with many loci with directional dominance (Crow, 1952, p 291). Additionally, there may be a role for pseudo-overdominance (Charlesworth and Willis, 2009) (sometimes called associative overdominance Frydenberg, 1963 Lynch and Walsh, 1998, p 288) and fluctuating selection (Charlesworth, 1988) in generating selection for heterozygosity.

## Discussion

Common approaches to dissect the genetics of complex traits in segregating populations are linkage mapping and association studies. These studies aim to identify the loci in the genome where genetic polymorphisms control the phenotypic variance in the studied populations. This is achieved by screening for significant genotype-phenotype associations across a large number of genotyped polymorphic markers in the genome. The most common statistical models used in such analyses aim to identify loci with significant mean phenotype differences between the genotypes at individual loci. Although such models are powerful for capturing much genetic variance in populations, they have limited power when challenged with more complex genetic architectures including multiple-alleles, variance-heterogeneity and genetic interactions [8,47]. It is therefore important to also develop, and test, methods that explore statistical genetic models reaching beyond additivity when aiming for a more complete dissection of the genetic architecture of complex traits.

The genetic architecture of variation in mean leaf molybdenum concentrations has earlier been explored using GWA analyses in a smaller set of 93 wild collected A. thaliana accessions [2]. No genome-wide significant associations were found for leaf molybdenum, which was surprising given that the trait has a high heritability [36,43] and that several polymorphisms in MOT1 are known to contribute to natural variation in this trait [36,37]. When we re-analyzed this data using a method to detect variance differences between genotypes, a strong genetic variance-heterogeneity was identified near the MOT1 gene [22]. Here, we studied a larger set of 340 A. thaliana accessions to replicate and fine-map the molecular determinant of this genetic variance-heterogeneity, and find that the strongest associations are to an extended region surrounding MOT1 (vBLOCK). This is the first successful fine-mapping and replication of a variance-heterogeneity locus on a genome-wide significance scale and in an independent dataset.

In this larger dataset we also identified four loci that independently alter the mean concentration of leaf molybdenum. The minor allele at one of these (DEL 53 ) was a deletion in the promoter region of MOT1 previously identified using an F2 bi-parental mapping population. This deletion allele decreases the concentration of molybdenum in leaves by down-regulating MOT1 transcription [36]. Further, we also identified three previously unknown loci, and the minor alleles at these loci (DUP 326 , SNP 1 + and SNP 2 + ) increased the concentration of molybdenum in leaves. One allele (DUP 326 ) was an insertion polymorphism in the promoter region of MOT1, and our analyses revealed that accessions carrying this polymorphism have higher expression of MOT1 compared to the Col-0 accession that does not carry this polymorphism. The other two associations were to SNPs in regions that were not in LD (r 2 ) with the MOT1 gene or its promoter. One of these SNPs was found

25 kb downstream of MOT1 (SNP1) and the other

600 kb upstream of the MOT1 transcription start-site (SNP2). The regulation of molybdenum concentrations in the leaves is hence due to multiple alleles in a gene known to regulate molybdenum uptake, MOT1, but also alleles at other neighboring loci that have earlier not been found to contribute to molybdenum homeostasis in A. thaliana. These results support and refine earlier results from QTL and functional analyses of the MOT1 region that highlighted the central importance of the MOT1 region in the regulation of molybdenum homeostasis in natural populations and also suggested that the natural variation in this trait might have a multi-allelic background [36,37]. As it is well known that major loci affecting traits under selection often evolve multiple mutations affecting the phenotype, and that allelic heterogeneity is an important driver of evolution in natural A. thaliana populations [48], our finding of multiple polymorphisms in this key locus is not surprising. Striking examples of allelic heterogeneity in natural A. thaliana populations include the large number of different loss-of-function mutants in the GA5 locus leading to semidwarfs [49], the MUM2 locus leading to altered seed flotation [50] and the FRIGIDA locus leading to an altered flowering-time [51].

Multi-allelic loci are, however, a major challenge in traditional GWA analyses [48]. It is therefore valuable to note that such loci, under certain conditions, can lead to a genetic variance-heterogeneity (see e.g. [10]) that can be detected with a vGWA analysis. The following two examples illustrate how genetic variance-heterogeneity can arise under i) classic allelic heterogeneity where multiple loss-of-function alleles have evolved independently at a locus, and ii) general multi-allelic architectures where the alleles affect the phenotype to various degree and hence either increase or decrease the phenotype relative to that of the major allele. To illustrate how a genetic variance-heterogeneity can emerge under these scenarios, let us consider an example when looking for associations to a bi-allelic SNP with alleles SNP A and SNP B and where the major SNP allele (SNP A ) is completely linked to the major allele at the functional gene M (M WT ). Below, we illustrate how the distribution of the minor alleles across the SNP genotypes will alter the differences in phenotypic mean and variances between the genotypes, and hence affect the power to detect them in GWA and vGWA analyses.

If gene M evolved via classic allelic heterogeneity, multiple loss-of-function alleles (M 1 - .M n - ) will exist in the population. The largest mean, and smallest variance, difference between the genotype-classes will occur when all n mutant alleles are linked to the SNP B allele. As the proportion of the n M - alleles linked to the SNP A allele increases, the mean difference between genotypes will decrease while the variance differences increase until it reaches its maximum when only one of the M - alleles is linked with the SNP B allele. In all these scenarios, however, there will be a difference both in the mean and variance between the SNP genotype classes and depending on the power of the study, the locus can be detected by either GWA, or vGWA analyses.

If locus M evolved multiple alleles with distinct effects on the phenotype, the locus might display everything from a complete lack of either mean- and variance-effects (scenario (a) below), to both mean and variance effects (b) or variance effects only (c). Under the simplest scenario with two minor alleles, M - and M + , that decreases/increases the trait value relative to that of M WT , respectively, it is the linkage between the alleles at M and the tested marker that determines the mean and variance differences between the genotypes observed at this locus as shown in the examples below.

If the M - and M + alleles are evenly distributed across the two SNP genotypes, there will neither be a mean nor a variance difference between the genotypes.

If the SNP tags the M + and M - alleles perfectly, i.e. that SNP A tags M + and SNP B M - or vice versa, there will be both mean and variance differences between the genotypes.

If the SNP B allele tags both minor alleles perfectly, i.e. M + and M - only occurs with SNP B , there will only be a difference in variance between the SNP genotype classes (S3 Fig).

Hence, the vGWA analysis is likely to be useful for identifying loci under a set of different scenarios ranging from classic allelic heterogeneity to loci with multiple alleles having distinct effects on the phenotype. As shown here, the genetic variance-heterogeneity for vBLOCK was detected based on its genetic variance-heterogeneity due to its close resemblance to scenario (c) above ( Fig 2A ).

Here, we dissected a locus displaying a genetic variance-heterogeneity for the molybdenum concentration in A. thaliana leaves into an underlying multi-locus, multi-allelic genetic architecture. We find several alleles at MOT1 that contribute to this association, which is consistent with findings in earlier studies reporting that several functional variants of this gene alter the mean molybdenum concentrations in A. thaliana [36,37]. Such multi-allelic architectures, where the different genetic variants affect traits under selection to varying degrees, are not unique to this study but have been described also for other traits and species. For example, in A. thaliana the Flowering Locus C (FLC) locus has a natural series of alleles with different effects on vernalization that have been identified [52]. Similar examples also exist in, for example, domestic animal populations for both Mendelian traits, such as coat color [53�], and complex traits, such as muscularity [56] and meat quality [57]. As illustrated above, the vGWA analysis is a straight-forward and computationally tractable analytical strategy that could be used to identify loci where multi-allelic genetic architectures reduce the additive genetic variance that can be detected by traditional GWA approaches. The examples above suggest that such genetic architectures are likely to be more common than what has been empirically shown to date. We therefore recommend that the vGWA approach be tested on more datasets to reveal how common this type of architecture might be for complex traits. This will also help reveal how large a contribution such multi-allelic genetic architectures contribute to the “missing heritability”.

Little is currently known about the genetic mechanisms contributing to variance-heterogeneity between genotypes in natural populations. Ayroles et al. [23] recently reported the first dissection of a locus displaying a genetic variance-heterogeneity in a segregating population and found that mutating a single gene (Ten-a) led to a genetic variance-heterogeneity for a behavioral phenotype in Dropsophila melanogaster. A number of other, not mutually exclusive, hypotheses have been proposed to explain the origin of genetic variance-heterogeneity at a locus. These can broadly speaking be divided into two categories: those due to the individual locus itself such as multiple functional alleles, incomplete linkage disequilibrium and developmental instabilities [7,10,22], and those due to interactions between the locus and other genetic or environmental factors (i.e. epistasis or gene-by-environment interactions) [8,10,21]. Here, we present the first empirical evidence illustrating how population-wide genetic variance-heterogeneity in a natural population can result from a complex locus involving multiple loci and multiple alleles. We show that this genetic variance-heterogeneity originates from the LD (D’) between multiple functional polymorphisms and the SNP markers defining an LD block around MOT1 (vBLOCK). The high-variance associated version of this LD-block (vBLOCK hv ) contains three independent polymorphisms (DEL 53 , DUP 326 and SNP 1 + ) altering the molybdenum concentration in leaves relative to the major alleles at these loci on the low-variance associated version (vBLOCK lv ). Two of these polymorphisms increase molybdenum and one decrease it, leading to a highly significant genetically determined variance-heterogeneity amongst the accessions that share vBLOCK hv ( Fig 2A multi-allelic example c above). Our work also illustrates how the use of alternative genetic models in GWA analyses can provide novel insights to complex genetic architectures underlying adaptively important traits in natural populations.

The LD (D’) between multiple functional polymorphisms and vBLOCK in this collection of natural A. thaliana accessions is the key genomic feature that facilitated the discovery of this locus in the vGWA. Although the molecular basis for this LD-pattern, as well as the reasons for multiple independent polymorphisms being found almost exclusively with one of the variants of this LD-block, is unknown, it is interesting to note that they could have emerged via the processes discussed in relation with the appearance of synthetic LD in GWA studies [58]. It would therefore be interesting to, in the future, explore whether the same basic genomic processes might drive the emergence of both synthetic and vGWA associations in general, or whether the resemblance between the genetic architecture described here and the mechanism proposed by Dickson et al. [58] is a rare case of where the two overlap.

Many GWA studies have found that the total additive genetic variance of associated loci is considerably less than that predicted based on estimates of the narrow-sense heritability, i.e. the ratio between the additive genetic and phenotypic variance in the population. This common discrepancy between the two is often called the curse of the “missing heritability” and is viewed as a major problem in past and current GWA studies [59]. Here, we provide an empirical example of how a vGWA is able to identify a locus [22] that remained undetected in a standard GWA [2] and that, when the underlying genetic architecture was revealed, was found to make a large contribution to the additive genetic variance and narrow-sense heritability. This illustrates the importance of utilizing multiple statistical modeling approaches in GWA studies to detect the loci contributing to the phenotypic variability of the trait, and then also continue to further dissect the underlying genetic architecture to uncover how the loci potentially contribute to the heritability that was “missing” in the original study [2].

By evaluating T-DNA insertional alleles of genes in LD with the SNPs associated to leaf molybdenum concentrations, we are able to suggest two novel functional candidate genes involved in molybdenum homeostasis in A. thaliana. Little is known about the function of one of these, AT2G27030, and further work is needed to explore the mechanisms by which it may alter molybdenum concentrations in the plant. The second gene (AT2G26975 Copper Transporter 6 COPT6) located

600 kb upstream of MOT1 is from earlier studies known to be involved in the connected regulation of copper and molybdenum homeostasis in plants. It was recently reported [46] that MOT1 and several copper transporters were up-regulated under copper deficiency in B. napus, suggesting a common regulatory mechanism for these groups of genes. Further experimental work is needed to explore the potential contributions of these genes to natural variation in molybdenum homeostasis, and the potential connection between copper and molybdenum homeostasis.

Here, we dissect a complex locus affecting molybdenum concentration in the A. thaliana leaf and find it likely that three closely linked genes contribute to this effect. Clustering of genes with similar function is well known for Resistance (R) genes [60] and close linkage between genes important for growth rate has also been evidenced [61] in A. thaliana. How common such functional clustering into complex loci will be for traits of importance for evolution is still largely unknown as the resolution in most complex trait studies does not allow the separation of effects from closely linked loci. Our finding that not only the already known gene in this region, MOT1, but likely also other novel genes contribute to the diverse range of molybdenum concentrations in the leaf observed in this collection of natural A. thaliana accessions suggest that the clustering of loci has been of adaptive value for this ecologically relevant trait. This makes the locus a highly interesting candidate for future work to better understand the role of gene clustering for the evolution of adapted populations.

In summary, here we dissect a locus displaying a genetic variance-heterogeneity for leaf molybdenum concentration in A. thaliana [22] into the contributions from three independent alleles that are in high LD with the high-variance associated version of an extended LD-block surrounding the MOT1 gene. This is the first empirical example of how a multi-locus, multi-allelic genetic architecture can lead to genetic variance heterogeneity at a locus. The dissection of the genetic architecture underlying the vGWA signal allowed the transformation of non-additive genetic variance into additive genetic variance, and hence allowed the detection of a significant part of the “missing heritability” in the variation in leaf molybdenum concentrations in this species-wide collection of A. thaliana accessions. This study also delivers insights into how vGWA mapping facilitates the detection and genetic dissection of the genetic architecture of loci contributing to complex traits in natural populations. It thereby illustrates the value of using alternative statistical methods in genome-wide analyses. Further, it provides an approach to infer multi-allelic loci, which are likely to be both a common, and far too often ignored, complexity in the genetics of multifactorial traits that contributes to undiscovered additive genetic variance and consequently the curse of the “missing heritability”.

## Heritability and additive genetic variance

Most people have an intuitive notion of heritability being the genetic component of why close relatives tend to resemble each other more than strangers. More technically, heritability is the fraction of the variance of a trait within a population that is due to genetic factors. This is the pedagogical post on heritability that I promised in a previous post on estimating heritability from genome wide association studies (GWAS).

One of the most important facts about uncertainty and something that everyone should know but often doesn’t is that when you add two imprecise quantities together, while the average of the sum is the sum of the averages of the individual quantities, the total error (i.e. standard deviation) is not the sum of the standard deviations but the square root of the sum of the square of the standard deviations or variances. In other words, when you add two uncorrelated noisy variables, the variance of the sum is the sum of the variances. Hence, the error grows as the square root of the number of quantities you add and not linearly as it had been assumed for centuries. There is a great article in the American Scientist from 2007 called The Most Dangerous Equation giving a history of some calamities that resulted from not knowing about how variances sum. The variance of a trait can thus be expressed as the sum of the genetic variance and environmental variance, where environment just means everything that is not correlated to genetics. The heritability is the ratio of the genetic variance to the trait variance.

Consider a trait that varies like height. If you plot a histogram of the heights of males or females, you will get a normal distribution. Heritability is about what determines the variance of the distribution and not the mean. That is not to say that the mean does not depend on genetics. Obviously, humans are taller than rhesus monkeys and that has everything to do with the genes. However, the mean is mostly determined by the genetic (and environmental) components that everyone shares. The variance is about what is different between people and that is what we can measure. For example, say one person is 178 cm and another is 176 cm and are genetically identical except for a handful of genetic factors. If they were subjected to the same identical environmental conditions then we could attribute the difference in height to those genetic differences. Obviously, there will be many other genetic factors that specify why the height is on average 177 cm and not say 100 cm, namely all the genetic factors that are identical. However, there is no way to figure out which of those identical genes are responsible for height as opposed to say kidney function with this information. The difference between individuals is also what natural selection can work on. That is why population genetics is so focused on variances.

In my previous post on population genetics, I introduced the concept of additive genetic effects. These are genes or more technically alleles whose contributions to the trait are independent of other genes or the environment. What this means is that if you want to know the difference from the mean, you simply add up the contributions of all the additive alleles that influence that trait. The genetic variance can thus be divided into additive genetic variance and non-additive genetic variance. The non-additive parts include everything that has a nonlinear effect such as dominance, where the presence of just one allele contributes as much as two of the same allele, or epistasis where alleles act differently depending on what other alleles are present, or gene-environment effects where the contribution of an allele changes depending on the environment. The fraction of the variance explained by the additive genetic effects is called the narrow-sense heritability as opposed to the broad-sense heritability, which includes all the genetic effects.

The classical way to measure narrow-sense heritability is to take a group of close relatives, say mothers and daughters, and plot the height of daughters versus the height of mothers. The best fit line picks up the additive genetic effects. If we standardize the heights of each generation, i.e. rescale the heights so that the mean is zero and the standard deviation is one, then the slope of the line is given by the correlation between the heights of the daughters and mothers. Note that the magnitude of a correlation is always less than one. Hence, on average daughters will be closer to the mean than their mothers. This is called regression to the mean. Mothers and daughters share exactly half of their genetic material. The heritability is thus twice the slope (i.e. slope divided by coefficient of relatedness). If you plot the height deviations of the daughters against the average of the height deviations of the two parents, then the slope is the heritability. What this means is that you can estimate the average height of your children or any other heritable trait by taking the average height of you and your spouse and multiplying by the narrow sense heritability. The narrow sense heritability of height is about 0.8, so if you and your wife are two standard deviations above the mean, then the average of your children will be 1.6 standard deviations above the mean. If the heritability of the trait is zero, then the average of your children will be the population mean. A recently developed method, as I described in a previous post, can estimate the heritability contained in a set of genetic markers for a population of strangers.

These days, most biologists seem to downplay the importance of additive genetic effects. To me this is a perfect example of discounting the obvious as I blogged about before. Most people seem to believe that the interaction of genes or epistasis must be more important. What I like to say is that epistasis is likely to be important for biology but additive genetic effects are most important for natural selection. The reason is that we inherit genes and not genotypes. Mozart may have been the genius he was because of the specific combination of genes that he possessed but that perfect combination would not be passed on to his children. Thus any allele that confers an advantage will likely only persist in the population if it confers an advantage additively. However, in cases where the population is small and there is some inbreeding, then it could be that combinations of genes that confer a large advantage together but little individually could become fixed in the population. Hence, the way I see evolution proceeding is that it takes small additive steps and then every once in awhile it takes a big nonadditive step.

The genetic variation between people can be divided into common and rare variants. The human genome has about ten million common single nucleotide polymorphisms (SNPs) but each individual will also carry many rare mutations. However, it is possible that the variation in the common variants alone could lead to mind-boggling differences in phenotype. Consider an example due to Steve Hsu. Suppose a trait depends on 400 alleles and there is a 50% chance of getting one of these alleles. Then on average you will have 400*0.5= 200 alleles and the variance around this mean will be 400*0.5*0.5=100. Hence, the standard deviation will be 10 alleles. That means 95% of the population will have between 180 and 220 alleles. This also means on average each allele contributes 0.1 standard deviations. A superoutlier who is four standard deviations away has 240 alleles. That still leaves a lot of room for improvement. If you happened to have all of the alleles, which has a probability of one half to the 400 power, you will be 20 standard deviations above the mean! Now, it could be that nonlinear effects could kick in if you have lots of alleles to saturate the effect. I wouldn’t expect any person could be 20 standard deviations above the mean but some traits could have great room for expansion and selective breeding on additive effects in animals have shown dramatic increases in phenotype.

## Polygenic Traits: Introduction, Features and Analysis | Genetics

In this article we will discuss about:- 1. Introduction to Polygenic Traits 2. Features of Polygenic Traits 3. Similarities between Oligogenic and Polygenic Traits 4. Analysis 5. Assumptions 6. Examples 7. Partitioning of Polygenic Variability 8. Significance of Polygenes.

1. Introduction to Polygenic Traits
2. Features of Polygenic Traits
3. Similarities between Oligogenic and Polygenic Traits
4. Analysis of Polygenic Traits
5. Assumptions of Polygenic Traits
6. Examples of Polygenic Traits
7. Partitioning of Polygenic Variability
8. Significance of Polygenes

1. Introduction to Polygenic Traits:

Character or trait refers to any property of an individual showing heritable variation. It includes morphological, physiological, biochemical and behavioural properties. Some characters are governed by one or few genes. Such traits are referred to as qualitative characters or oligogenic characters.

On the other hand, some characters are controlled by several genes. They are known as quantitative characters or polygenic characters. The mode of inheritance of polygenic characters is termed as polygenic inheritance or quantitative inheritance. Since in polygenic inheritance several genes (factors) are involved, it is also known as multiple factor inheritance.

#### 2. Features of Polygenic Traits:

The term polygene was introduced by Mather in 1941. This term has found wide usage in quantitative genetics replacing the older term multiple gene.

Main features of polygenic characters are briefly presented below:

1. Each polygenic character is controlled by several independent genes and each gene has cumulative effect.

2. Polygenic characters exhibit continuous variation rather than a discontinuous variation. Hence, they cannot be classified into clear-cut groups.

3. Effect of individual gene is not easily detectable in case of polygenic characters and, therefore, such traits are also known as minor gene characters.

4. The statistical analysis of polygenic variation is based on means, variances and co-variances, whereas the discontinuous variation is analysed with the help of frequencies and ratios. Thus, polygenic characters are studied in quantitative genetics and oligogenic characters in mendelian genetics.

5. Polygenic traits are highly sensitive to environmental changes, whereas oligogenic characters are little influenced by environmental variation.

6. Classification of polygenic characters into different clear-cut groups is not possible because of continuous variation from one extreme to the other. In case of qualitative characters, such grouping is possible because of discrete or discontinuous variation.

7. Generally the expression of polygenic characters is governed by additive gene action, but now cases are known where polygenic characters are governed by dominance and epistatic gene action. In case of oligogenic characters, the gene action is primarily of non-additive type (dominance and epistasis).

8. In case of polygenic characters, metric measurements like size, weight, duration, strength, etc. are possible, whereas in case of oligogenic characters only the counting of plants with regard to various kinds like colour and shape is possible. Thus, metric measurement is not possible in case of oligogenic characters.

9. Transgressive segregants are only possible from the crosses between two parents with mean values for a polygenic character. Such segregants are not possible in case of qualitative or oligogenic traits.

10. The transmission of polygenic characters is generally low because of high amount of environmental variation. On the other hand, oligogenic characters exhibit high transmission because there is little difference between the genotype and phenotype of such character. Thus, polygenic characters differ from oligogenic ones in several aspects (Table 12.1).

In plant breeding both types of characters showing qualitative and quantitative inheritance have equal economic importance.

#### 3. Similarities between Oligogenic and Polygenic Traits:

East (1916) demonstrated that polygenic characters were perfectly in agreement with Mendelian segregation and later on Fisher (1918) and Wright (1921, 1935) provided a mathematical basis for the genetic interpretation of such characters.

The quantitative characters do not differ in any essential feature from the qualitative characters, as discussed below:

1. Both quantitative and qualitative characters are governed by genes the former is controlled by polygenes or minor genes and the latter by oligogenes or major genes.

2. Both major as well as minor genes are located on the chromosome in the nucleus.

3. The polygenic traits controlling continuous variation exhibit segregation like major genes controlling discontinuous Mendelian variation.

4. Polygenic characters show variable expression which is due to non-genetic causes i.e., environmental effects. Qualitative characters also exhibit variation in expression but to a lesser degree than polygenic traits.

5. The reciprocal crosses for both types of traits exhibit close agreement in expression of genes.

6. The phenomenon of transgression in polygenes can only be explained by Mendelian principles of inheritance.

7. Polygenes mutate like oligogenes.

8. Dominance and non-allelic interactions are common features of major genes. These features are also observed for polygenes, but are usually complete for major genes and only partial for minor genes.

9. Polygenes exhibit linkage like oligogenes. Many cases of linkage between major genes and polygenes controlling continuous variation have been reported.

Thus, quantitative genetics or biometrical genetics is an extension of Mendelian genetics firmly based on Mendelian principles of heredity.

#### 4. Analysis of Polygenic Traits:

The method of analysis of quantitative inheritance differs from that of qualitative inheritance in some aspects as given below:

1. It requires various measurements of characters like weight, length, width, height, duration, etc., rather than classification of individuals into groups based on colour or shape.

2. Observations are recorded on several individuals and the mean values are used for genetical studies. Segregation into distinct classes in F2 generation-is not obtained in the inheritance of quantitative characters. The segregants exhibit continuous range of variation from one extreme (low) to other (high) for such traits.

3. The inheritance is studied with the help of mean, variances and covariance’s. These estimates can be worked out from data recorded in replicated experiment.

4. Fisher (1918) was the pioneer worker to interpret the quantitative characters in terms of Mendelian genetics. Now several biometrical techniques are available for the genetic analysis of quantitative characters. The science which deals with the genetic interpretations of quantitative characters has got separate entity as quantitative genetics or biometrical genetics.

#### 5. Assumptions of Polygenic Traits:

Polygenic inheritance is based on several assumptions.

The six important assumptions are given below:

1. Each of the contributing genes involved in the expression of a character produces an equal effect.

2. Each contributing allele has either cumulative or additive effect in the expression of a character.

3. The genes involved in the expression of characters have lack of dominance. They show intermediate expression between two parents.

4. There is no epistasis among genes at different loci.

6. The environmental effects are absent or may be ignored. However, last three assumptions are seldom fulfilled.

There are two types of alleles or genes in the polygenic inheritance, viz:

(1) Contributing alleles and

(2) Non-contributing alleles.

Those alleles which contribute to continuous variation are known as contributing alleles and those which do not contribute to continuous variation are referred to as non-contributing alleles. Some scientists refer to these as effective and non-effective alleles, respectively.

#### 6. Examples of Polygenic Traits:

In plant genetics, examples of polygenic characters include yield per plant, days to flower, days to maturity, seed size, seed oil content, etc. Examples of qualitative characters are colour of stem, flower, pollen, etc. and their shapes.

Polygenic inheritance has been reported for various characters both in plants and animals. The most common examples include kernel colour in wheat, corolla length in tobacco, skin colour in man and ear size in maize.

These are briefly described as follows:

1. Kernel Colour in Wheat:

Nilsson Ehle (1908) studied the inheritance of kernel colour in wheat. He found that seed or kernel colour in wheat is governed by one, two and three gene pairs, because in the crosses between red and white kernel varieties, he observed that the F1 was intermediate between the parental values and in F2 he observed 3:1,15:1 and 63 : 1 ratios of red and white seeds in different crosses.

The last two ratios indicated that there was duplicate gene interaction however, in depth study of coloured seeds revealed that there were different grades or shades of colour within the red coloured seeds. The red seeds of 15 : 1 ratio could be easily divided into four classes on the basis of shade of colour, viz., dark red, medium dark red, medium red and light red.

These colours were observed in the ratio of 1 : 4 : 6 : 4 : 1. This suggested that the seed colour in wheat is controlled by genes which show lack of dominance and have small cumulative effects.

Here, two types of alleles are involved in the expression of character. Those which contribute to continuous variation and those which do not contribute. The first category of alleles is called effective and second as non-effective. Assume that red seed colour is controlled by two genes R1 and R2 and, white seed colour by r1 and r2.

From the cross between dark red and white seed parents, Nilsson Ehle observed the following results (Fig. 12.1):

Where 4 effective alleles were present, the seed colour was dark red, where 3 such alleles were present, the seed colour was medium dark red, with 2 effective alleles, colour was medium red and with 1 effective allele, seed colour was light red. White seed colour was produced when all the non-effective alleles were present.

2. Corolla Length in Tobacco:

Extreme differences exist in corolla length in Nicotiana longiflora. East (1916) studied the inheritance of corolla length in this species of tobacco. He crossed inbred lines of this species with average corolla length of 40 cm and 93 cm.

The F1 showed intermediate expression for corolla length with 63 cm. In F2, wide variation for corolla length was observed. The results indicated that five or more genes were involved in the expression of corolla length.

3. Skin Colour Inheritance in Man:

The inheritance of skin colour in man was studied by Davenport. The inheritance of Negro x white matings can be explained on the basis of two gene difference. Assume that negro colour is governed by A and B genes and white colour by a and b genes.

A cross between negro and white gives birth to a child with medium skin colour called mullatoes (F1). In F2 generation, four distinct shades of black colour were observed besides one white (Fig. 12.2). Thus, the phenotypic ratio of 1 : 4 : 6 : 4 : 1 was observed. The individuals having 4, 3, 2, 1 and 0 effective alleles had black (negro) dark, medium, light and white colour, respectively.

The results are presented below:

Subsequent studies on skin colour inheritance indicated that as many as six genes are involved in the expression of this character.

Transgressive Segregation:

Appearance of transgressive segregants in F2 is an important feature of polygenic inheritance. Segregants which fall outside the limits of both the parents are known as transgressive segregants. Transgressive segregation results due to fixation of dominant and recessive genes in separate individuals.

Such segregation occurs when the parents are intermediate to the extreme values of the segregating population. Plant breeders use this principle to obtain superior combinations in segregating material for polygenic characters.

An example of transgressive segregation is presented as follows:

Environmental Effect:

Polygenic characters are highly sensitive to environmental changes. In other words, they are more prone to genotype x environmental interactions. The main effect of environment is to mask the small differences among different genotypes resulting in continuous variation in the character.

When the contribution of environment is 50 per cent, the distribution becomes roughly similar to normal curve and with 75 per cent contribution, it tends to reach normal distribution. For polygenic traits, generally the environmental variation ranges from 10 to 50 per cent and even more for some traits like yield. The high environmental variation results in overlapping of various classes resulting in continuous, variation.

#### 7. Partitioning of Polygenic Variability:

The polygenic variation or variability present in a genetic population is measured in terms of variances.

The polygenic variation is of three types, viz:

These are briefly described below:

1. Phenotypic Variability:

It is the total variability which is observable. It includes both genotypic and environmental variation and hence changes under different environmental conditions. Such variation is measured in terms of phenotypic variance.

2. Genotypic Variability:

It is the inherent or genetic variability which remains unaltered by environmental conditions. This type of variability is more useful to a plant breeder for exploitation in selection or hybridization. Such variation is measured in terms of genotypic variance. The genotypic variance consists of additive, dominance and epistatic components.

3. Environmental Variability:

It refers to non-heritable variation which is entirely due to environmental effects and varies under different environmental conditions. This uncontrolled variation is measured in terms of error mean variance. The variation in true breeding parental lines and their F1 is non-heritable. Fisher was the first to divide in 1918, the genetic variance into additive, dominance and epistatic components.

It refers to that portion of genetic variance which is produced by the deviations due to average effects of genes at all segregating loci. Thus, it is the component which arises from differences between two homozygotes of a gene, i.e., AA and aa. Additive genes show lack of dominance, i.e., intermediate expression.

The additive genetic variance is associated with homozygosis and, therefore, it is expected to be maximum in self-pollinating crops and minimum in cross-pollinating crops. Additive variance is fixable and, therefore, selection for traits governed by such variance is very effective.

Additive genetic variance is important for the following major reasons:

1. It is required for estimation of heritability in narrow sense and response to selection is directly proportionate to narrow sense heritability.

2. It is a pre-requisite for selection because this is the only variance which responds to selection.

3. Breeding value of an individual is measured directly by the additive gene effects. The general combining ability (gca) effect of a parent is measure of additive gene effects.

4. Additive genetic variance gets depleted proportionate to the improvement made by selection.

5. In natural plant breeding populations, additive variance is the predominant one closely followed by dominance variance.

b. Dominance Variance:

It arises due to the deviation from the additive scheme of gene action resulting from intra-allelic interaction i.e., interaction between alleles of the same gene or same locus. It is due to the deviation of heterozygote (Aa) from the average of two homozygotes (AA and aa).

Such genes show incomplete, complete or over-dominance. The dominance variance is associated with heterozygosis and, therefore, it is expected to be maximum in cross-pollinating crops and minimum in self-pollinating species.

Dominance variance is not fixable and, therefore, selection for traits controlled by such variance is not effective. Heterosis breeding may be rewarding in such situation. Dominance variance differs from additive variance in several ways. (Table 12.2).

c. Epistatic Variance:

It arises due to the deviation as a consequence of inter-allelic interaction, i.e., interaction between alleles of two or more different genes or loci. The epistatic variance is of three types, viz., (i) additive x additive, (ii) additive x dominance, and (iii) dominance x dominance. They differ from each other in several aspects. (Table 12.3).

In this case both the interacting loci exhibit lack of dominance individually. It is denoted as A x A and is fixable.

It refers to interaction between two or more loci, one exhibiting lack of dominance and the other dominance individually. It is denoted as A x D and is non-fixable.

(iii) Dominance x Dominance:

In this type of epistasis both the interacting loci exhibit dominance individually. It is represented as D x D and is non-fixable.

The first type of epistasis is fixable and, therefore, selection is effective for traits governed by such variance. The last two types of epistatic variances are unfixable and, therefore, heterosis breeding may be rewarding for traits exhibiting such variance. In natural plant breeding populations, epistatic variance has the lowest magnitude. Epistatic variance differs in many aspects from dominance variance.

Wright (1935) suggested the partitioning of genetic variance into two components, viz., additive and non-additive (dominance and epistatic components), of which only the additive component contributes to genetic advance under selection.

Mather (1949) divided the phenotypic variance into three components, namely, (1) heritable fixable (additive variance), (2) heritable non-fixable (dominance and epistatic components), and (3) non-heritable non-fixable (Environmental fraction).

In fact, the heritable fixable component of phenotypic variance will include the additive x additive fraction of the epistatic variance as well. Further, the total phenotypic variance may be partitioned as (1) fixable (additive and additive x additive components) and (2) non-fixable (dominance, additive x dominance and dominance x dominance types of epistasis and environmental fraction) components.

The above discussion may be summarized as follows:

VP = VG + VE VG = VA + VD + VI and VI = VAA + VAD + VDD

Where VP = phenotypic variance, VG = genotypic variance, VA = additive variance, VD = dominance variance, VI = epistatic variance, VAA = additive x additive variance, VAD = additive x dominance variance, and VDD = dominance x dominance variance.

In homozygous genotypes, the genetic variance is of additive (A) and additive epistatic (AA) types, while in the segregating populations all the three types of genetic variances, viz., additive, dominance and epistasis are observed. In F2, the phenotypic variance has 1/2D (additive) and 1/4H (non-additive) components.

In a random mating populations with no epistasis and zero inbreeding, the covariance between a parent and its offspring is 1/2 VA the covariance among half-sibs is 1/4 VA and the covariance among full-sibs is 1/2 VA + 1/4 VD. These relationships change with the level of inbreeding in the population.

Genetic variability for important agronomic traits in almost all the crops is mainly due to the additive genetic variance. The non-additive variance also exists in nearly all crops and for many important traits, but it is generally smaller in magnitude than the additive component.

The variability present in genetic populations can be assessed in four different ways: (1) using simple measures of variability, (2) by variance component analysis, (3) by D2 statistics, and (4) by metro glyph analysis. For details of these procedures refer Singh and Narayanan (1993).

#### 8. Significance of Polygenes:

Polygenes are of prime importance to plant breeder for evolution of improved cultivars. Polygenes have great evolutionary significance. They provide variation of fine adjustment and are systems of smooth adaptive change and of speciation.

The potential genetic variability is stored in the form of linked polygenic complexes. Such stores bear mixtures of plus and minus alleles. The potential or hidden variability is released, after inter-mating of such genotypes with other genotypes, due to segregation and recombination.

Mather (1943) has nicely explained the mechanism of storage and release of polygenic variability. It is believed that in natural populations, the best adapted or the fit individuals are those that are close to the population mean for various quantitative traits.

Mather recognized two types of variability, viz:

1. Free Variability:

It refers to phenotypic differences between homozygotes with extreme phenotypes. Such variability is expressed and exposed to selection. Natural selection acts against extreme phenotypes.

2. Potential Variability:

If refers to hidden or bound variability in the heterozygotes or in the homozygotes which do not have the extreme phenotype and, therefore, is not exposed to selection.

It is of two types as given below:

Heterozygotic Potential Variability:

This type of variability is stored in heterozygote, e.g., AaBb. Such heterozygotes are phenotypically uniform and are very close to the population mean. However, they would produce extreme phenotypes in the next generation due to segregation and recombination. Thus, the heterozygotes function as stores of variability which is released slowly as free variability due to segregation and recombination.

Homozygotic Potential Variability:

Homozygotes also function as stores of variability. For example, two gene homozygotes AAbb and aaBB may be expected to cluster around the mean of the population. They would, therefore, be protected from natural selection and would be phenotypically uniform.

However, they would produce the extreme phenotypes AABB and aabb after crossing, i.e., AAbb x aaBB followed by segregation and recombination. The release of this type of variability is slow because it must first be converted into heterozygotic potential variability through hybridization and then it is released as free variability.

In case of polygenic traits, several genes governing a character may be present on the same chromosome. It would be advantageous to the population if these genes were linked in the repulsion phase, i.e., some dominant genes were linked with some recessive genes.

For example, out of the three schemes for the arrangement of four genes, A, B, C, and D, given below, scheme number three would be the most desirable. Because in this scheme, the full release of variability would require three crossovers at precise points (marked x).

It may be expected that natural populations would develop such complex and elaborate gene arrangement for storing variability. This would permit them to meet the opposing demand of immediate fitness and long term evolutionary requirements.

This mechanism of storage and release of genetic variability in the form of polygenic complexes gives response to selection in new direction. The linkage among polygenes is useful. It reduces immediate response to selection but prolongs the response to selection due to slow release of potential genetic variability in the segregating generations.

## Alpha-5/alpha-3 nicotinic receptor subunit alleles increase risk for heavy smoking

Twin studies indicate that additive genetic effects explain most of the variance in nicotine dependence (ND), a construct emphasizing habitual heavy smoking despite adverse consequences, tolerance and withdrawal. To detect ND alleles, we assessed cigarettes per day (CPD) regularly smoked, in two European populations via whole genome association techniques. In these approximately 7500 persons, a common haplotype in the CHRNA3-CHRNA5 nicotinic receptor subunit gene cluster was associated with CPD (nominal P=6.9 x 10(-5)). In a third set of European populations (n= approximately 7500) which had been genotyped for approximately 6000 SNPs in approximately 2000 genes, an allele in the same haplotype was associated with CPD (nominal P=2.6 x 10(-6)). These results (in three independent populations of European origin, totaling approximately 15 000 individuals) suggest that a common haplotype in the CHRNA5/CHRNA3 gene cluster on chromosome 15 contains alleles, which predispose to ND.

### Figures

A diagram of the CHRNA3 gene on Chromosome 15 is shown, with base…