Pharma forum:Layout 1 25/02/2010 15:10 Page 1 Vista signs agreements with Quinn and Vhi V ista Primary Care, Ireland’s scans to its members. Commenting on the agreements, Mr uniquely capable of producing 4D vol-Quinn Healthcare members covered Ulick McEvaddy, Chairman of Vista, ume images.”campus for public and private by the arrangements with Vista Pri- said: ‘Since we opened our doo
Msi192 2022.2026Nina Stoletzki,* John Welch, Joachim Hermisson,* and Adam Eyre-Walker *Section of Evolutionary Biology, Department Biology II, Ludwig-Maximilians-University Munich, Planegg-Martinsried, Germany;and Centre for the Study of Evolution, University of Sussex, Brighton, United Kingdom It has been suggested that volatility, the proportion of mutations which change an amino acid, can be used to infer the levelof natural selection acting upon a gene. This conjecture is supported by a correlation between volatility and the rate ofnonsynonymous substitution (dN), or the ratio of nonsynonymous and synonymous substitution rates, in a variety oforganisms. These organisms include yeast, in which the correlations are quite strong. Here we show that these correlationsare a by-product of a correlation between synonymous codon bias toward translationally optimal codons and dN. Althoughthis analysis suggests that volatility is not a good measure of the selection, we suggest that it might be possible to infersomething about the level of natural selection, from a single genome sequence, using translational codon bias.
Understanding the nature of natural selection on DNA relation between volatility and dN/dS in yeast may, in fact, sequences is one of the central goals of molecular evolution.
as Hahn et al. (2005) suggest, be due to a correlation be- Plotkin, Dushoff, and Fraser (2004) and Plotkin et al.
tween translational codon bias and dN/dS. Although Hahn (2004) have recently suggested that it is possible to infer et al. suggest that the correlation between volatility and dN/ the level of natural selection, both positive and negative, dS may be due to a correlation between translational codon acting upon a gene from a single genome sequence.
bias and selective constraint they do not resolve whether They suggest that this can be achieved by measuring this is the case. They show that a measure of translational ‘‘volatility’’—volatility is the proportion of point mutations codon bias, codon adaptation index (CAI), explains more of in a gene, which do not yield a stop codon, which change the variance in volatility than dN/dS in yeast, but they do an amino acid. They base their method on the prediction not pursue the matter further. Plotkin, Dushoff, and Fraser that genes which have recently undergone amino acid sub- (2005) investigate the partial correlation between dN/dS stitutions should be populated by codons with high volatil- and volatility controlling for CAI and show it is significant, ity (Plotkin et al. 2004). In support of their thesis they show but they fail to give the magnitude of the effect.
that in both Mycobacterium and Saccharomyces species, It, therefore, remains very unclear if the principle cor- there is a correlation between volatility and the rate of non- relation is between dN/dS and translational codon bias, with synonymous substitution (dN) or the ratio of nonsynony- the correlation between dN/dS and volatility a by-product nous and synonymous substitution rates (dN/dS). This of this, or whether the principle correlation is between dN/ correlation is quite strong in yeast, which suggests that vol- dS and volatility. Also, it might be that both translational atility might be a useful measure of selection.
codon bias and volatility separately correlate to dN/dS.
The idea that volatility can measure the level of selec- To investigate the matter further, we take advantage of tion, either positive or negative, on a gene has been criti- the fact that in yeast there is a strong correlation between cized on a number of grounds (Dagan and Graur 2004; dN/dS (or dN) and both translational codon bias (Pal, Papp, Friedman and Hughes 2004; Sharp 2004; Chen, Emerson, and Hurst 2001) and volatility (Plotkin, Dushoff, and Fraser and Martin 2005; Hahn et al. 2005; Nielsen and Hubisz 2004) and that in yeast some of the translational optimal 2005; Zhang 2005). Much of the debate has centered codons have relatively high volatility while others have rel- around the reasons why volatility is not expected to corre- atively low volatility (table 1). It is well established that co- late to dN and dN/dS. For example, it has been suggested don bias and gene expression are correlated in yeast (see that volatility is unlikely to measure selection because (1) it e.g., Coghlan and Wolfe 2000). So volatility per amino acid only depends on four or five amino acids (Dagan and Graur is expected to increase (Ile, Leu, and Ser) or decrease (Arg 2004; Sharp 2004; Chen, Emerson, and Martin 2005), (2) it and Gly) with translational codon bias or expression level has low variance (Dagan and Graur 2004), and (3) simple (table 1). For example, the most optimal codon in yeast for models of evolution fail to yield a correlation between dN/ argenine is AGA, which has relatively high volatility. If the dS and volatility (Dagan and Graur 2004; Nielsen and principle correlation is between dN/dS (or dN) and trans- Hubisz 2005; Zhang 2005). However, volatility is corre- lational codon bias, then we expect AGA usage to be neg- lated to dN/dS (and dN); so much of this discussion, while atively correlated to dN/dS (or dN), but if the principle interesting is slightly tangential. The crucial question is correlation is between dN/dS (or dN) and volatility, then we expect AGA usage to be positively correlated to dN/ Almost all of these critiques point out that volatility is a measure of codon usage bias. As such, the apparent cor- Our results are unequivocal; in yeast dN/dS (and dN) is negatively correlated to the use of translational optimal Key words: volatility, codon bias, selection, nonsynonymous codons for all amino acids whose synonymous codons dif- fer in their volatility, even in those whose optimal codons E-mail: firstname.lastname@example.org; a.c.eyre-walker@ have high volatility. We further show that the correlation between dN/dS (or dN) and translational optimal codon Mol. Biol. Evol. 22(10):2022–2026. 2005 use is universal across all amino acids, including those syn- doi:10.1093/molbev/msi192Advance Access publication June 15, 2005 onymous codons which do not differ in their volatility. The Ó The Author 2005. Published by Oxford University Press on behalf ofthe Society for Molecular Biology and Evolution. All rights reserved.
For permissions, please e-mail: email@example.com Table 1Synonymous Codon Use of Volatility-Affecting Amino Acids a Given Relative Synonymous Codon Usage values of Kliman, Naheelah, and Santiago (2003).
b Given Relative Synonymous Codon Usage values of Sharp and Cowe (1991).
observed correlation between dN/dS (or dN) and volatility sume that the transition:transversion ratio 5 4.1, to calcu- is a by-product of the correlation between dN (or dN/dS) late the volatilities of individual codons, as suggested by Plotkin et al. We compute Plotkin’s volatility P values us- We downloaded the gene alignments from the four We measured translational codon bias per gene and yeast species sequenced by Kellis et al. (2003). From these per amino acid separately. To measure translational codon we excluded all genes which were not present in all the four bias, we computed the CAI according to Sharp and Li yeast species (Saccharomyces cerevisiae, Saccharomyces (1987) with the corrections suggested by Bulmer (1988).
paradoxus, Saccharomyces mikatae, and Saccharomyces We also calculated the frequency of optimal codons bayanus), which did not have start and stop codons in (FOP) according to the list of optimal codons for S. cerevi- all species, which had premature stop codons, and which siae given by Kliman, Naheelah, and Santiago (2003). Vol- had frameshifting indels. This left 1,077 genes. This is atility values and codon bias statistics were calculated for smaller than the data set analyzed by Plotkin, Dushoff, the S. cerevisiae sequence because this is the best studied of and Fraser (2005) but has less chance of containing pseu- We used PAML (Yang 1997) to compute dN, dS, and Plotkin, Dushoff, and Fraser (2004) suggest using dN/dS for each gene using the F3 3 4 model in which co- a statistic, volatility P value, to measure volatility. This don frequencies are estimated from the nucleotide frequen- is the probability of a gene having the observed volatility cies at the three codon positions. Because a physical given the average synonymous codon use of the genes in definition of a site is more appropriate for the measurement the genome. The volatility P measure of Plotkin, Dushoff, of the synonymous substitution rate (dS), we express dS per and Fraser 2004 is unlikely to be a very good statistic be- codon (Bierne and Eyre-Walker 2003). We performed all cause it will depend to some extent on gene length (Sharp our analyses on both dN and dN/dS. Although dN/dS is of- 2004) and amino acid composition (Dagan and Graur 2004; ten regarded as a better measure of the selection acting upon Zhang 2005)—any statistic based on probability values de- nonsynonymous sites, it may not be in organisms, like pends on sample size, and the variance between synony- yeast, in which there is selection on synonymous codon mous codons for volatility differs between amino acids.
use. Indeed we note that there is a strong correlation be- To account for these shortcomings, we calculated an alter- tween dS per codon and codon usage bias in our data (tables Confirming the analysis of Plotkin, Dushoff, and Fraser (2004), we found a highly significant correlation be- tween the volatility P value of Plotkin, Dushoff, and Fraser 2004, or average volatility, and dN/dS (or dN) per gene and Xi is the number of times codon i is used for the amino (table 2). We also confirm the result of Pal, Papp, and acid aa, Vi is the volatility of that codon, and n is the number Hurst (2001) that there is a strong correlation between mea- of amino acids whose synonymous codons differ in their sures of translational codon bias (FOP and CAI) and dN/dS volatility. When considering amino acids separately we used Vaa, the average volatility per amino acid. Note that So, is the observed correlation between volatility and the volatility is only affected by five amino acids whose dN/dS (or dN) due to the correlation between translational synonymous codons differ in their volatility—Arg, Gly, codon bias and dN/dS (or dN) or vice versa? To answer this, Ile, Leu, and Ser (the codons of Ile only differ when the we look at the five volatility-affecting amino acids individ- transition:transversion ratio is different from unity). We as- ually (table 3). We only observe a positive correlation Table 2Spearman’s Rank Correlation Coefficients Between dN, dN/dS, dS Per Codon, or dS andVolatility or Translational Codon Usage Bias for Each Gene a Remind, Plotkin’s volatility P value relates inversely to volatility.
*** P , 0.001.
between volatility and dN/dS (or dN) for three of the amino that volatility will only be a measure of selection under acids (Ile, Leu, and Ser). The two amino acids which show rather specific conditions (Plotkin et al. 2004).
a negative correlation between volatility and dN/dS (or dN), Our results may seem surprising given that Plotkin, opposing the expectation of Plotkin, Dushoff, and Fraser Dushoff, and Fraser (2005) report a significant partial cor- 2004, are those (Arg, Gly) for which high translational co- relation between volatility P value and dN/dS in yeast using don usage (in high expression genes) leads to low volatility CAI to control for translational codon bias, a result we can (see table 1). There is also no indication that volatility confirm on our smaller data set (table 5). However, volatil- affects the correlation; the correlation between translational ity P value is not normally distributed, so the probability of codon bias and dN/dS (or dN) is as strong for Arg and Gly, the partial correlation is not necessarily accurate, and the The correlation between translational codon bias and dN/dS (or dN) is very consistent across amino acids—for Table 4Spearman’s Rank Correlation Coefficients Between dN, almost every amino acid the correlation is negative and of- dN/dS, and dS Per Codon and Translational Codon Usage ten significant, and if it is positive, the correlation is small Bias for the Individual Amino Acids Not Effecting Volatility We have shown that the observed correlation between dN/dS (or dN) and volatility is an incidental correlation caused by a correlation between dN/dS (or dN) and trans- lational codon bias—dN/dS (or dN) correlates negatively with translational codon bias and volatility, for those amino acids in which the translationally optimal codons are high in volatility. This suggests that dN/dS (or dN) is not directly correlated to volatility and that volatility is therefore not the best, or even a good, predictor of dN/dS (or dN). This is not unexpected given recent theoretical work, which suggests Spearman’s Rank Correlation Coefficients Between dN, dN/dS, and dS Per Codon and Volatility or Translational Codon Usage Bias for the Five Amino Acids Effecting P , 0.01, *** P , 0.001, NS 5 not significant.
* P , 0.05, ** P , 0.01, *** P , 0.001, NS 5 not significant.
Table 5Partial Correlations of Measures of Translational Codon Bias and Measures of VolatilityMeasures for Each Gene with dN and dN/dS * P , 0.05, *** P , 0.001, NS 5 not significant.
significance of the partial correlation depends critically on tural data (Tourasse and Li 2000), to predict which genes the volatility statistic used. If we use our average volatility are likely to be fast-evolving genes. So, although volatility instead of the volatility of Plotkin et al., which will depend has come in for much criticism, Plotkin and colleagues may to some extent on gene length and amino acid composition have drawn our attention to an approach to an important (see Materials and Methods), then the partial correlation between dN/dS (or dN) and average volatility, controllingfor translational codon bias, becomes very small and non- significant, while the partial correlation between dN/dS (ordN) and translational codon bias remains (table 5). The We thank Daniel Jeffares for some initial work, strongest correlations, either simple or partial, that we ob- Stephan Hutter and Pieter van Beek for help with Perl, serve are between translational codon bias and dN/dS (or and an anonymous referee for helpful comments.
dN), which suggests that these are the primary correlations(tables 2 and 3).
It is also interesting to note that the correlation be- tween codon bias and dN is consistently stronger than Akashi, H. 1994. Synonymous codon usage in Drosophila mela- the correlation between codon bias and dN/dS. This is prob- nogaster: natural selection and translational accuracy. Genetics ably due to the fact that dS is correlated to codon bias and that this correlation is due to selection on codon usage bias Betancourt, A., and D. Presgraves. 2002. Linkage limits the power and not variation in the mutation rate.
of natural selection in Drosophila. Proc. Natl. Acad. Sci. USA99(21):13616–13620.
Although volatility does not appear to be a good mea- Bierne, N., and A. Eyre-Walker. 2003. The problem of counting sure of selection, Plotkin, Dushoff, and Fraser (2004) may sites in the estimation of the synonymous and nonsynonymous have been correct in asserting that it may be possible to infer substitution rates: implications for the correlation between syn- something about dN in a gene from a single genome se- onymous substitution rate and codon usage bias. Genetics quence. A negative correlation between translational co- don bias and dN has now been described in three Bulmer, M. 1988. Are codon usage patterns in unicellular organ- different organisms: enteric bacteria (Sharp 1991; Rocha isms determined by selection mutation balance? J. Evol. Biol.
and Danchin 2004), Drosophila (Akashi 1994; Betancourt and Presgraves 2002; Marais et al. 2004), and yeast (Pal, Chen, W., J. J. Emerson, and T. M. Martin. 2005. Not detecting Papp, and Hurst 2001), and we have shown that the corre- selection using a single genome. Nature 433:E6–E7.
Coghlan, A., and K. H. Wolfe, 2000. Relationship of codon bias lation is consistent for all amino acids in yeast. Further- to MRNA concentration and protein length in Saccharomyces more, although the basis of this correlation is unknown and subject to much debate (Betancourt and Presgraves Dagan, T., and D. Graur. 2004. The comparative method rules! 2002; Marais et al. 2004), at least one of the explanations Codon volatility cannot detect positive Darwinian selection us- is likely to lead to the correlation being widespread. It has ing a single genome sequence. Mol. Biol. Evol. 22:1260–1272.
been suggested that the correlation between codon bias and Friedman, R., and A. L. Hughes. 2004. Codon volatility as an in- dN arises through a correlation in the strength of selection dicator of positive selection: data from eukaryotic genome acting upon synonymous and nonsynonymous mutations, comparisons. Mol. Biol. Evol. 22:542–546.
probably as a consequence of selection for translational Hahn, M., J. G. Mezey, D. J. Begun, J. H. Gillespie, A. D. Kern, accuracy—important amino acid sites in a protein will C. H. Langley, and L. Moyle. 2005. Codon bias and selectionon single genomes. Nature 433:E5.
be subject to strong selection to be conserved during evo- Kellis, M., N. Patterson, M. Endrizzi, and E. S. Lander. 2003. Se- lution and to be accurately translated (Akashi 1994). Thus quencing and comparison of yeast species to identify genes and any genome, in which selection for translational accuracy is regulatory elements. Nature 423:241–254.
effective, should show the correlation, and it may therefore Kliman, R. M., I. Naheelah, and M. Santiago. 2003. Selection con- be possible to use codon bias, maybe in combination with flicts, gene expression, and codon usage trends in yeast.
other information, such as amino acid composition or struc- Marais, G., T. Domazet-Loso, D. Tautz, and B. Charlesworth.
Sharp, P. M. 2004. Gene ‘‘volatility’’ is most unlikely to reveal 2004. Correlated evolution of synonymous and nonsynony- adaptation. Mol. Biol. Evol. 22:807–809.
mous sites in Drosophila. J. Mol. Evol. 59:771–779.
Sharp, P. M., and E. Cowe. 1991. Synonymous codon usage in Nielsen, R., and M. J. Hubisz. 2005. Detecting selection needs Saccharomyces cerevisiae. Yeast 7:657–678.
Sharp, P. M., and W.-H. Li. 1987. The codon adaptation Pal, C., B. Papp, and L. D. Hurst. 2001. Highly expressed genes in index—a measure of directional synonymous codon usage yeast evolve slowly. Genetics 158:927–931.
bias, and its potential applications. Nucleic Acids Res.
Plotkin, J. B., J. Dushoff, M. M. Desai, and H. B. Fraser. 2004.
Synonymous codon usage and selection on proteins. Tourasse, N., and W.-H. Li. 2000. Selective constraints, amino acid composition and the rate of protein evolution. Mol. Biol.
Plotkin, J. B., J. Dushoff, and H. B. Fraser. 2004. Detecting se- lection using a single genome sequence of M. tuberculosis Yang, Z. 1997. PAML: a program package for phylogenetic and P. falciparum. Nature 428:942–945.
analysis by maximum likelihood. Comput. Appl. Biosci.
———. 2005. Reply. Nature 433:E7–E8.
Rocha, E. P. C., and A. Danchin. 2004. An analysis of determi- Zhang, J. 2005. On the evolution of codon volatility. Genetics nants of amino acid substitution rates in bacterial proteins. Mol.
Sharp, P. M., 1991. Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J. Mol. Evol.
FM: indicates sections (up to level 6) that address functional mathematics ISBN 978-0-521-68993-9 with answers; 978-0-521-69431-5 without answers Book F1 contents Order from +44 (0)1223 325588 1 Reflection and rotation symmetry 10 6 Angles, triangles and quadrilaterals 42 A Reflection symmetry level 4 10 A Review: angles round a point, on a line, in a triangle 42 B Rotatio