A review of the impact of genomic selection with SNPs on animal breeding
The purpose of this report was to assess the effect of genomic selection (GS) on the animal breeding industry. The report discusses the molecular theory behind the technology, the advances which facilitated its development and the key features which make it desirable to the animal breeding industry. The report focuses on the impact GS is currently having on three agricultural sectors and its likely implementation in the future. It concludes that the technology has great potential to facilitate genetic gains dairy and pig industries, which are well structured for its easy implementation, but that its use in the sheep industry is limited.
The traditional infinitesimal model of quantitative traits states that the phenotype is governed by an infinite number of genes, each with a small effect on the trait, as well as an external environmental effect. Under this model it is not possible to make selection decisions based on specific loci so, traditionally, selection has been based on the prediction of the overall effect of each animal’s genes and its estimated breeding value (EBV)(Goddard, 2009) from a combination of phenotypic and pedigree information (Luan et al., 2009).
In 2001, genomic selection (GS) was suggested as a means to accurately select superior breeding stock from values predicted by dense marker data (Verbyla et al., 2009). The method uses a reference population of genotyped and phenotyped individuals to develop a model that can be applied to the wider population. This model allows the production of accurate genomic estimated breeding values (GEBVs) for the rest of the untested population (Jannink et al., 2010) based on an estimation of the marker effects throughout the entire genome combined with marker genotypes (Luan et al., 2009). GS assumes that all the genetic variance is explained by markers (Luan et al., 2009) and that at least one marker is in linkage disequilibrium (LD) with quantitative trait loci (QTL). This allows for estimation of breeding values without phenotypic information gathered from the wider population and represents, potentially, a significant reduction in costs and labour since performance recording is not required for entire populations (Luan et al., 2009) or subsequent generations (Hayes et al., 2009a).
Initially, the calculation of GEBVs requires a prediction equation based on markers. As such, the genome is divided into small segments and the genotypic and phenotypic data from the reference population is used to determine the effects each locus contributes to overall genetic variation (Hayes et al., 2009a). Markers with low density are in low LD with QTL and subsequently account for a lower proportion of the genetic variance (Verbyla et al., 2009). The proportion of variance for the QTL that is explained by the markers can be used to infer the genotype of each animal at each QTL, since it is linked to the extent of LD and is known to decline as the distance between two loci increases (Goddard & Hayes, 2009). The true breeding value then is the sum of all the effects of each QTL (Goddard, 2009) and subsequent generations need only be genotyped for the markers to determine which chromosome segments they carry (Hayes et al., 2009a).
An alternative method to treating haplotypes of markers as though they were QTLs is to assume each gamete carries a different QTL allele for which the effects can be estimated, based on the markers surrounding it. Here, a linkage analysis is used to determine the likelihood of any two alleles being IBD (identical by descent) by assuming an evolutionary model for LD between markers and QTL and applying it to an assessment of the similarity of the marker alleles surrounding the QTL. The result is an IBD matrix which can be further employed to estimate the effects of all the QTL alleles. It should be noted that with both approaches, the accuracy of GEBVs can be compromised by mistaken positioning of the markers on the genome (Goddard & Hayes, 2009).
Features of Genomic Selection
Genomic selection has a number of advantages over traditional methods for estimating breeding values. The correlation of predicted breeding values between relatives traditionally leads to increased selection of breeding stock from a few genetically superior families with a consequent decline in genetic diversity because information garnered from relatives fails to determine the values of the specific alleles progeny receive from their parents. GS overcomes this issue since specific alleles are in LD with markers that have estimated effects, so the correlation between predicted breeding values of relatives are lower. As such, GS reduces the loss of genetic diversity and facilitates greater long-term gains through increased accuracy of individual breeding values based on genomic data (Jannink et al., 2010).
The method also offers producers the possibility of achieving selection gains with a reduction in costs. Production systems which participate in GS schemes no longer need to keep detailed phenotypic records for the entire herd or flock, thereby reducing labour in enterprises that currently undertake performance recording. Similarly, the ability to determine an animal’s genetic potential at birth rather than waiting for expression of a trait at puberty or death, or after sufficient numbers of progeny have been assessed for expression of that trait, will reduce generation interval, increase the rate of genetic gain and facilitate sustainable improvement, all of which can be utilised to increase efficiency and provide producers with better returns.
Opportunities for Genomic Selection
Initially, utilisation of GS within the dairy industry was hampered by the fact that production and health traits in dairy cattle are governed by a large number of loci, which meant that only small gains were accessible when the numbers of available markers were limited (Hayes et al., 2009a). The subsequent discovery of many SNPs across the bovine genome both rectified this problem and had the added advantage of reducing the costs associated with genotyping, thus increasing the availability of GS to commercial producers and ensuring it was trialed as early as 2009 in countries such as Ireland (Kearney et al., 2009).
The dairy industry, especially when compared to sheep and beef, has always been particularly progressive in its acceptance of breeding technologies. Indeed, artificial insemination (AI), which has the safety and financial advantages of not requiring a bull to be maintained on farm, has been widely utilised and has resulted in the industry being dominated by a few large and competitive breeding companies (Foote, 2001). Traditionally these companies marketed bulls with EBVs calculated from progeny testing. The competitiveness of the industry and the easy availability of superior bulls through AI mean that sires need high accuracies, gained from large numbers of progeny, in order to sell well. This means that by the time there is enough data on a bull to make him marketable, the bull is already more than five years old and the company has had the financial burden of maintaining him and other inferior breeding stock during that time (Hayes et al., 2009a). GS offers financial benefits to breeding companies that will ultimately be able to genotype calves at birth and determine their genetic worth before they reach maturity, reducing the need to raise and test relatively large numbers of bulls to identify the few with desirable genes. It has been estimated that this will relate to a saving of up to 92 per cent for these breeding companies (Hayes et al., 2009a).
The advantage of calculating breeding values early extends to the rest of the industry. Availability of accurate breeding values for young stock means that producers will have access to bulls at a younger age, thus reducing the generation interval and facilitating faster rates of genetic improvement. It has been reported that the accuracies gained through GS of animals at birth can be compared with those gained after extensive progeny testing (Hayes et al., 2009a). Indeed, a number of studies have determined that GEBVs are more accurate than EBVs, based on pedigree data (Jannink et al., 2010).
Unlike the dairy industry, breeding technologies have not been widely utilised in the British sheep industry. This is partly because sheep are generally run in low-input, extensive systems, or as a second or third enterprise. Another reason is that the industry’s profitability fluctuates, so that large investments in breeding technologies are not justified.
Nevertheless, there are a number of advantages to be gained by GS. Traits such as female fertility cannot be measured in sires; GEBVs for these traits would facilitate faster and more sustainable improvement in the industry without the need for excessive and labour-intensive data collection. The ability to determine animals with superior traits soon after birth offers further advantages to an industry in which one of the key productive traits is carcass composition, a trait which cannot be easily measured until death (Van Der Werf, 2009).
The advantages of a shortened generation interval are not as pronounced as in the dairy industry since fast genetic gains are routinely acquired using both ewe lambs and ram lambs as breeding stock. However, the ability to determine early which individuals express the desired trait and to what extent their progeny will express it facilitates a faster increase in target traits than breeding young stock based on pedigree or phenotype alone.
Compared with cattle, however, the benefit of GS may be considerably less in sheep. Unlike the dairy industry in which principle production traits such as milk yield are expressed only in females, production traits such as carcass composition are observed in both the male and female. This means that selection for productive traits have already been made on both sexes before mating. Similarly, unlike cattle, desirable traits have high heritabilities so a high rate of gain can already be achieved through artificial selection. The structure of the sheep industry also means that there are fewer progeny tested sires to contribute to the reference population, which could potentially affect the accuracy of genomic predictions (Van Der Werf, 2009).
Like the dairy industry, the pig industry is relatively uniform and widely utilises AI technologies which are provided by a few established competing breeding companies. As such, it is relatively well suited to embrace GS. The advantages include an increased accuracy of calculated breeding values, particularly amongst young animals on which selection decisions are made. This both reduces the generation interval and facilitates fast, sustainable genetic improvement (Hayes et al., 2009b), especially in lowly heritable traits (Van Der Werf, 2009).
In contrast to the dairy sector, where sires can be used for many years, the pig industry uses sires for a very short period. The result is that GS predictions of breeding values are hampered in much the same way as they are in the sheep industry – there is reduced availability of progeny-tested animals to form the reference population compared with the number of dairy animals available. Nevertheless, a comparison of GEBVs with traditionally calculated EBVs showed a deviation of EBVs from what was considered to be the individuals ‘true’ breeding value, suggesting that even with a reduced reference population, GS still provides more accurate breeding values than traditional methods based on pedigree data alone (Hayes et al., 2009b).
Genomic selection clearly has a number of advantages for commercial livestock farmers in all three industries. Both the dairy and pig industries, which operate with very few breeds and are dominated by large companies that already run nuclei, are well suited for an easy transition to GEBVs. The nuclei will provide the necessary reference populations and the companies which run them well situated to collect phenotypic and genotypic information from them. In contrast, the sheep industry is stratified and as such, more diverse in its breed choices. This means that it is not currently well positioned to utilise the technology. Differences in the LD between markers and QTL across breeds mean that a single reference population would not produce breeding values with equal accuracy for all sectors of the industry. In order to effectively utilise the technology, a number of reference populations would therefore need to be established and recorded. Theoretically this may be possible in the hills and lowlands, where breeds are more uniform – although still very diverse across regions – but upland producers typically run mules for which the construction of a reference population would require specific breed crosses. Such an undertaking is unlikely to be considered commercially viable, especially since the advantages of GS to breed improvement are minimal when compared with the gains it affords the dairy industry. Uptake within the sheep industry will therefore likely by restricted to breed societies and pedigree breeders who have livestock more suited to forming a reference population and higher returns to justify its expense. The impact of the technology on commercial flocks will likely be limited to the occasional purchase of a pedigree tup with calculated GEBVs.
Genomic selection clearly has incredible potential for reducing inbreeding depression, increasing production efficiency and combating current reproductive issues. However, its application is not universal and its uptake is currently limited to industries such as pigs and dairy, which are well practiced in utilising breeding technologies and whose structures suit a relatively easy introduction. As such, genomic selection is perhaps best described as being poised to revolutionise the animal breeding industry rather than actually currently revolutionising it.
Foote, R. H. (2001). The history of artificial insemination: Selected notes and notables. Early History of AI.
Goddard, M. (2009) Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–57.
Goddard, M. E. & Hayes, B. J. (2007) Genomic selection. Journal of animal breeding and genetics 124, 323–30.
Hayes, B. J., Bowman, P. J., Chamberlain, AJ., & Goddard, M. E. (2009a). Invited review: genomic selection in dairy cattle: progress and challenges. Journal of dairy science 92, 433–43.
Hayes, B., Daetwyler, P., Bowman, G., Moser, B., Tier, R., Crump, M., Khatkar, H., Raadsma, W. and Goddard, M.E. (2009b). Accuracy of Genomic Selection: Comparing Theory and Results. AAABG 18th Conference Proceedings, 34-37.
Jannink, JL., Lorenz, AJ., and Iwata, H. (2010) Genomic selection in plant breeding: from theory to practice. Briefings in functional genomics 9, 166–77.
Kearney, F., Cromie, A., and Berry, D. P. (2009) Implementation and Uptake of Genomic Evaluations in Ireland.
Luan, T., Wooliams, JA., Lien, S., Kent, M., Svendsen, M. and Meuwissen, T.H. (2009) The accuracy of Genomic Selection in Norwegian red cattle assessed by cross validation. Genetics 183, 1119–26.
Van Der Werf, JHJ. (2009) Potential benefit of genomic selection in sheep. Proceedings of the Association for the Advancement of Animal Breeding and Genetics 18, 38–41.
Verbyla, K. L., Hayes, B. J., Bowman, P. J. & Goddard, M. E. (2009). Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. Genetics research 91, 307–11.