Prunus persica (Peach)



About the genome:


Overview

Early Online Access to the Assembled Peach Genome for Browsing and BLASTing

The International Peach Genome Initiative (IPGI) would like to welcome you to the draft assembled and annotated peach genome (peach v1.0).

Peach v1.0 represents an initial draft of the assembled genome. While we believe peach v1.0 is a very high quality plant genome, we are aware that it contains both known and unknown errors and discrepancies that will be addressed in upcoming releases of the genome. For instance, we are aware of a few minor situations where the sequences have been correctly assigned to a location, but the orientation is in question. We hope and believe that any problems that arise from these discrepancies are compensated by the rapid release of the data. If you believe that you have identified a discrepancy in the data, please contact IPGI at and we will be sure to address your concerns in an upcoming release.

This genome has been published and is available from Nature Genetics:

The International Peach Genome Initiative et al. (2013 Apr 26). "The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution." Nature Genetics 45, 487-494. | doi:10.1038/ng.2586. Epub 2013 Mar 24.

History
At the Plant and Animal Genome XV Meeting on 01/16/07, Jerry Tuskan from the Joint Genome Institute (JGI) announced plans to sequence the peach genome. Since then, an international consortium (IPGI) coalesced to do the work cooperatively. This consortium, under the direction of Drs Bryon Sosinski, Ignazio Verde and Daniel Rokhsar, includes numerous researchers from countries around the globe including the US, Italy (Drupomics), Spain and Chile. The specific roles of the participants will be outlined in the publication of the peach genome.
Background
Peach (Prunus persica) is considered one of the genetically most well characterized species in the Rosaceae, and it has distinct advantages that make it suitable as a model genome species for Prunus as well as for other species in the Rosaceae. While some Prunus species, such as cultivated plums and sour cherries, are polyploid, peach is a diploid with n = 8 and has a comparatively small genome currently estimated to be ~220-230 Mbp based upon the peach v1.0 assembly. Peach has a relatively short juvenility period of 2-3 years compared to most other fruit tree species that require 6-10 years. In addition, a number of genes for fundamentally important traits have been genetically described in peach, including genes controlling flower and fruit development, tree growth habit, dormancy, cold hardiness, and disease and pest resistance.
Genome facts and statistics
Peach v1.0 was generated from DNA from the doubled haploid cultivar 'Lovell' which means that the genes and intervening DNA is "fixed" or identical for all alleles and both chromosomal copies of the genome. This doubled haploid nature was confirmed by the evaluation of >200 SSRs, and has facilitated a highly accurate and consistent assembly of the peach genome.

Peach v1.0 currently consists of 8 pseudomolecules (scaffolds) representing the 8 chromosomes of peach, and are numbered according to their corresponding linkage groups. The genome sequencing consisted of approximately 7.7 fold whole genome shotgun sequencing employing the accurate Sanger methodology, and was assembled using Arachne. The assembled peach scaffolds cover nearly 99% of the peach genome, with over 92% having confirmed orientation. To further validate the quality of the assembly, 74,757 Prunus ESTs were queried against the genome at 90% identity and 85% coverage, and we found that only ~2% were missing. This is truly a high quality genome! Gene prediction and annotation, is an ongoing process that may take years to complete, but current estimates indicate that peach has a typical plant gene repertoire of approximately 35,000 protein coding sequences.

Peach genome browsers are available at JGI and the Genome Database for Rosaceae, while the Italian version is hosted at the Istituto di Genomica Applicata (IGA). Access to the raw sequence data is provided via the Download Data button at the top of this page.

Once again, welcome to peach v1.0!

On behalf of IPGI and its collaborators,

Bryon Sosinski, NC State University (sosinski AT ncsu.edu)[B1]
Ignazio Verde, Consiglio per la Ricerca e la Sperimentazione in Agricoltura (ignazio.verde AT entecra.it)
Daniel Rokhsar, DOE Joint Genome Institute (dsrokhsar AT gmail.com)

Annotation

Transcript assemblies were constructed using PASA from Prunus persica ESTs (~88K) and ESTs of related species (~424K). Loci were determined by BLAT alignments of above transcript assemblies and/or BLASTX alignments of proteins from arabi (Arabidopsis thaliana), rice, soybean, grape and poplar proteins to repeat-soft-masked P. persica genome. Gene models were predicated by homology-based predictors, mainly by FGENESH+ with the addition of GenomeScan if FGENESH+ produced no model at the locus. Predicted genes were UTR-extended and/or improved by PASA. The final gene set was made from gene selection based on ESTs support or protein homology support subjected to filtering of repeats/transposable elements.

Statistics

This release of Phytozome includes the JGI v1.0 gene annotation of assembly v1.0

Genome Size
Approximately 227.3 Mb arranged in 202 scaffolds
Approximately 224.6 Mb arranged in 2730 contigs (~ 1.2% gap)
Scaffold N50 (L50) = 4 (26.8 Mbp)
Contig N50 (L5) = 294 (214.2 Kbp)
21 scaffolds larger than 50 Kbp, with 99.4% of the genome in scaffolds larger than 50 Kbp
Loci
27864 loci containing protein-coding genes
Transcripts
28702 protein-coding transcripts
  ©2006-2014 University of California Regents. All rights reserved  
Information on Accessibility/Section508