Plant Genome Database Release 2.7: One tera base pairs plant genomes era

Jongsun Park*, Mangi Kim, Hong Xi, Suhyeon Park, and Yunho Yun
Owing to next generation and third generation sequencing technologies, a large amount of plant genomes have been sequenced and released in public. To deal with them systematically and efficiently, Plant Genome Database (PGD; http://www.plantgenome.info/) was established and has been updated more than 4 years. Here, we announced recently updated Release 2.7 with 2,174 plant genomes of 713 species. Total length of 2,174 plant genomes is 1.065 Tbp, which has been increased by 126 Gbp in comparison to the previous release (Release 2.6), entering 1 Tbp plant genome era. Total numbers of plant genes and ORFs are 15,815,804 and 19,545,617 from 427 genomes, respectively, which has also been increased by 4,339,606 and 5,653,593, respectively. In addition, 78,784,484 simple sequence repeats (SSRs) were identified from 2,174 genomes using the pipeline of SSRDB (http://ssrdb.infoboss.co.kr/). The web interface of the PGD provides plant genome sequences, SSRs, and functional domains predicted by InterProScan. In addition, PGD provides BLAST search with the specific datasets of each plant genome deposited in PGD. GlobalScrap® was also implemented to analyze various information of plant genomes, such as genome statistics, and distribution of functional domains of plant specific genes. Taken together, PGD can be a standardized repository of plant genomes to understand current status of plant genomes as well as an integrated platform to conduct plant comparative genomic researches.