R&D CENTER

Utilization of Medicinal Plant Genomes by Identification of Secondary Metabolites from Genomes

Jongsun Park, Kiman Lee, Suhyeon Park, Woochan Kwon, Hong Xi, Janghyuk Son, Taejin Kang
URL  
Due to the rapid development of sequencing technologies, at least 2,174 plant genomes of 713 species have been sequenced, which are available on the recent released Plant Genome Database (PGD; Release 2.7; http://www.plantgenome.info/). Based on the PGD, 27 of 162 species of widely used herbal plants and 52 of 559 herbal plants listed in the ‘herbal medicine list’ presented by the Ministry of Food and Drug Safety contained at least one genome in the PGD. The genome sizes of these medicinal plants range from 145 Mb (Spirodela polyrhiza) to 10.6 Gb (Ginkgo biloba), displaying 73 times differences. Interestingly, Cannabis sativa, containing various useful secondary metabolites including α-pinene, myrcene, and linalool, has eight genomes originated from different strains, which is a good example for understanding intraspecific differences of secondary metabolites at the genomic level. MetaPre-AI® is a bioinformatic pipeline equipped with machine learning and artificial intelligence algorithms based on whole genome sequences, performance of which was proved by the study that predicted acteoside from Abeliophyllum distichum genome and was confirmed by HPLC. MetaPre-AI® can be used to profile possible secondary metabolites based on the medicinal plant genomes, which can reduce the costs and uncertainty of screening experiments. Considering the release speed of new plant genomes and the accumulation of available biochemical pathways, MetaPre-AI® will be the more efficient pipeline to investigate secondary metabolites of various plant species.