Pharmacognosy Magazine

: 2012  |  Volume : 8  |  Issue : 29  |  Page : 4--11

Application of deoxyribonucleic acid barcoding in Lauraceae plants

Zhen Liu1, Shi-Lin Chen2, Jing-Yuan Song2, Shou-Jun Zhang3, Ke-Li Chen4,  
1 Department of Pharmacy, The 309th Hospital of Chinese People's Liberation Army, Beijing; Key Laboratory of Traditional Chinese Medicine Resource and Compound Prescription, Ministry of Education, Hubei University of Chinese Medicine, Wuhan, Republic of China
2 Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, Republic of China
3 Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, Republic of China
4 Key Laboratory of Traditional Chinese Medicine Resource and Compound Prescription, Ministry of Education, Hubei University of Chinese Medicine, Wuhan, Republic of China

Correspondence Address:
Ke-Li Chen
Key Laboratory of Traditional Chinese Medicine Resource and Compound Prescription, Ministry of Education, Hubei University of Chinese Medicine, Wuhan 430065
Republic of China


Background: This study aims to determine the candidate markers that can be used as DNA barcode in the Lauraceae family. Material and Methods: Polymerase chain reaction amplification, sequencing efficiency, differential intra- and interspecific divergences, DNA barcoding gap, and identification efficiency were used to evaluate the four different DNA sequences of psbA-trnH, matK, rbcL, and ITS2. We tested the discrimination ability of psbA-trnH in 68 plant samples belonging to 42 species from 11 distinct genera and found that the rate of successful identification with the psbA-trnH was 82.4% at the species level. However, the correct identification of matK and rbcL were only 30.9% and 25.0%, respectively, using BLAST1. The PCR amplification efficiency of the ITS2 region was poor; thus, ITS2 was not included in subsequent experiments. To verify the capacity of the identification of psbA-trnH in more samples, 175 samples belonging to 117 species from the experimental data and from the GenBank database of the Lauraceae family were tested. Results: Using the BLAST1 method, the identification efficiency were 84.0% and 92.3% at the species and genus level, respectively. Conclusion: Therefore, psbA-trnH is confirmed as a useful marker for differentiating closely related species within Lauraceae.

How to cite this article:
Liu Z, Chen SL, Song JY, Zhang SJ, Chen KL. Application of deoxyribonucleic acid barcoding in Lauraceae plants.Phcog Mag 2012;8:4-11

How to cite this URL:
Liu Z, Chen SL, Song JY, Zhang SJ, Chen KL. Application of deoxyribonucleic acid barcoding in Lauraceae plants. Phcog Mag [serial online] 2012 [cited 2022 Nov 28 ];8:4-11
Available from:

Full Text


Lauraceae is a large family of woody plants (except the herbaceous parasite, Cassytha) with about 50 genera and 2500 to 3000 species distributed throughout tropical to subtropical latitudes. Lauraceae plants have the extremely important economic value. A great number of them are important resource in the construction timber, spice, essential oil, and medicinal plants. Simultaneously, as their crowns are spacious, they have immense ecological value for virescence and environment protection. Boasting of various kinds and widespread distribution, Lauraceae plants are known to have an ancient origin with a fossil record dating back to the mid-Cretaceous period. [1] However, the evolution and developing process of these plants are very slow. Since boundaries of many species in the family are quite unclear, it is difficult to identify them while the traditional morphological methods are used. Thus, it is significant to develop a quick, simple, and effective method to identify the species in the Lauraceae family.

Deoxyribonucleic acid (DNA) barcoding is the researching focus on biodiversity in the world in recent years. The core of the research is to choose a universal barcode in order to appraise the species quickly and accurately. In 2003, Herbert analyze the order of the genes of the cytochrome c oxidase subunit 1 (CO1) belonging to 11 phyla from 13320 species. [2] Then, as regards animals, most researchers agree that the mitochondrial gene encoding CO1 is a favorable region for use as the standard DNA barcode in the world. Compared with the excellent study in the animal barcode, the study in the plants barcode is relatively slow.

The plant working group of the Consortium for the barcode of life recommended the two-locus combination of rbcL + matK for plant barcoding. [3] Chen et al., tested the discrimination ability of ITS2 in more than 6600 plant samples belonging to 4800 species from 753 distinct genera; they found that the ITS2 region possesses many advantages compared with plastid loci, including rbcL and matK region. They also recommended for psbA-trnH to be a complementary barcode to ITS2 for a broad series of plantae. [4]

Despite some scholars having carried out DNA barcoding research for related species and genera, [5],[6],[7],[8],[9] none had referred to multiple samples in the Lauraceae family. In this study, four potential DNA regions (psbA-trnH, matK, rbcL, and ITS2) were tested for their suitability as DNA barcodes for the Lauraceae family (68 samples belonging to 42 species from 11 genera). The true ability of the candidate sequences to identify species of Lauraceae as a universal DNA barcode is assessed in spite of many closely related species in the samples.

 Materials and Methods

Experimental materials (68 samples belonging to 42 species from 11 diverse genera) were collected from the Chinese provinces of Hubei, Jiangxi, Guangdong, and Guangxi. The materials are authenticated by Prof. Panhong Lin of Hubei College of Traditional Chinese Medicine and Engr. Zhang Shoujun of Wuhan Botanical Garden at the Chinese Academy of Sciences. All specimen and image vouchers were maintained at the herbarium of Hubei College of Traditional Chinese Medicine. To increase further the number of species represented, psbA-trnH sequences from the taxonomy database of the National Centre for Biotechnology Information (NCBI) were included in the reference database.

Leaf tissues were firstly dried in silica gel. A total of 10 mg of each of the dried tissues was rubbed for 1 min at a frequency of 30 times/second in a FastPrep bead mill (Retsch MM400, Germany). Total DNA was extracted as instructed by the Plant Genomic DNA Kit (Tiangen Biotech Co., China). The polymerase chain reaction (PCR) reaction mixture consisted of 1 μL (~30 ng) DNA, 2 μL of 25 mM MgCl 2 , 2.5 μL of 10×PCR buffer, 1.0 U of Taq DNA polymerase, 2 μL of 2.5 mM dNTPs mix (Biocolor BioScience and Technology Co., China), 1.0 μL of 2.5 μM primers (Synthesized by Sangon Co., China); the final volume was 25 μL. Sequences of the universal primers for the tested DNA barcode, including those for psbA-trnH, matK, rbcL, and ITS2, as well as general PCR reaction conditions, were obtained from previous studies. [4] PCR products were purified using the Gel Band Purification Kit (Tiangen Biotech Co., China) and sequenced on an ABI 3730XL sequencer (Applied Biosystems, USA). The sequences were submitted to GenBank.

Sequence editing and contig assembly were conducted by CodonCode Aligner (CodonCode Co., Germany). Sequences were aligned using CLUSTALW and analyzed by the MEGA 4.0 software program. Average interspecific distances, theta prime, and smallest interspecific distances were used to characterize interspecific divergences. [4],[10],[11] Average intraspecific distances, theta, and coalescent depth were calculated to determine intraspecific variations using Kimura 2-parameter (K2P) distances. [10] Wilcoxon signed rank tests were performed as described previously. [12],[13] Barcoding gap was calculated by TAXON DNA. [14] To estimate the reliability of species identification using the DNA barcoding technique, two methods (BLAST1 and the nearest genetic distance) were carried out. [15]


PCR amplification and sequencing efficiency

Results showed that psbA-trnH, matK, and rbcL sequences were successfully amplified and sequenced at 100%. However, in our pilot study, the PCR amplification efficiency of the ITS2 region was poor; thus, ITS2 was not included in subsequent experiments [Table 1].{Table 1}

Analysis of intraspecific variations and interspecific divergences

A favorable barcode should own low intraspecific variations and high interspecific divergence in order to distinguish different species. First, upon comparison of interspecific genetic distances among congeneric species for three candidate barcodes, it was observed that the chloroplast noncoding region of psbA-trnH exhibited the highest interspecific divergence for all three metrics, followed by rbcL, while matK provided the lowest divergence [Table 2]. Moreover, Wilcoxon signed rank tests confirmed that psbA-trnH provided the highest interspecific divergence among congeneric species [Table 3].

Second, it was found that matK showed the lowest level of intraspecific variation for all three parameters, followed by rbcL, while psbA-trnH provided the highest variation [Table 2]. Wilcoxon signed rank tests showed that rbcL and matK have the lowest variation between conspecific individuals, whereas psbA-trnH showed the highest [Table 4].{Table 2}{Table 3}{Table 4}

Assessment of the barcoding gap

Ideally, barcoding involves separate distributions and without overlap between intra- and interspecific variations. [10],[16] Results of the present study showed that psbA-trnH have a faint gap, whereas matK and rbcL exhibited significant overlap without any gaps [Figure 1] and [Figure 2]. {Figure 1}{Figure 2}

Evaluation of identifying ability of barcodes

In the BLAST1 method, results showed that psbA-trnH identified correctly 82.4% of the samples at the species level and 88.1% at the genus level. In contrast to psbA-trnH, the correct identification for matK and rbcL were much lower at the species level, as identified by both BLAST1 and nearest genetic distance methods. At the species level, the correct identification of the two-locus combination of rbcL + matK, matK + psbA-trnH, and rbcL + psbA-trnH were 38.2%, 82.4%, and 82.4%, respectively, using BLAST1 [Table 5]. To verify the capacity of the identification of psbA-trnH in more samples, 175 samples belonging to 117 species from the experimental data and from the GenBank database of the Lauraceae family were tested [Table S1] and [Table S2]. Using the BLAST1 method, the identification efficiency were 84.0% and 92.3% at the species and genus level, respectively.{Table 5}




This work, which focused on four popular candidate sequences of matK, rbcL, psbA-trnH, and nrDNA ITS2, has conducted a comparative study of 11 genera 42 species from 68 samples of Lauraceae. In the experiments, it was found that matK, rbcL, rbcL + matK, and ITS2 were not suitable as a barcode for the Lauraceae family. The psbA-trnH region presented itself with short length, easy sequencing, and powerful ability of species identification for Lauraceae plants. By comparing matK, rbcL, and ITS2, it was found that the psbA-trnH region is the best marker for the identification of Lauraceae species.

Selection of the DNA barcode for the Lauraceae family

In the present research, it was found that psbA-trnH, as a barcode sequence, showed excellent results. First, the psbA-trnH region has a short length in the 195-423 base pairs, which can then be easily amplified and sequenced. The success rate of PCR amplification and sequencing for the psbA-trnH of 68 samples from 11 genera of Lauraceae were 100%. Second, the determination of genetic divergences using six metrics and statistical tests confirmed that the psbA-trnH region possesses sufficient high interspecific variation. There existed significant differences between interspecific and intraspecific variations. Third, according to BLAST1, the identification efficiency using the psbA-trnH region was 84.0% at the species level for the 175 samples from 117 species in 35 genera of Lauraceae. Moreover, the two loci combination of matK + psbA-trnH and rbcL + psbA-trnH did not show any improved abilities for identification. The psbA-trnH can identify all the species, which were identified by matK, rbcL, and the two-locus combination of rbcL + matK.

The rbcL sequence possesses advantages of versatility, easy amplification, and alignment. However, the variation in the rbcL region mainly exists for the above-species level, as the variation in the species level is insufficient to discriminate the different species. [12],[13],[17],[18] The evolutionary rate of matK segment is faster than the coding regions of others, but Rohwer et al., [19] reported that the matK sequence has low-evolutionary rates for Lauraceae (ie, the informative sites are only 9.7%). In this study, the two loci can be easily amplified and sequenced, but it was also found that they were too conservative for Lauraceae plants-their interspecific divergence were very low. Although matK and rbcL provided good PCR efficiency (both at 100%) and satisfactory sequencing efficiency (both at 100%), the successful identification rate of matK and rbcL were 30.9% and 25.0%, respectively, according to BLAST1. The success rate was only 38.2% at the species level when the two loci combination was used.

Many researchers have proposed the use of ITS2 as a suitable marker applicable for phylogenetic reconstruction and taxonomic classification. [4],[20],[21] In our study, the success rate of PCR amplification with ITS2 was poor in Lauraceae; thus, ITS2 was not included in subsequent experiments. We strictly observe the standard operating program of PCR, during the test, and similar experiment was repeated three times. The success rates for ITS2 sequences were 32.35%, 32.35%, and 30.88%, respectively. Then, we compared the success rate of PCR amplification of Lauraceae and Caprifoliaceae, used the same primers of ITS2 and PCR reaction conditions. Results showed that ITS2 sequences are relatively easy to amplify in Caprifoliaceae. In contrast to Caprifoliaceae, the success rate of PCR amplification of Lauraceae were much lower. Furthermore, in our experiments, ITS2 provided not satisfactory PCR efficiency (32.35%) and bad sequencing efficiency (27.27%), because homologous sequences existed. Our much work shows that in the direct PCR amplification and sequencing ITS2 produce a high success rate in some taxonomy group but the low success rate in another taxonomy group. It is found that ITS2 region produced a low success rate in direct PCR amplification and sequencing in Lauraceae species and it is also unsuitable to be DNA barcode of Lauraceae.

Discussion on samples with unsuccessful identification

In our study, the psbA-trnH sequence was chosen as a DNA barcode in identifying the species of Lauraceae family. Among the 175 samples tested, 28 samples could not be identified. At present, there is no stated consensus on the taxonomy of Lauraceae, and the relationships among the species of the family are still poorly understood. [22] The present study found that ambiguous identification mainly occurred in five genera (Persea, Ocotea, Litsea, Machilus, and Cinnamomum) which have always been as source of dispute in taxonomy. It was difficult to distinguish species in the same genus because they show little differences in morphology. The relationship among species of these genera is complex and the boundaries across groups are vague, which could result in improper classification. [23],[24],[25],[26],[27] These species could not be identified by matK, rbcL, and the two-locus combination of rbcL + matK, could also not be identified by psbA-trnH in this study. A possible method for the species of these genera identification may be whole chloroplast genome sequencing.

The present research made a new exploration in the application of DNA barcode technology, as well as provided new approaches and evidences for the classification and phyletic evolution of Lauraceae plants. However, because of sampling constraints, lack of duplication of some species individuals, and the presence of those highly related species (ie, from sister species) not included in the analysis, some flaws in the research still exist. Hopefully, with the increasing number of materials and the progress of the study, DNA barcode technology can provide more effective information and more reliable method for the identification of Lauraceae plants.


1Drinnan AN, Crane PR, Friis EM, Pedersen KR. Lauraceous Flowers from the Potomac Group (Mid-Cretaceous) of Eastern North America. Bot Gaz 1990;151:370-84.
2Hebert PD, Ratnasingham S, de Waard JR. Barcoding animal life: Cytochrome c oxidase subunit 1 divergences among closely related species. Proc Biol Sci 2003;270 Suppl 1:S96-9.
3CBOL Plant Working Group. A DNA barcode for land plants. Proc Natl Acad Sci U S A 2009;106:12794-7.
4Chen SL, Yao H, Han JP, Liu C, Song JY, Shi LC, et al. Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. Plos One 2010;5:e8613.
5Gao T, Chen SL. Authentication of the medicinal plants in Fabaceae by DNA barcoding technique. Planta Med 2009;75:417.
6Pang XH, Chen SL. Using DNA barcodes to identify Rosaceae. Planta Med 2009;75:417.
7Song JY, Yao H, Li Y, Li XW, Liu C, Han JP, et al. Authentication of the family Polygonaceae in Chinese pharmacopoeia by DNA barcoding technique. J Ethnopharmacol 2009;124:434-9.
8Yao H, Song JY, Ma XY, Liu C, Li Y, Xu HX, et al. Identification of Dendrobium species by a candidate DNA barcode sequence: The chloroplast psbA-trnH intergenic region. Planta Med 2009;75:667-9.
9Zhu YJ, Chen SL, Yao H, Tan R, Song JY, Luo K, et al. DNA barcoding for the identification plants of the genus Paris. Yao Xue Xue Bao 2010;45:376-82.
10Meyer CP, Paulay G. DNA barcoding: Error rates based on comprehensive sampling. PLoS Biol 2005;3:2229-38.
11Meier R, Zhang GY, Ali F. The use of mean instead of smallest interspecific distances exaggerates the size of the "Barcoding Gap" and leads to misidentification. Syst Biol 2008;57:809-13.
12Kress WJ, Erickson DL. A two-locus global DNA barcode for land plants: The coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS One 2007;2:e508.
13Lahaye R, van der Bank M, Bogarin D, Warner J, Pupulin F, Gigot G, et al. DNA barcoding the floras of biodiversity hotspots. Proc Natl Acad Sci U S A 2008;105:2923-8.
14Slabbinck B, Dawyndt P, Martens M, De Vos P, De Baets B. TaxonGap: A visualization tool for intra- and inter-species variation among individual biomarkers. Bioinformatics 2008;24:866-7.
15Ross HA, Murugan S, Li WL. Testing the reliability of genetic methods of species identification via simulation. Syst Biol 2008;57:216-30.
16Moritz C, Cicero C. DNA Barcoding: Promise and Pitfalls. PLoS Biol 2004;2:e354.
17Fazekas AJ, Burgess KS, Kesanakurti PR, Graham SW, Newmaster SG, Husband BC, et al. Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS One 2008;3:e2802.
18Newmaster SG, Fazekas AJ, Steeves RA, Janovec J. Testing candidate plant barcode regions in the Myristicaceae. Mol Ecol Resour 2008;8:480-90.
19Rohwer JG. Toward a phylogenetic classification of the Lauraceae: Evidence from matK sequences. Syst Bot 2000;25:60-71.
20Schultz J, Maisel S, Gerlach D, Müller T, Wolf M. A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA 2005;11:361-4.
21Miao M, Warren A, Song WB, Wang S, Shang HM, Chen ZG. Analysis of the internal transcribed spacer 2 (ITS2) region of scuticociliates and related taxa (Ciliophora, Oligohymenophorea) to infer their evolution and phylogeny. Protist 2008;159:519-33.
22Li J, Li XW. Advances in lauraceae systematic research on the world scale. Acta Bot Yunnan 2004;26:1-11.
23Kojoma M, Kurihara K, Yamada K, Sekita S, Satake M, Iida O. Genetic identification of cinnamon (Cinnamomum spp.) based on the trnL-trnF chloroplast DNA. Planta Med 2002;68:94-6.
24Van der Werff H. A synopsis of Persea (Lauraceae) in Central America. Novon St. Louis Mo. 2002;12:575-86.
25Van der Werff H. A synopsis of Ocotea (Lauraceae) in Central America and Southern Mexico. Ann Mo Bot Gard 2002;89:429-51.
26Li J, Christophel DC, Conran JG, Li HW. Phylogenetic relationships within the 'core' Laureae (Litseacomplex, Lauraceae) inferred from sequences of the chloroplast gene matK and nuclear ribosomal DNA ITS regions. Plant Syst Evol 2004;246:19-34.
27Wei FN, Tang SC. On the circumscription of Machilus and of Persea (Lauraceae). Acta Phytotaxon Sin 2006;44:437-42.