亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

RLKdb: A comprehensively curated database of plant receptor-like kinase families

生物 计算生物学 数据库 生物信息学 计算机科学
作者
Zhiyuan Yin,Jinding Liu,Daolong Dou
出处
期刊:Molecular Plant [Elsevier BV]
卷期号:17 (4): 513-515 被引量:1
标识
DOI:10.1016/j.molp.2024.02.014
摘要

Since the first plant receptor-like kinase (RLK) gene ZmPK1 was cloned from Zea mays in 1990 (Walker and Zhang, 1990Walker J.C. Zhang R. Relationship of a putative receptor protein kinase from maize to the S-locus glycoproteins of Brassica.Nature. 1990; 345: 743-746Crossref PubMed Google Scholar), this large gene family has been extensively studied and shown to play crucial roles in growth, development, and immunity (Tang et al., 2017Tang D. Wang G. Zhou J.M. Receptor Kinases in Plant-Pathogen Interactions: More Than Pattern Recognition.Plant Cell. 2017; 29: 618-637Crossref PubMed Scopus (464) Google Scholar). RLKs are widespread in the plant kingdom. However, biological functions of most RLKs remain largely elusive (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar). Given RLKs share a conserved monophyletic RLK/Pelle kinase domain, RLKs in several model plants are classified into distinct families by extracellular domains (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). However, independent domain shuffling in specific lineages drives the origin of novel families, which raises a question what is the landscape of RLKs across the entire plant kingdom? Previously, sequence-homology-based methods have been widely used for RLK identification and classification, which might miss distantly related proteins with similar structures and potential novel families not mentioned in the literature. The academic community urgently requires a dedicated database for a systematic overview of the RLK gene family, providing data support for in-depth research on RLK genes. Here, we used a topology-based method to accurately isolate the RLKomes from proteomes. The obtained RLKomes were further classified into (sub)families based on extracellular domains. We constructed a comprehensively curated plant RLK database (https://biotec.njau.edu.cn/rlkdb), which contains valuable resources for investigating the origin and evolution of the RLK family and multiple online tools for personalized analysis. To obtain the landscape of RLKs in plants, we collected 300 plant genomes with chromosome-level assemblies for identification of RLKs. In addition to some significant model species, including Arabidopsis, rice, and maize, these plant genomes encompass representatives from 4 phyla, 12 classes, and 45 orders (Figure 1A; Supplemental Table 1). We adopted a previously described pipeline developed by our group to identify plant RLKs (Yin et al., 2023Yin Z. Shen D. Zhao Y. Peng H. Liu J. Dou D. Cross-kingdom analyses of transmembrane protein kinases show their functional diversity and distinct origins in protists.Comput. Struct. Biotechnol. J. 2023; 21: 4070-4078Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar). In Arabidopsis thaliana, our pipeline identified 468 RLKs, representing a 72% increase compared to the Ensembl annotation (Martin et al., 2023Martin F.J. Amode M.R. Aneja A. Austine-Orimoloye O. Azov A.G. Barnes I. Becker A. Bennett R. Berry A. Bhai J. et al.Ensembl 2023.Nucleic Acids Res. 2023; 51: D933-D941Crossref PubMed Scopus (92) Google Scholar). We further examined the reliability of our pipeline with reference to the 610 putative RLKs reported by Shiu and Bleecker (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). Among these, we observed that our pipeline missed 144 putative RLKs while predicting two novel RLKs. In the missed RLKs, 16 putative RLK gene models were removed from the current genome assembly, and 128 putative RLKs do not have a transmembrane domain. Several methods were also used to identify leucine-rich repeat (LRR)-RLKs and some other families (Man et al., 2020Man J. Gallagher J.P. Bartlett M. Structural evolution drives diversification of the large LRR-RLK gene family.New Phytol. 2020; 226: 1492-1505Crossref PubMed Scopus (45) Google Scholar, Man et al., 2023Man J. Harrington T. Lally K. Bartlett M. Asymmetric evolution of protein domains in the leucine-rich repeat receptor-like kinase (LRR-RLK) family of plant developmental coordinators.bioRxiv. 2023; (Preprint at)https://doi.org/10.1101/2023.03.13.532436Crossref Scopus (0) Google Scholar; Ngou et al., 2022Ngou B.P.M. Heal R. Wyler M. Schmid M.W. Jones J.D.G. Concerted expansion and contraction of immune receptor gene repertoires in plant genomes.Nat. Plants. 2022; 8: 1146-1152Crossref PubMed Scopus (28) Google Scholar, Ngou et al., 2024Ngou B.P.M. Wyler M. Schmid M.W. Kadota Y. Shirasu K. Evolutionary trajectory of pattern recognition receptors in plants.Nat. Commun. 2024; 15: 308Crossref PubMed Scopus (0) Google Scholar). Comparatively, our pipeline has high accuracy and is suitable for systematic and high-throughput identification of RLKomes covering all the different families. In total, 220 038 RLKs were identified from 300 plant genomes. The RLKome size ranges from 1 to 2459, with an average proteome percentage of 1.35% (Figure 1B; Supplemental Table 1). In the past three decades, more than a dozen RLK families have been described (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar), but a systematic and automatic pipeline for the classification of RLKome is still lacking. PRGdb (http://prgdb.org/prgdb4/) is a database about pathogen receptor genes but only provides the whole list of RLKs, lacking detailed gene information and classified families. According to their distinct extracellular domain structures, RLKs were divided into 18 families. Among them, 15 families have known Pfam annotations. The remaining unannotated RLKs were clustered by protein sequence similarity, which further yielded the proline-rich extensin-like receptor kinase and unknown disordered 1 families. All the unclassified RLKs were defined as the unclassified family. LRR (44.0%), G-LecRLK (13.9%), and wall-associated kinase (11.1%) are the largest families, which make up 69% of the RLKdb (Figure 1C). The large and well-known families occur in almost all the 300 plant genomes here, while the thaumatin, glycoside hydrolase family 19; cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins; and proline-rich membrane anchor 1 families are only found in specific lineages. RLKdb has a very concise and user-friendly web interface. Through the home page or the navigation menu, users can open an RLK family (Supplemental Figure 1) or RLKome page (Supplemental Figure 2) to explore the database. In the RLK family page, the first section contains its family description, its lineage coverage, and a list box for switching to other families (Supplemental Figure 1A). The following section is an interactive table of genomes that possess the corresponding RLK family (Supplemental Figure 1B). Through the load button in the table, users can load an RLK family of interest into the third section (Supplemental Figure 1C). The RLK members and landscape of the family can be displayed in five panels: (1) the RLK table panel shows all RLK members, (2) the linkage map panel displays the positions of RLK members in the genome, (3) the length distribution panel exhibits the distribution of RLK protein lengths, (4) the domain topology panel presents the percentage of various function domain topologies and a domain word cloud, and (5) the phylogeny panel showcases the evolutionary relationships among RLK members. The RLKome page has a similar layout. Its initial section provides information about the plant genome, including details on species, lineage, taxonomy, genome assembly, cultivar, and more (Supplemental Figure 2A). The second section is a column chart showing the number of different RLK families in the RLKome. By clicking on an RLK family name, the corresponding RLK family can be retrieved and displayed in the five panels that are identical to the family page. By clicking on the hyperlinks associated with RLK IDs in the RLK table panel, users can access a dedicated RLK page displaying its detailed information (Figure 1D). In the RLK page, the first section provides a snapshot of RLK protein structure, along with essential details such as species, data source, and family information (Supplemental Figure 3A). The second section contains six panels: (1) the gene model panel shows gene exon-intron structure and domain topology in protein (Supplemental Figure 3B), (2) the transcription factor binding site panel provides a table of transcription factor binding sites upstream of the RLK gene (Supplemental Figure 3C), (3) the primer panel offers five pairs of qPCR primers (Supplemental Figure 3D), (4) the structure panel exhibits the 3D structure of the RLK protein and its ligand binding sites (Supplemental Figure 3E), (5) the interaction panel presents RLK's potential interacting proteins based on the experimentally validated protein interactions collected in the STRING database (Supplemental Figure 3F), and (6) the phylogeny panel includes a Sankey diagram to show the distribution of corresponding RLK subfamily across plant species, an interactive table of RLK subfamily members, and a phylogeny tree containing the members of the RLK subfamily (Supplemental Figure 3G). Through the phylogeny tree and the Sankey diagram, users can intuitively see the relatedness of a particular RK of interest across the diversity of plant species in the database. We also developed online tools that enable users to search and classify RLKs into different families (Figure 1E). The web-based tool allows a user to upload a proteome or transcriptome file in FASTA format (Supplemental Figure 4A). The sequences undergo processing through the pipeline on a multi-core and GPU Linux server. For a proteome file, the user will obtain an RLK annotation file containing information on signal peptide, transmembrane, kinase, and other domain regions, along with an RLK sequence file. In the case of a transcriptome file, users will receive an additional open reading frame annotation file that highlights coding regions in the transcript sequences. To enhance database accessibility, the BLAST and Foldseek programs have been integrated to support sequence similarity and structure similarity retrieval, respectively (Supplemental Figures 4B and 4C). In summary, we have accurately annotated the RLKomes and classified RLK families of 300 plant genomes with chromosome-level assemblies. The RLKdb provides comprehensive information of the RLKome, the RLK family, and RLKs. An online tool for genome- and transcriptome-wide identification and classification of RLKs was also developed. The valuable resources and tools will aid evolutionary and functional studies of plant RLKs. This study was supported by grants from the National Natural Science Foundation of China (32270208, 32202251, and 32230089), the Fundamental Research Funds for the Central Universities (KYCXJC2023001 and KYQN2023039), the Natural Science Foundation of Jiangsu Province (BK20221000), and the China Agricultural Research System (CARS-21).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
5秒前
桐桐应助cyhcyh采纳,获得10
5秒前
11秒前
leslie完成签到 ,获得积分10
12秒前
15秒前
李李原上草完成签到 ,获得积分10
21秒前
科研通AI5应助科研通管家采纳,获得10
25秒前
斯寜应助科研通管家采纳,获得20
25秒前
隐形曼青应助科研通管家采纳,获得10
26秒前
斯寜应助科研通管家采纳,获得10
26秒前
26秒前
斯寜应助科研通管家采纳,获得10
26秒前
科研通AI2S应助科研通管家采纳,获得10
26秒前
Xzh发布了新的文献求助10
50秒前
cyhcyh完成签到,获得积分20
54秒前
56秒前
59秒前
cyhcyh发布了新的文献求助10
1分钟前
研友_VZG7GZ应助cyhcyh采纳,获得10
1分钟前
1分钟前
wdnyrrc发布了新的文献求助10
1分钟前
wesley完成签到 ,获得积分10
1分钟前
你好好好完成签到,获得积分10
2分钟前
2分钟前
杨gj完成签到,获得积分10
2分钟前
杨gj发布了新的文献求助10
2分钟前
JD完成签到 ,获得积分10
2分钟前
科目三应助杨gj采纳,获得10
2分钟前
斯寜应助科研通管家采纳,获得20
2分钟前
HuiHui完成签到,获得积分10
2分钟前
2分钟前
聪慧的娜完成签到 ,获得积分10
2分钟前
高高的坤完成签到 ,获得积分10
2分钟前
2分钟前
nicaicai发布了新的文献求助10
2分钟前
3分钟前
豌豆发布了新的文献求助10
3分钟前
香蕉觅云应助Kevin采纳,获得10
3分钟前
隐形曼青应助豌豆采纳,获得10
3分钟前
斯文梦寒完成签到 ,获得积分10
3分钟前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
ISCN 2024 – An International System for Human Cytogenomic Nomenclature (2024) 3000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
the MD Anderson Surgical Oncology Manual, Seventh Edition 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3777580
求助须知:如何正确求助?哪些是违规求助? 3322969
关于积分的说明 10212658
捐赠科研通 3038289
什么是DOI,文献DOI怎么找? 1667296
邀请新用户注册赠送积分活动 798086
科研通“疑难数据库(出版商)”最低求助积分说明 758215