已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

RLKdb: A comprehensively curated database of plant receptor-like kinase families

生物 计算生物学 数据库 生物信息学 计算机科学
作者
Zhiyuan Yin,Jinding Liu,Daolong Dou
出处
期刊:Molecular Plant [Elsevier BV]
卷期号:17 (4): 513-515 被引量:1
标识
DOI:10.1016/j.molp.2024.02.014
摘要

Since the first plant receptor-like kinase (RLK) gene ZmPK1 was cloned from Zea mays in 1990 (Walker and Zhang, 1990Walker J.C. Zhang R. Relationship of a putative receptor protein kinase from maize to the S-locus glycoproteins of Brassica.Nature. 1990; 345: 743-746Crossref PubMed Google Scholar), this large gene family has been extensively studied and shown to play crucial roles in growth, development, and immunity (Tang et al., 2017Tang D. Wang G. Zhou J.M. Receptor Kinases in Plant-Pathogen Interactions: More Than Pattern Recognition.Plant Cell. 2017; 29: 618-637Crossref PubMed Scopus (464) Google Scholar). RLKs are widespread in the plant kingdom. However, biological functions of most RLKs remain largely elusive (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar). Given RLKs share a conserved monophyletic RLK/Pelle kinase domain, RLKs in several model plants are classified into distinct families by extracellular domains (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). However, independent domain shuffling in specific lineages drives the origin of novel families, which raises a question what is the landscape of RLKs across the entire plant kingdom? Previously, sequence-homology-based methods have been widely used for RLK identification and classification, which might miss distantly related proteins with similar structures and potential novel families not mentioned in the literature. The academic community urgently requires a dedicated database for a systematic overview of the RLK gene family, providing data support for in-depth research on RLK genes. Here, we used a topology-based method to accurately isolate the RLKomes from proteomes. The obtained RLKomes were further classified into (sub)families based on extracellular domains. We constructed a comprehensively curated plant RLK database (https://biotec.njau.edu.cn/rlkdb), which contains valuable resources for investigating the origin and evolution of the RLK family and multiple online tools for personalized analysis. To obtain the landscape of RLKs in plants, we collected 300 plant genomes with chromosome-level assemblies for identification of RLKs. In addition to some significant model species, including Arabidopsis, rice, and maize, these plant genomes encompass representatives from 4 phyla, 12 classes, and 45 orders (Figure 1A; Supplemental Table 1). We adopted a previously described pipeline developed by our group to identify plant RLKs (Yin et al., 2023Yin Z. Shen D. Zhao Y. Peng H. Liu J. Dou D. Cross-kingdom analyses of transmembrane protein kinases show their functional diversity and distinct origins in protists.Comput. Struct. Biotechnol. J. 2023; 21: 4070-4078Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar). In Arabidopsis thaliana, our pipeline identified 468 RLKs, representing a 72% increase compared to the Ensembl annotation (Martin et al., 2023Martin F.J. Amode M.R. Aneja A. Austine-Orimoloye O. Azov A.G. Barnes I. Becker A. Bennett R. Berry A. Bhai J. et al.Ensembl 2023.Nucleic Acids Res. 2023; 51: D933-D941Crossref PubMed Scopus (92) Google Scholar). We further examined the reliability of our pipeline with reference to the 610 putative RLKs reported by Shiu and Bleecker (Shiu and Bleecker, 2001Shiu S.H. Bleecker A.,B. Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases.Proc. Natl. Acad. Sci. USA. 2001; 98: 10763-10768Crossref PubMed Scopus (1137) Google Scholar). Among these, we observed that our pipeline missed 144 putative RLKs while predicting two novel RLKs. In the missed RLKs, 16 putative RLK gene models were removed from the current genome assembly, and 128 putative RLKs do not have a transmembrane domain. Several methods were also used to identify leucine-rich repeat (LRR)-RLKs and some other families (Man et al., 2020Man J. Gallagher J.P. Bartlett M. Structural evolution drives diversification of the large LRR-RLK gene family.New Phytol. 2020; 226: 1492-1505Crossref PubMed Scopus (45) Google Scholar, Man et al., 2023Man J. Harrington T. Lally K. Bartlett M. Asymmetric evolution of protein domains in the leucine-rich repeat receptor-like kinase (LRR-RLK) family of plant developmental coordinators.bioRxiv. 2023; (Preprint at)https://doi.org/10.1101/2023.03.13.532436Crossref Scopus (0) Google Scholar; Ngou et al., 2022Ngou B.P.M. Heal R. Wyler M. Schmid M.W. Jones J.D.G. Concerted expansion and contraction of immune receptor gene repertoires in plant genomes.Nat. Plants. 2022; 8: 1146-1152Crossref PubMed Scopus (28) Google Scholar, Ngou et al., 2024Ngou B.P.M. Wyler M. Schmid M.W. Kadota Y. Shirasu K. Evolutionary trajectory of pattern recognition receptors in plants.Nat. Commun. 2024; 15: 308Crossref PubMed Scopus (0) Google Scholar). Comparatively, our pipeline has high accuracy and is suitable for systematic and high-throughput identification of RLKomes covering all the different families. In total, 220 038 RLKs were identified from 300 plant genomes. The RLKome size ranges from 1 to 2459, with an average proteome percentage of 1.35% (Figure 1B; Supplemental Table 1). In the past three decades, more than a dozen RLK families have been described (Dievart et al., 2020Dievart A. Gottin C. Périn C. Ranwez V. Chantret N. Origin and diversity of plant receptor-like kinases.Annu. Rev. Plant Biol. 2020; 71: 131-156Crossref PubMed Scopus (118) Google Scholar), but a systematic and automatic pipeline for the classification of RLKome is still lacking. PRGdb (http://prgdb.org/prgdb4/) is a database about pathogen receptor genes but only provides the whole list of RLKs, lacking detailed gene information and classified families. According to their distinct extracellular domain structures, RLKs were divided into 18 families. Among them, 15 families have known Pfam annotations. The remaining unannotated RLKs were clustered by protein sequence similarity, which further yielded the proline-rich extensin-like receptor kinase and unknown disordered 1 families. All the unclassified RLKs were defined as the unclassified family. LRR (44.0%), G-LecRLK (13.9%), and wall-associated kinase (11.1%) are the largest families, which make up 69% of the RLKdb (Figure 1C). The large and well-known families occur in almost all the 300 plant genomes here, while the thaumatin, glycoside hydrolase family 19; cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins; and proline-rich membrane anchor 1 families are only found in specific lineages. RLKdb has a very concise and user-friendly web interface. Through the home page or the navigation menu, users can open an RLK family (Supplemental Figure 1) or RLKome page (Supplemental Figure 2) to explore the database. In the RLK family page, the first section contains its family description, its lineage coverage, and a list box for switching to other families (Supplemental Figure 1A). The following section is an interactive table of genomes that possess the corresponding RLK family (Supplemental Figure 1B). Through the load button in the table, users can load an RLK family of interest into the third section (Supplemental Figure 1C). The RLK members and landscape of the family can be displayed in five panels: (1) the RLK table panel shows all RLK members, (2) the linkage map panel displays the positions of RLK members in the genome, (3) the length distribution panel exhibits the distribution of RLK protein lengths, (4) the domain topology panel presents the percentage of various function domain topologies and a domain word cloud, and (5) the phylogeny panel showcases the evolutionary relationships among RLK members. The RLKome page has a similar layout. Its initial section provides information about the plant genome, including details on species, lineage, taxonomy, genome assembly, cultivar, and more (Supplemental Figure 2A). The second section is a column chart showing the number of different RLK families in the RLKome. By clicking on an RLK family name, the corresponding RLK family can be retrieved and displayed in the five panels that are identical to the family page. By clicking on the hyperlinks associated with RLK IDs in the RLK table panel, users can access a dedicated RLK page displaying its detailed information (Figure 1D). In the RLK page, the first section provides a snapshot of RLK protein structure, along with essential details such as species, data source, and family information (Supplemental Figure 3A). The second section contains six panels: (1) the gene model panel shows gene exon-intron structure and domain topology in protein (Supplemental Figure 3B), (2) the transcription factor binding site panel provides a table of transcription factor binding sites upstream of the RLK gene (Supplemental Figure 3C), (3) the primer panel offers five pairs of qPCR primers (Supplemental Figure 3D), (4) the structure panel exhibits the 3D structure of the RLK protein and its ligand binding sites (Supplemental Figure 3E), (5) the interaction panel presents RLK's potential interacting proteins based on the experimentally validated protein interactions collected in the STRING database (Supplemental Figure 3F), and (6) the phylogeny panel includes a Sankey diagram to show the distribution of corresponding RLK subfamily across plant species, an interactive table of RLK subfamily members, and a phylogeny tree containing the members of the RLK subfamily (Supplemental Figure 3G). Through the phylogeny tree and the Sankey diagram, users can intuitively see the relatedness of a particular RK of interest across the diversity of plant species in the database. We also developed online tools that enable users to search and classify RLKs into different families (Figure 1E). The web-based tool allows a user to upload a proteome or transcriptome file in FASTA format (Supplemental Figure 4A). The sequences undergo processing through the pipeline on a multi-core and GPU Linux server. For a proteome file, the user will obtain an RLK annotation file containing information on signal peptide, transmembrane, kinase, and other domain regions, along with an RLK sequence file. In the case of a transcriptome file, users will receive an additional open reading frame annotation file that highlights coding regions in the transcript sequences. To enhance database accessibility, the BLAST and Foldseek programs have been integrated to support sequence similarity and structure similarity retrieval, respectively (Supplemental Figures 4B and 4C). In summary, we have accurately annotated the RLKomes and classified RLK families of 300 plant genomes with chromosome-level assemblies. The RLKdb provides comprehensive information of the RLKome, the RLK family, and RLKs. An online tool for genome- and transcriptome-wide identification and classification of RLKs was also developed. The valuable resources and tools will aid evolutionary and functional studies of plant RLKs. This study was supported by grants from the National Natural Science Foundation of China (32270208, 32202251, and 32230089), the Fundamental Research Funds for the Central Universities (KYCXJC2023001 and KYQN2023039), the Natural Science Foundation of Jiangsu Province (BK20221000), and the China Agricultural Research System (CARS-21).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
科研通AI5应助一朵采纳,获得10
2秒前
6秒前
12秒前
一朵发布了新的文献求助10
16秒前
Sky发布了新的文献求助10
17秒前
Noah完成签到 ,获得积分0
23秒前
24秒前
aish发布了新的文献求助50
27秒前
小耶完成签到 ,获得积分10
29秒前
wowozyy发布了新的文献求助10
31秒前
konosuba完成签到,获得积分0
34秒前
szx233完成签到 ,获得积分10
34秒前
唐泽雪穗应助everglow采纳,获得10
35秒前
佩吉完成签到 ,获得积分10
37秒前
Serein完成签到,获得积分10
44秒前
Waris完成签到 ,获得积分10
44秒前
小叶子完成签到,获得积分10
49秒前
张雨露完成签到 ,获得积分10
54秒前
Criminology34应助橘子猫采纳,获得10
56秒前
58秒前
芷兰丁香完成签到,获得积分10
1分钟前
1分钟前
荷兰香猪完成签到,获得积分10
1分钟前
三只虾发布了新的文献求助10
1分钟前
wang完成签到 ,获得积分10
1分钟前
1分钟前
dyyisash完成签到 ,获得积分10
1分钟前
不知名完成签到,获得积分10
1分钟前
Sky完成签到,获得积分10
1分钟前
1分钟前
1分钟前
小熊早就睡啦完成签到,获得积分10
1分钟前
bkagyin应助三只虾采纳,获得10
1分钟前
眼睛大的星月完成签到,获得积分10
1分钟前
1分钟前
zhu发布了新的文献求助30
1分钟前
1分钟前
锦诗完成签到,获得积分10
1分钟前
1分钟前
1分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
An overview of orchard cover crop management 1000
二维材料在应力作用下的力学行为和层间耦合特性研究 600
基于3um sOl硅光平台的集成发射芯片关键器件研究 500
Progress and Regression 400
A review of Order Plesiosauria, and the description of a new, opalised pliosauroid, Leptocleidus demoscyllus, from the early cretaceous of Coober Pedy, South Australia 400
National standards & grade-level outcomes for K-12 physical education 400
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 4813231
求助须知:如何正确求助?哪些是违规求助? 4125472
关于积分的说明 12765677
捐赠科研通 3862772
什么是DOI,文献DOI怎么找? 2126101
邀请新用户注册赠送积分活动 1147573
关于科研通互助平台的介绍 1041537