Once you have a genome sequence, you need to know what functions the gene has and which biological processes it is involved in. Only when you understand the function of the gene, can you relate the genotype with the phenotype. High-throughput annotation of all genes in the genome is a hot spot in current functional genomics research.
Gene function analysis refers to the prediction, identification and verification of gene function by using bioinformatics and different expression systems. Whole-genome sequencing will produce a large amount of data, and comparison methods are generally used to annotate the predicted coding genes. In microbial gene function analysis, protein comparison is conducted through various function databases (KEGG, GO, MetaCyc, EggNOG, CAZy and CARD) to obtain the function information of the gene.
KEGG has developed into a comprehensive database. We can understand the differences in the metabolic pathways of the functional genes of the microbial community between samples of different groups by the KEGG pathway database. We can also provide predictions of the number and composition of functional genes, and give you analysis results with statistical significance and confidence intervals.
GO database classifies and aggregates all the research results related to genes in the world. The GO annotation provides an overview of the functional classification of all gene products of a species. In microbial research, GO analysis can be applied to classify the differential protein, predict the function of related proteins and identify some microbial genes.
MetaCyc is a non-redundant database. The goal of MetaCyc is to classify the range of metabolism by storing representative samples of the pathways elucidated in the experiment, and to collect all known metabolic pathways of life. The metabolic pathways in sequenced genes can be predicted by MetaCyc.
EggNOG is an extension of NCBI's COG database, and EggNOG collects more comprehensive species and a larger amount of protein sequence data. EggNOG classified homologous genes for the COG of each species, and carries out multiple sequence alignment, phylogenetic tree construction, HMM file construction and functional annotation for each homologous gene category.
CAZy is a professional database resource about enzymes that can synthesize or decompose complex carbohydrates and sugar complexes. Based on the similarity of amino acid sequences in protein domains, carbohydrate-active enzymes are classified into different protein families. This database is established to correlate the sequence, structure, and catalytic mechanism of enzyme molecules.
CARD is a resistance gene database. It contains all the resistance information in the ARDB database. At present, CARD annotation is widely used for microbial gene annotation analysis, especially in the prediction of metagenomic sequencing results. The resistance gene prediction analysis can be achieved by choosing BLAST and RGI modes.
Creative Biogene has many years of experience in microbial bioinformatics analysis. Our team is composed of professional scientists, researchers and technicians. We combine with rich project experience, strict data quality control and professional analysis process, to ensure that the project is carried out accurately and quickly. We look forward to working with you for your cooperation.
If you are interested in our services, please contact us for more details.