Sequence, Structure, & Function Hub

Description:

The role of the Annotation Module is to integrate and disseminate many known biological descriptions (annotations) discovered by the PSI and other biological databases.  By combining data from over 150 resources, the Structural Biology Knowledgbase (SBKB) helps present amino acid sequences in their biological context in order to enable a better understanding of living systems and disease. 

The PSI Structural Biology Knowledgebase Search Engine

Existing identifiers and values from over 100 PSI and other biological resources can all be found through a single search. Annotations stored in the SBKB are accessible by all three search types: by amino acid or nucleotide sequence, by PDB ID, and by text. 

Annotations are organized into easy-to-follow biological categories within our Annotation Notebook view: Gene-level (genetic/genomics resources), Protein-level (amino acid, primary structure), Structure-level (amino acid secondary - quarternary structure), Functions, Localization, Pathways, Medicine, and Reference. An example of our Protein-level tab is shown below for PDB ID 2pe2.

 


Annotation Resources accessible through the PSI SBKB

Resources created by the PSI

Additional Annotation Resources (listed alphabetically)

  • Astral - Compendium for Sequence and Structure Analysis.
  • Blocks multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins.
  • BMRB (Biological Magnetic Resonance Data Bank) - A repository for data from NMR spectroscopy on proteins, peptides, nucleic acids, and other biomolecules
  • BRENDA Enzyme Information System.
  • CATH Protein Structure Classification resource.
  • DIP a database that catalogs experimentally determined interactions between proteins.
  • EC2PDB- Enzyme Structure Database at the European Bioinformatics Institute (EBI).
  • Ensembl is a joint project between EMBL/EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes.
  • Entrez - Life Sciences search Engine at NCBI.
  • Enzyme Nomenclature Committee of the IUBMB - Recommendations on Biochemical & Organic Nomenclature, Symbols & Terminology.
  • Evolutionary Trace Viewer (ETV) - a method to view and run Evolutionary Traces, elucidating evolutionarily conserved amino acids within protein families.
  • Evolutionary Trace Report Maker - creates an integrated report about the evolutionary propensity of individual residues.
  • Expert Protein Analysis System (ExPASy).
  • Gene Ontology (GO) functional assignment for the proteins and Gene Ontology Browser at
    European Bioinformatics Institute (EBI).
  • Gene3D - Reliable Structural Markup of the Protein Universe.
  • GeneDB at the Welcome Trust Sanger Institute Pathogen Sequencing Unit (PSU) provides access access to the latest sequence data and annotation/curation 37 organisms sequenced by the PSU.
  • Integrated relational Enzyme database (IntEnz)
  • InterPro a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
  • iProClass contains links to over 90 biological databases, including databases for protein families, functions and pathways, interactions, structures and structural classifications, genes and genomes, ontologies, literature, and taxonomy.
  • Kyoto Encyclopedia of Genes and Genomes (KEGG).
  • LabelHash matches arbitrary user-defined substructural motifs against all structures in the PDB or NRPDB.
  • NCBI Taxonomy Browser.
  • PDBSUM provides a variety of structure and function annotation
  • Pfam a database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).
  • PlasmoDB hosts genomic and proteomic data for different species of the parasitic eukaryote Plasmodium, the cause of Malaria.
  • PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterise a protein family; its diagnostic power is refined by iterative scanning of a SWISS-PROT/TrEMBL composite.
  • ProDom a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases.
  • PROFUNC protein annotation server.
  • ProLinks - A database of proteins having functional linkages to an input protein.
  • ProKnow - A server to suggest functions and annotations for a protein of known structure.
  • ProSite describes protein domains, families and functional sites as well as associated patterns and profiles to identify them.
  • ProtoNet provides global classification of the proteins, from the SWISS-PROT (UNIPROT) database into hierarchical clusters.
  • SAVES - Structure Analysis and Verification server.
  • SCOP Structural Classification of Proteins resource.
  • TIGR - The Institute for Genomic Research
  • UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins.
  • WormBase - The Biology and Genome of C. elegans

 


The PSI Centers also develop tools and resources to help predict/determine annotations for public use.

PSI Interactive Services for Sequence, Structure and Functional Annotations

  • The Protein Sequence Comparative Analysis (PSCA) server at JCSG is an integrated web tool for comparative analysis of protein sequence. PSCA analyzes protein sequence in multiple layers as domains and families, secondary structure feature, similarity of PDB protein structures, protein structure prediction and protein homologue
  • The Open Protein Structure Annotation Network (TOPSAN) at JCSG is a wiki-based project where automated target annotations are integrated with structure determination summary reports extracted from the JCSG database. TOPSAN provides access to outside collaborators to comment on and/or annotate a structure through an open mechanism similar to Wikipedia.
  • The ProFunc server at EBI identifies likely biochemical function of a protein from its three-dimensional structure. This has been done for MCSG structures.
  • The Tempura server performs a reverse template analysis on the results from ProFunc
  • PROCOGNATE is a database which provides a mapping between cognate ligands and protein structural domains. This database pulls together data from EBI/MSD, KEGG, CATH, SCOP and UniProt, and can be searched in a variety of ways, including PDB code, PDB ligand, EC number, KEGG reaction, KEGG compound, SCOP and CATH superfamilies and free-text searches on ligand and structure names.
  • The Global Protein Surface Survey (GPSS) server provides analysis of all MCSG functionally annotated surfaces. Annotation can be viewed through the GPSS web interface or using a plug-in to the PyMOL molecular visualization program.
  • The NESG Functional Annotation Server provides functional annotation of NESG structures including multiple sequence alignments (ClustalW), sequence homologs (PSI-Blast), domain assignments (InterPro, Pfam), structure alignment (Dali, SKAN), and cavity analysis (SCREEN).
  • Human cancer protein interaction network (BIONET) at NESG
  • AUTOPUBLISH server at NYSGXRC takes a PDB code as input and generates a variety of outputs including experimental details, structure images and standard functional and structural analysis in a publication ready format.
  • XtalPred Server: Prediction of Protein Crystallizability at JCSG

PSI Galleries and Summaries of Sequence, Structure and Functional Annotations

search

Explore proteins and this website

search

help