KASII some functions are unique to each source database,


SHBI/ 01579/2014

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now



bioinformatics tool of your choice, covering the given subtopics,


NCBI, it stands for national center for biotechnology information. It is a
division of united states national library of medicine(NLM), a brach of national
institutes of health(NIH). It was founded in 1988 through legislation sponsored
by senator claude pepper and it at betheseda.
This tool can be directly accessed by this link www.ncbi.nlm.gov
Its mission is to contribute to the NIH mission of uncovering new knowledge. It
contains resources for genomic, genetic and biomedical data. The research is
conducted by their intramural investigators. NCBI creates a variety of
educational products including courses, workshops, webinars, training materials
and documentations. NCBI educational events are free to anyone and everyone.
They are available to anyone to use, reuse and distribute

and courses- in person courses, live webinars and webinar recordings

and presentations- both exhibits and workshops at scientific conferences

training materials inform of HTML, PDF and videos

online manuals, handbooks and FAQs





uses Entrez system. Entrez is the text-based search and retrieval system used
at the National Center for Biotechnology Information (NCBI) for all of the
major databases, including PubMed, Nucleotide and Protein Sequences, Protein
Structures, Complete Genomes, Taxonomy, and many others.
it integrates data from a large number of sources, formats, and databases into
a uniform information model and retrieval system. The actual databases from
which records are retrieved and on which the Entrez indexes are based have
different designs, based on the type of data, and reside on different machines.
These are referred to as the source databases. A common theme in the
implementation of Entrez is that some functions are unique to each source
database, whereas others are common to all Entrez databases.

Nodes Are Intended for Linking, an Entrez node enables linking to other Entrez
nodes in a useful and reliable way. For example, given a protein sequence, it
is very useful to quickly find the nucleotide sequence that encodes it. Or
given a research article, it is useful to find the sequences it describes, if
Links between Nodes have been achieved by putting all information into one
record For example, many GenBank records contain pertinent article citations.
However, NCBI also contains the article abstract and additional index terms;
furthermore, the bibliographic information is also more carefully curated than
the citation in a GenBank entry. It therefore makes much more sense to search
for articles in NCBI rather than in GenBank.

a subset of articles has been retrieved from NCBI, it may be useful to link to
sequence information associated with the abstracts. The article citation in the
GenBank record can be used to establish the link to NCBI and, conversely, to
make the reciprocal link from the NCBI article back to the GenBank record.
Treating each Entrez node separately but enabling linking between related data
in different nodes means that the retrieval characteristics for each node can
be optimized for the characteristics and strengths of that node, whereas
related data can be reached in nodes with different strengths.

approach also means that new connections between data can be made. In the
example above, the GenBank record cited the published article, but there was no
link from that article in NCBI to the sequence until Entrez made the reciprocal
link from NCBI. Now, when searching articles in NCBI, it is possible to find
this sequence, although no NCBI records have been changed. Because of this
design principle, the Entrez system is richly interconnected, although any particular
association may originate from only one record in one node.

type of linking in Entrez is between records of the same type, often called
neighbors, in sequence and structure nodes. Most often these associations are
computed at NCBI. For example, in Entrez Proteins, all of the protein sequences
are blasted against each other, and the highest-scoring hits are stored as
indexes within the node. This means that each protein record has associated
with it a list of highly similar sequences, or neighbors.


NCBI creates a variety of educational products including courses, workshops,
webinars, training materials and documentations. Databases found on NCBI have
been grouped in to the following: Assemblies, Bio collections, Bio projects, Bio
samples, Bio systems, Books, Chinar, Clone, Conserved domains, DbGaP, DbVar, EST,
Gene, Genome, GEO DataSets, Geo profiles, GSS, GTR, Homologue gene, Identical
protein groups medgen, Mesh, NCBI web site, Nlm catalogue, catalogue,
nucleotide, omim, pmc, popset, probe, protein, protein clusters, PubMed bio
assay, PubChem compound, PubChem substance, PubMed, PubMed health, snp,
sparkle, sra, structure, taxonomy, toolkit, toolkitall toolkit, bookgh, unigene.
One can choose to search specifically in to each category or search
comprehensively in all categories.

provides information on the structure of assembled genomes, assembly names and
other meta-data, statistical reports, and links to genomic sequence data. Researchers
can collect curated set of metadata for culture collections, museums, herbaria
and other natural history collections. The records display collection codes,
information about the collections’ home institutions, and links to relevant
data at NCBI. This can help in learning the distribution and source of the
specimens. A specific example from viralproj 14703, the strain was mayinga;
submitter was Philipps university Marburg, institute of virology, Germany,
Marburg; assembly level was full and lastly GenBank assembly accession was

Genome Reference Consortium (GRC) which is one of the categories in NCBI
database maintains responsibility for the human and mouse reference genomes.
Members consist of The Genome Center at Washington University, the Wellcome
Trust Sanger Institute, the European Bioinformatics Institute (EBI) and the
National Center for Biotechnology Information (NCBI). The GRC works to correct
misrepresented loci and to close remaining assembly gaps. In addition, the GRC
seeks to provide alternate assemblies for complex or structurally variant
genomic loci. At the GRC website (http://www.genomereference.org), the public
can view genomic regions currently under review, report genome-related problems
and contact the GRC.

also offers access to OMIM, A database of human genes and genetic disorders. NCBI
maintains current content and continues to support its searching and
integration with other NCBI databases. For example, Mitochondrial genome
maintenance exonuclease 1; MGME1 which is one of many available in omim
category may help researchers who might want to know more about its allelic
variants and associated topics

can search validated chemical structures (small molecules) that can be searched
using names, synonyms or keywords. The compound records may link to more than
one PubChem Substance record if different depositors supplied the same
structure. These Compound records reflect validated chemical depiction
information provided to describe substances in PubChem Substance. Structures
stored within PubChem Compounds are pre-clustered and cross-referenced by
identity and similarity groups. Additionally, calculated properties and
descriptors are available for searching and filtering of chemical structures.

NCBI can provide base information; past projects, books, researches among others
for scientist scholars and aspiring researchers


is American based bioinformatics tool, most of upcoming African scientists and researchers
may not get a chance to attend live conferences and workshops or even exhibit
their inventions. Although live webinars and webinar videos can host
conferences globally, they may limit the viewers from creating networks with
fellow scientists which is vital in every part of life. African countries, Kenya
inclusive, should organize workshops, exhibits and conferences in conjunction
with NCBI to promote, motivate, involve and advice upcoming scientists and

website should also create video tutorials to help researchers utilize the
search tools efficient. This is because some researchers may don’t know how to
use this bioinformatics tool and hence training may be vital.





Mizrachi, Ilene (22 August 2007).
“GenBank: The Nucleotide Sequence Database”. National Center for
Biotechnology Information (US) – via www.ncbi.nlm.nih.gov.

Altschul Stephen; Gish Warren; Miller Webb;
Myers Eugene; Lipman David (1990). “Basic local alignment search
tool”. Journal of Molecular Biology. 215 (3): 403–410.
doi:10.1016/s0022-2836(05)80360-2. PMID 2231712.

Madden T. (2002). The NCBI handbook, 2nd
edition, Chapter 16, The BLAST Sequence Analysis Tool

Resource Coordinators (2012). “Database resources of the National Center
for Biotechnology Information”. Nucleic Acids Research 41 (Database
issue): D8–D20.

Ostell J. (2002). The NCBI handbook, 2nd
edition, Chapter 15, The Entrez Search and Retrieval System

Maglott D. Pruitt K. & Tatusova T. (2005).
The NCBI handbook, 2nd edition, Chapter 19, Gene: A Directory of Genes

 Sayers E.
(2013). The NCBI handbook, 2nd edition, NCBI Protein Resources