About
SeqBag
Certainly
biology didn't start due to today's genome projects occurring all around,
but it has definitely started a great acceleration of the accumulation
of biological knowledge. To make sense out of the enormous amount of
data and knowledge is a huge challenge. The primary part is to parse,
simplify, classify and organize this immense richness of sequence data.
This era has very well managed to do so and the output is all the databases
readily available at a single click. But only capturing and reducing
the complexity is not the solution, rather integrating from diverse
sources and analyzing is a big task. Hence today sequence analysis plays
an important role in biological research. The web resource SeqBag is
a comprehensive and compact web based software dedicated for sequence
analysis. It proposes a collection of modular tools which provides different
ways to perform sequence analysis on nucleotide and protein sequence.
SeqBag reads sequence in raw format. It has seven different modules-
DNA Properties, Protein Properties, Restriction Analysis, EST Analysis,
Sequence Alignment, Six Reading Frame and Pattern Finder.
TASKS AND PROGRAMS
In the following section we describe different tasks and programs of SeqBag,
which are useful for sequence analysis:
Sequence
Properties: The sequence properties can be calculated by using the DNA
properties and Protein properties modules. By using these two modules
user can retrieve the properties of nucleotides and protein sequences.
1) DNA Properties Module:
The DNA properties module enables the user to analyze the DNA sequence
and retrieve GC% and AT%, length, composition, reverse, complementary,
reverse complementary and the transcribed sequence of the given DNA
sequence. Another important application of this tool is that it enables
the measurement of molecular weights under specified variations in the
DNA sequence for eg. Anhydrous molecular weight assuming there is no
5' monophosphate, Anhydrous molecular weight assuming 5'monophosphate
is present, molecular weight assuming there is 5' triphosphate and molecular
weight of single stranded DNA. It allows to calculate the basic melting
temperature of the query DNA sequence.
2) Protein Properties Module:
The other module known as the protein properties module provides analysis
of the Protein sequence. The user can retrieve composition, length,
total molecular weight of the sequence, exact weight of the sequence,
protein volume (approximately), extinction coefficient1 [3] of the sequence,
optical density2 of sequence, and grand average of hydropathy3 of the
given protein sequence.
3) Restriction Enzyme Analysis:
A restriction enzyme is an enzyme that cuts double-stranded DNA at specific
sequences within it known as restriction sites. This restriction enzyme
analysis tool contains the 30 different restriction enzymes which are
the most commonly used enzymes, to perform the analysis. These restriction
enzymes are listed below;
|
AaatIII |
AccIII |
Acc65I |
AccB7I
|
Agel |
|
Alul |
A/W44I |
Apal |
BalI |
BamHl |
|
Bbul |
BclI |
BglII |
BglIl |
BsaMI |
|
BsrBRI |
BsrSI |
BssHIl |
EcoRI |
BaMHI |
|
HindIII |
TaqI |
NotI |
HinfI |
Sau3A |
|
PovII |
SmaI |
HaeIII |
AluI |
EcoRV |
4)
EST Analysis:
Expressed Sequence Tags (ESTs) are short (usually about 300-500 bp),
single-pass sequence reads from mRNA (cDNA). Typically they are produced
in large batches. They represent a snapshot of genes expressed in a
given tissue and/or at a given developmental stage. They are tags (some
coding, others not) of expression for a given cDNA library. The EST
analysis tool helps user to concatenate any number of ESTs provided
by the user .
5)
Sequence Alignment:
Sequence alignment is a way of arranging the primary sequences of DNA,
RNA, or protein to identify regions of similarity that may be a consequence
of functional, structural, or evolutionary relationships between the
sequences . Aligned sequences of nucleotide or amino acid residues are
typically represented as rows within a matrix. Gaps are inserted between
the residues so that residues with identical or similar characters are
aligned in successive columns. The sequence alignment tool provides
the pairwise alignment between the two sequences, there by giving Matches,
Mismatch and Similarity.
6) SIX Reading Frame:
A reading frame is a contiguous and non-overlapping set of three-nucleotide
codons in DNA or RNA. There are 3 possible reading frames in a mRNA
strand and six in a double stranded DNA molecule due to the two strands
from which transcription is possible. The SIX reading frame tool helps
to find the possible six reading frames of a nucleotide sequence.
7) Pattern Finder:
Any protein structure is not fully deciphered until all the domains
and motifs present in it are recognized. It has been known that specific
sequence patterns are responsible for a particular domain or motif formation.
Pattern finder enables the identification of such specific patterns
and hence aiding comparative studies. The Pattern finder accepts a given
pattern (can be a domain or motif) and helps to locate the sequence
region that matches the given pattern of interest.
| 1. |
Extinction Coefficient: The extinction coefficient for a
particular substance is a measure of how well it absorbs electromagnetic
radiation. |
| 2. |
Optical Density: Optical density is the absorbance of an
optical element for a given wavelength. |
| 3. |
Hydropathy:
It is the hydrophobic character, which may be useful in predicting
membrane-spanning domains, potential antigenic sites and regions
that are likely exposed on the protein's surface. |