The SeqBag Project
  Bioinformatics Research Lab.
  IBI Biosolutions Pvt. Ltd.

EXTINCTION COEFFICIENT

The extinction coefficient for a particular substance is a measure of how well it absorbs electromagnetic radiation (EM waves).
If the EM wave can pass through very easily, the material has a low extinction coefficient.Conversely, if the radiation hardly penetrates the material, but rather quickly becomes "extinct" within it, the extinction coefficient is high.A material can behave differently for different wavelengths of electromagnetic radiation. Glass is transparent to visible light, but many types of glass are opaque to ultra-violet wavelengths. In general, the extinction coefficient for any material is a function of the incident wavelength. The extinction coefficient is used widely in ultraviolet-visible spectroscopy.It has been shown that it is possible to estimate the molar extinction coefficient of a protein from knowledge of its amino acid composition. From the molar extinction coefficient of tyrosine, tryptophan and cystine (cysteine does not absorb appreciably at wavelengths >260 nm, while cystine does) at a given wavelength, the extinction coefficient of the native protein in water can be computed using the following equation:

E=(yy*extY)+(ww*extW)+(cc*extC);
(for proteins in water measured at 280 nm)

Where, E= Exticntion Coefficient
yy= nimber of Tyrosine
ww= number of Tryptophan
cc= number of Cysteine
Ext(yy) = 1280
Ext(ww) = 5690
Ext(cc) = 120;

Physical Definitions
The parameter used to describe the interaction of electromagnetic radiation with matter is the complex index of refraction, ñ, which is a combination of a real part and an imaginary part:

Here, n is also called the index of refraction, which sometimes leads to confusion. k is the extinction coefficient, which represents the damping of an EM wave inside the material. Both depend on the wavelength.

OPTICAL DENSITY

Optical density is the absorbance of an optical element for a given wavelength ? per unit distance:

Optical Density = E / Molecular weight

Where
E= Extinction coefficient
Molecular weight= of the given protein.

Although absorbance does not have true units, it is quite often reported in "Absorbance Units" or AU. Accordingly, optical density is measured in ODU, which are equivalent to AU cm-1.
The higher the optical density, the lower the transmittance. Optical density times 10 is equal to a transmission loss rate expressed in decibels per cm
" e.g., an optical density of 0.3 corresponds to a transmission loss of 3 dB per cm.
Optical density is sometimes defined without regard to the length of the sample; in this case it is a synonym for absorbance. Neutral density filters are typically quantified this way.

GRAND HYDROPATHY VALUE

It is the hydrophobic character, which may be useful in predicting membrane-spanning domains, potential antigenic sites and regions that are likely exposed on the protein's surface. It is calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence.
Amino acid HYDROPATHY VALUES:
Ala: 1.800
Arg: -4.500
Asn: -3.500
Asp: -3.500
Cys: 2.500
Gln: -3.500
Glu: -3.500
Gly: -0.400
His: -3.200
Ile:- 4.500
Leu: 3.800
Lys: -3.900
Met:- 1.900
Phe:- 2.800
Pro: -1.600
Ser: -0.800
Thr:-0.700
Trp: -0.900
Tyr: -1.300
Val: 4.200


TRIPLET CODONS

The genetic code is the set of rules by which information encoded in genetic material (DNA or RNA sequences) is translated into proteins (amino acid sequences) by living cells. Specifically, the code defines a mapping between tri-nucleotide sequences called codons and amino acids; every triplet of nucleotides in a nucleic acid sequence specifies a single amino acid. Because the vast majority of genes are encoded with exactly the same code.


MELTING TEMPERATURE

DNA melting is the process by which deoxyribonucleic acid is heated to such a high temperature that the helical structure unwinds due to the breaking of the hydrogen bonding between bases. For multiple copies of DNA molecules, the melting temperature (Tm) is defined as the temperature at which half of the DNA strands are in the double-helical state and half are in the "random-coil" states. The melting temperature depends on both the length of the molecule, and the specific nucleotide sequence composition of that molecule.

Basic Melting Temperature (Tm) Calculations:-
The two standard approximation calculations are used. For sequences less than 14 nucleotides the formula is

Tm= (a+t) * 2 + (g+c) * 4

(WALLACE METHOD:- The reasoning behind the method is that, because cytosine-guanine pairs form three hydrogen bonds compared to the two hydrogen bonds between adenosine and thymine, they contribute more to the stability of a double-helix.)

Where:- a, t, c, g are the number of the bases A,T,G,C in the sequence, respectively.

For sequences having 14 or more nucleotides, the equation used is

MOLECULAR WEIGHT

=> NUCLEOTIDE

DNA is comprised of base pairs and DNA from different life forms is made up of different numbers
of bases. A mole of one base weighs 325 g. We have about 3 x 10e9 bp for each copy of our DNA
[Two total, one from each parent]. There are about 6 picograms of DNA/cell.

It would simply depend on the number of nitrogen bases (since the phosphates and sugars are
a constant) and therefore the length of the DNA in question.

Anhydrous Molecular Weight = (a*313.21) + (t*304.2) + (c*289.18) + (g*329.21) - 61.96

Where:
a, t, g, c are the number of each respective nucleotide within the polynucleotide.
The subtraction of 61.96 gm/mole from the oligonucleotide molecular weight takes into account the removal of HPO2 (63.98) and the addition of two hydrogens (2.02).

Molecular Weight = (a*313.21) + (t*304.2) + (c*289.18) + (c*329.21) - 61.96 + 79.0

The addition of 79.0 gm/mole to the oligonucleotide molecular weight takes into account the 5' monophosphate left by most restriction enzymes.

RNA Molecular Weight (assuming that there is a 5' triphosphate on the molecule):-

Molecular Weight = (a*329.21) + (u*306.17) + (c*305.18) + (g*345.21) + 159.0

Where
a, u, c, and g are the number of each respective nucleotide within the polynucleotide.
Addition of 159.0 gm/mole to the molecular weight takes into account the 5' triphosphate.

=> PROTEIN

Molecular weight of protein is the grand total weight of the amino acids in that particular sequence.When the amino acids bond together forming the peptide bonds, it is accompanied by the loss of H2O.Hence the exact molecular weight of protein requires the removal of weight of H2O molecules.

Water weight= (length of sequence-1)*18.

Exact weight of protein =Total weight-water weight.

ORFs-Open Reading Frames

An open reading frame or ORF is a portion of an organism's genome which contains a sequence of bases that could potentially encode a protein. The start and stop ends of the ORF are not equivalent to the ends of the mRNA, but they are usually contained within the mRNA. In a gene, ORFs are located between the start-code sequence (initiation codon) and the stop-code sequence (termination codon). ORFs are usually encountered when sifting through pieces of DNA while trying to locate a gene. Since there exist variations in the start-code sequence of organisms with altered genetic code, the ORF will be identified differently.

For example, if you have 5'-UCUAAAGGUCCA-3' it has 2 out of 3 reading frames possible. This is one of the 2 possible mRNA sequences of the transcript, and we see that it can be reading in the 3 possible ways:

1. UCU AAA GGU CCA

2. CUA AAG GUC etc

3. UAA AGG UCA etc

APPROXIMATE VOLUME OF PROTEIN

A peptide's volume can be estimated from the molecular weight of the peptide and an average protein partial specific volume.
The simple calculation starts from

0.73 cm3/g*1024A/cm3*molecular weight g/mole

6.02*1023 molecules/mole

And results in a protein volume of approximately:

(1.21*MW) A3/molecule

Amino Acid Table

AMINO ACID
MOLECULAR WEIGHT
Alanine
89.09
Cysteine
121.16
Aspartate
133.10
Glutamate
147.13
Phenylalanine
165.19
Glycine
75.07
Histidine
155.16
Isoleucine
132.18
Lysine
146.19
Leucine
132.18
Methionine
149.21
Asparagine
132.12
Proline
115.13
Glutamine
146.15
Arginine
174.20
Serine
105.09
Threonine
119.12
Valine
117.15
Tyrosine
181.19
Tryptophan
204.23


ESTs-Expressed Sequence Tags

An expressed sequence tag or EST is a short sub-sequence of a transcribed spliced nucleotide sequence (either protein-coding or not). They are intended as a way to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination. The identification of ESTs has proceeded rapidly, with approximately 42 million ESTs now available in public databases.

An EST is produced by one-shot sequencing of a cloned mRNA (i.e. sequencing several hundred base pairs from an end of a cDNA clone taken from a cDNA library). The resulting sequence is a relatively low quality fragment whose length is limited by current technology to approximately 500 to 800 nucleotides.