DNA Sequencing Help

By — McGraw-Hill Professional
Updated on Aug 23, 2011

DNA Sequencing

Methods for determining the order of DNA nucleotides in a segment of DNA have been in existence since the 1970s. The primary method used today is called Sanger sequencing, after its inventor Fred Sanger. Several variations of this method have been developed over the years, but the most common method utilizes a modified polymerase chain reaction (PCR), called cycle sequencing, with four fluorescently labeled chain-terminating nucleotides.

The fragment of DNA to be sequenced (target or template DNA) is most commonly contained within a plasmid or other type of cloning vector, or can be a purified PCR amplicon. A DNA primer that will specifically anneal to one of the 3' ends of the template strand is chemically synthesized. Such primers are relatively short (15–20 bp long), single-stranded oligonucleotide sequences. The two strands of the template DNA are denatured to single strands by heat and the primer molecules will bind to their complementary sequence on the desired template strand as the mixture cools (Figure 12-14).

DNA Sequencing

DNA polymerase is added together with a mixture of the four deoxyribonucleoside triphosphates [dATP, dCTP, dGTP, dTTP; together referred to as dNTPs (deoxyribonucleotides), Fig. 12-15(a)] and a mix of dideoxynucleoside triphosphates [ddATP, ddCTP, ddGTP and ddTTP; together referred to as ddNTPs (dideoxyribonucleotides), Fig. 12-15b)], each of which is labeled with a different fluorescent dye.

DNA Sequencing

These labeled dideoxynucleotides are called chain-extension terminators (chain terminators) because they have no 3' hydroxyl group with which to form internucleotide 3' to 5' phosphodiester linkages. Thus, once a ddNTP is added (bold type in Fig. 12-14) to an extending chain, further extension ceases. Asuitable ratio of ddNTPs to normal dNTPs is chosen so that only a relatively small fraction of the growing extension products will be terminated by the addition of ddNTPs. The primed complexes are extended by DNA polymerase toward their 3' ends by random polymerization of each nucleotide from either dNTP or ddNTP pools, so that ideally every possible extension chain length should be produced. These reactions are carried out using a PCR format in a process called cycle sequencing. Polyacrylamide gel electrophoresis [Fig. 12-14(a)] is then applied under conditions that allow the separation of extension products different in length by a single nucleotide, in a single column or lane on the gel. Since DNA molecules are generally negatively charged, they migrate toward the anode (positive electrode) during electrophoresis. Thus, the smaller the extension product, the faster it will migrate through the gel. The last nucleotide in the extension product (indicated by the specific fluorescent dye) in the fastest-migrating band thus represents the 5' nucleotide of one strand in the target DNA and the last nucleotide in the extension product in the slowest-migrating band represents the 3' nucleotide of that strand. When each population of extension products of identical length passes a fluorescence scanner as a band, the fluorescent light emitted is recorded and analyzed by a computer to yield the entire sequence of that stretch of target DNA (electropherogram).

A typical sequencing gel can now yield 500–600 bp of sequence information per lane. In order to sequence a long stretch of target DNA ( > 500–600 bp), multiple different sequencing reactions must be carried out, each beginning with a primer designed to anneal to and direct DNA synthesis from a different region of the DNA. Automated equipment can now accommodate almost 100 lanes in one electrophoresis run. That theoretically translates to approximately 50,000 bp of sequence data that can be generated in about 4–6 h. This kind of productivity requires lots of technical and robotic assistance, but it has made possible the relatively rapid sequencing of large genomes (such as the Human Genome).

Sanger sequencing procedures are limited by the number of DNA nucleotides they can sequence in a given time period. Although this technique was used to sequence the human genome, new techniques, such as pyrosequencing, have been developed to increase the sequencing throughput and lower costs. Pyrosequencing utilizes a DNA polymerase enzyme to add dNTPs to a growing DNA chain extending from a primer, like Sanger sequencing, but it is different in several respects. First, the dNTPs are not chain terminators, they are regular nucleotides, and they are added to the sequencing reaction mixture one at a time in a sequential manner. When a dNTP is incorporated into the growing DNA chain by the DNA polymerase enzyme, a single molecule of pyrophosphate (PPi) – hence the name of the reaction – is given off. An enzyme, ATP sulfurylase, converts the pyrophosphate to ATP (adenosine triphosphate), which is used by the enzyme luciferase to convert luciferin to oxyluciferin. This conversion produces light in proportion to the amount of ATP. This light is detected using a device such as a charge-coupled device (CCD) camera and recorded in a pyrogram trace that is similar to the electropherogram output from cycle sequencing. When two of the same types of dNTP are incorporated next to each other in the growing DNA chain, twice the amount of light is produced so a double signal is recorded. If no dNTP is incorporated, no light is produced and no signal is detected. An enzyme called apyrase is always present in the reaction to continually degrade the unincorporated dNTPs and the unused ATP. The luciferase/luciferin light generating system comes from the light emitting abdominal organs of firefly insects (Photinus pyralis). Unique DNA molecules from a library are immobilized onto tiny beads that are placed into individual wells of a picotiter plate for the sequencing of over 1 million DNA fragments in 10 hours. Each fragment generates up to 500 bp of sequence, so 500 million bp of sequencing information can be captured in a short period of time allowing for the sequencing of entire genomes or large environmental genomic libraries in a few days (as opposed to months or years for very large genomes). The individual sequences are "assembled" into a larger continuous genome sequence using bioinformatics computer software that identifies overlapping nucleotide sequence regions. This increased capacity is known as high throughput sequencing (HTS) or high throughput genomics (HTG).

EXAMPLE 12. 10 Below is an electropherogram trace from a cycle sequencing reaction (a) and a pyrogram trace from a pyrosequencing reaction (b). In the electropherogram, each nucleotide is shown as a different color that corresponds to the wavelength of light emitted by the fluorescent tag on the ddNTP (shown here as different types of lines). In the pyrogram, the dNTPs are added to the reaction wells in a sequential flow so signals are only detected when one or more dNTPs are added. Note also that when the sequence contains adjacent nucleotides of the same type they are shown as peak with 2-3 times the signal strength of a single dNTP addition. The base sequence of the target DNA strand in both samples is 5' ACTCCCGATTCTA 3'.

DNA Sequencing

Practice problems for these concepts can be found at:

Add your own comment