Polymerase chain reaction
Polymerase chain reaction (PCR) is a molecular biology technique for enzymatically replicating DNA without using a living organism, such as E. coli or yeast. The technique allows a small amount of the DNA molecule to be amplified many times, in an exponential manner. With more DNA available, analysis is made much easier. PCR is commonly used in medical and biological research labs for a variety of tasks, such as the detection of hereditary diseases, the identification of genetic fingerprints, the diagnosis of infectious diseases, the cloning of genes, paternity testing, and DNA computing.
PCR in practice
PCR is used to amplify a short, well-defined part of a DNA strand. This can be a single gene, or just a part of a gene. As opposed to living organisms, the PCR process can copy only short DNA fragments, usually up to 10 kb (kb stands for kilo base pairs). Certain methods can copy fragments up to 40 kb in size, which is still much less than the chromosomal DNA of a eukaryotic cell--for example, a human cell contains about three billion base pairs.
PCR, as currently practiced, requires several basic components. These components are:
- DNA template, or cDNA which contains the region of the DNA fragment to be amplified
- Two primers, which determine the beginning and end of the region to be amplified (see following section on primers)
- Taq polymerase, which copies the region to be amplified
- Nucleotides, from which the DNA-Polymerase builds the new DNA
- Buffer, which provides a suitable chemical environment for the DNA-Polymerase
The PCR reaction is carried out in a thermal cycler. This is a machine that heats and cools the reaction tubes within it to the precise temperature required for each step of the reaction. To prevent evaporation of the reaction mixture (typically volumes between 15-100µl per tube), a heated lid is placed on top of the reaction tubes or a layer of oil is put on the surface of the reaction mixture. These machines cost more than USD 2,500 in 2004.
Primers
The DNA fragment to be amplified is determined by selecting primers. Primers are short, artificial DNA strands--not more than fifty (usually 18-25 bp) nucleotides that are complementary to the beginning and end of the DNA fragment to be amplified. They anneal (adhere) to the DNA template at these starting and ending points, where the DNA-Polymerase binds and begins the synthesis of the new DNA strand.
The choice of the length of the primers and their melting temperature (Tm) depends on a number of considerations. The melting temperature of a primer--not to be confused with the melting temperature of the DNA in the first step of the PCR process--is defined as the temperature at which half of the primer binding sites are occupied. The melting temperature increases with the length of the primer. Primers that are too short would anneal at several positions on a long DNA template, which would result in non-specific copies. On the other hand, the length of a primer is limited by the temperature required to melt it. Melting temperatures that are too high, i.e., above 80°C, can cause problems since the DNA-Polymerase is less active at such temperatures. The optimum length of a primer is generally from fifteen to forty nucleotides with a melting temperature between 55°C and 65°C.
Sometimes degenerate primers are used. These are actually mixtures of similar, but not identical, primers. They may be convenient if the same gene is to be amplified from different organisms, as the genes themselves are probably similar but not identical. The other use for degenerate primers is when primer design is based on protein sequence. As several different codons can code for one amino acid, it is often difficult to deduce which codon is used in a particular case. Therefore primer sequence corresponding to the amino acid isoleucine might be "ATH", where A stands for adenine, T for thymine, and H for adenine, thymine, or cytosine. (See genetic code for further details about codons) Use of degenerate primers can greatly reduce the specificity of the PCR amplification. This problem can be partly solved by using touchdown PCR.
Above mentioned considerations makes primer design a very accurate process, on which depends product yield:
- GC-content should be between 40-60.
- Calculated Tm for both primers used in reaction should not differ >5°C and Tm of the amplification product should not differ from primers by >10°C.
- Annealing temperature usually is 5°C below the calculated lower Tm. However it should be chosen empirically for individual conditions.
- Inner self-complementary hairpins of >4 and of dimers >8 should be avoided.
- 3' terminus is extremely case sensitive - it must not be complementary to any region of the other primer used in the reaction and must provide correct base matching to template.
There are programs to help design primers (see External links).
Procedure
The PCR process usually consists of a series of twenty to thirty-five cycles. Each cycle consists of three steps (Fig. 2).
(1) The double-stranded DNA has to be heated to 94-96°C in order to separate the strands. This step is called denaturing; it breaks apart the hydrogen bonds that connect the two DNA strands. Prior to the first cycle, the DNA is often denatured for an extended time to ensure that both the template DNA and the primers have completely separated and are now single-strand only. Time: 1-2 minutes up to 5 minutes. Also Taq-polymerase is activated by this step.
(2) After separating the DNA strands, the temperature is lowered so the primers can attach themselves to the single DNA strands. This step is called annealing. The temperature of this stage depends on the primers and is usually 5°C below their melting temperature (45-60°C). A wrong temperature during the annealing step can result in primers not binding to the template DNA at all, or binding at random. Time: 1-2 minutes.
(3) Finally, the DNA-Polymerase has to copy the DNA strands. It starts at the annealed primer and works its way along the DNA strand. This step is called extension. The extension temperature depends on the DNA-Polymerase. The time for this step depends both on the DNA-Polymerase itself and on the length of the DNA fragment to be amplified. As a rule-of-thumb, 1 minute per 1 kbp. A final extension step is frequently used after the last cycle to ensure that any remaining single stranded DNA is completely copied. This differs from all other extension steps, only in that it is longer, typically 10-15 minutes.
Example
The times and temperatures given in this example are taken from a PCR program that was successfully used on a 250 bp fragment of the C-terminus of the insulin-like growth factor (IGF).
The reaction mixture consists of :
- 1.0 µl DNA template (100 ng/µl)
- 2.5 µl of primer, 1.25 µl per primer (100 ng/µl)
- 1.0 µl Pfu-Polymerase
- 1.0 µl nucleotides
- 5.0 µl buffer
- 89.5 µl H2O
A 200 µl reaction tube containing the 100 µl mixture is inserted into the thermocycler.
The PCR process consists of the following steps:
- Step 1
- Initialization. Heat the mixture at 96°C for 5 minutes to ensure that the DNA strands as well as the primers have melted. The DNA-Polymerase can be present at initialization, or it can be added after this step.
- Step 2
- Melting. Heat at 96°C for 30 seconds. For each cycle, this is usually enough time for the DNA to denature.
- Step 3
- Annealing. Heat at 68°C for 30 seconds.
- Step 4
- Elongation. Heat at 72°C for 45 seconds.
- Step 5
- Steps 2-4 are repeated 25 times, but with good primers and fresh polymerase, 15 to 20 cycles is sufficient.
- Step 6
- Hold mixture at 7°C. This is useful if one starts the PCR in the evening just before leaving the lab, so it can run overnight. The DNA will not be damaged at 7°C after just one night.
The PCR product can be identified by its size using agarose gel electrophoresis. Agarose gel electrophoresis is a procedure that consists of injecting DNA into agarose gel and then applying an electric current to the gel. As a result, the smaller DNA strands move faster than the larger strands through the gel toward the positive current. The size of the PCR product can be determined by comparing it with a DNA ladder, which contains DNA fragments of known size, also within the gel (Fig. 3).
PCR optimization
Since PCR is very sensitive, adequate measures to avoid contamination from other DNA present in lab environment (bacteria, viruses, own DNA etc.) should be taken. Thus DNA sample preparation, reaction mixture assemblage and the PCR process, in addition to the subsequent reaction product analysis, should be performed in separate areas. For the preparation of reaction mixture, a laminar flow cabinet with UV lamp is recommended. Fresh gloves should be used for each PCR step as well as displacement pipettes with aerosol filters. The reagents for PCR should be prepared separately and used solely for this purpose. Aliquots should be stored separately from other DNA samples. A control reaction (inner control), omitting template DNA, should always be performed, to confirm the absence of contamination or primer multimer formation.
Difficulties with polymerase chain reaction
Polymerase chain reaction is not perfect, and errors and mistakes can occur. These are some common errors and problems that may occur.
Polymerase errors
Taq polymerase lacks a 3' to 5' exonuclease activity. This makes it impossible for it to check the base it has inserted and remove it if it is incorrect, a process common in higher organisms. This in turn results in a high error rate of approximately 1 in 10000 bases, which, if an error occurs early, can alter large proportions of the final product. As a result other polymerases are available for accuracy in vital uses such as amplification for sequencing.
Examples of polymerases with 3' to 5' exonuclease activity include: Vent, which is extracted from Thermococcus litoralis, Pfu which is extracted from Pyrococcus furiosus and Pwo which is extracted from Pyrococcus woesii.
Size limitations
PCR works readily with DNA of lengths two to three thousand basepairs, but above this length the polymerase tends to fall off and the typical heating cycle does not leave enough time for polymerisation to complete. It is possible to amplify larger pieces of up to 20,000 base pairs, with a slower heating cycle and special polymerases. It is often necessary to "restock" the reaction with polymerase part way through due to the limited half life of the polymerase.
Non specific priming
The non specific binding of primers is always a possibility due to sequence duplications, non-specific binding and partial primer binding, leaving the 5' end unnatatched. This increased by the use of degenerate sequences or bases in the primer. Manipulation of annealing temperature and magnesium ion (which stabilise DNA and RNA interations) concentrations can increase specificity. Non-specific priming can be prevented during the low temperatures of reaction preparation by use of hot-start polymerase enzymes where the active site is blocked by an antibody or chemical that only dislodges once the reaction is heated to 95˚C during the denaturation step of the first cycle.
Other methods to increase specificity include Nested PCR and Touchdown PCR.
Practical modifications to the PCR technique
- Nested PCR - Nested PCR is intended to reduce the contaminations in products due to the amplification of unexpected primer binding sites. Two sets of primers are used in two successive PCR runs, the second set intended to amplify a secondary target within the first run product. This is very successful, but requires more detailed knowledge of the sequences involved.
- Inverse PCR - Inverse PCR is a method used to allow PCR when only one internal sequence is known. This is especially useful in identifying flanking sequences to various genomic inserts. This involves a series of digestions and self ligation before cutting by an endonuclease, resulting in known sequences at either end of the unknown sequence.
- RT-PCR - RT-PCR (Reverse Transcription PCR) is the method used to amplify, isolate or identify a known sequence from a cell or tissues RNA library. Essentially normal PCR preceded by transcripion by Reverse transcriptase (to convert the RNA to cDNA) this is widely used in expression mapping, determening when and where certain genes are expressed.
- Asymmetric PCR - Asymetric PCR is used to preferentially amplify one strand of the original DNA more than the other. It finds use in some types of sequencing and hybridization probing where having only one of the two complementary stands is ideal. PCR is carried out as usual, but with a great excess of the primers for the chosen strand. Due to the slow (arithmetic) amplification later in the reaction after the limiting primer has been used up, extra cycles of PCR are required. A recent modification on this process, known as Linear-After-The-Exponential-PCR ( LATE-PCR), uses a limiting primer with a higher melting temperture ( Tm) than the excess primer to maintain reaction efficiency as the limiting primer concentration decreases mid-reaction.
- Quantitative PCR - Q-PCR (Quantitative PCR) is used to rapidly measure the quantity of PCR product (preferably real-time), thus is an indirect method for quantitatively measuring starting amounts of DNA, cDNA or RNA. This is commonly used for the purpose of determining whether a sequence is present or not, and if it is present the number of copies in the sample. There are 3 main methods which vary in difficulty and detail.
- Quantitave real-time PCR is often confusingly known as RT-PCR (Real Time PCR). QRT-PCR or RTQ-PCR are more appropriate contractions. RT-PCR can also refer to reverse transcription PCR, which even more confusingly, is often used in conjunction with Q-PCR. This method uses fluorescent dyes and probes to measure the amount of amplified product in real time.
- Touchdown PCR - Touchdown PCR is a variant of PCR that reduces nonspecific primer annealing by lowering of annealing temperature between cycles.
- Colony PCR - Bacterial clones ( E.coli) can be screened for the correct ligation products. Selected colonies are picked with a sterile toothpick from a agarose plate and dabed into the master mix or sterile water. Primers (and the master mix) are added - the PCR protocol has to be started with an extented time at 95^^C.
Recent developments in PCR techniques
- A more recent method which excludes a temperature cycle, but uses enzymes, is helicase-dependent amplification.
- TAIL-PCR, developed by Liu et al. in 1995, is the thermal asymmetric interlaced PCR.
- Meta-PCR, developed by Adrew Wallace, allows to optimize amplification and direct sequence analysis of complex genes. Details at National Genetic Reference Laboratory, Manchester, UK
Uses of PCR
PCR can be used for a broad variety of experiments and analyses. Some examples are discussed below.
Genetic fingerprinting
Genetic fingerprinting is a forensic technique used to identify a person by comparing his or her DNA with a given sample, such as blood from a crime scene can be genetically compared to blood from a suspect. The sample may contain only a tiny amount of DNA, obtained from a source such as blood, semen, saliva, hair, or other organic material. Theoretically, just a single strand is needed. First, one breaks the DNA sample into fragments, then amplifies them using PCR. The amplified fragments are then separated using gel electrophoresis. The overall layout of the DNA fragments is called a DNA fingerprint. Since there is a very small possibility that two individuals may have the same sequences, the technique is more effective at acquitting a suspect than proving the suspect guilty. This small possibility was exploited by defense lawyers in the controversial O.J. Simpson case. A match however usually remains a very strong indicator also in the question of guilt.
Paternity testing
Although these resulting 'fingerprints' are unique (except for identical twins), genetic relationships, for example, parent-child or siblings, can be determined from two or more genetic fingerprints, which can be used for paternity tests (Fig. 4). A variation of this technique can also be used to determine evolutionary relationships between organisms.
Detection of hereditary diseases
The detection of hereditary diseases in a given genome is a long and difficult process, which can be shortened significantly by using PCR. Each gene in question can easily be amplified through PCR by using the appropriate primers and then sequenced to detect mutations.
Viral diseases, too, can be detected using PCR through amplification of the viral DNA. This analysis is possible right after infection, which can be from several days to several months before actual symptoms occur. Such early diagnoses give physicians a significant lead in treatment.
Cloning genes
Cloning a gene, not to be confused with cloning a whole organism, describes the process of isolating a gene from one organism and then inserting it into another organism (now termed a genetically modified organism (GMO)). PCR is often used to amplify the gene, which can then be inserted into a vector (a vector is a piece of DNA which 'carries' the gene into the GMO) such as a plasmid (a circular DNA molecule) (Fig. 5). The DNA can then be transferred into an organism (the GMO) where the gene and its product can be studied more closely. Expressing a cloned gene (when a gene is expressed the gene product (usually protein or RNA) is produced by the GMO) can also be a way of mass-producing useful proteins, for example medicines or the enzymes in biological washing powders. The incorporation of an affinity tag on a recombinant protein will generate a fusion protein which can be more easily purified by affinity chromatography.
Mutagenesis
Mutagenesis is a way of making changes to the sequence of nucleotides in the DNA. There are situations in which one is interested in mutated (changed) copies of a given DNA strand, for example, when trying to assess the function of a gene or in in-vitro protein evolution. Mutations can be introduced into copied DNA sequences in two fundamentally different ways in the PCR process. Site-directed mutagenesis allows the experimenter to introduce a mutation at a specific location on the DNA strand. Usually, the desired mutation is incorporated in the primers used for the PCR program. Random mutagenesis, on the other hand, is based on the use of error-prone polymerases in the PCR process. In the case of random mutagenesis, the location and nature of the mutations cannot be controlled. One application of random mutagenesis is to analyze structure-function relationships of a protein. By randomly altering a DNA sequence, one can compare the resulting protein with the original and determine the function of each part of the protein.
Analysis of ancient DNA
Using PCR, it becomes possible to analyze DNA that is thousands of years old. PCR techniques have been successfully used on animals, such as a forty-thousand-year-old mammoth, and also on human DNA, in applications ranging from the analysis of Egyptian mummies to the identification of a Russian Tsar.
Genotyping of specific mutations
Through the use of allele-specific PCR, one can easily determine which allele of a mutation or polymorphism an individual has. Here, one of the two primers is common, and would anneal a short distance away from the mutation, while the other anneals right on the variation. The 3' end of the allele-specific primer is modified, to only anneal if it matches one of the alleles. If the mutation of interest is a T or C single nucleotide polymorphism (T/C SNP), one would use two reactions, one containing a primer ending in T, and the other ending in C. The common primer would be the same. Following PCR, these two sets of reactions would be run out on an agarose gel, and the band pattern will tell you if the individual is homozygous T, homozygous C, or heterozygous T/C. This methodology has several applications, such as amplifying certain haplotypes (when certain alleles at 2 or more SNPs occur together on the same chromosome [Linkage Disequilibrium]) or detection of recombinant chromosomes and the study of meiotic recombination.
Comparison of gene expression
Researchers have used traditional PCR as a way to estimate changes in the amount of a gene's expression. Ribonucleic acid (RNA) is the molecule into which DNA is transcribed prior to making a protein, and those strands of RNA that hold the instructions for protein sequence are known as messenger RNA (mRNA). Once RNA is isolated it can be reverse transcribed back into DNA (complementary DNA to be precise, known as cDNA), at which point traditional PCR can be applied to amplify the gene, this methodology is called RT-PCR. In most cases if there is more starting material (mRNA) of a gene then during PCR more copies of the gene will be generated. When the product of the PCR reaction are run on an agarose gel (see Figure 3 above) a band, corresponding to a gene, will appear larger on the gel (note that the band remains in the same location relative to the ladder, it will just appear fatter or brighter). By running samples of amplified cDNA from differently treated organisms one can get a general idea of which sample expressed more of the gene of interest. A quantative RT-PCR method has been developed, it is called Real-time PCR details at http://www.gene-quantification.info - The reference in real-time PCR.
History
PCR was invented by Kary Mullis while working for Cetus in December 1985. He was awarded the Nobel Prize in Chemistry in 1993 for this achievement, only seven years after his colleagues at Cetus first reduced his proposal to practice. Mullis's idea was to develop a process by which DNA could be artificially multiplied through repeated cycles of duplication driven by an enzyme called DNA polymerase.
DNA polymerase occurs naturally in living organisms, where it functions to duplicate DNA when cells divide in mitosis and meiosis. Polymerase works by binding to a single DNA strand and creating the complementary strand. In Mullis's original process, the enzyme was used in vitro (in a controlled environment outside an organism). The double-stranded DNA was separated into two single strands by heating it to 94°C (204.8°F). At this temperature, however, the DNA polymerase used at the time was destroyed, so the enzyme had to be replenished after the heating stage of each cycle. Mullis's original procedure was very inefficient, since it required a great deal of time, large amounts of DNA polymerase, and continual attention throughout the process.
Later, this original PCR process was greatly improved by the use of DNA polymerase taken from thermophilic bacteria grown in geysers at a temperature of over 110°C (230°F). The DNA polymerase taken from these organisms is stable at high temperatures and, when used in PCR, does not break down when the mixture was heated to separate the DNA strands. Since there was no longer a need to add new DNA polymerase for each cycle, the process of copying a given DNA strand could be simplified and automated.
One of the first thermostable DNA polymerases was obtained from Thermus aquaticus and was called "Taq." Taq polymerase is widely used in current PCR practice. A disadvantage of Taq is that it sometimes makes mistakes when copying DNA, leading to mutations (errors) in the DNA sequence, since it lacks 3'→5' proofreading exonuclease activity. Polymerases such as Pwo or Pfu, obtained from Archaea, have proofreading mechanisms (mechanisms that check for errors) and can significantly reduce the number of mutations that occur in the copied DNA sequence. However these enzymes polymerize DNA at a much slower rate than Taq. Combinations of both Taq and Pfu are available nowadays that provide both high processivity (fast polymerization) and high fidelity (accurate duplication of DNA).
Patent wars
The PCR technique was patented by Cetus Corporation, where Mullis worked when he invented the technique. The Taq polymerase enzyme is also covered by patents. There have been several high-profile lawsuits related to the technique, including, most famously, a lawsuit brought by DuPont. The pharmaceutical company Hoffmann-La Roche purchased the rights to the patents in 1992 and currently holds them.