Deoxyribonucleic Acid (DNA)
Deoxyribonucleic Acid (DNA)
Expression of genetic information
Genetic engineering and recombinant DNA
Deoxyribonucleic acid (DNA) is a natural polymer which encodes the genetic information required for the growth, development, and reproduction of an organism. Found in all cells, it consists of chains of units called nucleotides. Each nucleotide unit contains three components: the sugar deoxyribose, a phosphate group, and a nitrogen-containing ring structure called a base. There are four different bases in DNA: adenine, cytosine, guanine or thymine.
DNA molecules are very long and threadlike. They consist of two polymeric strands twisted about each other into a spiral shape known as a double helix, which resembles a twisted ladder. In eukaryotic cells, DNA is found within the cell nucleus in the chromosomes, which are extremely condensed structures in which DNA is associated with proteins. Each species contains a characteristic number of chromosomes in their cells. In humans, every cell contains 46 chromosomes (except for egg and sperm cells which contain only 23). The total genetic information in a cell is called its genome. In prokaryotic cells such as bacteria, DNA is not contained within the specialized nuclear membrane, but rather is dispersed in the interior substance of the cell (cytoplasm)
The fundamental units of heredity are genes. A gene is a segment of a DNA molecule that encodes the information necessary to make a specific protein. The many proteins encoded by DNA contribute to a cell’s structure and chemical activities.
DNA not only encodes the “blueprints” for cellular proteins but also the instructions for when and where they will be made. For example, the oxygen carrier hemoglobin is made in red blood cells but not in nerve cells, though both contain the same total genetic content. Thus, DNA also contains the information necessary for regulating how its genetic messages are used.
The sequencing of the human genome has determined that a human cell contains approximately 30, 000 genes, far less than the previously estimated 50, 000–100, 000. Except in the case of identical twins, a comparison of the genes from different individuals always reveals a number of differences. Therefore, each person is genetically unique. This is the basis of DNA fingerprinting, a forensic procedure used to match DNA collected from a crime scene with that of a suspect, and of the use of DNA to establish who is the biological parent of a child.
Genes direct the function of all organs and systems in the body. In some cases, the defects in the
DNA of just one gene can cause a genetic disorder that results in disease because the protein encoded by the defective gene is abnormal. The abnormal hemoglobin produced by people afflicted with sickle cell anemia is an example. Defects in certain genes called oncogenes, which regulate growth and development, give rise to cancer. Therefore, defects in DNA can affect the two kinds of genetic information it carries, messages directing the manufacture of proteins and information regulating the expression, or carrying out, of these messages.
History
Prior to the discovery of the nucleic acids, the Austrian monk Gregor Mendel (1822-1884) worked out the laws of inheritance by the selective breeding of pea plants. As early as 1865 he proposed that some then-undefined factors from each parent were responsible for the inheritance of certain characteristics in plants. The Swiss biochemist Friedrich Miescher (1844-1895) discovered the nucleic acids in 1868 in nuclei isolated from pus cells scraped from surgical bandages. However, research on the chemical structure of nucleic acids lagged until new analytical techniques became available in the mid twentieth century.
Despite knowledge of the chemical structure of nucleotides and how they were linked together to form DNA, the possibility that DNA was the genetic material was regarded as unlikely. As late as the mid twentieth century, proteins were thought to be the molecules of heredity because they appeared to be the only cellular components diverse enough to account for the large variety of genes. In 1944, Oswald Avery (1877-1955) and his colleagues showed that non-pathogenic strains of pneumococcus, the bacterium that causes pneumonia, could become pathogenic (disease-causing) if treated with a DNA-containing extract
from heat-killed pathogenic strains. Based on this evidence, Avery concluded that DNA was the genetic material. However, widespread acceptance of DNA as the bearer of genetic information did not come until a report by other workers in 1952 that DNA, not protein, enters a bacterial cell infected by a virus. This showed that the genetic material of the virus was contained in its DNA, confirming Avery’s hypothesis.
In 1953, James Watson (1928-) and Francis Crick (1916-2004) proposed their double helix model for the three-dimensional structure of DNA. They correctly deduced that the genetic information was encoded in the form of the sequence of nucleotides in the molecule. With their landmark discovery began an era of molecular genetics in biology. Eight years later investigators cracked the genetic code. They found that specific trinucleotide sequences—sequences of three nucleotides—are codes for each of 20 amino acids, the building blocks of proteins.
In 1970 scientists found that bacteria contained enzymes that recognize a particular sequence of 4-8 nucleotides and will always cut DNA at or near that sequence to yield specific (rather than random), consistently reproducible DNA fragments. These enzymes were dubbed restriction enzymes. Two years later it was found that the bacterial enzyme DNA ligase could be used to rejoin these fragments. This permitted scientists to construct what were termed recombinant DNA; DNA composed of segments from two different sources, even from different organisms. With the availability of these tools, genetic engineering became possible and biotechnology began.
By 1984 the development of DNA fingerprinting allowed forensic chemists to compare DNA samples from a crime scene with that of suspects. The first conviction using this technique came in 1987. Three years later doctors first attempted to treat a patient unable to produce a vital immune protein using gene therapy. This technique involves inserting a portion of DNA into a patient’s cells to correct a deficiency in a particular function. The Human Genome Project also began in 1990. The aim of this project is to determine the nucleotide sequence in DNA of the entire human genome, which consists of about three billion nucleotide pairs. In 2001, researchers announced the completion of the sequencing of a human genome.
Structure
Deoxyribose, the sugar component in each nucleotide, is so-named because it has one less oxygen atom than ribose, which is present in ribonucleic acid (RNA). Deoxyribose contains five carbonatoms, four of which lie in a ring along with one oxygen atom. The fifth carbon atom is linked to a specific carbon atom in the ring. A phosphate group is always linked to deoxyribose via a chemical bond between an oxygen atom in the phosphate group and the carbon atom in deoxyribose by a chemical bond between a nitrogen atom in the base and a specific carbon atom in the deoxyribose ring.
The nucleotide components of DNA are connected to form a linear polymer in a very specific way. A phosphate group always connects the sugar component of a nucleotide with the sugar component of the next nucleotide in the chain. Consequently, the first nucleotide bears an unattached phosphate group, and the last nucleotide has a free hydroxyl group. Therefore, DNA is not the same at both ends. This directionality plays an important role in the replication of DNA.
DNA molecules contain two polymer chains or strands of nucleotides and so are said to be double-stranded. (In contrast, RNA is typically single-stranded.) Their shape resembles two intertwined spiral staircases in which the alternating sugar and phosphate groups of the nucleotides compose the sidepieces. The steps consist of pairs of bases, each attached to the sugars on their respective strands. The bases are held together by weak attractive forces called hydrogen bonds. The two strands in DNA are antiparallel, which means that one strand goes in one direction (first to last nucleotide from top to bottom) and the other strand goes in the opposite direction (first to last nucleotide from bottom to top).
Because the sugar and phosphate components which make up the sidepieces are always attached in the same way, the same alternating phosphate-sugar sequence repeats over and over again. The bases attached to each sugar may be one of four possible types. Because of the geometry of the DNA molecule, the only possible base pairs that will fit are adenine (A) paired with thymine (T), and cytosine (C) paired with guanine (G).
The DNA in our cells is a masterpiece of packing. The double helix coils itself around protein cores to form nucleosomes. These DNA-protein structures resemble beads on a string. Flexible regains between nucleosomes allows these structures to be wound around themselves to produce an even more compact fiber. The fibers can then be coiled for even further compactness. Ultimately, DNA is paced into the highly condensed chromosomes. If the DNA in a human cell is stretched, it is approximately 6 ft (1.82 m) long. If all 46 chromosomes are laid end-to-end, their total length is still only about eight-thousandths of an inch. This means that DNA in chromosomes is condensed about 10, 000 times more than that in the double helix. Why all this packing? The likely answer is that the fragile DNA molecule would get broken in its extended form. Also, if not for this painstaking compression, the cell might be mired in its own DNA.
Function
DNA directs a cell’s activities by specifying the structures of its proteins and by regulating which proteins and how much are produced, and where. In so doing, it never leaves the nucleus. Each human cell contains about 6 ft (2 m) of highly condensed DNA, which encodes some 30, 000 genes. If a particular protein is to be made, the DNA segment corresponding to the gene for that protein acts as a template (pattern) for the synthesis of an RNA molecule in a process known as transcription. This messenger RNA molecule travels from the nucleus to the cytoplasm where it in turn acts as the template for the construction of the protein by the protein assembly apparatus of the cell. This latter process is known as translation and requires an adaptor molecule, transfer RNA, which translates the genetic code of DNA into the language of proteins.
Eventually, when a cell divides, its DNA must be copied so that each daughter cell will have a complete set of genetic instructions. The structure of DNA is perfectly suited to this process. The two intertwined strands unwind, exposing their bases, which then pair with bases on free nucleotides present in the cell. Because of the base-pairing rules, the sequence of bases along one strand of DNA determines the sequence of bases in the newly forming complementary strand. An enzyme then joins the free nucleotides to complete the new strand. Since the two new DNA strands that result are identical to the two originals, the cell can pass along an exact copy of its DNA to each daughter cell.
Sex cells, the eggs and sperm, contain half the number of chromosomes as other cells. When the egg and sperm fuse during fertilization, they form the first cell of a new individual with the complete complement of DNA—46 chromosomes. Each cell (except the sex cells) in the new person carries DNA identical to that in the fertilized egg cell. In this way the DNA of both parents is passed from one generation to the next. Thus, DNA plays a crucial role in the propagation of life.
Replication of DNA
DNA replication, the process by which the double-stranded DNA molecule reproduces itself, is a complicated process, even in the simplest organisms. DNA synthesis—making new DNA from old—is complex because it requires the interaction of a number of cellular components and is rigidly controlled to ensure the accuracy of the copy, upon which the very life of the organism depends. This adds several verification steps to the procedure. Though the details vary from organism to organism, DNA replication follows certain rules that are universal to all.
DNA replication (duplication, or copying) is always semi-conservative. This means that during DNA replication the two strands of the parent molecule unwind and each becomes a template for the synthesis of the complementary strand of the daughter molecule. As a result both daughter molecules contain one new strand and one old strand (from the parent molecule). The replication of DNA always requires a template, an intact strand from the parent molecule. This strand determines the sequence of nucleotides on the new strand, because of the A-withT and C-with-G base pairing requirement.
Replication begins at a specific site called the replication origin when the enzyme DNA helicase binds to a portion of the double stranded helix and “melts” the bonds between base pairs. This unwinds the helix to form a replication fork consisting of two separated strands, each serving as a template. Specific proteins then bind to these single strands to prevent them from re-pairing. Another enzyme called DNA polymerase proceeds to assemble the daughter strands using a pool of free nucleotide units which are present in the cell in an “activated” form.
High fidelity in the copying of DNA is vital to the organism and, incredibly, only about one error per one trillion replications ever occurs. This high fidelity results largely because DNA polymerase is a “self-editing” enzyme. If a nucleotide added to the end of the chain mismatches the complementary nucleotide on the template, pairing does not occur. DNA polymerase then clips off the unpaired nucleotide and replaces it with the correct one.
Occasionally errors are made during DNA replication and passed along to daughter cells. Such errors are called mutations. They have serious consequences because they can cause the insertion of the wrong amino acid into a protein. For example, the substitution of a T for an A in the gene encoding hemoglobin causes an amino acid substitution that results in sickle cell anemia. To understand the significance of such mutations requires knowledge of the genetic code.
The genetic code
Genetic information is stored as nucleotide sequences in DNA (or RNA) molecules. This sequence specifies the identity and position of the amino acids in a particular protein. Amino acids are the building blocks of proteins in the same way that nucleotides are the building blocks of DNA. However, though there are only four possible bases in DNA (or RNA), there are 20 possible amino acids in proteins. The genetic code is a sort of “bilingual dictionary” which translates the language of DNA into the language of proteins. In the genetic code the letters are the four bases A, C, G, and T (or U instead of T in RNA). Obviously, the four bases of DNA are not enough to code for 20 amino acids. A sequence of two bases is also insufficient, because this permits coding for only 16 of the 20 amino acids in proteins. Therefore, a sequence of three bases is required to ensure enough combinations to code for all 20 amino acids. Since all the combinations in this DNA language, called codons, consist of three letters, the genetic code is often referred to as the triplet code.
Each codon specifies a particular amino acid. Because there are 64 possible codons and only 20 amino acids, several different codons specify the same amino acid, so the genetic code is said to be degenerate. However, the code is unambiguous because each codon specifies only one amino acid.
Since in eukaryotes DNA never leaves the nucleus, the information it stores is not transferred to the cell directly. Instead, a DNA sequence must first be copied into a messenger RNA molecule, which carries the genetic information from the nucleus to protein assembly sites in the cytoplasm. There it serves as the template for protein construction. The sequences of nucleotide triplets in messenger RNA are also referred to as codons.
Expression of genetic information
Genetic information flows from DNA to RNA to protein. Ultimately, the linear sequence of nucleotides in DNA directs the production of a protein molecule with a characteristic three-dimensional structure essential to its proper function. Initially, information is transcribed from DNA to RNA. The information in the resulting messenger RNA is then translated from RNA into protein by small transfer RNA molecules.
In some exceptional cases the flow of genetic information from DNA to RNA is reversed. In retro-viruses, such as the AIDS virus, RNA is the hereditary material. An enzyme known as reverse transcriptase makes a copy of DNA using the virus’ RNA as a template. In still other viruses which use RNA as the
KEY TERMS
Codon— The base sequence of three consecutive nucleotides on DNA (or RNA) that codes for a particular amino acid or signals the beginning or end of a messenger RNA molecule.
Cytoplasm— All the protoplasm in a living cell that is located outside of the nucleus, as distinguished from nucleoplasm, which is the protoplasm in the nucleus.
Gene— A discrete unit of inheritance, represented by a portion of DNA located on a chromosome. The gene is a code for the production of a specific kind of protein or RNA molecule, and therefore for a specific inherited characteristic.
Genetic code— The blueprint for all structures and functions in a cell as encoded in DNA.
Genetic engineering— The manipulation of the genetic content of an organism for the sake of genetic analysis or to produce or improve a product.
Genome— The complete set of genes an organism carries.
Nucleotide— The basic unit of DNA. It consists of deoxyribose, phosphate, and a ring-like, nitrogen-containing base.
Nucleus— A compartment in the cell which is enclosed by a membrane and which contains its genetic information.
Replication— The synthesis of a new DNA molecule from a pre-existing one.
Transcription— The process of synthesizing RNA from DNA.
Translation— The process of protein synthesis.
hereditary material, DNA is not involved in the flow of information at all.
Most cells in the body contain the same DNA as that in the fertilized egg. (Some exceptions to this are the sex cells, which contain only half of the normal complement of DNA, as well as red blood cells, which lose their nucleus when fully developed.) Some so-called housekeeping genes are expressed in all cells because they are involved in the fundamental processes required for normal function. (A gene is said to be expressed when its product, the protein it codes for, is actively produced in a cell.) For example, since all cells require ribosomes, structures that function as protein assembly lines, the genes for ribosomal proteins and ribosomal RNA are expressed in all cells. Other genes are only expressed in certain cell types, such as genes for antibodies in certain cells of the immune system. Some are expressed only during certain times in development. How is it that some cells express certain genes while others do not, even though all contain the same DNA? A complete answer to this question is still in the works. However, the main way is by controlling the start of transcription. This is accomplished by the interaction of proteins called transcription factors with DNA sequences near the gene. By binding to these sequences transcription factors may turn a gene on or off.
Another way is to change the rate of messenger RNA synthesis. Sometimes the stability of the messenger RNA is altered. The protein product itself may be altered, as well as its transport or stability. Finally, gene expression can be altered by DNA rearrangements. Such programmed reshuffling of DNA is the means of generating the huge assortment of antibody proteins found in immune cells.
Genetic engineering and recombinant DNA
Cells that contain the same recombinant DNA fragment are clones. A clone harboring a recombinant DNA molecule that contains a specific gene can be isolated and identified by a number of techniques, depending upon the particular experiment. Thus, recombinant DNA molecules can be introduced into rapidly growing microorganisms, such as bacteria or yeast, to produce large quantities of medically or commercially important proteins normally present only in scant amounts in the cell. For example, human insulin and interferon have been produced in this manner.
In recent years a technique has been developed which permits analysis of very small samples of DNA without repeated cloning, which is laborious. Known as the polymerase chain reaction, this technique involves “amplifying” a particular fragment of DNA by repeated synthesis using the enzyme DNA polymerase. This method can increase the amount of the desired DNA fragment by a million-fold or more.
Resources
BOOKS
Butler, John M. Forensic DNA Typing, Second Edition: Biology, Technology, and Genetics of STR Markers. New York: Academic Press, 2005.
Stebbins, Michael. Sex, Drugs and DNA: Science’s Taboos Confronted. New York: MacMillan, 2006.
Walker, Richard. Genes and DNA. London: Kingfisher, 2003.
Watson, James D. and Andrew Berry. DNA: The Secret of Life. New York: Knopf, 2004.
Patricia V. Racenis
Deoxyribonucleic Acid (DNA)
Deoxyribonucleic acid (DNA)
Deoxyribonucleic acid (DNA), "the master molecule," is a natural polymer which encodes the genetic information required for the growth, development, and reproduction of an organism . Found in all cells, it consists of chains of units called nucleotides. Each nucleotide unit contains three components: the sugar deoxyribose, a phosphate group, and a nitrogen-containing amine or base with a ring-type structure. The base component can be any of four types: adenine, cytosine, guanine or thymine.
DNA molecules are very long and threadlike. They consist of two polymeric strands twisted about each other into a spiral shape known as a double helix , which resembles two intertwined circular staircases. DNA is found within the cell nucleus in the chromosomes, which are extremely condensed structures in which DNA is associated with proteins . Each species contains a characteristic number of chromosomes in their cells. In humans, every cell contains 46 chromosomes (except for egg and sperm cells which contain only 23). The total genetic information in a cell is called its genome .
The fundamental units of heredity are genes. A gene is a segment of a DNA molecule that encodes the information necessary to make a specific protein. Proteins are the "workhorses" of the cell. These large, versatile molecules serve as structural components: they transport molecules in and out of cells, catalyze cellular reactions, and recognize and eliminate invaders. Imagine a community in which the trash collectors, goods distributors, manufacturers, and police are all on strike, and you get an idea of the importance of proteins in the life of a cell.
DNA not only encodes the "blueprints" for cellular proteins but also the instructions for when and where they will be made. For example, the oxygen carrier hemoglobin is made in red blood cells but not in nerve cells, though both contain the same total genetic content.
Thus, DNA also contains the information necessary for regulating how its genetic messages are used.
Human cells are thought to contain between 50,000 and 100,000 genes. Except in the case of identical twins, a comparison of the genes from different individuals always reveals a number of differences. Therefore, each person is genetically unique. This is the basis of DNA "fingerprinting," a forensic procedure used to match DNA collected from a crime scene with that of a suspect.
Through the sum of their effects, genes direct the function of all organs and systems in the body. Defects in the DNA of just one gene can cause a genetic disorder which results in disease because the protein encoded by the defective gene is abnormal. The abnormal hemoglobin produced by people afflicted with sickle cell anemia is an example. Defects in certain genes called oncogenes, which regulate growth and development, give rise to cancer . Only about 100 genes are thought to be oncogenes. Therefore, defects in DNA can affect the two kinds of genetic information it carries, messages directing the manufacture of proteins and information regulating the expression, or carrying out, of these messages.
History
Prior to the discovery of the nucleic acids, the Austrian monk Gregor Mendel (1822-1884) worked out the laws of inheritance by the selective breeding of peaplants. As early as 1865 he proposed that "factors" from each parent were responsible for the inheritance of certain characteristics in plants. The Swiss biochemist Friedrich Miescher (1844-1895) discovered the nucleic acids in 1868 in nuclei isolated from pus cells scraped from surgical bandages. However, research on the chemical structure of nucleic acids lagged until new analytical techniques became available in the mid twentieth century. With the advent of these new methods came evidence that the nucleic acid we now know as DNA. DNA was present in the nuclei of all cells and evidence about the chemical structure of its nucleotide components.
Despite knowledge of the chemical structure of nucleotides and how they were linked together to form DNA, the possibility that DNA was the genetic material was regarded as unlikely. As late as the mid twentieth century, proteins were thought to be the molecules of heredity because they appeared to be the only cellular components diverse enough to account for the large variety of genes. In 1944, Oswald Avery (1877-1955) and his colleagues showed that non-pathogenic strains of pneumococcus, the bacterium that causes pneumonia, could become pathogenic (disease-causing) if treated with a DNA-containing extract from heat-killed pathogenic strains. Based on this evidence, Avery concluded that DNA was the genetic material. However, widespread acceptance of DNA as the bearer of genetic information did not come until a report by other workers in 1952 that DNA, not protein, enters a bacterial cell infected by a virus . This showed that the genetic material of the virus was contained in its DNA, confirming Avery's hypothesis.
Shortly afterwards in 1953, James Watson (1928-) and Francis Crick (1916-) proposed their double helix model for the three-dimensional structure of DNA. They correctly deduced that the genetic information was encoded in the form of the sequence of nucleotides in the molecule. With their landmark discovery began an era of molecular genetics in biology . Eight years later investigators cracked the genetic code. They found that specific trinucleotide sequences—sequences of three nucleotides—are codes for each of 20 amino acids, the building blocks of proteins.
In 1970 scientists found that bacteria contained restriction enzymes molecular "scissors" that recognize a particular sequence of 4-8 nucleotides and will always cut DNA at or near that sequence to yield specific (rather than random ), consistently reproducible DNA fragments. Two years later it was found that the bacterial enzyme DNA ligase could be used to rejoin these fragments. This permitted scientists to construct "recombinant" DNA molecules; that is, DNA molecules composed of segments from two different sources, even from different organisms. With the availability of these tools, genetic engineering became possible and biotechnology began.
By 1984 the development of DNA fingerprinting allowed forensic chemists to compare DNA samples from a crime scene with that of suspects. The first conviction using this technique came in 1987. Three years later doctors first attempted to treat a patient unable to produce a vital immune protein using gene therapy . This technique involves inserting a portion of DNA into a patient's cells to correct a deficiency in a particular function. The Human Genome Project also began in 1990. The aim of this project is to determine the nucleotide sequence in DNA of the entire human genome, which consists of about three billion nucleotide pairs. In 2001, researchers announced the completion of the sequencing of a human genome, promising refinement by 2003.
Structure
Deoxyribose, the sugar component in each nucleotide, is so called because it has one less oxygen atom than ribose, which is present in ribonucleic acid (RNA) . Deoxyribose contains five carbon atoms, four of which lie in a ring along with one oxygen atom. The fifth carbon atom is linked to a specific carbon atom in the ring. A phosphate group is always linked to deoxyribose via a chemical bond between an oxygen atom in the phosphate group and the carbon atom in deoxyribose by a chemical bond between a nitrogen atom in the base and a specific carbon atom in the deoxyribose ring.
The nucleotide components of DNA are connected to form a linear polymer in a very specific way. A phosphate group always connects the sugar component of a nucleotide with the sugar component of the next nucleotide in the chain. Consequently, the first nucleotide bears an unattached phosphate group, and the last nucleotide has a free hydroxyl group. Therefore, DNA is not the same at both ends. This directionality plays an important role in the replication of DNA.
DNA molecules contain two polymer chains or strands of nucleotides and so are said to be double-stranded. (In contrast, RNA is typically single-stranded.) Their shape resembles two intertwined spiral staircases in which the alternating sugar and phosphate groups of the nucleotides compose the sidepieces. The steps consist of pairs of bases, each attached to the sugars on their respective strands. The bases are held together by weak attractive forces called hydrogen bonds. The two strands in DNA are antiparallel, which means that one strand goes in one direction (first to last nucleotide from top to bottom) and the other strand goes in the opposite direction (first to last nucleotide from bottom to top).
Because the sugar and phosphate components which make up the sidepieces are always attached in the same way, the same alternating phosphate-sugar sequence repeats over and over again. The bases attached to each sugar may be one of four possible types. Because of the geometry of the DNA molecule, the only possible base pairs that will fit are adenine (A) paired with thymine (T), and cytosine (C) paired with guanine (G).
The DNA in our cells is a masterpiece of packing. The double helix coils itself around protein cores to form nucleosomes. These DNA-protein structures resemble beads on a string. Flexible regains between nucleosomes allows these structures to be wound around themselves to produce an even more compact fiber. The fibers can then be coiled for even further compactness. Ultimately, DNA is paced into the highly condensed chromosomes. If the DNA in a human cell is stretched, it is approximately 6 ft (1.82 m) long. If all 46 chromosomes are laid end-to-end, their total length is still only about eight-thousandths of an inch. This means that DNA in chromosomes is condensed about 10,000 times more than that in the double helix. Why all this packing? The likely answer is that the fragile DNA molecule would get broken in its extended form. Also, if not for this painstaking compression, the cell might be mired in its own DNA.
Function
DNA directs a cell's activities by specifying the structures of its proteins and by regulating which proteins and how much are produced, and where. In so doing, it never leaves the nucleus. Each human cell contains about 6 ft (2 m) of highly condensed DNA which encodes some 50,000–100,000 genes. If a particular protein is to be made, the DNA segment corresponding to the gene for that protein acts as a template, a pattern, for the synthesis of an RNA molecule in a process known as transcription. This messenger RNA molecule travels from the nucleus to the cytoplasm where it in turn acts as the template for the construction of the protein by the protein assembly apparatus of the cell. This latter process is known as translation and requires an adaptor molecule, transfer RNA, which translates the genetic code of DNA into the language of proteins.
Eventually, when a cell divides, its DNA must be copied so that each daughter cell will have a complete set of genetic instructions. The structure of DNA is perfectly suited to this process. The two intertwined strands unwind, exposing their bases, which then pair with bases on free nucleotides present in the cell. The bases pair only in a certain combination; adenine (A) always pairs with thymine (T) and cytosine (C) always pairs with guanine (G). The sequence of bases along one strand of DNA therefore determines the sequence of bases in the newly forming complementary strand. An enzyme then joins the free nucleotides to complete the new strand. Since the two new DNA strands that result are identical to the two originals, the cell can pass along an exact copy of its DNA to each daughter cell.
Sex cells, the eggs and sperm, contain half the number of chromosomes as other cells. When the egg and sperm fuse during fertilization , they form the first cell of a new individual with the complete complement of DNA—46 chromosomes. Each cell (except the sex cells) in the new person carries DNA identical to that in the fertilized egg cell. In this way the DNA of both parents is passed from one generation to the next. Thus, DNA plays a crucial role in the propagation of life.
Replication of DNA
DNA replication , the process by which the double-stranded DNA molecule reproduces itself, is a complicated process, even in the simplest organisms. DNA synthesis—making new DNA from old—is complex because it requires the interaction of a number of cellular components and is rigidly controlled to ensure the accuracy of the copy, upon which the very life of the organism depends. This adds several verification steps to the procedure. Though the details vary from organism to organism, DNA replication follows certain rules that are universal to all.
DNA replication (duplication, or copying) is always semi-conservative. During DNA replication the two strands of the parent molecule unwind and each becomes a template for the synthesis of the complementary strand of the daughter molecule. As a result both daughter molecules contain one new strand and one old strand (from the parent molecule), hence the term semi-conservative. The replication of DNA always requires a template, an intact strand from the parent molecule. This strand determines the sequence of nucleotides on the new strand. Wherever the nucleotide on the template strand contains the base A, then the nucleotide to be added to the daughter strand at that location must contain the base T. Conversely, every T must find an A to pair with. In the same way, Gs and Cs will pair with each other and with no other bases.
Replication begins at a specific site called the replication origin when the enzyme DNA helicase binds to a portion of the double stranded helix and "melts" the bonds between base pairs. This unwinds the helix to form a replication fork consisting of two separated strands, each serving as a template. Specific proteins then bind to these single strands to prevent them from repairing. Another enzyme, DNA polymerase, proceeds to assemble the daughter strands using a pool of free nucleotide units which are present in the cell in an "activated" form.
High fidelity in the copying of DNA is vital to the organism and, incredibly, only about one error per one trillion replications ever occurs. This high fidelity results largely because DNA polymerase is a "self-editing" enzyme. If a nucleotide added to the end of the chain mismatches the complementary nucleotide on the template, pairing does not occur. DNA polymerase then clips off the unpaired nucleotide and replaces it with the correct one.
Occasionally errors are made during DNA replication and passed along to daughter cells. Such errors are called mutations. They have serious consequences because they can cause the insertion of the wrong amino acid into a protein. For example, the substitution of a T for an A in the gene encoding hemoglobin causes an amino acid substitution which results in sickle cell anemia . To understand the significance of such mutations requires knowledge of the genetic code.
The genetic code
Genetic information is stored as nucleotide sequences in DNA (or RNA) molecules. This sequence specifies the identity and position of the amino acids in a particular protein. Amino acids are the building blocks of proteins in the same way that nucleotides are the building blocks of DNA. However, though there are only four possible bases in DNA (or RNA), there are 20 possible amino acids in proteins. The genetic code is a sort of "bilingual dictionary" which translates the language of DNA into the language of proteins. In the genetic code the letters are the four bases A, C, G, and T (or U instead of T in RNA). Obviously, the four bases of DNA are not enough to code for 20 amino acids. A sequence of two bases is also insufficient, because this permits coding for only 16 of the 20 amino acids in proteins. Therefore, a sequence of three bases is required to ensure enough combinations or "words" to code for all 20 amino acids. Since all words in this DNA language, called codons , consist of three letters, the genetic code is often referred to as the triplet code.
Each codon specifies a particular amino acid. Because there are 64 possible codons (for example 43 = 64 different 3-letter "words" can be generated from a 4-letter "alphabet") and only 20 amino acids, several different codons specify the same amino acid, so the genetic code is said to be degenerate. However, the code is unambiguous because each codon specifies only one amino acid. The sequence of codons are not interrupted by "commas" and are always read in the same frame of reference, starting with the same base every time. So the "words" never overlap.
Since DNA never leaves the nucleus, the information it stores is not transferred to the cell directly. Instead, a DNA sequence must first be copied into a messenger RNA molecule, which carries the genetic information from the nucleus to protein assembly sites in the cytoplasm. There it serves as the template for protein construction. The sequences of nucleotide triplets in messenger RNA are also referred to as codons.
Four codons serve special functions. Three are stop codons that signal the end of protein synthesis. The fourth is a start codon which establishes the "reading frame" in which the message is to be read. For example, suppose the message is PAT SAW THE FAT RAT. If we overshoot the reading frame by one "nucleotide," we obtain ATS AWT HEF ATR AT, which is meaningless.
The genetic code is essentially universal. This means that a codon which specifies the amino acid tryptophan in bacteria also codes for it in man. The only exceptions occur in mitochondria and chloroplasts and in some protozoa . (Mitochondria and chloroplasts are sub-cellular compartments which are the sites of respiration in animals and plants, respectively, and contain some DNA.) The structure of the genetic code has evolved to minimize the effect of mutations. Changes in the third base of a codon do not necessarily result in a change in the specified amino acid during protein synthesis. Furthermore, changes in the first base in a codon generally result in the same or at least a similar amino acid. Studies of amino acid changes resulting from mutations have shown that they are consistent with the genetic code. That is, amino acid changes resulting from mutations are consistent with expected base changes in the corresponding codon. These studies have confirmed that the genetic code has been deduced correctly by demonstrating its relevance in actual living organisms.
Expression of genetic information
Genetic information flows from DNA to RNA to protein. Ultimately, the linear sequence of nucleotides in DNA directs the production of a protein molecule with a characteristic three dimensional structure essential to its proper function. Initially, information is transcribed from DNA to RNA. The information in the resulting messenger RNA is then translated from RNA into protein by small transfer RNA molecules.
In some exceptional cases the flow of genetic information from DNA to RNA is reversed. In retroviruses, such as the AIDS virus, RNA is the hereditary material. An enzyme known as reverse transcriptase makes a copy of DNA using the virus' RNA as a template. In still other viruses which use RNA as the hereditary material, DNA is not involved in the flow of information at all.
Most cells in the body contain the same DNA as that in the fertilized egg. (Some exceptions to this are the sex cells, which contain only half of the normal complement of DNA, as well as red blood cells which lose their nucleus when fully developed.) Some "housekeeping" genes are expressed in all cells because they are involved in the fundamental processes required for normal function. (A gene is said to be expressed when its product, the protein it codes for, is actively produced in a cell.) For example, since all cells require ribosomes , structures which function as protein assembly lines, the genes for ribosomal proteins and ribosomal RNA are expressed in all cells. Other genes are only expressed in certain cell types, such as genes for antibodies in certain cells of the immune system . Some are expressed only during certain times in development. How is it that some cells express certain genes while others do not, even though all contain the same DNA? A complete answer to this question is still in the works. However, the main way is by controlling the start of transcription. This is accomplished by the interaction of proteins called transcription factors with DNA sequences near the gene. By binding to these sequences transcription factors may turn a gene on or off.
Another way is to change the rate of messenger RNA synthesis. Sometimes the stability of the messenger RNA is altered. The protein product itself may be altered, as well as its transport or stability. Finally, gene expression can be altered by DNA rearrangements. Such programmed reshuffling of DNA is the means of generating the huge assortment of antibody proteins found in immune cells.
Genetic engineering and recombinant DNA
Restriction enzymes come from microorganisms . Recall that they recognize and cut DNA at specific base pair sequences. They cleave large DNA molecules into an assortment of smaller fragments ranging in size from a few to thousands of base pairs long, depending on how often and where the cleavage sequence appears in the original DNA molecule. The resulting fragments can be separated by their size using a technique known as electrophoresis . The fragments are placed at the top of a porous gel surrounded by a solution which conducts electricity . When a voltage is applied, the DNA fragments move towards the bottom of the gel due to the negative charge on their phosphate groups. Because it is more difficult for the large fragments to pass through the pores in the gel, they move more slowly than the smaller fragments.
DNA fragments isolated from a gel in this way can be joined with DNA from another source, either of the same or a different species, into a new, recombinant DNA molecule by enzymes. Usually, such DNA fragments are joined with DNA from subcellular organisms—"parasites" that live inside another organism but have their own DNA. Plasmids and viruses are two such examples. Viruses consist only of nucleic acids encapsulated in a protein coat. Though they can exist outside the cell, they are inactive. Inside the cell, they take over its metabolic machinery to manufacture more virus particles, eventually destroying their host. Plasmids are simpler than viruses in that they never exist outside the cell and have no protein coat. They consist only of circular double-stranded DNA. Plasmids replicate their DNA independently of their hosts. They are passed on to daughter cells in a controlled way as the host cell divides.
Cells that contain the same recombinant DNA fragment are clones. A clone harboring a recombinant DNA molecule that contains a specific gene can be isolated and identified by a number of techniques, depending upon the particular experiment. Thus, recombinant DNA molecules can be introduced into rapidly growing microorganisms, such as bacteria or yeast , to produce large quantities of medically or commercially important proteins normally present only in scant amounts in the cell. For example, human insulin and interferon have been produced in this manner.
In recent years a technique has been developed which permits analysis of very small samples of DNA without repeated cloning, which is laborious. Known as the polymerase chain reaction, this technique involves "amplifying" a particular fragment of DNA by repeated synthesis using the enzyme DNA polymerase. This method can increase the amount of the desired DNA fragment by a million-fold or more.
See also Chromosome; Enzyme; Genetics; Meiosis; Mitosis; Mutation; Nucleic acid.
Resources
books
Berg, Paul, and Maxine Singer. Dealing with Genes—The Language of Heredity. Mill Valley, CA: University Science Press, 1992.
Blueprint for Life. Journey Through the Mind and Body series. Alexandria, VA: Time-Life Books, 1993.
Lee, Thomas F. Gene Future. New York: Plenum Publishing Corporation, 1993.
Rosenfeld, Israel, Edward Ziff, and Borin Van Loon. DNA forBeginners. New York: Writers and Readers Publishing Cooperative Limited, 1983.
Sofer, William H. Introduction to Genetic Engineering. Stoneham, MA: Butterwoth-Heineman, 1991.
Patricia V. Racenis
KEY TERMS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
- Codon
—The base sequence of three consecutive nucleotides on DNA (or RNA) that codes for a particular amino acid or signals the beginning or end of a messenger RNA molecule.
- Cytoplasm
—All the protoplasm in a living cell that is located outside of the nucleus, as distinguished from nucleoplasm, which is the protoplasm in the nucleus.
- Gene
—A discrete unit of inheritance, represented by a portion of DNA located on a chromosome. The gene is a code for the production of a specific kind of protein or RNA molecule, and therefore for a specific inherited characteristic.
- Genetic code
—The blueprint for all structures and functions in a cell as encoded in DNA.
- Genetic engineering
—The manipulation of the genetic content of an organism for the sake of genetic analysis or to produce or improve a product.
- Genome
—The complete set of genes an organism carries.
- Nucleotide
—The basic unit of DNA. It consists of deoxyribose, phosphate, and a ring-like, nitrogen-containing base.
- Nucleus
—A compartment in the cell which is enclosed by a membrane and which contains its genetic information.
- Replication
—The synthesis of a new DNA molecule from a pre-existing one.
- Transcription
—The process of synthesizing RNA from DNA.
- Translation
—The process of protein synthesis.
DNA
DNA
DNA (deoxyribonucleic acid) is the molecule that stores genetic information in living systems. Like other organic molecules, DNA mostly consists of carbon, along with hydrogen, oxygen, nitrogen, and phosphorus. The fundamental structural unit of DNA is the nucleotide , which has two parts: an unvarying portion composed of sugar and phosphate, attached to one of four nitrogen-containing bases named adenine, cytosine, guanine, or thymine (abbreviated A, C, G, T).
The Double Helix
The structure of DNA, deduced in 1953 by James Watson, Francis Crick, and Rosalind Franklin, resembles that of a twisted ladder or spinal staircase composed of two long chains of nucleotides that are coiled around each other to form a double helix. The DNA ladder's two sidepieces (its double-stranded backbone) are made of alternating units of sugar and phosphate. The sugar is deoxyribose, which contains a ring of four carbons and one oxygen. A phosphate is an atom of phosphorus bonded to four oxygens. Bases attached to opposing sugars project inward toward each other to form rungs or steps, called base pairs . In contrast to the strong covalent (electron-sharing) bonds between nucleotides in a strand, the two bases in a base pair are held together only by much weaker hydrogen bonds . However, the cumulative attractive force of the hydrogen bonds in a chain of base pairs maintains DNA as a double-stranded molecule under physiological conditions. In the cell nucleus , DNA is bound to proteins to form chromosomes , and is coated with a layer of water molecules.
To make a sturdy rung, the two bases in a base pair have to interlock like pieces of a jigsaw puzzle, which only happens if their shapes and hydrogen-bonding characteristics are compatible. Only two combinations fulfill these requirements in DNA: G–C and A–T. This rule makes the two strands of a DNA molecule complementary , so that if the bases of one strand are ordered GGTACAT, the bases of the opposite strand must be ordered CCATGTA. The order of the bases on a strand (mirrored in the complementary strand) is called the sequence of the DNA, and embodies coded instructions for making new biomolecules: proteins, ribonucleic acid (RNA), and DNA itself.
Complementarity and Replication
Each strand of DNA has a direction in which it can be read by the cellular machinery, arising from the arrangement of phosphates and sugars in the backbone. The two strands of DNA are oriented antiparallel to each other, that is, they lie parallel to each other but are decoded in opposite directions. Because of the numbering convention for the combinations in sugar, the directions along the backbone are called 5′ → → → 3′ ("five-prime to three-prime") or 3′ → 5′. The complementary nature of the two strands means that instructions for making new DNA can be read from both strands.
When DNA replicates, the weak hydrogen bonds of base pairs are broken and the two strands separate. Each strand acts as a template for the synthesis of a new complementary strand. Since the resulting new doublestranded molecule always contains one "old" (template) strand and one newly made strand, DNA replication is said to be semiconservative; it would be termed conservative if the two original template strands rejoined. By a similar mechanism (transcription), a DNA strand can be a template for the synthesis of RNA, which is a single-stranded nucleic acid that carries coded information from the DNA to the protein synthesizing machinery of the cell. During protein synthesis, the genetic code is used to translate the order of bases originally found in the DNA sequence into the order of amino acid building blocks in a protein.
Genes, Noncoding Sequences, and Methylation
DNA exists in nature as a macromolecule millions of base pairs long. In multicelled organisms, the complete set of genetic information—the genome —is divided among several DNA macromolecules (called chromosomes) in the cell nucleus. In contrast, the genomes of many one-celled organisms consist of a single, often circular, chromosome. The human genome contains 3.2 billion base pairs distributed among twenty-three chromosomes. Laid end to end, these would make a macromolecule 1.7 meters (5.5 feet) long; printed out, they would fill one thousand one-thousand-page telephone books. Furthermore, two copies of the genome are in almost every cell of humans and other diploid organisms. This vast amount of DNA packs into a cell nucleus, whose volume is only a few millionths of a cubic meter, by first spooling around globular proteins called histones . The DNA/histone complex then coils and curls up into even denser configurations, like a rubber band does when one holds one end and rolls the other end between one's fingers. Yet the human genome isn't nearly nature's biggest: the genome of a lily is just over ten times larger than a human's, although its nuclei are not significantly larger.
WILKINS, MAURICE (1916– )
New Zealand–born British biologist who helped James Watson and Francis Crick deduce the structure of deoxyribonucleic acid (DNA), for which the three men received a 1962 Nobel Prize. Wilkins secretly showed Watson an x-ray diffraction photo of DNA taken by researcher Rosalind Franklin. Watson and Crick later used Franklin's extensive unpublished data to build a model of DNA.
The information storage capacity of DNA is vast; a microgram (onemillionth of a gram) of DNA theoretically could store as much information as 1 million compact discs. The "useful" information contained in genomes consists of the coded instructions for making proteins and RNA. These information-containing regions of a genome are called genes. However, genes comprise less than 5 percent of the human genome. Most genomes consist largely of repetitive, noncoding DNA (sometimes called junk DNA) that is interspersed with genes and whose only apparent function is to replicate itself. Perhaps it helps to hold the chromosome together. The tenfold greater size of the lily genome compared to humans' is due to the presence of enormous amounts of repetitive DNA of unknown function.
While most cells of higher organisms contain all the genes in the genome, specialized cells such as neurons or muscle require expression from only some of the genes. One strategy for silencing unneeded genes is methylation . A methyl group (–CH3) is added to cytosine nucleotides, but only if they are followed by a guanine in the sequence, that is, CG. Adding methyl groups to a region of DNA attracts repressive DNA-binding proteins to it and may also cause the region to compact even further, making it inaccessible to proteins that make RNA from DNA (the first step of protein synthesis). During DNA replication the pattern of methylation is preserved by specific proteins that add methyl groups to the new strand based on the location of CG methyl groups in the template strand. The most extreme case of repression by methylation is X-inactivation, in which one of the two X chromosomes in cells of a female mammal is entirely shut down, presumably because expression from one X provides enough protein in females, as it does in males (who have only one X chromosome).
see also Chromosome, Eukaryotic; Control of Gene Expression; Crick, Francis; Gene; Mutation; Nucleotides; Replication; RNA; Watson, James
Steven A. Sullivan
Bibliography
Alberts, Bruce, et al. Molecular Biology of the Cell. New York: Garland Publishing, 2000.
Felsenfeld, Gary. "DNA." Scientific American 253 (1985): 58–67.
Levin, Benjamin. Genes VII. New York: Oxford University Press, 1999.
Watson, James D., and Francis H. Crick. "A Structure for Deoxyribose Nucleic Acid." Nature 171 (1953): 737.
A defect in the gene for a methylating enzyme causes Rett syndrome, a disorder responsible for mental retardation and movement disorders in young girls.
DNA
DNA
DNA (deoxyribonucleic acid ) carries design information between generations, and thus accounts for inherited biological traits (phenotypes ). At conception, a father's sperm injects a set of DNA molecules into a mother's egg, which already contains a nearly matching set. Those molecules contain the designs for all the material components their child needs for growth, development, and daily living.
Structure of DNA
The designs are called genes. Some genes play a role in regulating other genes, and some design ribonucleic acid, a close relative of DNA. But mostly, the designs in DNA are for the class of chemicals called proteins. The human body contains tens of thousands of kinds of proteins, which do all the body's work. Interactions among those proteins, and interactions between them and environmental factors account for the processes and structures of the body. Those processes and structures are manifested as inherited traits. DNA is comprised of chains of chemical subunits called nucleotides, each of which contains one nitrogenous base: adenine (A ), thymine (T ), cytosine (C ), or guanine (G ). The design instructions in DNA are spelled out as particular sequences of these four bases. This is analogous to conveying instructions in printed books by particular arrangements of the twenty-six letters of the alphabet. In the case of genes, however, there are only four letters in the alphabet. Hundreds of nucleotides are linked in a DNA chain in a sequence that spells out instructions for a single gene.
There are two complementary chains in the structure of DNA. Each nucleotide in DNA has a sugar component joined to a phosphate group at one point on the sugar, and to a nitrogen-containing base attached at another point. The chains in DNA have the phosphate of one nucleotide linked to the sugar of the next nucleotide to form a strand of alternating sugars and phosphates with dangling nitrogenous bases. DNA contains two such chains, twisted around each other to form a double-stranded helix with the bases on the inside. Every A on one chain forms weak bonds with a T on the other strand, and every C on a strand bonds weakly to a G on the opposite chain. The two strands, held together weakly by the pairing of A with T, and G with C, are thus complementary, and the sequence in one can be deduced from the other's sequence.
Design information is transmitted as new DNA to new cells during development and growth. The complementarity of the two DNA strands allows their information to be copied. Each old strand is used as a template in synthesizing a new complementary one. Intricate cellular machinery makes new copies of the DNA when a fertilized egg divides into two progeny cells. When each of the progeny divides again, the new progeny all receive complete copies of the parental DNA. As the fertilized egg grows to become successively an embryo, a fetus, a child, and finally an adult, cells go through many rounds of division with replication of the DNA in each round. Finally, adult humans have trillions of cells, each one (except sperm and ovum) containing complete copies of the DNA initially contributed by the parents.
On rare occasions mutations (changes) are made in nucleotides by chemicals, radiation, or errors in copying DNA. In a nucleotide chain, one nucleotide may be substituted for another, or one or more nucleotides might be inserted or deleted. Sometimes the change in DNA structure has little or no effect on the function of the gene's product, but it frequently harms the function to some degree, or very rarely enhances it. Harmful mutations cause gene-based diseases, but enhancing mutations allow organisms to evolve new or more effective functions. Like normal phenotypes, disease phenotypes usually require the products of multiple genes, so most defective genes predispose an organism to disease rather than directly causing it. The accumulation of mutations within the human species accounts for such phenotypic differences as eye color, stature, or skin pigmentation. The number of mutations among human genes is so large that no two persons, except for identical twins, have exactly the same nucleotide sequence in the three billion bases of their DNA.
Control of gene expression
DNA information is expressed as proteins and their feedback networks. The information resident in nucleotide sequences is used not only for replicating DNA, but also for synthesizing proteins. Proteins are chains of a few hundred subunits called amino acids, of which there are twenty kinds. The amino acids in a protein are arranged in a specific sequence by cellular machinery that translates the genetic information coded in DNA. The sequence of nucleotides, read three at a time, corresponds to the sequence of amino acids in a protein. The amino acids differ among themselves in chemical character so that every kind of protein differs in chemical character from others. For the work of the human body many thousands of proteins are needed, each having a highly specific function like catalyzing a chemical reaction or transporting oxygen. Observable phenotypes are the result of protein action, usually the coordinated action of many proteins. The functions of many proteins are integrated into large networks, and these webs of chemical processes act as feedback control systems allowing organisms to shift the balance of their activities to adapt to changes in the demand for the system's output. Often the networks possess alternate pathways for achieving a desired output.
Differentiation into specialized cells requires the control of gene expression. The development of a human being starts with a single-celled, fertilized egg. As the egg divides into two cells, and as successive rounds of cell division occur, every progeny cell receives a complete copy of parental DNA. In the first few divisions, the cells produced are identical in all observable characteristics, but as cell division continues, cells are produced that differ in phenotype even though all the cells continue to have identical DNA. In this differentiation, particular genes are controlled by blocking their expression, not by changing nucleotide sequence. Regulatory molecules block particular sites in DNA preventing translation of the corresponding genes into their products. Specific blocking thus generates different patterns of gene expression. Changing patterns of gene expression produce distinct populations of cells, diverging in phenotype as differentiation progresses. Eventually, differentiation in humans produces more than two hundred cell types, organized into different tissues and organs. In any one cell type the majority of its approximately 35,000 genes is repressed, leaving a small subset of expressed genes that differs from the subsets expressed in other cell types. Phenotypic differences between progeny in a given cell generation depend on the location of the cells in different microenvironments. During differentiation cells adapt to a succession of environmental changes produced by changes in their neighboring cells and extracellular fluids. Each successive adaptation is superimposed on its predecessor so that each terminally differentiated cell manifests the entire history of its lineage and not merely its immediate state. Since differentiation is irreversible in animals, (except in special cases), history as well as DNA designs a person, even in the material sense.
Feedback networks and regulation of genes allow individual organisms to adapt to changing conditions throughout life. When environment increases the need for the product of a network of chemical reactions, the overall process will be accelerated, and when need decreases the process will be inhibited. Obviously, adaptation to environment is induced by contact with physical and chemical forces, but adaptation can be evoked even without physical contact, as in the adaptation of the brain through learning, and emotional reaction. Many of these adaptive responses affect patterns of gene expression, and therefore environment, as well as history, joins with DNA in designing persons.
At the level of populations, long-term adaptation to environment occurs more by changes in gene structure than by changes in the expression of genes. The mechanism for this adaptation is the natural selection that underlies evolution. For example, skin pigmentation may be an adaptation that protects against exposure to the sun, and the genes that design the pigment systems would be naturally selected in successive generations that are exposed to much sunlight. Similarly, sickle-cell hemoglobin seems to have evolved in Africa because it offers resistance to malaria that is prevalent there.
Long-term adaptation through natural selection is most obvious in the case of physical and chemical aspects of human beings. Less obvious is the adaptation of behaviors through natural selection of genes, a possibility actively studied under the title "sociobiology." Although the mechanisms producing material phenotypes may seem more obvious than those producing social behaviors, a mechanism giving rise to a certain behavior may be thoroughly materialistic, although far more complex. Behavior modification by psychoactive drugs reveals a material mechanism for behavior. A mechanism can be pictured, for example, in the courting and mating behaviors that are correlated with the release of hormones from the brain, when an animal or human senses that a potential mate is near. Those released hormones induce particular chemical reactions at many sites throughout the body, giving rise to an appropriate pattern of bodily actions. Moreover, feedback responses between the mates guide further behavioral interactions between them. The hormonal system that links brain functions to bodily functions is, of course, designed by genes, and the mechanism just sketched is clearly materialistic. The frequent association of natural selection with notions of "survival of the fittest," makes altruism an especially challenging kind of behavior to study in testing the validity of sociobiology theory, and much of the research of sociobiologists is focused on the evolution of a gene for altruism.
Genes affect behavior, but as is the case with most human phenotypes, genes act in combinations and their expression is modulated by the histories and environments of individuals, as already described. Through the invariability of individual histories and environments, natural selection must be able to recognize the difference between organisms that possess a particular behavioral gene, and those that do not possess it. In order for a behavioral gene to evolve through natural selection it must be powerful enough in determining the behavior, to avoid substantial compromise by variable non-genetic factors. Sociobiology, then, tends to favor a strongly deterministic and materialistic view of behavior.
Human nature and genetic determinism
Choosing is part of human nature, but its degree of autonomy is debated. All agree that choice is constrained by genes, history, and environment, but does any degree of freedom remain? Science describes material brain mechanisms as chains of causes and effects, but every cause is an effect having a prior cause. Since the initial cause is not recognized by science, some say thought initiation is due to chance. Others look for initiation outside the material realm of science by distinguishing between mind and brain, or even spirit and brain.
Some degree of genetic determinism is necessary in describing human nature. All the possible scenarios of a person's life must conform to the designs in DNA, and thus genes set rigid, though spacious boundaries on what a person can be and do. But genes are insufficient for explaining what actually happens. What actually happens within the boundaries set by genes, depends on factors that control genes, including environment, history, and mental state. The question arises whether spiritual forces can be added to the list of controlling factors. Material determinism argues that a complete physicochemical description of the history and state of a person would explain everything without including a spiritual component. Some, however, argue that human spirituality is a capacity that emerged as gene-based human biology evolved, and that its activity cannot be fully comprehended at the molecular level. Still others add spirit as a control factor in human nature in accepting a dualism where body and spirit are distinct, though coexistent, in a person. The disparity in these views of human nature has theological consequences.
A view of human nature according to material determinism fits atheism and deism. It provides no locus for personal interaction with God, although deists might suppose that God influences humans through environment. Belief in human spirituality, either as an emerged capacity or as a distinct part of human nature does provide such a locus. Scientific understanding of gene-based human biology does not perceive a spiritual component in human nature, but it might not be expected that a physicochemico-molecular description of humans would be capable of such discernment in the first place.
See also Gene Patenting; Genetic Defect; Genetic Determinism; Genetics; Human Genome Project; Mutation; Nature versus Nurture
Bibliography
avise, john c. "evolving genomic metaphors: a new look at the language of dna." science 294 (2001): 86-87.
barbour, ian. religion in an age of science. new york: harper collins, 1990.
dawkins, richard. the selfish gene. oxford: oxford university press, 1989.
dennis, carina; gallagher, richard; and campbell, philip, eds. "the human genome." nature 409 (2001): 813-958.
edelman, gerald m. bright air, brilliant fire: on the matter of the mind. new york: basic books, 1992.
goldsmith, timothy h. the biological roots of human nature: forging links between evolution and behavior. new york: oxford university press, 1991.
hefner, philip. "determinism, freedom, and moral failure." in genetics: issues of social justice, ed. ted peters. cleveland, ohio: pilgrim press, 1998.
kitcher, philip. the lives to come: the genetic revolution and human possibilities. new york: simon and schuster, 1996.
kotulak, ronald. inside the brain: revolutionary discoveries of how the mind works. kansas city, mo.: andrews mcmeel, 1996.
peters, ted. "genes, theology, and social ethics." in genetics: issues of social justice, ed. ted peters. cleveland, ohio: pilgrim press, 1998.
raven, peter h., and johnson, george b. biology, 6th edition. new york: mcgraw hill, 2002.
stevens, raymond c.; shigeyuki, yokoyama; and wilson, ian a. "global efforts in structural genomics." science 294 (2001): 89-92.
wilson, edward o. sociobiology: the new synthesis. cambridge, mass.: harvard university press, 1975.
r. david cole
DNA
DNA
DNA (deoxyribonucleic acid) is a nucleic acid that carries genetic information. The study of DNA launched the science of molecular biology, transformed the study of genetics, and led to the cracking of the biochemical code of life. Understanding DNA has facilitated genetic engineering, the genetic manipulation of various organisms; has enabled cloning, the asexual reproduction of identical copies of genes and organisms; has allowed for genetic fingerprinting, the identification of an individual by the distinctive patterns of his or her DNA; and made possible the use of genetics to predict, diagnose, prevent, and treat disease.
Discovering DNA
In the late nineteenth century, biologists noticed structural differences between the two main cellular regions, the nucleus and the cytoplasm. The nucleus attracted attention because short, stringy objects appeared, doubled, then disappeared during the process of cell division. Scientists began to suspect that these objects, dubbed chromosomes, might govern heredity. To understand the operation of the nucleus and the chromosomes, scientists needed to determine their chemical composition.
Swiss physiologist Friedrich Miescher first isolated "nuclein"—DNA—from the nuclei of human pus cells in 1869. Although he recognized nuclein as distinct from other well-known organic compounds like fats, proteins, and carbohydrates, Miescher remained unsure about its hereditary potential. Nuclein was renamed nucleic acid in 1889, and for the next forty years, biologists debated the purpose of the compound.
In 1929, Phoebus Aaron Levene, working with yeast at New York's Rockefeller Institute, described the basic chemistry of DNA. Levene noted that phosphorus bonded to a sugar (either ribose or deoxyribose, giving rise to the two major nucleic acids, RNA and DNA), and supported one of four chemical "bases" in a structure he called a nucleotide. Levene insisted that nucleotides only joined in four-unit-long chains, molecules too simple to transmit hereditary information.
Levene's conclusions remained axiomatic until 1944, when Oswald Avery, a scientist at the Rockefeller Institute, laid the groundwork for the field of molecular genetics. Avery continued the 1920s-era research of British biologist Fred Griffiths, who worked with pneumococci, the bacteria responsible for pneumonia. Griffiths had found that pneumococci occurred in two forms, the disease-causing S-pneumococci, and the harmless R-pneumococci. Griffiths mixed dead S-type bacteria with live R-type bacteria. When rats were inoculated with the mixture, they developed pneumonia. Apparently, Griffiths concluded, something had transformed the harmless R-type bacteria into their virulent cousin. Avery surmised that the transforming agent must be a molecule that contained genetic information. Avery shocked himself, and the scientific community, when he isolated the transforming agent and found that it was DNA, thereby establishing the molecular basis of heredity.
DNA's Molecular Structure
Erwin Chargaff, a biochemist at Columbia University, confirmed and refined Avery's conclusion that DNA was complex enough to carry genetic information. In 1950, Chargaff reported that DNA exhibited a phenomenon he dubbed a complementary relationship. The four DNA bases—adenine, cytosine, guanine, and thymine (A, C, G, T, identified earlier by Levene)—appeared to be paired. That is, any given sample of DNA contained equal amounts of G and C, and equal amounts of A and T; guanine was the complement to cytosine, as adenine was to thymine. Chargaff also discovered that the ratio of GC to AT differed widely among different organisms. Rather than Levene's short molecules, DNA could now be reconceived as a gigantic macromolecule, composed of varying ratios of the base complements strung together. Thus, the length of DNA differed between organisms.
Even as biochemists described DNA's chemistry, molecular physicists attempted to determine DNA's shape. Using a process called X-ray crystallography, chemist Rosalind Franklin and physicist Maurice Wilkins, working together at King's College London in the early 1950s, debated whether DNA had a helical shape. Initial measurements indicated a single helix, but later experiments left Franklin and Wilkins undecided between a double and a triple helix. Both Chargaff and Franklin were one step away from solving the riddle of DNA's structure. Chargaff understood base complementarity but not its relation to molecular structure; Franklin understood general structure but not how complementarity necessitated a double helix.
In 1952, an iconoclastic research team composed of an American geneticist, James Watson, and a British physicist, Francis Crick, resolved the debate and unlocked DNA's secret. The men used scale-model atoms to construct a model of the DNA molecule. Watson and Crick initially posited a helical structure, but with the bases radiating outward from a dense central helix. After meeting with Chargaff, Watson and Crick learned that the GC and AT ratios could indicate chemical bonds; hydrogen atoms could bond the guanine and cytosine, but could not
bond either base to adenine or thymine. The inverse also proved true, since hydrogen could bond adenine to thymine. Watson and Crick assumed these weak chemical links and made models of the nucleotide base pairs GC and AT. They then stacked the base-pair models one atop the other, and saw that the phosphate and sugar components of each nucleotide bonded to form two chains with one chain spinning "up" the molecule, the other spinning "down" the opposite side. The resulting DNA model resembled a spiral staircase—the famous double helix.
Watson and Crick described their findings in an epochal 1953 paper published in the journal Nature. Watson and Crick had actually solved two knotty problems simultaneously: the structure of DNA and how DNA replicated itself in cell division—an idea they elaborated in a second path breaking paper in Nature. If one split the long DNA molecule at the hydrogen bonds between the bases, then each half provided a framework for assembling its counterpart, creating two complete molecules—the doubling of chromosomes during cell division. Although it would take another thirty years for crystallographic confirmation of the double helix, Crick, Watson, and Rosalind Franklin's collaborator Maurice Wilkins shared the 1962 Nobel Prize in physiology or medicine (Franklin had died in 1958). The study of molecular genetics exploded in the wake of Watson and Crick's discovery.
Once scientists understood the structure of DNA molecules, they focused on decoding the DNA in chromosomes—determining which base combinations created structural genes (those genes responsible for manufacturing amino acids, the building blocks of life) and which combinations created regulator genes (those that trigger the operation of structural genes). Between 1961 and 1966, Marshall Nirenberg and Heinrich Matthaei, working at the National Institutes of Health, cracked the genetic code. By 1967, scientists had a complete listing of the sixty-four three-base variations that controlled the production of life's essential twenty amino acids. Researchers, however, still lacked a genetic map precisely locating specific genes on individual chromosomes. Using enzymes to break apart or splice together nucleic acids, American scientists, like David Baltimore, helped develop recombinant DNA or genetic engineering technology in the 1970s and 1980s.
Genetic engineering paved the way for genetic map-ping and increased genetic control, raising a host of political and ethical concerns. The contours of this debate have shifted with the expansion of genetic knowledge. In the 1970s, activists protested genetic engineering and scientists decried for-profit science; thirty years later, protesters organized to fight the marketing of genetically modified foods as scientists bickered over the ethics of cloning humans. Further knowledge about DNA offers both promises and problems that will only be resolved by the cooperative effort of people in many fields—medicine, law, ethics, social policy, and the humanities—not just molecular biology.
DNA and American Culture
Like atomic technology, increased understanding of DNA and genetics has had both intended and unintended consequences, and it has captured the public imagination. The popular media readily communicated the simplicity and elegance of DNA's structure and action to nonscientists. Unfortunately, media coverage of advances in DNA technology has often obscured the biological complexity of these developments. Oversimplifications in the media, left uncorrected by scientists, have allowed DNA to be invoked as a symbol for everything from inanimate objects to the absolute essence of human potential.
DNA's biological power has translated into great cultural power as the image of the double helix entered the iconography of America after 1953. As Dorothy Nellkin and M. Susan Lindee have shown, references to DNA and the power of genetics are ubiquitous in modern culture. Inanimate objects like cars are advertised as having "a genetic advantage." Movies and television dramas have plots that revolve around DNA, genetic technology, and the power of genetics to shape lives. Humorists use DNA as the punch line of jokes to explain the source of human foibles. Consumer and popular culture's appropriation of DNA to signify fine or poor quality has merged with media oversimplifications to give rise to a new wave of hereditarian thinking in American culture.
The DNA technology that revolutionized criminology, genealogy, and medicine convinced many Americans that DNA governed not only people's physical development, but also their psychological and social behavior. Genetic "fingerprints" that allow forensics experts to discern identity from genetic traces left at a crime scene, or that determine ancestralties by sampling tissue from long-dead individuals, have been erroneously touted as foolproof and seem to equate peoples' identities and behavior with their DNA. Genomic research allows scientists to identify genetic markers that indicate increased risk for certain diseases. This development offers hope for preventive medicine, even as it raises the specter of genetic discrimination and renewed attempts to engineer a eugenic master race. In the beginning of the twenty-first century, more scientists began to remind Americans that DNA operates within a nested series of environments—nuclear, cellular, organismic, ecological, and social—and these conditions affect DNA's operation and its expression. While DNA remains a powerful cultural symbol, people invoke it in increasingly complex ways that more accurately reflect how DNA actually influences life.
Without question, in the 131 years spanning Miescher's isolation of nuclein, Crick and Watson's discovery of DNA's structure, and the completion of the human genome, biologists have revolutionized humanity's understanding of, and control over, life itself. American contributions to molecular biology rank with the harnessing of atomic fission and the landing of men on the moon as signal scientific and technological achievements.
BIBLIOGRAPHY
Chargaff, Erwin. Heraclitean Fire?: Sketches from a Life before Nature. New York: Rockefeller University Press, 1978. Bitter but provocative.
Judson, Horace Freeland. The Eighth Day of Creation: Makers of the Revolution in Biology. New York: Simon and Schuster, 1979. Readable history of molecular biology.
Kay, Lily E. Who Wrote the Book of Life?: A History of the Genetic Code. Stanford, Calif.: Stanford University Press, 2000.
Kevles, Daniel J., and Leroy Hood, eds. The Code of Codes: Scientific and Social Issues in the Human Genome Project. Cambridge, Mass.: Harvard University Press, 1992.
Lagerkvist, Ulf. DNA Pioneers and Their Legacy. New Haven, Conn.: Yale University Press, 1998.
Nelkin, Dorothy, and M. Susan Lindee. The DNA Mystique: The Gene as Cultural Icon. New York: W. H. Freeman, 1995. Excellent cultural interpretation of DNA in the 1990s.
Watson, James D. The Double-Helix. New York: Atheneum, 1968. Crotchety account of discovery.
Watson, James D., and F. H. C. Crick. "Molecular Structure of Nucleic Acid: A Structure for Deoxyribonucleic Acid." Nature 171 (1953): 737–738.
Gregory MichaelDorr
DNA
DNA
DNA (deoxyribonucleic acid) was discovered in the late 1800s, but its role as the material of heredity was not elucidated for fifty years after that. It occupies a central and critical role in the cell as the genetic information in which all the information required to duplicate and maintain the organism. All information necessary to maintain and propagate life is contained within a linear array of four simple bases: adenine, guanine, thymine, and cytosine.
DNA was first described as a monotonously uniform helix, generally called B-DNA. However, we now know that DNA can adopt many different shapes and conformations. Moreover, many of these alternative shapes have biological importance. Thus, the DNA is not simply an informational repository, from which information flows through RNA into proteins. Rather, structural information exists within the specific sequence patterns of the bases. This structural information dictates the interaction of DNA with proteins to carry out processes of DNA replication, transcription into RNA, and repair of errors or damage to the DNA.
The Components of DNA
DNA is composed of purine (adenine and guanine) and pyrimidine (cytosine and thymine) bases, each connected through a ribose sugar to a phosphate backbone. Many variations are possible in the chemical structure of the bases and the sugar, and in the structural relationship of the base to the sugar that result in differences in helical shape and form. The most common DNA helix, B-DNA, is a double helix of two DNA strands with about 10.5 base pairs per helical turn.
Bases and Base Pairs.
The four bases found in DNA are shown in Figures 1 and 2. The purines and pyrimidines are the informational molecules of the genetic blueprint for the cell. The two sides of the helix are held together by hydrogen bonds between base pairs. Hydrogen bonds are weak attractions between a hydrogen atom on one side and an oxygen or nitrogen atom on the other. Hydrogen atoms of amino groups serve as the hydrogen bond donor while the carbonyl oxygens and ring nitrogens serve as hydrogen bond acceptors. The specific location of hydrogen bond donor and acceptor groups gives the bases their specificity for hydrogen bonding in unique pairs. Thymine (T) pairs with adenine (A) through two hydrogen bonds, and cytosine (C) pairs with guanine (G) through three hydrogen bonds (Figure 2). T does not normally pair with G, nor does C normally pair with A.
Deoxyribose Sugar.
In DNA the bases are connected to a β-D-2-deoxyribose sugar with a hydrogen atom at the 2′ ("two prime") position. The sugar is a very dynamic part of the DNA molecule. Unlike the nucleotide bases, which are planar and rigid, the sugar ring is easily bent and twisted into various conformations (which exist in different structural forms of DNA). In canonical B-DNA, the accepted and most common form of DNA, the sugar configuration is known as C2′ endo.
Nucleosides and Nucleotides.
The term "nucleoside" refers to a base and sugar. "Nucleotide," on the other hand, refers to the base, sugar, and phosphate group (Figure 1). A bond, called the glycosidic bond, holds the base to the sugar and the 3′-5′ ("three prime-five prime") phosphodiester bond holds the individual nucleotides together. Nucleotides are joined from the 3′ carbon of the sugar in one nucleotide to the 5′ carbon of the sugar of the adjacent nucleotide. The 3′ and the 5′ ends are chemically very distinct and have different reactive properties. During DNA replication, new nucleotides are added only to the 3′ OH end of a DNA strand. This fact has important implications for replication.
The Structure of Double-Stranded DNA
As mentioned above, the two individual strands are held together by hydrogen bonds between individual T·A and C·G base pairs. In DNA, the distance between the atoms involved is 2.8 to 2.95 angstroms (10−10 meters). While individually weak, the large number of hydrogen bonds along a DNA chain provides sufficient stability to hold the two strands together.
The stabilization of duplex (double-stranded) DNA is also dependent on base stacking. The planar, rigid bases stack on top of one another, much like a stack of coins. Since the two purine.pyrimidine pairs (A.T and C.G) have the same width, the bases stack in a rather uniform fashion. Stacking near the center of the helix affords protection from chemical and environmental attack. Both hydrophobic interactions and van der Waal's forces hold bases together in stacking interactions. About half the stability of the DNA helix comes from hydrogen bonding, while base stacking provides much of the rest.
Double-stranded DNA in its canonical B-form is a right-handed helix formed by two individual DNA strands aligned in an antiparallel fashion (a right-handed helix, when viewed on end, twists clockwise going away from the viewer). Antiparallel DNA has the two strands organized in the opposite polarity, with one strand oriented in the 5′-3′ direction and the other oriented in the 3′-5′ direction.
In the right-handed B-DNA double helix, the stacked base pairs are separated by about 3.24 angstroms with 10.5 base pairs forming one helical turn (360°), which is 35.7 angstroms in length. Two successive base pairs, therefore, are rotated about 34.3° with respect to each other. The width of the helix is 20 angstroms. An idealized model of the double helix is shown in Figure 3. As can be seen, the organization of the bases creates a major groove and a minor groove.
Adenine and thymine are said to be complementary, as are cytosine and guanine. Complementary means "matching opposite." The shapes and charges of adeninne and thymine complement each other, so that they attract one another and link up (as do cytosine and guanine). Indeed, one entire strand of duplex DNA is complementary to the opposing strand. During replication, the two strands unwind, and each serves as a template for formation of new complementary strand, so that replication ends with two exact double-stranded copies.
Alternative DNA Conformations
While the vast majority of the DNA exists in the canonical B-DNA form, DNA can adopt an amazing array of alternative structures. This is the result of certain particular sequence arrangements of DNA and, in many cases, energy in the DNA double helix from DNA supercoiling, the property of DNA in which the double helix, in a high-energy state, becomes twisted around itself. Some alternative DNA conformations identified are shown in Figure 4.
Unwound DNA.
Since A·T base pairs contain two hydrogen bonds and C·G base pairs contain three, A+T-rich tracts are less thermally stable that C+G-rich tracts in DNA. Under denaturing conditions (heat or alkali), the DNA begins to "melt" (separate), and unwound regions of DNA will form, and it is the A+T-rich sequences that melt first. In addition, in the presence of superhelical energy (a high-energy state of DNA resulting from its supercoiling, which is the natural form of DNA in the chromosomes of most organisms), A+T-rich regions can unwind and remain unwound under conditions normally found in the cell. Such sites often provide places for DNA replication proteins to enter DNA to begin the process of chromosome duplication.
Cruciform Structures.
DNA sequences are said to be palindromic when they contain inverted repeat symmetry, as in the sequence GGAATTAATTCC, reading from the 5′ to the 3′ end. Palindromic sequences can form intramolecular bonds (within a single strand), rather than the normal intermolecular (between the two complementary strands), hydrogen bonds. To form cruciforms ("cross-shaped"), the DNA must form a small unwound structure, and then base pairs must begin to form within each individual strand, thus forming the four-stranded cruciform structure.
Slipped-Strand DNA.
Slipped-strand DNA structures can form within direct repeat DNA sequences, such as (CTG)n·(CAG)n and (CGG)n·(CCG)n (where "n" denotes a variable number of repetitions). They form following denaturation, after the strands become unwound, and during renaturation, when the strands come back together. To form slipped-strand DNA, the opposite strands come together in an out-of-alignment fashion, during renaturation. Expansion of such triplet repeats are features of certain neurological diseases.
Intermolecular Triplex DNA.
Three-stranded, or triplex DNA, can form within tracts of polypurine.polypyrimidine sequence, such as (GAA)n·(TTC)n. Purines, with their two-ring structures, have the potential to form hydrogen bonds with a second base, even while base paired in the canonical A·T and G·C configurations. This second type of base pair is called a Hoogsteen base pair, and it can form in the major groove (the top of the base pair representations in Figure 2). Pyrimidines can only pair with a single other base, and thus a long Pu·Py tract must be present for triplex DNA formation. The important factor for triplex DNA formation is the presence of an extended purine tract in a single DNA strand. The third-strand base-pairing code is as follows: A can pair with A or T; G can pair with a protonated C (C+) or G.
Intramolecular Triplex DNA.
When a Pu·Py tract exists that has mirror repeat symmetry (5′ GAAGAG-GAGAAG 3′), an intramolecular triplex can form, in which half of the Pu.Py tract unwinds and one strand wraps into the major groove, forming a triplex. The structure in Figure 4 shows the pyrimidine strand (CTT) pairing with the purine strand (GAA) of a canonical DNA duplex. In an intramolecular triplex, one strand of the unwound region remains unpaired, as shown.
Quadruplex DNA.
DNA sequences containing runs of G·C base pairs can form quadruplex, or four-stranded DNA, in which the four DNA strands are held together by Hoogsteen hydrogen bonds between all four guanines. The four guanines are aligned in a plane, and the successive rings of guanines are stacked one upon another.
Left-handed Z-DNA.
Alternating runs of (CG)n·(CG)n or (TG)n·(CA)n dinucleotides in DNA, under superhelical tension or high salt (more than 3 M NaCl) (M, moles per liter) can adopt a left-handed helix called Z-DNA. In this form, the two DNA strands become wrapped in a left-handed helix, which is the opposite sense to that of canonical B-DNA. This can occur within a small region of a larger right-handed B-DNA molecule, creating two junctions at the B-Z transition region.
Curved DNA.
DNA containing tracts of (A)3-4·(T)3-4 (that is, runs of three or four bases of A in one strand and a similar run of T in the other) spaced at 10-base pair intervals can adopt a curved helix structure.
In summary, DNA can exist in a very stable, right-handed double helix, in which the genetic information is very stable. Certain DNA sequences can also adopt alternative conformations, some of which are important regulatory signals involved in the genetic expression or replication of the DNA.
see also Chromosome, Eukaryotic; Chromosome, Prokaryotic; DNA Microarrays; Gene; Genome; Nucleotide; Sequencing DNA; Triplet Repeat Disease.
Richard R. Sinden
Bibliography
Sinden, Richard R. DNA Structure and Function. San Diego: Academic Press, 1994.
DNA (Deoxyribonucleic Acid)
DNA (deoxyribonucleic acid)
DNA, or deoxyribonucleic acid, is the genetic material that codes for the components that make life possible. Both prokaryotic and eukaryotic organisms contain DNA. An exception is a few viruses that contain ribonucleic acid , although even these viruses have the means for producing DNA.
The DNA of bacteria is much different from the DNA of eukaryotic cells such as human cells. Bacterial DNA is dispersed throughout the cell, while in eukaryotic cells the DNA is segregated in the nucleus , a membrane-bound region. In eukaryotics, structures called mitochondria also contain DNA. The dispersed bacterial DNA is much shorter than eukaryotic DNA. Hence the information is packaged more tightly in bacterial DNA. Indeed, in DNA of microorganisms such as viruses, several genes can overlap with each other, providing information for several proteins in the same stretch of nucleic acid. Eukaryotic DNA contains large intervening regions between genes.
The DNA of both prokaryotes and eukaryotes is the basis for the transfer of genetic traits from one generation to the next. Also, alterations in the genetic material (mutations ) can produce changes in structure, biochemistry , or behavior that might also be passed on to subsequent generations.
Genetics is the science of heredity that involves the study of the structure and function of genes and the methods by which genetic information contained in genes is passed from one generation to the next. The modern science of genetics can be traced to the research of Gregor Mendel (1823–1884), who was able to develop a series of laws that described mathematically the way hereditary characteristics pass from parents to offspring. These laws assume that hereditary characteristics are contained in discrete units of genetic material now known as genes.
The story of genetics during the twentieth century is, in one sense, an effort to discover the gene itself. An important breakthrough came in the early 1900s with the work of the American geneticist, Thomas Hunt Morgan (1866–1945). Working with fruit flies, Morgan was able to show that genes are somehow associated with the chromosomes that occur in the nuclei of cells. By 1912, Hunt's colleague, American geneticist A. H. Sturtevant (1891–1970) was able to construct the first chromosome map showing the relative positions of different genes on a chromosome. The gene then had a concrete, physical referent; it was a portion of a chromosome.
During the 1920s and 1930s, a small group of scientists looked for a more specific description of the gene by focusing their research on the gene's molecular composition. Most researchers of the day assumed that genes were some kind of protein molecule. Protein molecules are large and complex. They can occur in an almost infinite variety of structures. This quality is expected for a class of molecules that must be able to carry the enormous variety of genetic traits.
A smaller group of researchers looked to a second family of compounds as potential candidates as the molecules of heredity. These were the nucleic acids. The nucleic acids were first discovered in 1869 by the Swiss physician Johann Miescher (1844–1895). Miescher originally called these compounds "nuclein" because they were first obtained from the nuclei of cells. One of Miescher's students, Richard Altmann, later suggested a new name for the compounds, a name that better reflected their chemical nature: nucleic acids.
Nucleic acids seemed unlikely candidates as molecules of heredity in the 1930s. What was then known about their structure suggested that they were too simple to carry the vast array of complex information needed in a molecule of heredity. Each nucleic acid molecule consists of a long chain of alternating sugar and phosphate fragments to which are attached some sequence of four of five different nitrogen bases: adenine, cytosine, guanine, uracil and thymine (the exact bases found in a molecule depend slightly on the type of nucleic acid).
It was not clear how this relatively simple structure could assume enough different conformations to "code" for hundreds of thousands of genetic traits. In comparison, a single protein molecule contains various arrangements of twenty fundamental units (amino acids) making it a much better candidate as a carrier of genetic information.
Yet, experimental evidence began to point to a possible role for nucleic acids in the transmission of hereditary characteristics. That evidence implicated a specific sub-family of the nucleic acids known as the deoxyribose nucleic acids, or DNA. DNA is characterized by the presence of the sugar deoxyribose in the sugar-phosphate backbone of the molecule and by the presence of adenine, cytosine, guanine, and thymine, but not uracil.
As far back as the 1890s, the German geneticist Albrecht Kossel (1853–1927) obtained results that pointed to the role of DNA in heredity. In fact, historian John Gribbin has suggested that the evidence was so clear that it "ought to have been enough alone to show that the hereditary information...must be carried by the DNA." Yet, somehow, Kossel himself did not see this point, nor did most of his colleagues for half a century.
As more and more experiments showed the connection between DNA and genetics, a small group of researchers in the 1940s and 1950s began to ask how a DNA molecule could code for genetic information. The two who finally resolved this question were James Watson , a 24-year-old American trained in genetics, and Francis Crick , a 36-year-old Englishman, trained in physics and self-taught in chemistry. The two met at the Cavendish Laboratories of Cambridge University in 1951. They shared the view that the structure of DNA held the key to understanding how genetic information is stored in a cell and how it is transmitted from one cell to its daughter cells.
The key to lay in a technique known as x-ray crystallography. When x rays are directed at a crystal of some material, such as DNA, they are reflected and refracted by atoms that make up the crystal. The refraction pattern thus produced consists of a collection of spots and arcs. A skilled observer can determine from the refraction pattern the arrangement of atoms in the crystal.
Watson and Crick were fortunate in having access to some of the best x-ray diffraction patterns that then existed. These "photographs" were the result of work being done by Maurice Wilkins and Rosalind Elsie Franklin at King's College in London. Although Wilkins and Franklin were also working on the structure of DNA, they did not recognize the information their photographs contained. Indeed, it was only when Watson accidentally saw one of Franklin's photographs that he suddenly saw the solution to the DNA puzzle.
Watson and Crick experimented with tinker-toy-like models of the DNA molecule, shifting atoms around into various positions. They were looking for an arrangement that would give the kind of x-ray photograph that Watson had seen in Franklin's laboratory. On March 7, 1953, the two scientists found the answer. They built a model consisting of two helices (corkscrew-like spirals), wrapped around each other. Each helix consisted of a backbone of alternating sugar and phosphate groups. To each sugar was attached one of the four nitrogen bases, adenine, cytosine, guanine, or thymine. The sugar-phosphate backbone formed the outside of the DNA molecule, with the nitrogen bases tucked inside. Each nitrogen base on one strand of the molecule faced another nitrogen base on the opposite strand of the molecule. The base pairs were not arranged at random, however, but in such a way that each adenine was paired with a thymine, and each cytosine with a guanine.
The Watson-Crick model was a remarkable achievement, for which the two scientists won the 1954 Nobel Prize in Chemistry. The molecule had exactly the shape and dimensions needed to produce an x-ray photograph like that of Franklin's. Furthermore, Watson and Crick immediately saw how the molecule could "carry" genetic information. The sequence of nitrogen bases along the molecule, they said, could act as a genetic code . A sequence, such as A-T-T-C-GC-T...etc., might tell a cell to make one kind of protein (such as that for red hair), while another sequence, such as G-C-TC-T-C-G...etc., might code for a different kind of protein (such as that for blonde hair). Watson and Crick themselves contributed to the deciphering of this genetic code, although that process was long and difficult and involved the efforts of dozens of researchers over the next decade.
Watson and Crick had also considered, even before their March 7th discovery, what the role of DNA might be in the manufacture of proteins in a cell. The sequence that they outlined was that DNA in the nucleus of a cell might act as a template for the formation of a second type of nucleic acid, RNA (ribonucleic acid). RNA would then leave the nucleus, emigrate to the cytoplasm and then itself act as a template for the production of protein. That theory, now known as the Central Dogma, has since been largely confirmed and has become a critical guiding principal of much research in molecular biology .
Scientists continue to advance their understanding of DNA. Even before the Watson-Crick discovery, they knew that DNA molecules could exist in two configurations, known as the "A" form and the "B" form. After the Watson-Crick discovery, two other forms, known as the "C" and "D" configurations were also discovered. All four of these forms of DNA are right-handed double helices that differ from each other in relatively modest ways.
In 1979, however, a fifth form of DNA known as the "Z" form was discovered by Alexander Rich and his colleagues at the Massachusetts Institute of Technology. The "Z" form was given its name partly because of its zigzag shape and partly because it is different from the more common A and B forms. Although Z-DNA was first recognized in synthetic DNA prepared in the laboratory, it has since been found in natural cells whose environment is unusual in some respect or another. The presence of certain types of proteins in the nucleus, for example, can cause DNA to shift from the B to the Z conformation. The significance and role of this most recently discovered form of DNA remains a subject of research among molecular biologists.
See also Chemical mutagenesis; Genetic regulation of eukaryotic cells; Genetic regulation of prokaryotic cells; Mitochondrial DNA
DNA (Deoxyribonucleic Acid)
DNA (Deoxyribonucleic Acid)
Deoxyribonucleic acid, or DNA, is the genetic material that carries the code for all living things. This code determines the form, development, and behavior patterns of an organism and is part of the chromosomes that exist within the nucleus of cells. DNA consists of two long chains joined together by chemicals called bases and coiled together into a twisted-ladder shape.
DNA is a large molecule found in almost all organisms and contains codes for the making and using of proteins. Since proteins carry out the work of all cells, it is DNA that ultimately controls and directs all the activities of a living cell. Biologists have known about DNA for a very long time. Even before they discovered that genes control heredity, they were aware that the cell's chromosomes were made up of protein and a special chemical they called deoxyribonucleic acid (DNA). Although DNA was discovered in 1869, more than fifty years passed before biologists believed that genes were composed of DNA. As a nucleic acid, DNA was considered too simple a chemical to contain the huge amount of complex information needed to determine heredity. Since it is made of only four or sometimes five chemical bases, called nucleotides, no one thought that DNA was complex enough. However, as more experimental evidence began to point toward DNA as key in the transmission of hereditary characteristics, more scientists began to turn their attention to DNA.
WATSON AND CRICK DISCOVER THE STRUCTURE OF DNA
By the early 1950s, an unusual pair of scientists teamed their efforts in the passionate belief that the structure of DNA held the key to understanding how genetic information is stored and transmitted. In 1951, the twenty-four-year-old American geneticist James Watson met the thirty-six-year-old English physicist (and self-trained chemist) Francis Crick and the two decided to try to solve the puzzle of how such a simple material as DNA could store so much complicated information. Both knew they did not have to make any new discoveries, but instead, had to solve what might be called the "molecular architecture" of DNA.
The key to solving that problem lay in a technique known as x-ray crystallography. When x rays are directed at a crystal of some material, such as DNA, they are reflected and refracted by the atoms that make up the crystal. A refraction pattern is produced from which a skilled observer can determine how the atoms of the crystal are arranged. However, this is more difficult than it sounds, and Watson and Crick were trying to solve a very difficult puzzle. It was the chemical composition of DNA itself that led them to the correct model. They knew that DNA was composed of four chemicals—adenine (A), guanine (G), cytosine (C), and thymine (T)—and that A was always paired with T, and C always with G. Knowing this and studying the x rays led them to realize that this alignment could only happen if DNA was made up of two strands that were twisted together to form what is called a "double helix." (A double helix is the correct name for a corkscrew-like spiral shape.) In March 1953 Watson and Crick built a wire model showing that the DNA molecule could be thought of as a ladder where the nucleotide bases (A, T, G, C) form the "rungs" connecting the two side rails. The sides were then twisted to make the double helix. In Watson and Crick's Nobel Prize-winning work, the base pairs were the critical part that allowed them to explain how nature stores and uses a genetic code. Each DNA base is like a letter of the alphabet, and a sequence of nucleotide bases can be thought of as forming a message. Put another way, each "rung" (base pair) of the twisted ladder consists of some combination of the four chemicals (A, T, G, C) that form the coded message.
THE CELL'S INSTRUCTION MANUAL
DNA has been called the instruction manual for the cell. It has also been called the chemical language in which "gene recipes" are written. This is because genes can be considered recipes for making proteins, and proteins control the characteristics of all organisms. These codes or recipes are written with the four nucleotide building blocks (the bases A, T, G, and C). Each gene has several thousand bases joined together in a precise
JAMES DEWEY WATSON AND FRANCIS HARRY COMPTON CRICK
The team of Watson and Crick discovered the structure of deoxyribonucleic acid (DNA), one of the most important discoveries of modern science. Their model explained how genetic information is coded and how DNA makes copies of itself. This discovery formed the basis for all the genetic developments that have followed.
English molecular biologist Francis Crick (1916– ) was born in Northampton, England. As a boy, he was very interested in chemistry, although he eventually obtained a degree in physics from University College in London. During World War II (1939–45) he worked on the development of radar (an instrument used to determine an object's position, speed, or other characteristics) and new weapon design. After the war, Crick decided he wanted to concentrate on biology and did so on his own, eventually taking a job at the Cavendish Laboratory where he would meet the younger Watson.
American molecular biologist James Watson (1928– ) was a former "Quiz Kid," which was the name of a popular radio show in the 1940s. Born in Chicago, Illinois, Watson was a child prodigy (an exceptionally smart person) who graduated from the University of Chicago when he was nineteen and who obtained his Ph.D. from the University of Indiana at twenty-two. After receiving a fellowship to study in Copenhagen, Denmark, he joined the Cavendish Laboratory at Cambridge, England, where he met Crick.
Although Crick was twelve years older than Watson, both shared what was described as "youthful arrogance" and found that they got along extremely well both personally and professionally. Crick had been studying protein structure, and Watson came to Cambridge highly interested in discovering the basic substance of genes. With these complementary goals, they teamed up to try to unravel the structure of DNA, the carrier of genetic information at the molecular level.
Beginning in 1951, they worked to create a DNA model that would explain how it could copy and pass on its instructions to every new cell in a living thing. After a great deal of experimentation, they recognized the importance of DNA x rays being done by the English biochemist Rosalind Franklin (1920–1958), and soon built an accurate spiral-shaped model, called a double helix, in which they said two parallel chains of alternate sugar and phosphate groups were linked by pairs of organic bases. Their model looked like a twisted, spiral staircase. They then theorized that replication (the process by which DNA molecules copy themselves) occurs by a parting, or unwinding, of the two strands, or bases, of the staircase that then unite with newly created strands to form new DNA molecules (made up of one old strand and one new strand). Watson and Crick then published their findings in the journal Nature, which appeared on April 25, 1953 (with Watson's name appearing first due to a coin toss). Other researchers soon confirmed their hypothesis and the Watson-Crick model was accepted as correct.
Watson went on to write several highly regarded books, one of which became the first widely used textbook on molecular biology. He taught at Harvard University, became director of Cold Spring Harbor Laboratory of Quantitative Biology in Massachusetts, and served as the first director of the United States Human Genome Project. Crick joined the Salk Institute for Biological Studies in San Diego, California, in 1977. In 1962, Watson and Crick shared the Nobel Prize in Physiology and Medicine with their colleague, English physicist, Maurice H. F. Wilkins, for their discovery of the structure of DNA. Rosalind Franklin worked on Wilkins's team, and she would have received the award had she lived. The Watson-Crick discovery of DNA structure ushered in the modern era of molecular biology and made possible all that has happened since. There would be no understanding of human genetics without their discovery.
code that is different from the code for any other gene. For example, a sequence such as A-T-T-C-G-C-T… etc. might tell a cell to make one type of protein (for red hair), while another sequence such as G-C-T-C-T-CG… etc. might code for a different type of protein (for blonde hair). When cells reproduce by division (a process called mitosis), each parent cell must make sure that its daughter cell (its exact duplicate) gets a complete copy of its DNA. This is accomplished by a process called "replication" in which the two strands or rails of the DNA ladder "unzip" themselves down the middle of the bases (or rungs of the ladder). Since A always links to T and G always to C, each separate rail of the ladder becomes a template or model for free-floating bases to link up to. The result is the existence of two identical double helixes where there was just one.
The other important processes to understand are called "transcription" and "translation," which are the two stages of making a protein. In transcription, DNA "unzips" again and another type of nucleic acid called ribonucleic acid (or messenger RNA) uses one strand of DNA as a template to make an exact, single-strand copy. The RNA then leaves the nucleus with its message and becomes a template for the production of protein (by a ribosome in the cell) in what is known as translation. This process is going on all the time in our bodies, since cells are constantly called upon to make protein molecules.
Breakthroughs in understanding DNA have also led to the use of DNA in forensic science where "DNA fingerprinting" or "DNA profiling" is conducted. This use of DNA is based on the fact that repetitive sequences of DNA vary greatly among individuals, since each person has his or her own unique code. It has also led to the beginnings of treatment for hereditary disease by genetic engineering. In theory, doctors will be able to replace a defective (inherited) gene that causes a certain disease with a normal gene, thus preventing the patient from getting the disease in the first place.
[See alsoCell; Double Helix; Gene; Nucleic Acid ]
DNA (deoxyribonucleic acid)
DNA (deoxyribonucleic acid)
Genetics is the science of heredity that involves the study of the structure and function of genes and the methods by which genetic infomation contained in genes is passed from one generation to the next. The modern science of genetics can be traced to the research of Gregor Mendel (1823–1884), who was able to develop a series of laws that described mathematically the way hereditary characteristics pass from parents to offspring. These laws assume that hereditary characteristics are contained in discrete units of genetic material now known as genes.
The story of genetics during the twentieth century is, in one sense, an effort to discover the gene itself. An important breakthrough came in the early 1900s with the work of the American geneticist, Thomas Hunt Morgan (1866–1945). Working with fruit flies, Morgan was able to show that genes are somehow associated with the chromosomes that occur in the nuclei of cells. By 1912, Hunt's colleague, American geneticist A. H. Sturtevant (1891–1970) was able to construct the first chromosome map showing the relative positions of different genes on a chromosome. The gene then had a concrete, physical referent; it was a portion of a chromosome.
During the 1920s and 1930s, a small group of scientists looked for a more specific description of the gene by focusing their research on the gene's molecular composition. Most researchers of the day assumed that genes were some kind of protein molecule. Protein molecules are large and complex. They can occur in an almost infinite variety of structures. This quality is expected for a class of molecules that must be able to carry the enormous variety of genetic traits.
A smaller group of researchers looked to a second family of compounds as potential candidates for the molecules of heredity. These were the nucleic acids. The nucleic acids were first discovered in 1869 by the Swiss physician Johann Miescher (1844–1895). Miescher originally called these compounds "nuclein" because they were first obtained from the nuclei of cells. One of Miescher's students, Richard Altmann, later suggested a new name for the compounds, a name that better reflected their chemical nature: nucleic acids.
Nucleic acids seemed unlikely candidates as molecules of heredity in the 1930s. What was then known about their structure suggested that they were too simple to carry the vast array of complex information needed in a molecule of heredity. Each nucleic acid molecule consists of a long chain of alternating sugar and phosphate fragments to which are attached some sequence of four of five different nitrogen bases: adenine, cytosine, guanine, uracil and thymine (the exact bases found in a molecule depend slightly on the type of nucleic acid).
It was not clear how this relatively simple structure could assume enough different conformations to "code" for hundreds of thousands of genetic traits. In comparison, a single protein molecule contains various arrangements of twenty fundamental units (amino acids) making it a much better candidate as a carrier of genetic information.
Yet, experimental evidence began to point to a possible role for nucleic acids in the transmission of hereditary characteristics. That evidence implicated a specific sub-family of the nucleic acids known as the deoxyribonucleic acids, or DNA. DNA is characterized by the presence of the sugar deoxyribose in the sugar-phosphate backbone of the molecule and by the presence of adenine, cytosine, guanine, and thymine, but not uracil.
As far back as the 1890s, the German geneticist Albrecht Kossel (1853–1927) obtained results that pointed to the role of DNA in heredity. In fact, historian John Gribbin has suggested that the evidence was so clear that it "ought to have been enough alone to show that the hereditary information... must be carried by the DNA." Yet, somehow, Kossel himself did not see this point, nor did most of his colleagues for half a century.
As more and more experiments showed the connection between DNA and genetics, a small group of researchers in the 1940s and 1950s began to ask how a DNA molecule could code for genetic information. The two who finally resolved this question were a somewhat unusual pair, James Watson, a 24-year old American trained in genetics, and Francis Crick, a 36-year old Englishman, trained in physics and self-taught in chemistry. The two met at the Cavendish Laboratories of Cambridge University in 1951, and became instant friends. They were united by a common passionate belief that the structure of DNA held the key to understanding how genetic information is stored in a cell and how it is transmitted from one cell to its daughter cells.
In one sense, the challenge facing Watson and Crick was a relatively simple one. A great deal was already known about the DNA molecule. Few new discoveries were needed, but those few discoveries were crucial to solving the DNA-heredity puzzle. Primarily the question was one of molecular architecture. How were the various parts of a DNA molecule oriented in space such that the molecule could hold genetic information?
The key to answering that question lay in a technique known as x-ray crystallography. When x rays are directed at a crystal of some material, such as DNA, they are reflected and refracted by atoms that make up the crystal. The refraction pattern thus produced consists of a collection of spots and arcs. A skilled observer can determine from the refraction pattern the arrangement of atoms in the crystal.
The technique is actually more complex than described here. For one thing, obtaining satisfactory x-ray patterns from crystals is often difficult. Also, interpreting x-ray patterns—especially for complex molecules like DNA—can be extremely difficult.
Watson and Crick were fortunate in having access to some of the best x-ray diffraction patterns that then existed. These "photographs" were the result of work being done by Maurice Wilkins and Rosalind Elsie Franklin at King' s College in London. Although Wilkins and Franklin were also working on the structure of DNA, they did not recognize the information their photographs contained. Indeed, it was only when Watson accidentally saw one of Franklin's photographs that he suddenly saw the solution to the DNA puzzle.
Racing back to Cambridge after seeing this photograph, Watson convinced Crick to make an all-out attack on the DNA problem. They worked continuously for almost a week. Their approach was to construct tinker-toy-like models of the DNA molecule, shifting atoms around into various positions. They were looking for an arrangement that would give the kind of x-ray photograph that Watson had seen in Franklin's laboratory.
Finally, on March 7, 1953, the two scientists found the answer. They built a model consisting of two helices (corkscrew-like spirals), wrapped around each other. Each helix consisted of a backbone of alternating sugar and phosphate groups. To each sugar was attached one of the four nitrogen bases, adenine, cytosine, guanine, or thymine. The sugar-phosphate backbone formed the outside of the DNA molecule, with the nitrogen bases tucked inside. Each nitrogen base on one strand of the molecule faced another nitrogen base on the opposite strand of the molecule. The base pairs were not arranged at random, however, but in such a way that each adenine was paired with a thymine, and each cytosine with a guanine.
The Watson-Crick model was a remarkable achievement, for which the two scientists won the 1954 Nobel Prize in Chemistry. The molecule had exactly the shape and dimensions needed to produce an x-ray photograph like that of Franklin's. Furthermore, Watson and Crick immediately saw how the molecule could "carry" genetic information. The sequence of nitrogen bases along the molecule, they said, could act as a genetic code. A sequence, such as A-T-T-C-G-C-T . . . etc., might tell a cell to make one kind of protein (such as that for red hair), while another sequence, such as G-C-T-C-T-C-G . . . etc., might code for a different kind of protein (such as that for blonde hair). Watson and Crick themselves contributed to the deciphering of this genetic code, although that process was long and difficult and involved the efforts of dozens of researchers over the next decade.
Watson and Crick had also considered, even before their March 7th discovery, what the role of DNA might be in the manufacture of proteins in a cell. The sequence that they outlined was that DNA in the nucleus of a cell might act as a template for the formation of a second type of nucleic acid, RNA (ribonucleic acid). RNA would then leave the nucleus, emigrate to the cytoplasm and then itself act as a template for the production of protein. That theory, now known as the Central Dogma, has since been largely confirmed and has become a critical guiding principal of much research in molecular biology.
Scientists continue to advance their understanding of DNA. Even before the Watson-Crick discovery, they knew that DNA molecules could exist in two configurations, known as the "A" form and the "B" form. After the Watson-Crick discovery, two other forms, known as the "C" and "D" configurations, were also discovered. All four of these forms of DNA are right-handed double helices that differ from each other in relatively modest ways.
In 1979, however, a fifth form of DNA known as the "Z" form was discovered by Alexander Rich and his colleagues at the Massachusetts Institute of Technology. The "Z" form was given its name partly because of its zig-zag shape and partly because it is different from the more common A and B forms. Although Z-DNA was first recognized in synthetic DNA prepared in the laboratory, it has since been found in natural cells whose environment is unusual in some respect or another. The presence of certain types of proteins in the nucleus, for example, can cause DNA to shift from the B to the Z conformation. The significance and role of this most recently discovered form of DNA remains a subject of research among molecular biologists.
Judyth Sassoon, ARCS, PhD
DNA (Deoxyribonucleic Acid)
DNA (deoxyribonucleic acid)
Genetics is the science of heredity that involves the study of the structure and function of genes and the methods by which genetic infomation contained in genes is passed from one generation to the next. The modern science of genetics can be traced to the research of Gregor Mendel (1823–1884), who was able to develop a series of laws that described mathematically the way hereditary characteristics pass from parents to offspring. These laws assume that hereditary characteristics are contained in discrete units of genetic material now known as genes.
The story of genetics during the twentieth century is, in one sense, an effort to discover the gene itself. An important breakthrough came in the early 1900s with the work of the American geneticist, Thomas Hunt Morgan (1866–1945). Working with fruit flies, Morgan was able to show that genes are somehow associated with the chromosomes that occur in the nuclei of cells. By 1912, Hunt's colleague, American geneticist A. H. Sturtevant (1891–1970) was able to construct the first chromosome map showing the relative positions of different genes on a chromosome. The gene then had a concrete, physical referent; it was a portion of a chromosome.
During the 1920s and 1930s, a small group of scientists looked for a more specific description of the gene by focusing their research on the gene's molecular composition. Most researchers of the day assumed that genes were some kind of protein molecule. Protein molecules are large and complex. They can occur in an almost infinite variety of structures. This quality is expected for a class of molecules that must be able to carry the enormous variety of genetic traits.
A smaller group of researchers looked to a second family of compounds for potential candidates as the molecules of heredity. These were the nucleic acids. The nucleic acids were first discovered in 1869 by the Swiss physician Johann Miescher (1844–1895). Miescher originally called these compounds "nuclein" because they were first obtained from the nuclei of cells. One of Miescher's students, Richard Altmann, later suggested a new name for the compounds, a name that better reflected their chemical nature: nucleic acids.
Nucleic acids seemed unlikely candidates as molecules of heredity in the 1930s. What was then known about their structure suggested that they were too simple to carry the vast array of complex information needed in a molecule of heredity. Each nucleic acid molecule consists of a long chain of alternating sugar and phosphate fragments to which are attached some sequence of four of five different nitrogen bases: adenine, cytosine, guanine, uracil and thymine (the exact bases found in a molecule depend slightly on the type of nucleic acid).
It was not clear how this relatively simple structure could assume enough different conformations to "code" for hundreds of thousands of genetic traits. In comparison, a single protein molecule contains various arrangements of twenty fundamental units (amino acids) making it a much better candidate as a carrier of genetic information.
Yet, experimental evidence began to point to a possible role for nucleic acids in the transmission of hereditary characteristics. That evidence implicated a specific sub-family of the nucleic acids known as the deoxyribonucleic acids, or DNA. DNA is characterized by the presence of the sugar deoxyribose in the sugar-phosphate backbone of the molecule and by the presence of adenine, cytosine, guanine, and thymine, but not uracil.
As far back as the 1890s, the German geneticist Albrecht Kossel (1853–1927) obtained results that pointed to the role of DNA in heredity. In fact, historian John Gribbin has suggested that the evidence was so clear that it "ought to have been enough alone to show that the hereditary information… must be carried by the DNA." Yet, somehow, Kossel himself did not see this point, nor did most of his colleagues for half a century.
As more and more experiments showed the connection between DNA and genetics, a small group of researchers in the 1940s and 1950s began to ask how a DNA molecule could code for genetic information. The two who finally resolved this question were a somewhat unusual pair, James Watson, a 24-year old American trained in genetics, and Francis Crick, a 36-year old Englishman, trained in physics and self-taught in chemistry. The two met at the Cavendish Laboratories of Cambridge University in 1951, and became instant friends. They were united by a common passionate belief that the structure of DNA held the key to understanding how genetic information is stored in a cell and how it is transmitted from one cell to its daughter cells.
In one sense, the challenge facing Watson and Crick was a relatively simple one. A great deal was already known about the DNA molecule. Few new discoveries were needed, but those few discoveries were crucial to solving the DNA-heredity puzzle. Primarily the question was one of molecular architecture. How were the various parts of a DNA molecule oriented in space such that the molecule could hold genetic information?
The key to answering that question lay in a technique known as x-ray crystallography. When x rays are directed at a crystal of some material, such as DNA, they are reflected and refracted by atoms that make up the crystal. The refraction pattern thus produced consists of a collection of spots and arcs. A skilled observer can determine from the refraction pattern the arrangement of atoms in the crystal.
The technique is actually more complex than described here. For one thing, obtaining satisfactory x-ray patterns from crystals is often difficult. Also, interpreting x-ray patterns—especially for complex molecules like DNA—can be extremely difficult.
Watson and Crick were fortunate in having access to some of the best x-ray diffraction patterns that then existed. These "photographs" were the result of work being done by Maurice Wilkins and Rosalind Elsie Franklin at King' s College in London. Although Wilkins and Franklin were also working on the structure of DNA, they did not recognize the information their photographs contained. Indeed, it was only when Watson accidentally saw one of Franklin's photographs that he suddenly saw the solution to the DNA puzzle.
Racing back to Cambridge after seeing this photograph, Watson convinced Crick to make an all-out attack on the DNA problem. They worked continuously for almost a week. Their approach was to construct tinker-toy-like models of the DNA molecule, shifting atoms around into various positions. They were looking for an arrangement that would give the kind of x-ray photograph that Watson had seen in Franklin's laboratory.
Finally, on March 7, 1953, the two scientists found the answer. They built a model consisting of two helices (corkscrew-like spirals), wrapped around each other. Each helix consisted of a backbone of alternating sugar and phosphate groups. To each sugar was attached one of the four nitrogen bases, adenine, cytosine, guanine, or thymine. The sugar-phosphate backbone formed the outside of the DNA molecule, with the nitrogen bases tucked inside. Each nitrogen base on one strand of the molecule faced another nitrogen base on the opposite strand of the molecule. The base pairs were not arranged at random, however, but in such a way that each adenine was paired with a thymine, and each cytosine with a guanine.
The Watson-Crick model was a remarkable achievement, for which the two scientists won the 1954 Nobel Prize in Chemistry. The molecule had exactly the shape and dimensions needed to produce an x-ray photograph like that of Franklin's. Furthermore, Watson and Crick immediately saw how the molecule could "carry" genetic information. The sequence of nitrogen bases along the molecule, they said, could act as a genetic code. A sequence, such as A-T-T-C-G-C-T…etc., might tell a cell to make one kind of protein (such as that for red hair), while another sequence, such as G-C-T-C-T-C G…etc., might code for a different kind of protein (such as that for blonde hair). Watson and Crick themselves contributed to the deciphering of this genetic code, although that process was long and difficult and involved the efforts of dozens of researchers over the next decade.
Watson and Crick had also considered, even before their March 7th discovery, what the role of DNA might be in the manufacture of proteins in a cell. The sequence that they outlined was that DNA in the nucleus of a cell might act as a template for the formation of a second type of nucleic acid, RNA (ribonucleic acid) . RNA would then leave the nucleus, emigrate to the cytoplasm and then itself act as a template for the production of protein. That theory, now known as the Central Dogma, has since been largely confirmed and has become a critical guiding principal of much research in molecular biology.
Scientists continue to advance their understanding of DNA. Even before the Watson-Crick discovery, they knew that DNA molecules could exist in two configurations, known as the "A" form and the "B" form. After the Watson-Crick discovery, two other forms, known as the "C" and "D" configurations, were also discovered. All four of these forms of DNA are right-handed double helices that differ from each other in relatively modest ways.
In 1979, however, a fifth form of DNA known as the "Z" form was discovered by Alexander Rich and his colleagues at the Massachusetts Institute of Technology. The "Z" form was given its name partly because of its zig-zag shape and partly because it is different from the more common A and B forms. Although Z-DNA was first recognized in synthetic DNA prepared in the laboratory, it has since been found in natural cells whose environment is unusual in some respect or another. The presence of certain types of proteins in the nucleus, for example, can cause DNA to shift from the B to the Z conformation. The significance and role of this most recently discovered form of DNA remains a subject of research among molecular biologists.