Kenneth Travis LaPensee. Scientific Thought: In Context. Editor: K Lee Lerner & Brenda Wilmoth Lerner. Volume 1. Detroit: Gale, 2009.
Essentially completed in April 2003, the Human Genome Project was a massive 13-year research effort to map and sequence each of humanity’s 25,000 genes, known collectively as the genome. Researchers predict that it will enable physicians to determine individual genetic strengths and vulnerability to disease, information that can be used to minimize known medical risks.
While there is ample reason for optimism that the human genome project will improve public and personal health, there is also a disquieting potential that it could be abused for commercial or political ends. The young and growing field of bioethics will face a flood of new challenges and conundrums posed by scientists’ increasing ability to ascertain the genetic status of each individual. The emerging science of biotechnology, which studies the human genome and its application to medicine, commerce, and government, may potentially have the power to alter the course of human history.
Historical Background and Scientific Foundations
Articulating the human genome essentially began with the discovery of the role and structure of deoxyribonucleic acid (DNA), the molecule that contains the genetic codes of all living organisms, by American geneticist James Watson (1928-), and British biophysicists Francis Crick (1916-2004), Maurice Wilkins (1916-2004), and Rosalind Franklin (1920-1958). Their groundbreaking research at Cambridge University in the 1950s led to a shared Nobel Prize for Watson, Crick, and Wilkins in 1962. (Franklin was deceased and thus ineligible for a Nobel Prize.) Later work by American biochemists Marshall Nirenberg (1927-) Robert Holley (1922-1993), and Har Gobind Khorana (1922-) further deciphered the genetic code and protein synthesis. The three shared a Nobel Prize in 1968.
Using x-ray diffraction data, Watson and Crick found that DNA had the structure of a double helix, a “ladder” with spines composed of polysaccharide chains of the sugar deoxyribose and rungs composed of nucleic acids (or nucleotides) that include adenine, guanine, cytosine, and thymine. Delimited sequences of these nucleotides, called triplets, comprise the codons for the 20 amino acids found in all life forms; codon sequences comprise the genes that dictate the structure of the proteins that are the building blocks of cells, tissues, and organ systems of all living organisms.
Development of the Human Genome Project
In 1982 Dr. Leroy Hood (1938-), cofounder of the Institute for Systems Biology in Seattle, began to develop an automated gene sequencer. During the 1990s this device fueled the rapid progress of the Human Genome Project, an enterprise that scientists had predicted would take a century to complete. The sequencer cut that time to 13 years, and was soon employed at major national laboratories for exploration of the human genome.
The next year the Los Alamos National Laboratory (LANL) and the Lawrence Livermore National Laboratory (LLNL) began to produce DNA clone (cosmid) libraries, collections of cloned chromosomal DNA representing single human chromosomes. This project received critical backing from the Congressional Office of Technology Assessment (OTA), giving attention to the increasing scientific consensus that a human genome project would be of great value.
An important meeting held in 1985 at the University of California, Santa Cruz, gathered scientists interested in working on human genome sequencing. In that same year, the Office of Health and Environmental Research within the U.S. Department of Energy (OHER, now the Office of Biological and Environmental Research), commissioned the first annual Santa Fe conference to evaluate the viability of a human genome program. After the next Santa Fe conference, OHER announced the Human Genome Initiative (HGI, now known as the Human Genome Project or HGP) with $5.3 million in startup funding; pilot projects began at the Department of Energy (DOE) national laboratories to build up vital assets and technologies for the project.
In 1987 the DOE Health and Environmental Research Advisory Committee (HERAC) recommended a 15-year multidisciplinary project to map and sequence the human genome, designating certain research sites as “multidisciplinary human genome centers.” In addition, the National Institute of General Medical Sciences (NIGMS) at the National Institutes of Health (NIH) began to fund genome projects. One of the first breakthroughs from these initiatives was the discovery of the significance of telomere (chromosome end) sequences, when scientists at LANL discovered that shorter telomere length is correlated with increased age and greater cancer risk.
So many different initiatives were going on simultaneously at that time that Congress recommended a more coordinated effort. Scientists founded the Human Genome Organization (HUGO) in 1989 to coordinate research internationally, and annual meetings at Cold Spring Harbor Laboratory began that year. The DOE and NIH signed a memorandum of understanding that outlined cooperation on genome research and later jointly published guidelines for data release and resource sharing.
In 1988 the first fruits of all of these efforts began to emerge. The human chromosome mapping data repository (GDB) was established, and in 1989 a “low-resolution” genetic linkage map of the entire human genome was published. Low-resolution maps show only the location of genes on chromosomes, while high-resolution maps determine the actual variants or alleles of these genes in an individual’s genetic sequence. In 1990 some projects began to identify gene locations on chromosome maps as sites of messenger RNA (mRNA) expression. These discoveries were significant because mRNA transfers protein-manufacturing information from the DNA to protein-manufacturing sites in the cellular nucleus, called ribosomes. Another type of RNA known as transfer RNA (tRNA) carries amino acids for the proteins to the ribosome. Each tRNA has a nucleotide triplet (codon) that binds to the complementary mRNA sequence on the ribosome. In finding mRNA sites, researchers began to increase the resolution of the gene map. At the same time researchers began to increase efficiency in the production of stable large-insert bacterial artificial chromosomes (BACs), which are arti-ficial gene sequences that are inserted into bacteria for cloning experiments.
Sociopolitical Implications of Human Genome Knowledge
The first Internet sequencing service was the Gene Recognition and Analysis Internet Link (GRAIL), established at the Life Sciences Division of the Oak Ridge National Laboratory (ORNL) in 1993. This service is currently provided by the Genome Analysis and System Modeling Group at ORNL, which conducts genetics research and system development in genomic sequencing, computational genome analysis, and computational protein structure analysis. The group also provides bioinformatics and analytic services and resources to research collaborators, predicts prospective gene and protein models for analysis, and provides user services for the general community, including computer-annotated genomes. This service is increasingly being used to map individuals’ genetic risk for various diseases.
The Institute of Medicine released the HGP-funded report “Assessing Genetic Risks” in 1993. The rapid accumulation of genetic data had created growing concern about the misuse of such information by insurance companies, prompting the DOE—NIH Ethical, Legal, and Social Issues (ELSI) Working Group’s Task Force on Genetic and Insurance Information to publish recommendations for safeguarding individual privacy and prohibiting the use of such information for insurance underwriting.
Soon after ELSI made its recommendations, the Genetic Privacy Act, the first American HGP legislation, was proposed to govern “collection, analysis, storage, and use of DNA samples and the genetic information obtained from them.” The legislation was sanctioned by the ELSI Working Group. Within a few years Equal Employment Opportunity Commission guidelines extended the Americans with Disability Act (ADA) employment protection to cover discrimination based on genetic information related to illness, disease, or other conditions. In 1996 the Health Care Portability and Accountability Act (HIPAA) prohibited the use of genetic information in some determinations of health insurance eligibility, and required the Department of Health and Human Services to enforce privacy provisions for individual health information. DOE and the new National Center for Human Genome Research also issued guidelines on use of human subjects for large-scale sequencing projects. HGP participants agreed on sequencing data release policies and the United Nations Educational, Scientific, and Cultural Organization (UNESCO) adopted the Universal Declaration on the Human Genome and Human Rights. President Bill Clinton (1946-) signed an executive order forbidding federal departments and agencies from using genetic information in hiring or promoting workers in 1998.
Commercialization of the Human Genome Project
While scientists and politicians tried to put sociopolitical safeguards in place, the movement to commercialize the genome findings began to gain steam. In 1993 the NCHGR established a Division of Intramural Research, charged with developing genome technology research of specific diseases. The chromosome paints developed at LLNL to contrast various regions of the chromosomes and the hybridization sequencing technology developed at Argonne National Laboratory were commercialized in 1994. At the same time it also became clear that HGP research goals would be met ahead of projections. National laboratories completed second-generation DNA clone libraries that represented each human chromosome.
J. Craig Venter (1946-) was an NIH scientist who decided to focus his gene-sequencing efforts on the gene products rather than the genome itself, such as messenger RNAs manufactured within cells. Beyond the genes themselves, the genome also consists of a much larger quantity of DNA, the function of which is not yet known, referred to as “junk DNA.” Venter launched a challenging project to sequence genome sections, attempting to identify new gene products and to estimate how diverse these sections were in terms of gene structure and function.
Venter ultimately left NIH to set up a private, non-profit institute, The Institute for Genomic Research (TIGR), which aimed to collect and interpret enormous numbers of expressed sequence tags (random portions of complementary DNAs whose structure was deduced from m-RNA information; the RNAs hold all the information that is actually “phenotypically expressed” in a given cell type). TIGR published the first sequenced genome of the bacteria Haemophilus influenzae in July 1995.
Venter changed direction when he announced in May 1998 that he was forming a new company later named Celera Genomics Group, a privately funded genomics effort seeking to sequence much of the human genome using HGP-developed resources. It would compete directly with the public effort to sequence the complete human genome by 2001. Venter adopted an alternative approach to mapping the genome from that followed by the HGP. Rather than mapping each gene on the chromosomes, his private commercial effort intended to section the genome into random portions, and then sequence and piece them back together. Although this technique saved time by doing sequencing without gene mapping, it needed powerful computing capabilities for the reassembly phase because of the many recurring sequences. Venter’s method used a supercomputer and 300 high-speed automatic sequencers manufactured by the Perkins-Elmer Corporation. This method has undergone refinement and is now standard at the national laboratories.
When Venter’s intentions were made public, the Wellcome Trust in the United Kingdom decided to allocate extra resources to its own Sanger Centre to speed up genome sequencing. Thus began a determined race between public and private initiatives, with the added dimension that the private initiative was based in the United States, while the public initiative was based in Cambridge, U.K. With increased funding, the Sanger Centre doubled its sequencing goal from one sixth up to a full one-third of the complete genome. With the race between the public and private projects on in earnest, sequencing milestones were achieved ever more quickly.
Actually, it was overly simplistic to view the competition in terms of U.K. public sector versus U.S. private sector. Venter’s bold initiative also energized the United States’ publicly funded project (HGP) and stimulated strenuous efforts. HGP scientists wanted to avoid the public perception that their work would be perceived as unhurried and inefficient, which could have resulted in the HGP losing congressional support and funding. Furthermore, the possibility that the human genome resources could end up in private hands was unacceptable to the HGP investigators. Thus Celera’s entrance into the race to sequence the human genome revolutionized the field of genomics. Celera decided to make its genome sequences available to subscribing customers only, and the company planned to patent some sequences with particularly promising commercial value before releasing them. This commercial approach was emulated by other private companies that were involved in genetic research that considered individual genes, rather than the entire genome. The commercialization of the genome sequence thus raised a possibility that was alarming to many scientists and laymen that the genetic material of the human body could become patented and “owned” for the financial benefit of corporations.
In order to stymie the effort to patent large portions of the genome, the publicly funded HGP published gene sequence information as fast as they could discover it, in order to provide immediately usable and timely data to the scientific community. The HGP leadership maintained that patenting the human genome was unethical and delayed the quick application of genomic information to curing medical illnesses. International genome research project partners had convened in Bermuda in February 1996, where they formulated the “Bermuda principles,” to govern access to genome data, particularly the rule that sequence information should be released into public databases within twenty-four hours. Adherence to these rules meant that participating scientists would enter newly discovered base sequences into a public database within one day of completing the sequence. Data contained in the databases were merged daily, and access to the stored sequences was totally unrestricted and free of charge. This agreement was extended to encompass the genomes of other organisms later that year.
Overall, Celera’s entry into the genome sequencing field and its plan to patent genes had mixed consequences. The company’s efforts were excoriated in the U.K. press since, unlike the United States, the United Kingdom provides health care to all citizens through its National Health Service (NHS). Thus the NHS would bear the financial brunt of paying for the medical application of patented genome discoveries. In general there was alarm and disapproval of U.S. biotechnology companies that aimed for profits rather than allowing public access to genome sequences to advance scientific knowledge and enhance public health.
On the other hand, some of the prominent HGP researchers believed that Celera’s entry into the field ultimately had a positive influence on the government-funded projects because it inspired healthy competition and more creative and vigorous thinking. The news media, on the other hand, portrayed the race between the public and private sectors as acrimonious and neglectful of the overarching goals of the project to increase scientific knowledge and to improve human health.
In 1999 major drug firms, which also stood to lose financially if large parts of the human genome were patented, created the SNP (single nucleotide polymorphism) Consortium to create an inventory of all the variations in the genome that differ by a single nucleotide; this was to prevent individual human genes from being patented and to preserve the growing knowledge of the human genome as a public resource.
The billionth base pair of the human genome was entered into the public databases on November 26, 1999, and the publication of the sequence of chromosome 22 on December 1 was heralded in the press. In March 2000 Prime Minister Tony Blair and President Bill Clinton called for all data about human genes to be made freely available to scientists worldwide. The stock prices of Celera and the other biotechnology firms involved in genome research dropped precipitously. In the U.S. Congress, Republicans denounced Democrats for failing to support the U.S. biotechnology industry. Pressure on the HGP leadership mounted to patch up their differences with Venter, although earlier efforts to resolve differences had failed because Venter had refused to release its genome sequence data unconditionally. Finally, in June 2000, a joint announcement was crafted that both the public and private efforts had completed working drafts of the human genome. These drafts were published simultaneously in the journals Nature, which published the public project’s draft and Science, which published Celera’s draft.
Milestones in the Continuing Public Sector Genome Effort
By 1995 National Laboratory scientists had achieved several critical breakthroughs. LANL and LLNL announced the construction of high-resolution physical maps of chromosome 16 and 19, while moderate-resolution maps of chromosomes 3, 11, 12, and 22 were published. The first non-viral gene map (for the bacterium Haemophilus influenzae) was constructed by Celera, and the gene sequence of the smallest bacterium, Mycoplasma genitalium, was completed; the latter achievement showed the minimum number of genes needed for independent existence. (Viruses, simpler than bacteria, do not live independently, but must use the genetic machinery of host organisms to replicate.)
By 1996 eight NIH institutes and centers had also collaborated to create the Center for Inherited Disease Research to study the genetics of complex diseases. In 1997 the NCHGR gained full institute status at NIH, becoming the National Human Genome Research Institute (NHGRI). Francis Collins remained as the director of the new institute. A third five-year plan was announced in 1998 in Science. The sequence of the human T-cell receptor region was completed in 1996; this was an important step toward understanding how the immune system’s biochemical signaling (e.g., spreading information about invading organisms) works. In that same year an unexpected and dramatic discovery was made when the genome of the organism Methanococcus jannaschii was sequenced. This work confirmed the existence of a third major branch of life, revealing an ancient metabolic world shared largely by Eubacteria (bacteria that are familiar to us) and Archaea (represented by M. jannaschii) prior to the emergence of eukaryotes (“true” cells such as those in protozoa and the bodies of multicellular organisms).
Although components of information processing and excretory systems are present in all three domains, their apparent refinement over time—especially transcription and translation—indicate that Archaea and Eukaryotes share a common evolutionary trajectory independent of the Eubacteria lineage. In 1998 the genome sequence of Escherichia coli, the most commonly used bacterium in biotechnology, was completed.
In 1999 the first human chromosome (chromosome 22) was sequenced completely, as was the smallest of the chromosomes (chromosome 21). In 2001 human chromosome 20 was finished. Many of the remaining chromosomes were sequenced by the end of 2004. In that year the estimated number of human genes was revised upward from about 20,000 to 25,000. The gene sequences of other animals often used in research, such as the mouse and the fruit fly Drosophila were also completed in the early years of the twenty-first century.
Modern Cultural Connections
While genome projects have given scientists an inventory of genes and information about some of the basic purposes they serve, little is known about how cells use genetic information to function as living organisms. Researchers still do not know the functions of most genes, or how genes and the proteins they code for act together and with the external world.
Gene sequences and the technology used to produce them has at the very least revolutionized the way that molecular biology research is conducted. Prior to the development of these techniques, researchers could study only a few genes or proteins at a time, producing an artificial and unrealistic understanding of the way that organisms function. Now scientists can use a much grander approach, investigating all genes relevant to a particular process, tissue, organ, or tumor. A new field known as systems biology models the interactions of thousands of genes, proteins, and biochemicals to produce the phenomena that occur to bring organisms to life.
An overarching purpose of genome science is to chart variations in DNA sequences that can increase or reduce the risk of disease, and determine how individuals respond to infections, toxins, and drugs. One of the more common types of sequence variation is the single nucleotide polymorphism (SNP), in which individuals differ in their DNA sequences by a single base (e.g., having adenine at a particular location instead of cytosine). Researchers estimate that the human genome has at least 10 million SNPs, and maps of these sites are being generated. Ultimately this variation will be correlated with the risk of disease and response to the environment. Scientists hope that building an inventory of individual SNPs will provide a shortcut for identifying DNA regions linked to diseases such as cancer, heart disease, diabetes, and even certain types of mental illness. A new SNP map also might also help ascertain how genetic variation produces individual traits and responses to the environment.
Primary Source Connection
The following article was written by Peter N. Spotts, a science and technology writer for The Christian Science Monitor.Founded in 1908, The Christian Science Monitor is an international newspaper based in Boston, Massachusetts. The article describes the National Human Genome Research Institute’s announcement of their completion of a map that sequences the human genome and its implications for the future.
In what many are hailing as a historic milestone in the annals of modern science, researchers have announced the successful completion of a project to sequence the human genome.
The 13-year, $2.7 billion undertaking drew to a close Monday, 50 years to the month after scientists Francis Crick and James Watson published their discovery of the structure of DNA, the biochemical instruction book for organic life archived in the centers of cells.
Now, biologists are unrolling a fresh research blueprint for genome-related research, drawn for what National Human Genome Research Institute Director Francis Collins and colleagues have termed “the true dawning of the genomic era.”
They’ve assembled the parts list in the right order. Now they hope to accelerate efforts to understand how the genetic information they’ve uncovered yields the complexities and diversity of living organisms.
“We have opened the door into a vast and complex new biological landscape,” says Aristides Patrinos, director of the US Department of Energy’s Office of Biological and Environmental Research.
Even before the project ended, it was having a measurable impact on areas ranging from medicine to the war against bioterrorism. Researchers say information from the genome project has allowed them to develop genetic tests that can help identify broad classes of cancer. Gene therapies, in which defective genes are identified and replaced, remain in their infancy. But scientists claim some success in treating mice with sickle-cell disease.
Microcircuits that can quickly analyze DNA samples placed on them are being used in equipment designed to test for many of the microbes thought to be the most likely weapons in a bioterrorism attack.
Meanwhile, researchers using sequencing and computational techniques developed for the Human Genome Project are looking for microbes that could help clean up nuclear waste, refine gasoline more efficiently and with less energy, or act as a source of hydrogen for fuel.
The praise and predictions surrounding Monday’s announcement of the Human Genome Project’s end has a familiar ring. In February 2001, with fanfare that included capturing the covers of the world’s two leading general-science journals, researchers with the Human Genome Project and a private human-genome effort published rough drafts of the sequence.
Yet the drafts were laced with errors and contained vast gaps in the sequence of pairings among the four chemical “bases” that combine to form the “runs” of the now-iconic twisted-ladder structure of DNA.
This time, it’s complete
The version announced Monday is as complete as today’s technology can make it, researchers say. The error rate has been cut from one mistake in every 1,000 base pairs to one in every 10,000—an accuracy that applies to 99 percent of the genome’s 3 billion base pairs.
Just as important, researchers add, are the finished product’svast stretches of uninterrupted genetic information which is expected to radically shorten the time it takes scientists to hunt for genes.
“It’s a bit like moving on from a first-attempt demo music tape to a classic CD,” says Jane Rogers, director of sequencing at Britain’s Wellcome Trust Sanger Institute, a key player in the sequencing effort.
The information and technologies the project has generated already are profoundly affecting fields ranging from biomedicine and hazardous-waste clean-up to the study of the origins and evolution of organic life itself. The project also has laid at society’s doorstep challenging ethical and legal questions about the use of human genetic information.
Now scientists are moving into a new generation of global research to build on the Human Genome Project’s results. Writing in a forthcoming issue of the journal Nature, Dr. Collins and several colleagues outline what they see as the opportunities the completed genome offers for improving medical care, dealing with environmental issues, and assessing the effect genetic information can have on “concepts of race, ethnicity, kinship, individual and group identity, health, disease, and ‘normality’ for traits and behaviors.”
Several projects already are under way. Last fall, the National Human Genome Research Institute and collaborators began the International HapMap Project, a three-year effort to pinpoint genetic variation within the human genome. Another project aims to build an encyclopedia covering all of the genes that code for proteins, and other important biochemicals, or that perform other functions. This would allow scientists to quickly distinguish useful genes from the junk genes the genome carries.
Applications for energy
Meanwhile, the DOE is focusing efforts on one-celled organisms and the roles they may be able to play in meeting US energy needs and cleaning the environment. Over the next 10 to 20 years, researchers want to know how microbes, which make up an estimated 50 percent of Earth’s biomass, function at such a level of detail that they can accurately simulate how organisms will respond to changes in their environment.
“We’re just beginning to understand how to work with multiple influences” instead of single determining factors, notes Alta Charo, a professor of law and bioethics at the University of Madison at Wisconsin. Those multiple influences can be found in the interplay between genes and environment or the interplay of many genes within a genome required to trigger a particular set of biological processes.
The idea of multiple influences “does not work well within a medical system whose paradigm is to cure diseases that are presented to it. It requires thinking from a more preventative point of view,” she says. “The Human Genome Project may push us toward a different organization of the healthcare system, if we use the information creatively.”
As researchers probe the “sheer number of genetic variances and mutations, we’re going to slowly realize that any individual has genetic variance”, that variance is the norm, and consequently that, in biological terms at least, there is no idealized “normal.”
The challenge for society as it continues to grapple with advances in genetics information “is not to say: we won’t go there,” Charo says. “The challenge is to say we will go there, and this is how.”