Gary S Stein, Janet L Stein, Andre J van Wijnen, Jane B Lian. American Scientist. Volume 84, Issue 1. January 1996.
Many of the cells in a human body are disposable. That is, many are meant to be used briefly, and when they become used up, to be discarded. Consider skin cells, which are abraded in vast numbers daily, only to be replaced by other skin cells. Or blood cells, which reside in the body for only a few weeks before being replaced. Bone cells are also being used and replaced, either as the bone is remodeled in response to the changing demands it receives through a person’s lifetime, or when it is broken and must be repaired.
Skin, blood and bone cells represent the mature or differentiated stages of each cell type. In order for the body to replenish these mature cells when necessary, it must manufacture an ample supply of precursor cells. Precursors are the immature varieties of the fully functional cells that have the capability to divide and make more precursors or differentiate into the mature and functional cell type.
These two stages—proliferation and differentiation—are almost always mutually exclusive. At some point in a person’s life, almost every cell will reach an important crossroad where it must “decide” whether to do one or the other.
Of course, cells do not go through a rational decision process. Rather, the “decision” is mediated by the signaling molecules in the cell’s environment. These give information to the cell regarding the need for either more precursors or more differentiated cells. The external signals in turn activate a particular set of genes that will prepare the cell for the activity to follow. If the cell is to divide, then the set of genes will be activated that will help manufacture the apparatus needed to replicate DNA, the genetic material, and apportion it equally to the daughter cells. If the cell is to differentiate, a specific set of genes are activated that help the cell manufacture the products of its mature type.
During the past two decades in particular, molecular biologists have made tremendous progress in identifying the genes involved in cell growth and maturation for a range of cell types. They have also come a long way in enumerating the specific molecules, usually proteins, needed to activate particular sets of genes. Yet with this understanding, other problems present themselves.
Genes are made of DNA, but not all of the DNA found in a cell is genes. In fact, genes constitute a small fraction of the total DNA in any given cell, possibly less than ten percent. Furthermore, the regulatory regions of any gene represent a very small part of the total length of any gene. If we assume that genes and their regulatory proteins must find each other at random, the chances that the proteins needed to activate any particular gene at any particular time will find their way to the proper regulatory sequence at the right time become vanishingly small. Added to those odds is the problem that the proteins that activate genes do not act singly. Rather, they often form large complexes of many different proteins, which means that large numbers of proteins have to find a small portion of the right gene at the right time. And as if that weren’t bad enough, one must consider that the regulatory proteins themselves exist in very small quantities at any given time. When scientists consider the facts of gene regulation, the challenges of proliferation and differentiation, with their absolute requirement for the proper activation and deactivation of sets of genes through time, seem almost impossible. Yet if one looks at any healthy individual, it is clear that genes are being properly regulated, and that cells are differentiating on schedule. So how does the cell manage to do it?
The task seems impossible if we assume that genes and their regulatory proteins are floating freely within the nucleus of the cell and that the two groups will meet by chance when they need to. That would be a bit like trying to find a particular item in a shopping mall where goods are randomly distributed among the stores in it. You would have to visit a good number of stores before finding the item you were looking for. But the mall is not organized that way, and neither, it seems, is the cell’s nucleus. The merchandise in a store is dictated by a single theme, and when you need a particular item, you go to the store that carries that general item. It is becoming increasingly clear that a cell’s nucleus and its DNA are also organized in such a way that regulatory proteins and their target genes are concentrated in specific regions. However, since different sets of genes and proteins are needed at different times, this organization can change with the needs and life stages of the cell. This is somewhat akin to the appearance and disappearance of seasonal merchandise in a shopping mall.
In our laboratory, we study the development of bone-forming cells called osteoblasts and the genes that are expressed maximally either during proliferation or differentiation, with the hope of understanding how the three-dimensional configuration of the cell’s nucleus, DNA and regulatory proteins contribute to the proper expression of these genes.
Bone, like many human tissues, is self-renewing. Increased pressure on a particular bone will cause it to grow and thicken, whereas decreased use has the opposite effect. In addition, broken bones can be repaired. This is possible because the osteoblast cells that form bone are capable of regenerating themselves. Regeneration, however, requires that two distinct pools of cells be maintained. One pool is of immature osteoblasts. These cells are found near the outer bone surfaces. They can divide to produce more immature osteoblasts so that ample numbers of these cells are available whenever the bone needs to be repaired or enlarged.
When bone tissue must be made, some of the immature osteoblasts start to migrate into the interior of the bone and differentiate. The differentiated cells lose the ability to divide—a mature cell cannot therefore be derived from the division of other mature cells. But they gain other capabilities. The mature cells start to manufacture proteins required for their specialized function. In the case of mature osteoblasts, many of these products are fibrous proteins that are secreted into the space between cells. The fibrous material forms the framework on which phosphate and calcium salts from body fluids accumulate to give bone its hardness.
Studies of osteoblasts grown in culture and in vivo support the idea that a specific set of genes is expressed during the proliferation stages but is repressed during differentiation. Then, at a key transition point that marks the completion of the initial developmental period, proliferation is shut down, and differentiation begins. At that point, genes are activated that contribute to the formation and maintenance of bone tissue. In our studies we have followed the expression of two genes in particular whose expression is correlated with proliferation or differentiation. The first is a gene for the protein called histone H4, which is maximally active during cell division. The second is the gene for osteocalcin, one of the specialized, calcium-binding bone proteins secreted by mature osteoblasts that associate with the mineralized fibrous framework of bone. In both cases, we have found that the overall architecture of the cell’s nucleus and the folding pattern of the chromosome in the area of the gene make important contributions to the proper and timely expression of these genes.
Almost all of the components of the cell are made up of proteins, and almost all of the work in the cell is carried out by proteins. But the cell has a complex system for manufacturing these proteins. The instructions for making proteins are encoded in the form of DNA, which is chemically distinct from protein. The set of instructions for making one protein is a gene, and genes are arrayed along the chromosomes. Genes, however, account for a small fraction of the total DNA in the chromosomes. The function of the rest of this DNA is the subject of intense debate.
The ultimate goal of expressing a gene is to produce the protein encoded within it. But this is not a straightforward process. The DNA of animal and plant cells is housed in a membrane-bound compartment within the cell called the nucleus. Protein synthesis, on the other hand, takes place outside of the nucleus, in the cellular compartment known as the cytoplasm. The DNA never leaves the nucleus, so the code contained within the genes has to be exported somehow from the nucleus to the protein-synthesizing machinery in the cytoplasm. This is achieved by making a temporary copy of the gene, which carries the DNA message to the cytoplasm. The messenger is not itself DNA, but a chemically related molecule called RNA. In our laboratory, we concern ourselves with the first steps of gene expression—namely, the copying of DNA into RNA, a process known as transcription.
For more than 30 years, since DNA was discovered to be the genetic material, scientists have been investigating the process of gene expression and trying to learn how transcription is initiated. Biologists have learned that transcription requires the collaboration of two classes of molecules—DNA and proteins.
If we look at a piece of a human gene, we see that it is a double strand of DNA coiled in the shape of a helix. One of these strands contains the instructional code for making the protein (the other contains the instructions for making its partner DNA strand). The major part of the gene is the coding region, but a portion of the gene codes for nothing. Instead, this portion constitutes an on/off switch. This regulatory region, called a promoter, is very much like the switch of a light. When the switch is in the “on” position, transcription can proceed, and an RNA copy of the coding region of the gene will be made. If the switch is turned “off,” transcription will not proceed.
The promoter can be turned on by proteins (and occasionally other molecules) specifically designed to recognize and bind to the promoter. These proteins are called transcription factors. Often, recognizing the promoter requires a whole complex of transcription factors. Sometimes the same transcription factor can be found in combination with different partners at different promoters. In this way, a single factor might be able to activate different kinds of promoters, depending on the partners with which it combines. In addition to transcription factors, other proteins are required that make an RNA copy of the DNA. It is generally believed that the transcription factors attach onto the promoter of the gene they recognize and help untwist the two strands to expose the coding region of the gene. The transcription factors may also help secure the RNA-copying machinery to the gene, and in this way initiate transcription of that specific gene. A different code at the end of the gene tells the RNA-copying protein, called a polymerase, to stop, whereupon the polymerase is released from the gene. Transcription is likely to proceed as long as all of the correct transcription factors are attached to the promoter, but will stop when one or more of the factors is removed.
In some cases, a gene has only one promoter. But many genes have more than one, each recognized by a different set of transcription factors. In this way, it is possible to construct a scenario for how genes are expressed in specific types of cells and during particular times in the life of a cell.
Different transcription factors may be available under different circumstances, depending on the external environment of the cell, its nutritional state or even the environmental temperature. External signals include hormones, transmitters and other small molecules secreted by other tissues or cells. Sometimes molecules on the cell’s exterior can penetrate both the cell’s membrane and the nuclear membrane and bind directly to a promoter and activate that gene. More often, however, external molecular and environmental signals trigger an internal biochemical response, which will result in the dispatch of certain transcription factors to the cell’s nucleus and to a gene’s promoter. Clearly, the external environments of proliferating and differentiating cells provide different external cues, which cause a different set of transcription factors to activate the promoters of different sets of genes. These genes encode products appropriate for the activities of the cell in a particular phase of its life cycle. Our study of the histone H4 gene, for example, has allowed us to look in detail at the activation of a gene during the proliferative phase.
Proliferation and Differentiation
There is an unwieldy amount of DNA packaged into the cramped space of the cell’s nucleus. Approximately 2-1/2 yards of DNA are squeezed into a volume of only a few millionths of a liter—an area visible only with the aid of a microscope. The secret to getting all of the material into so small a space is in the packaging. The double strands of DNA that make up the chromosomes are wrapped around a group of histone proteins (the group contains two molecules each of histone proteins H3, H4, H2A and H2B) in a complex called a nucleosome. Winding the DNA around the histone proteins contracts linear spacing by sevenfold. Higher-order folding of DNA further reduces linear distances along the DNA.
Since histone proteins are crucial to maintaining the overall structure of DNA, one would expect then that a supply of histone proteins would be required during cell division. At this time, the cell is doubling its DNA content in order to impart a full set of chromosomes to each of the resulting progeny cells. In fact, our own studies of the histone H4 proteins show they are being manufactured at a maximal pace (as, presumably, are the other histone proteins), during the phase of the cell-division cycle where DNA is being replicated, the so-called synthesis, or S, phase. It therefore stands to reason that histone H4 gene transcription (that is, the manufacture of an RNA copy of histone H4 DNA) would also be maximally active during this phase. That is what we see.
We have also found that the gene’s regulatory region is somewhat complex. The gene’s “on” switch contains not just one, but four principal promoter regions. All of these are occupied by complexes of transcription factors during the S phase of the cell cycle, when transcription is maximal. The transcription factors occupy the same regions even when transcription continues at lowered levels during other phases of the cell cycle, but some of these factors become chemically modified in a way that also seems to lower, but not shut off, transcription.
However, when we look at the same gene at the close of the cell cycle, when the cell starts to differentiate, we see a very different situation. Three of the four promoter regions are completely unoccupied. Transcription of H4 is also completely halted.
At the close of proliferation, differentiation begins. To understand this stage of the cell’s life cycle, we have studied the regulation of the bone-restricted osteocalcin gene that is turned on when mature osteoblasts secrete the matrix upon which bone is formed. At this point, the differentiating cell receives signals in the form of steroid hormones and growth factors, which activate the expression of the set of genes coding for the products made and secreted by the mature cell.
Osteoblasts respond to these factors in part because promoter regions of genes in these bone cells are sensitive to them. The molecular signaling molecules can either bind directly to the promoters, combine with other molecules to bind directly, or cause some internal cellular changes that will stimulate some other molecule to bind to the promoter. We have been able to identify several regulatory regions of the osteocalcin gene and determine their sensitivity to external signals.
We have found that two promoter regions in particular must be activated in order for the osteocalcin gene to be transcribed. These are called the TATA box and the osteocalcin (OC) box, and they must be occupied by the appropriate transcription factors before the osteocalcin gene can be even minimally transcriptionally active. Additionally, several other regulatory segments, called enhancers, are present in the osteocalcin gene. If promoters are the on/off switch for the gene, then enhancers act like a dimmer switch, regulating the intensity of the response once the switch has been turned on. Binding to the enhancer segments of the osteocalcin gene causes transcription to proceed at a greater rate than does transcription factor-binding to the promoter regions alone. Specifically, the osteocalcin enhancers are responsive to vitamin D, which is a steroid hormone. Other regions of the osteocalcin gene bind factors that suppress transcription. The presence of enhancers, suppressors and promoters on the gene provides opportunities for the expression of the gene in bone under diverse biological circumstances. Interactions of various transcription factors with the regulatory elements also provides subtle changes in the strength of the response—stimulating greater or lesser transcriptional activity—as the situation requires.
Much of molecular biology has concerned itself with studies such as ours that seek to describe the regulatory areas of genes and the transcription factors that bind to them. But as biologists answer these questions, they also have found other problems that need to be addressed. For example, promoters are grouped together on the gene in an area that lies ahead of the coding sequence, but they are often spaced apart. We have found that each promoter sequence must be occupied by the appropriate transcription factors before transcription can proceed, but it is has been difficult to understand how the various regions of promoters interact, separated as they are from each other. Furthermore, enhancers can often be found anywhere in the gene—before or in the coding sequence—often at a great genetic distance from the promoter. So it has been difficult to understand how promoters and enhancers can work together to modulate the level of gene expression.
Equally difficult has been the problem of understanding how the many transcription factors can find the relatively small promoter sequences of the exact gene they activate in the morass of total nuclear DNA, especially when one considers that these transcription factors exist in fairly low abundance themselves.
These questions are particularly difficult to answer if one assumes, as molecular biologists have traditionally assumed, that all of these activities take place somewhat randomly in the cell nucleus.
More recently, however, cell and molecular biologists are coming to understand that more sense can be made of transcriptional regulation if we make more realistic assumptions about the conditions under which these events are happening. After all, even though we tend to draw genes as linear, they, like all of the DNA in the cell’s nucleus, actually are folded and twisted. We are now coming to appreciate that understanding the real-life configuration of DNA in the cell is crucial to understanding transcriptional regulation. For it is the way in which DNA is folded that makes it accessible or inaccessible to regulatory proteins such as transcription factors at any given time. Consistent with that, we have learned that the folding pattern of a gene varies with the particular phase of the cell’s life cycle.
In addition to understanding the role of the three-dimensional configuration of DNA in transcriptional activation, cell and molecular biologists are also learning more about the architecture of the cell’s nucleus. The pioneering studies of Sheldon Penman of the Massachusetts Institute of Technology, Donald Coffey of the Johns Hopkins University and their collaborators have shown that underlying the nucleus is a scaffold of anastomosing fibrous proteins called a nuclear matrix. The matrix gives the nucleus its integrity, yet we believe it is capable of being modified as the cellular shape and function change from proliferation through differentiation. The matrix helps to localize particular genes to particular locations within the nucleus. It imposes constraints on the folding patterns of DNA, and it may help to concentrate and localize various transcription factors.
It becomes much easier, then, to understand how all of the proper transcription factors can interact with their respective promoter sequences if they are all concentrated at various sites within the nuclear matrix, supposedly in the same area where the gene of interest is found.
Biologists are just beginning to appreciate the significance of nuclear domains in the control of gene expression. However, it is already apparent that local nuclear environments generated by all levels of nuclear structure are intimately tied to developmental expression of cell growth and specialization. There appears to be a reciprocally functional relationship between nuclear structure and gene expression. Nuclear structure is a primary determinant of transcriptional control, and the expressed genes modulate the regulatory components of nuclear architecture.
The power of addressing gene expression within the three-dimensional context of nuclear structure would be difficult to overestimate. It is already understood that the mechanisms that sense, amplify, dampen and/or integrate regulatory signals involve structural as well as functional components of cellular membranes. Extending the structure-regulation paradigm to nuclear architecture expands the relationship between cell structure and gene expression.
In our research, we have started to investigate transcriptional regulation of our genes of interest in terms of their three-dimensional folding patterns and their relation to the nuclear matrix.
Expression in Three Dimensions
After we deciphered the various regulatory regions of the histone H4 promoter and the transcription factors that acted on them, we tried to determine the three-dimensional structure of the transcribed gene. Remember that the promoter region contained several seemingly independent regulatory regions separated from each other. Our studies helped us understand how signals at these apparently independent sites could be integrated.
We found that histone proteins play an important role in the three-dimensional configuration of the transcriptionally active gene. The DNA between regulatory regions was wrapped around histone proteins. This sees to shorten the space between regulatory regions and bring them in to close proximity. We and others believe that bringing the regulatory regions together allows the various transcription factors bound to each region to interact with each other. Possibly this interaction further alters the shape of the gene and ultimately makes it more accessible to the RNA polymerase.
We have also found that the three-dimensional conformation of the DNA and the placement of histone proteins varies as a function of the cell cycle. These changes may enhance or restrict accessibility of the DNA to the transcription factors and modulate the extent to which DNA-bound factors are active.
In addition to the strategic placement of histone proteins in the spaces between regulatory regions, we found evidence in our laboratory that the nuclear matrix also plays a role in modulating the transcriptional activity of the histone H4 gene. We have identified a site in the regulatory region of the gene capable of binding proteins that attach to the nuclear matrix. We have also found that different DNA-binding proteins are present at the attachment site when the histone gene is on and off. Specifically, we have found the transcription factor known as YY1 to be present at the gene’s matrix attachment site when the histone gene is transcriptionally active. It is possible that YY1 helps to hold the gene to the matrix and so alter its conformation to make it accessible for transcription. Likewise, the absence of YY1 at that site in differentiated cells may mean that the gene is not attached to the matrix and therefore not available for transcription. The conformation of DNA and the involvement of the nuclear matrix restrict the mobility of the promoter and impose physical constraints that reduce distances between the various promoter elements. It has not yet been experimentally determined whether that is the case, but transcriptional control unquestionably involves a complex series of spatial interactions that are responsive to a broad spectrum of biological signals.
Just as the histone gene undergoes conformational changes, and possibly changes in its relation with the nuclear matrix, so does the osteocalcin gene. Modifications in the overall conformation of the DNA and in the organization of nucleosomes parallel both the gene’s competency for transcription and the extent to which the osteocalcin gene is transcribed. We have found that when the osteocalcin gene is quiescent during cell proliferation, some of the regulatory regions are wrapped around histones, forming a nucleosome. Nucleosome placement in these regions, specifically in the region sensitive to vitamin D and in the so-called OC box, makes this important promoter inaccessible to transcription factors.
In contrast, when the osteocalcin gene becomes transcriptionally active at the end of proliferation, these regulatory regions no longer contain nucleosomes. The absence of nucleosomes means that the regulatory regions are free to bind with the appropriate transcription factors.
In addition to regulation through changes in DNA conformation, the osteocalcin gene may also depend on interactions with the nuclear matrix for differential transcriptional control. We believe we have evidence for three matrix-binding regions within the osteocalcin gene. We have tentatively proposed a model where the inactive gene is not attached to the matrix, whereas the transcriptionally active gene is.
We speculate that when the gene binds to the matrix, particular regulatory regions are brought in close proximity with each other in a way that supports transcription. Subtle changes in matrix interactions in the presence of vitamin D may account for the increased transcriptional activity of the gene caused by this vitamin. Our study of the osteocalcin gene therefore supports the notion that both the overall structure of the chromosome in the region of the gene and interactions with the nuclear matrix contribute to making the promoter accessible to the appropriate transcription factors, and to integrating activities at multiple regulatory regions on the gene.
The cell receives numerous physiological regulatory signals. It is becoming increasingly evident that the various structural and functional changes the cell undergoes through its life cycle are mediated by the integration and interaction of these signals. It is also clear that the cell’s ability to respond to these signals in an appropriate way depends on activities that are mediated through multiple regulatory elements of gene promoters. The current knowledge of promoter organization and the repertoire of transcription factors that mediates these activities provides a one-dimensional map of options for biological control. Molecular and cell biologists are beginning to appreciate the additional structural and functional dimensions provided by the folding patterns of DNA, nucleosome organization and subnuclear localization of both genes and transcription factors.
Particularly exciting is the increasing evidence for dynamic modification in nuclear structure that parallels developmental expression of genes. The extent to which nuclear structure regulates and is regulated by modifications in gene expression remains to be experimentally established.
In the future, molecular and cell biologists will need to define how genes and transcription factors are specifically localized within or by the nuclear matrix. We hope that resolving these and other questions will yield additional insight into the relationship between specific components of nuclear architecture and gene expression. We anticipate that nuclear architecture will be highly changeable and variable with the developmental and functional demands of the particular cell.
As knowledge of specific components becomes more sophisticated, scientists will want to know how the nuclear architecture is remodeled with changing cellular circumstances. Many of the answers will no doubt come from studies of cells simpler than mammalian cells, where the genetic content is lower and the genes are more extensively mapped out. However, we also advocate caution in extrapolating results from the cells of yeast and fruit flies to mammalian cells. The nuclear organization and proteins that package DNA are different in each group of cells. These variations may reflect the increased regulatory components in mammalian cells that support both structure and function.
The relevance of nuclear structure to diagnosis and treatment of disease should not be overlooked. Nuclear matrix proteins characterize prostate cancer, breast cancer, colon cancer and bone cancer cells. Additionally, nuclear matrix proteins provide the potential for targeting drugs and other therapeutic agents to specific regions of the cell nucleus.