Grant is a regular commenter here & occasionally I've twisted his arm & persuaded him to write a guest post for me. The following item is one of these - it was too good (& too long!) for the comment thread. Grant begins:
Alison recently put up an article about epigenetics. Since I was in a bit of a writing mood, and this topic was close enough to some of my own work, I wrote to add a bit of my own about epigenetics in reply. Alison invited to put this in a guest blog. It is a bit long for a comment...! I wanted to add an aspect of epigenetics that interests me: specifying the use of genes through forming different chromatin loops depending on which parent the copy of the gene came from. I am going to introduce a few terms as I go along; I am writing thinking of those that already know a little about biology and might like to learn some of the new things we are learning about chromosomes.
Humans are diploid: we have two copies of each chromosome, one from each parent, except in males there is usually only one X and one Y chromosome (but two of all the others). Ignoring the “sex” chromosomes in males, having two of each chromosome also means that we have two copies of each gene. Each of the two genes making up a pair of corresponding genes, one from each parent, is called an allele. The two alleles of a gene make up the genotype of that person for that gene.
For most genes, when the gene is needed, both alleles are expressed and roughly the same amount of the RNA each allele encodes is made. But in some cases, evolution has selected that one of the two alleles should be switched off.
Alison described one example in her article: dosage compensation in females “corrects” for having twice the number of X chromomome genes as needed by switching one copy off. Recapping on what she was saying, in the case of switching off the “extra” copy of the genes on the “second” X chromosome in females, the choice of if the copy from the father (paternal allele) or from the mother (maternal allele) is inactivated is random. The choice made is inherited in each cell line once that choice is made. Because there are many cells, each making a separate random choice of which allele to switch off, most female mammals are mosaics, with a mixture of cells with an active paternal X chromosome genes and with an active maternal X chromosome genes. (I believe, rodents and marsupials are exceptions to this rule.)
X chromosome inactivation doesn’t “care” which parent the inactivated allele came from. In allele-specific gene expression, the alleles are “imprinted” with an epigenetic parent-of-origin mark which specifies what parent the allele is from that determines how each allele is used.
One region in our genomes that has been studied in detail is the region around the IGF2 (insulin-like growth factor 2) gene. Near the IGF2 gene is the H19 gene, which codes for a non-coding RNA, that is, a RNA that is not translated to a protein, but functions as a RNA. In these two neighbouring genes, only one of allele is used, each from different parents. Normally, the only the paternal allele of IGF2 is used, and the maternal counterpart is “silent” and vice versa, only the maternal allele of H19 is used, and the paternal H19 allele silent.
CTCF, a protein I study, binds to regions near these genes called “imprinting control regions” or ICRs. Genomic imprinting is the “imprinting” of the origin of that region of a chromosome, specifying which parent it came from. Usually the imprinting is DNA methylation, which Alison introduced in her previous acticles. CTCF binds DNA with the sequence CTCCC. If the DNA bases of the binding site are methylated, CTCF cannot bind the DNA. If the binding site has no methyl groups, CTCF can bind it. Thus, DNA methylation can control which binding sites CTCF is able to bind to.
So epigenetic modification of the DNA can control what DNA sites a protein can bind. It turns out that this protein can form chromatin loops that control how genes are used.
When CTCF is bound to DNA, it has two properties: (1) it prevents the spreading of histone modifications that mark a gene as inactive (heterochromatin), and (2) it limits the ability of enhancers to encourage (or “enhance”) the expression of a gene via the promoter immediately before the gene (I’ll come back to this). This figure from a review article summarises this graphically.
These two properties are thought to be a result of pairs of DNA-bound CTCF joining together to form loops of chromatin. (Chromatin simply means ‘DNA wrapped around histone proteins’, like DNA to the left of the words ‘Histone modification’ in the illustration in Allison’s article. Almost all the DNA in the nucleus is packaged into chromatin.) One CTCF molecule bound to one ICR can interact (join together) with an other CTCF molecule bound to another ICR, so that the chromatin between the two CTCF molecules form a loop. It’s a bit like putting a bit of Blu-Tack on a peice of string, and other bit of Blu-Tack somewhere else on the same piece of string, then pushing the two pieces of Blu-Tack together: the string in between the two bits of Blu-Tack will form a loop.
Research scientists have worked out what the “loop structure” of the DNA in the IGF2-H19 region looks like and it’s pretty complicated! It looks a bit like if we stuck several pairs of Blu-Tack on our string (chromsome) together, and joined the Blu-tack pairs together to form a “hub” with the ends of the loops near the middle, with several loops coming from it.
Enhancers are (small) regions of DNA, that if proteins that regulate gene expression (gene regulatory proteins) bind to them, they “encourage” the promoter region immediately before a gene to express that gene. Enhancers seem to usually only able to only “encourage” promoters that are in the same chromatin loop. (There also seem to be exceptions to this rule.) Thus, what loops form can determine which genes enhancers can “enhance”. Furthermore, different enhancers can respond to different gene regulatory proteins, so different chromatin loops may make genes responsive to different regulatory “signals”.
Pulling all this together, the maternal and paternal copies of the IGF2-H19 region have different DNA methylation of the ICR regions, so that they present different binding sites for CTCF. This causes the maternal and paternal copies of the IGF2-H19 region to have different loop structures. In the maternal copy, the IGF2 gene is in an inactive loop and H19 gene in an active one. And vice versa, for the paternal copy of the IGF2-H19 region, with the IGF2 gene in an active loop and the H19 gene in an inactive one. And older idea of the structure of the maternal copy of the IGF2-H19 region can be seen in this figure from Kurukuti et al PNAS 103(28)10684-10689 (2006) (The more recent view is a little messier!)
So... epigenetics, like the DNA methylation of CTCF binding sites, can control the “loop structure” of regions of a chromosome, affecting which allele is used and how, and in ways that can depend on which parent a gene came from. (Phew!)
(I’m simplifying parts of this: a lot more proteins and other types of special locations in the DNA are involved in structuring the chromatin/DNA into loops, and there is a lot more to how the genes are regulated, but this gives starting point in a way that I hope shows that DNA in the nucleusis not just a linear sequence, but organised in domains by forming different loops that affect how the genes in those loops are used.)
You can see a little of this in an indirect way in high-school science, or at least the science I did at high school. You might have an experiment where you look at the “polytene puffs” of <i>Drosophila</i> (or from other species) under a light microscope. Polytene chromosomes are essentially many of copies of chromosomes side-by-side. Looked at under a light microscope with appropriate staining, they show stripes of light and dark regions. The dark regions (bands) are the condensed (tightly packed) chromatin of inactive genes (heterochromatin) and the light regions (interbands or “puffs”) are regions with active genes, which are being transcribed to make RNA. The many side-by-side chromosomes of polytene chromosomes are letting us see differences in chromosome structure that we would otherwise not be able to see using a light microscope. Some researchers have shown by staining polytene chromsomoes for the <i>Drosophila</i> CTCF protein, that the CTCF proteins are most often found at the boundary between bands and interbands, seemingly marking out the ends of the regions of active and inactive chromatin. This is consistent with the description of how it works that I have given earlier.
All of this, in turn, relates to responses to hormone signals and to development, but I had better stop somewhere. And get some sleep!
No science article should be with references... For those wanting more depth, there are many excellent review articles explaining epigenetics. Alison gives one in her earlier article. “Serious” readers should bear in mind that this field is moving so fast that reviews from even two years ago are somewhat dated. (High-throughput genomics methods are partially to blame; these methods allow scientists to scan the entire genome for particular features in the time it took to previously study one region of the genome.) Volume 128, issue 4 of Cell (pages 627-802, 23 February 2007) is devoted to epigenetics and has many excellent review articles. Most of the review issues tend to cover some aspects at the expense of others and this issue of Cell does better than most at covering the complete range of issues (at that time, two years ago). A more up-to-date review issue is Current Opinion in Genetics & Development Volume 18, Issue 2, Pages 107-226 (April 2008). Unfortunately you’ll need to visit a university library to get these. The paper my first link points to is available free on-line: Wei et al Cell Research (2005) 15, 292–300 Chromatin domain boundaries: insulators and beyond. It’s a little older, but covers the earlier findings well.