During transcription a gene encoded in the DNA of an organism is read by and a strand of RNA is made based on the DNA template. This is much like the DNA replication process, with the difference that only one strand of RNA is formed in transcription, and the nucleotides are ribonucleotides. During replication, both strands of DNA are copied and the copies are also DNA, made up of deoxyribonucleotides.
Because a DNA molecule has many genes on it, how does the transcription machinery know where to start and where to stop? Within the DNA sequences there are signals encoded that help the transcription machinery find the right spots. These are called the upstream (just in front of the gene) and downstream (just after the gene) regions. The stretch of DNA immediately upstream of the DNA is often referred to as the "promoter" of the gene.
The DNA "tells" the transcription machinery where to start, with a certain stretch of nucleotides, so that for example (and this actual sequence is totally made up for this purpose) a stretch of nucleotides "AGCTAGCCGACAT" means: this is where you should start making RNA. How can that work?
The sequence of DNA that serves as a signal is actually bound by a particular protein that can recognize and bind that sequence and sequences that are very similar to it, but not sequences that are different. The protein that binds this particular stretch of DNA is called a transcription factor and in turn binds other proteins that are transcription factors, which in turn "recruit" RNA polymerase to the complex of DNA-stretch and proteins. Now the RNA polymerase knows where to start and so it does. Once it is a little further down the DNA strand, very busy making mRNA, another polymerase can start at the beginning and making a second strand of mRNA. In this way it is possible that many polymerases are making the same strand of mRNA at any one time.
An example of such a transcription factor is called TBP, which stands for TATA-binding protein. This is a protein which can recognize and bind a sequence of DNA called a TATA-box, or Goldberg-Hogness box. The TATA-box has the sequence "TATAAA" at it's core and a bunch of more variable sequences around it. TBP finds, and binds the TATA-box, then a protein called TFIIA (transcription factor IIA) binds TBP, and then TFIIB and TFIID join in, and RNA Polymerase can now bind these proteins and start transcription.
The stretch of DNA that is recognized by a transcription factor is called a cis-element, while the proteins that bind cis-elements are called trans-factors.
At the end of the gene there are again signals encoded in the DNA sequence that tell RNA polymerase it is time to stop, because the end of the gene has been reached. This is called the transcriptional terminator. There are two ways in which transcriptional terminators work. One is called Rho-dependent transcriptional termination, and depends on the presence of a particular protein called the Rho factor to stop transcription. The second is called intrinsic transcriptional terminator. This is really cool, because the signal is a palindromic sequence at the end of the gene, that can fold over and bind tightly to itself, forming a hairpin structure.
It looks something like this:
Credit for this picture: http://en.wikipedia.org/wiki/Image:Stem-loop.svg
Question of the Day
2 hours ago