Henry Levin heads the Section on Eukaryotic Transposable Elements, which analyzes LTR retrotransposons and the integration of their cDNA into the chromosomes of host cells. Recently, the laboratory demonstrated that Tf1 integration occurs primarily at promoters and that this mechanism is mediated by transcription factors. In the case of the fbp1 promoter, the group found that integration occurs at the binding site for the transcription factor Atf1p and that the factor is required for integration. This year, the group used ultra-high throughput sequencing to generate a saturating map of sites targeted by Tf1. Importantly, they found that integration occurred preferentially at promoters that are induced by environmental stress.
- Website: http://sete.nichd.nih.gov/bio_levin.htm
- Annual Report: http://annualreport.nichd.nih.gov/sete.html
Because diseases such as AIDS and leukemia are caused by retroviruses there is an intense need to understand the mechanisms of retrovirus replication. One of our objectives is to understand how retroviral cDNAs are integrated into the genome of infected cells. Because of their similarities to retroviruses, long terminal repeat (LTR)-retrotransposons are important models for retrovirus replication. The retrotransposon under study in our laboratory is the Tf1 element of the fission yeast Schizosaccharomyces pombe. We are particularly interested in Tf1 because its integration exhibits a strong preference for pol II promoters. This choice of target sites is similar to the integration preferences human immunodeficiency virus 1 (HIV-1) and murine leukemina virus (MLV) have for pol II transcription units. Currently, it is not clear how these viruses recognize their target sites and perform integration. We therefore study the integration of Tf1 as a model system with which we hope to uncover mechanisms general to the selection of integration sites. An understanding of the mechanisms responsible for targeted integration could lead to new approaches for antiviral therapies.
A specific goal of our research is to identify the mechanism that directs integration to regions containing pol II promoters. To study insertion patterns in specific genes, a target plasmid assay was developed. Integration into the promoter of fbp1 clustered within 10 bp of a transcription enhancer called upstream activating sequence 1 (UAS1). Integration into the promoter of fbp1 depended on UAS1 sequence and Atf1p, a transcription activator that binds UAS1. To identify the key determinants responsible for targeting integration in the fbp1 promoter we conducted an extensive study of the promoter sequences. We found that two discrete target windows close to UAS1 were the only sequences in the promoter required for the pattern of integration. These two target windows functioned independently of each other and each one was found to be sufficient to function as an efficient target of integration. Although Atf1p is necessary for directing integration to UAS1, it may be that by activating transcription, Atf1p induces subsequent steps of transcription that are more directly responsible for directing integration. If the role of Atf1p in integration were indirect, other factors that promote fbp1 transcription would also influence integration at this promoter. However, other known factors that mediate fbp1 transcription, Pcr1p, Rst2p and Tup11p/Tup12p were found not to contribute to integration. UAS2 is an independent enhancer in the promoter of fbp1 and was not a target of integration. Nevertheless, we found UAS2 did promote efficient transcription of fbp1. In addition, we found a synthetic promoter induced by lexA fused to an activator, VP16, was not a target of Tf1 integration. These data indicate that transcription activity of a promoter is not sufficient to mediate integration. Instead, the data indicate Atf1p plays a direct and specific role in targeting integration to UAS1 of the fbp1 promoter.
The role of Atf1p in integration may be to bind to and recruit integrase to UAS1. To test integrase for direct interactions with Atf1p, pull-down experiments were conducted. Various domains of integrase and Atf1p were fused to epitope tags and the recombinant proteins were purified from bacteria. These experiments demonstrated that the catalytic core of integrase interacted with the b-ZIP domain of Atf1p. While these in vitro results with recombinant proteins indicated integrase and Atf1p are capable of direct interaction, the experiment did not address whether the interactions can occur within the cell. We therefore used the yeast two-hybrid assay and tested the domains of integrase and Atf1p for interactions. The two-hybrid assays detected the same interaction identified with the recombinant proteins, namely the binding of the b-ZIP domain of Atf1p to the catalytic core of integrase. These results suggested that integration was directed to the promoter of fbp1 by the binding of integrase to Atf1p anchored at UAS1. Working with recombinant proteins and DNA we identified a three-component complex. Gel retardation assays detected a complex that contained integrase, the b-ZIP domain, and a 100 bp DNA from fbp1 that included UAS1. We also conducted experiments to test whether this complex was capable of directing integration. Integration products were detected within the 100 bp DNA that corresponded to the same positions of insertion that are selected in vivo. These data demonstrated that integration targeted to specific sites in the promoter of fbp1 was reconstituted with purified integrase, the b-ZIP domain of Atf1p, and a 100 bp DNA.
Ultra high throughput sequencing of transposon integration provides a saturated profile of target activity in Schizosaccharomyces pombe
The result that integration in the genome of S. pombe is directed to the promoters of genes raises several key questions about the biology of Tf1 integration. Are all promoters recognized equally or is integration directed to specific sets of promoters. If specific sets of promoters are preferred targets, what distinguishes the preferred promoters from those not recognized by Tf1. To address these questions, large numbers of integrations throughout the genome of S. pombe were sequenced. The revolutionary new methods for ultra high throughput sequencing made it possible to characterize extraordinarily large numbers of integration events.
Cells were induced for the expression of Tf1 containing neo (Tf1-neo) to select for the cells with integration events. We applied ligation mediated PCR to generate libraries of Tf1-neo associated with the downstream flanking DNA. In this study, we performed four independent transposition experiments (Hap_Mse_1, Hap_Mse_2, Dip_Mse and Dip_Hpy ) which were named according to the strains (haploid or diploid) and restriction enzymes (Mse I or Hpy CH4 IV) used to digest the genomic DNA from the cells with integration events. The cut libraries of DNA were ligated to linkers, and subjected to barcoded PCR. The amplified products, consisting of the downstream LTRs and their flanking DNA, were size selected and submitted to 454 Life Sciences for sequencing.
All together we obtained 599,760 high quality sequence reads that were then analyzed with BLAST to determine the chromosomal location of the insertions. In all there were 73,125 independent Tf1 integration events in unique positions of the S. pombe genome. The BLAST results of sequences from our first integration library identified 21,848 independent insertions in the experiment termed Hap_Mse_2. The insertions were broadly distributed across all three chromosomes. To examine the insertion data for preferences, all 21,848 insertion sites from the Hap_Mse_2 experiment were mapped relative to ORFs. The distance from the insertions to the closest ORF was determined. The integration from Hap_Mse_2 showed a clear preference for the first 500 nt upstream of ORFs.
The profile of integration across the genome revealed substantial variation with some intervals containing 35 to 40 insertions per kb while many others had zero to five insertions per kb. An analysis of integration density for intervals of 10 kb also showed high levels of bias that were incompatible with random selection. The key question about this variation in integration is whether it was due to intrinsic differences in integration efficiency between different sequences in the genome or whether the size of our cultures and the PCR amplification limited our ability to sample the integration potential of each sequence. To distinguish between these two possibilities we tested whether the levels of integration in individual intergenic sequences were reproducible between multiple independent experiments. We compared the numbers of integration events in the intergenic regions of the Hap_Mse_2 experiment to the numbers of integration events from the Dip_Mse experiment. Each intergenic region was plotted using the number of integration events identified in the Hap_Mse_2 experiment as the X coordinate and the number of inserts recorded in Dip_Mse experiment as the Y coordinate. Because each of the 5,045 intergenic regions was plotted, and many intergenic regions had the same X,Y coordinates, we used the Z coordinate to indicate the number of the intergenic regions that had the same X,Y coordinates. The planar distribution of the data points shows that the amount of integration in each intergenic region is similar between the two independent experiments. The R value for the data is 0.95 (R2=0.91), indicating there is strong correlation of the integration levels between the two experiments. This comparison was performed between all pairs of the four experiments and the plots showed similar correlations.
In the Hap_Mse_2 experiment, 76% of all the insertion events occurred in just 20% of the intergenic sequences. This strong bias is a consequence of the integration preference for a specific set of promoters. One possibility was that Tf1 integrated into the promoters with the highest transcription activity. We tested this hypothesis but found no correlation between transcription and integration. In another effort to determine what distinguishes promoters that had high levels of insertions from the promoters that did not, we asked whether the genes associated with the targeted promoters contributed to specific classes of biological function. The results of the gene ontology analysis suggested that genes regulated by environmental stress were among the strongest targets of integration. To examine this further we sorted all the intergenic sequences from highest number of insertions to the lowest using the Hap_Mse_2 data. Using this order, the intergenic regions were placed into bins of 500 each. We then used published microarray data to tabulate how many of the intergenic regions in each bin contained promoters that are induced at least three-fold by conditions of stress. The bin containing the 500 intergenic regions with the most integration contained the highest number of genes induced by cadmium. The bins with successively lower amounts of integration contained fewer promoters that are induced by cadmium. This relationship indicates that integration has a preference for promoters that are induced by cadmium. Similar preferences were observed for genes induced when cells are treated with hydrogen peroxide or by heat. Particularly strong preferences for integration into promoters induced by MMS or sorbitol were observed for the first bin of 500 intergenic regions. The targeting of Tf1 to stress induced promoters represents a unique response that may function to specifically alter expression levels of stress response genes. Although there is no systematic data, integration of Tf1 into the promoter of ade6 and bub1 does stimulate transcription.
The size and number of the integration experiments reported here resulted in reproducible measures of integration for each intergenic region and ORF in the S. pombe genome. The reproducibility of the integration activity of each intergenic and ORF sequence from experiment to experiment demonstrates that we have saturated the full set of insertion sites that are actively targeted by Tf1. To our knowledge, this is the first time such a profile of integration data has been assembled.
Integration profiling; A genome-wide method of measuring gene function.
With the introduction of new deep sequencing technology it is now possible to sequence many millions of transposon insertions in a single experiment. We tested whether Illumina sequencing could be used to generate a dense profile of transposon insertions that would reveal which genes are required for cell growth. For this experiment we used a haploid strain of S. pombe and Hermes, a DNA transposon from the housefly. In previous work we found that the Hermes transposon was highly active in S. pombe and that the insertions did not discriminate against ORFs. We predicted that in actively growing cultures, Hermes insertions would not be tolerated in essential ORFs. This year we induced Hermes transposition in a large culture S. pombe that was grown for 80 generations. With ligation mediated PCR and Illumina sequencing we were able to sequence 360,513 independent insertion events. On average, this represented one insertion for every 29 bp of the S. pombe genome. An analysis of integration density revealed that the ORFs largely separated into two classes, one with high numbers of insertions and another with much lower numbers. In collaboration with a group that deleted each of the genes of S. pombe, we found the ORFs with low numbers of Hermes insertion corresponded to the essential genes. The ORFs with higher integration densities were in genes classified as nonessential. These results validated transposon profiling as a new method for identifying genes with essential function. Importantly, by applying specific conditions of selection during growth, this method can be adopted to identify genes that contribute to a wide variety of functions.
Esnault, C., Levin, H. L. (in press), The Long Terminal Repeat retrotransposons Tf1 and Tf2 of Schizosaccharomyces pombe, Mobile DNA III. Editor Nancy Craig, American Society of Microbiology, Washington, DC.
Singh, P., Bourque, G., Craig, N., Dubnau, J., Feschotte, C., Flasch, D., Gunderson, K., Malik, H., Moran, J., Peters, J., Slotkin, R., and Levin, H. L. (2015), Mobile genetic elements and genome evolution 2014, Mobile DNA.
Matreyek K. A, Wang W., Serrao E., Singh P., Levin H. L., Engelman A., (2014), Host and viral determinants for MxB restriction of HIV-1 infection, Retrovirology, 11:90.
Chatterjee, A. G. , Esnault, C., Guo, Y., Hung, S., McQueen, P. G., Levin, H. L., (2014), Serial number tagging reveals a prominent sequence preference of retrotransposon integration, Nucleic Acids, Res., 42: 8449-60.
Guo Y., Park J.M., Cui B., Humes E., Gangadharan S., Hung S., Fitzgerald P.C,. Hoe K.L., Grewal S.I., Craig N.L., Levin H.L., (2013), Integration profiling of gene function with dense maps of transposon integration, Genetics 195:599-609.
Feng, G., Leem, Y.E., and Levin, H. L., (2013), Transposon integration enhances expression of stress response genes, Nucleic Acids, Res., 41:775-89.
Tanaka, A., Tanizawa, H., Sriswasdi, S., Iwasaki, O., Chatterjee, A., Speicher, D., Levin, H., Noguchi, E., and Noma, K. (2012) Epigenetic regulation of condensin-mediated genome organization during the cell cycle and upon DNA damage through histone H3 lysine 56 acetylation, Molecular Cell, Vol. 48:1-15.
Levin, H., and Moran, J., (2011) Transposons and their hosts: the dynamics of conflict, Nature Reviews Genetics 12(9):615-27.
Rhind N, Chen Z, Yassour M, Thompson DA, Haas BJ, Habib N, Wapinski I, Roy S, Lin MF, Heiman DI, Young SK, Furuya K, Guo Y, Pidoux A, Chen HM, Robbertse B, Goldberg JM, Aoki K, Bayne EH, Berlin AM, Desjardins CA, Dobbs E, Dukaj L, Fan L, FitzGerald MG, French C, Gujja S, Hansen K, Keifenheim D, Levin JZ, Mosher RA, Müller CA, Pfiffner J, Priest M, Russ C, Smialowska A, Swoboda P, Sykes SM, Vaughn M, Vengrova S, Yoder R, Zeng Q, Allshire R, Baulcombe D, Birren BW, Brown W, Ekwall K, Kellis M, Leatherwood J, Levin H, Margalit H, Martienssen R, Nieduszynski CA, Spatafora JW, Friedman N, Dalgaard JZ, Baumann P, Niki H, Regev A, Nusbaum C., (2011), Comparative functional genomics of the fission yeasts, Science, Vol. 332:930-6.
Majumdar, A., Chatterjee, A., Ripmaster, T., and Levin, H., (2011), The determinants that specify the integration pattern of retrotransposon Tf1 in the fbp1 promoter of Schizosaccharomyces pombe, Journal of Virology, Vol. 85:519-29 (Featured in teh Spotlight section by the editors).
Chaconas G, Craig N, Curcio MJ, Deininger P, Feschotte C, Levin H, Rice PA, Voytas DF., Meeting report for mobile DNA 2010, Mob DNA. 2010 Aug 24;1(1):20.
Guo, Y. and Levin, H. (2010), High throughput sequencing of retrotransposon integration provides a saturated profile of target activity in Schizosaccharomyces pombe, Genome Research, 20:239-248.
Chatterjee, A. G., Leem, Y., Kelly, F., and Levin, H., 2009, The Chromodomain of Tf1 Integrase Promotes Binding to cDNA and Mediates Target Site Selection, Journal of Virology, 83:2675-2685.
Park, J., Evertts, A. and Levin, H. (2009), The Hermes transposon of Musca domestica and its use as a mutagen of Schizosaccharomyces pombe. Methods, 49:243-247.
Levin, H. (2008) Metaviruses, in Encyclopedia of Virology, Third Edition, Eds Mahy and Van Regenmortel, Elsevier Limited Oxford, UK, Vol. 3:301-311.
Cam, H., Noma, K., Ebina, H., Levin, H., and Grewal, S., (2008), Host genome surveillance for retrotransposons by transposon-derived proteins, Nature 451:431-436_._
Gao, X., Hou, Y., Ebina, H., Levin, H. and Voytas, D. (2008), Chromodomains direct integration of retrotransposons to heterochromatin, Genome Research 18:359-369.
Ebina, H., Chatterjee, A. Judson, R., and Levin, H. (2008) The GP(Y/F) domain of Tf1 integrase multimerizes when present in a fragment, and substitutions in this domain reduce enzymatic activity of the full-length protein., Journal of Biological Chemistry 283:15965-74_._
Leem,Y., Ripmaster, T., Kelly, F., Ebina, H., Heincelman, M., Zhang, K., Grewal, S., Hoffman, C., and Levin, H. (2008) Retrotransposon Tf1 is targeted to pol II promoters by transcription activators, Mol. Cell 30:98-107.
Evertts, A., Plymire, C., Craig, N., and Levin, H. (2007) The hermes transposon of Musca domestica has robust activity in Schizosaccharomyces pombe that disrupts open reading frames, Genetics 177:2519-2523_._
Ebina, H. and Levin, H. (2007) Stress Management: How cells take control of their transposons, Mol. Cell, 27: 180-181.
Atwood-Moore A, Yan, K, Judson, R., Levin H. (2006) The self primer of the long terminal repeat retrotransposon Tf1 is not removed during reverse transcription, Journal of Virology, 80: 8267--8270.
Atwood-Moore A, Ejebe K, Levin H. (2005) Specific recognition and cleavage of the plus-strand primer by reverse transcriptase, Journal of Virology, 79:14863-14875.
Hizi, A., and Levin, H. (2005) The integrase of the LTR-retrotransposon Tf1 has a chromodomain that modulates integrase activities, Journal of Biological Chemistry, 280: 39086-39094_._
Kim, M., Claiborn, K., and Levin, H., (2005) The long terminal repeat-containing retrotransposon Tf1 possesses amino acids in Gag that regulate nuclear localization and particle formation. Journal of Virology, 79:9540-9555.
Kelly, F., and Levin, H. L., (2005) The evolution of retrotransposons in Schizosaccharomyces pombe, Cytogenetics and genome Research, 110:566-574.