View a 508-compliant PDF of this issue here: NICHD_Connection_2016_02.pdf

Hot Off the Press logoViruses are simple. Get in, replicate, get out. The human immunodeficiency virus (HIV)-1, on the other hand, prefers a more complex approach. HIV-1 genetic material hitches a ride into the nucleus. There, it inserts into the host genome, where it can lurk quietly in the unsuspecting cell until it’s transcribed and sent off anew. But HIV-1 is picky—it prefers certain spots in the DNA. New research from the Levin laboratory at the NICHD shows just how discerning HIV-1 can be. According to their study, HIV-1 integrates preferentially into intron dense genes by taking advantage of the cell’s own RNA splicing mechanisms.

The first work to map HIV-1 insertion occurred over a decade ago, with a few hundred sites mapped in 2002. When deep sequencing technology became available, researchers were quick to identify nearly 40,000 different HIV integration sites in the human genome. As large as the number seemed, it was only a tiny fraction of the possible sites, and therefore an incomplete sampling. Fast forward six or seven years. Deep sequencing technology became much more powerful. Dr. Henry Levin, of the NICHD Section on Eukaryotic Transposable Elements, saw this as an opportunity to generate the largest collection of HIV-1 integration data for humans to date.

With several ideas about how to approach the problem, a postdoctoral fellow in the lab, Dr. Parmit Kumar Singh, took the lead. Rather than using a single tube approach to sequencing, Singh ran a large number of ligation reactions with an even larger number of polymerase chain reactions. His approach worked. Singh amassed nearly one million HIV-1 integration sites, replicated over nine integration libraries. “We can now say it goes into this gene but not that gene,” Levin said, referring to HIV-1 integration. About 4,000 genes contained most of the integration sites, with the top hits showing a high enrichment for cancer-associated genes.

The implications for cancer gene preference during HIV-1 insertion are large—and not only for those infected with HIV-1. During the first successful gene therapy trials in the 1990s, researchers used genetic information from gamma-viruses to introduce DNA into patients with Severe Combined Immunodeficiency (SCID), a genetic disorder that leaves the body unable to fight even the smallest infection. While the therapy was successful, some of the treated individuals developed a fatal leukemia, likely due to the gamma-virus propensity to activate proto-oncogenes. In an attempt to avoid this complication, researchers switched to lentivirus-based vectors.

Parmit Singh's figure, description follows

Click image to enlarge.

Cartoon model for interactions that help direct HIV-1 to transcription units with a large number of introns. Abbreviations: Splicing Factors (SF), HIV-1 Integrase (IN), RNA Polymerase II (RNA Pol II)

But that was the problem: HIV-1 itself is a lentivirus. Singh’s work suggested that patients undergoing gene therapy with lentivirus vectors could need long-term monitoring for clonal proliferation of cells. Singh needed to understand HIV-1 integration at the gene level. He became interested in the cellular factors that enabled HIV-1 attraction to specific genes. During a time in biomedical science when research questions are increasingly focused and narrow in scope, Singh’s study was taking on complex elements of HIV-1 biology, gene therapy, and the beauty of developmental biology.

In a gene-level analysis, Singh identified a striking pattern: HIV-1 integration heavily favored the five prime (5’) end of the gene. On a wild hunch, he hypothesized that introns may be involved. In an elegant set of experiments, Singh showed that increased intron number led to greater likelihood for HIV integration into the gene—and it was not due to gene length. It was his “AHA!” moment. In collaboration with Dr. Brian T. Luke of NCI, NIH, Singh determined that intron number was the largest predictor for HIV-1 integration into a gene.

In search of a cellular factor that linked HIV-1 integration to intron number, Singh turned his attention to LEDGF, a chromatin binding factor previously associated with HIV-1 integration. In animal studies, knocking out LEDGF caused mice to die, which of course made it terribly difficult to study. Nothing had been known about LEDGF’s endogenous role in the cell, but HIV-1 integration offered a surprisingly robust model to examine LEDGF function.

Using previously published HIV-1 integrations in mouse embryonic fibroblast cell lines with and without LEDGF, Singh showed that highly spliced genes had greater rates of HIV -1 integration, but only when LEDGF was present. The absence of LEDGF also lessened the trend toward integration at the 5’ end of the gene. In a final clincher, mass spectrometry analysis revealed that LEDGF interacted with a long list of known splicing factors. LEDGF was the link Singh needed. It provided a mechanism for HIV-1 to preferentially integrate into highly spliced genes.

The ability to predict how HIV-1 will act in the cell, or how a lentivirus vector will introduce life-saving DNA into the genome, relies on an understanding of the players involved. “The larger picture for me is LEDGF function,” Singh stressed. “This will open a new door.”


Singh et al. (2015). “LEDGF/p75 interacts with mRNA splicing factors and targets HIV-1 integration to highly spliced genes.” Genes & Development 29:2287–2297. (PMID: 26545813)