...Changing The World

Project Description

MMRF » 2018 Vaccine Development Conference #10: 3D and Single-Cell Epigenome Technologies for Precision Immune Profiling [2018-06-27. Ansuman Satpathy. HVP/USC]

2018 Vaccine Development Conference Session #10: 3D and Single-Cell Epigenome Technologies for Precision Immune Profiling [2018-06-27. Ansuman Satpathy, Peggy Hamburg. HVP/USC]

2018 Vaccine Development Conference – Session #10: 3D and Single-Cell Epigenome Technologies for Precision Immune Profiling — Ansuman Satpathy, Peggy Hamburg.


Organized in conjunction with the Human Vaccines Project, the 1st Annual Conference on the Future of Vaccine Development was a one day event which took place at the USC Michelson for Convergent Bioscience on June 27, 2018.

By bringing together some of the world’s leading scientists in the fields of immunology, genomics, bioinformatics, and bioengineering, the Future of Vaccine Development annual conference aims to explore how the convergence of new technologies across disciplines is impacting the future of vaccine development. The conference will also honor the three inaugural winners of the Michelson Prizes for Human Immunology and Vaccine Research, both via their respective presentations and the remittance of their prizes during the Awards dinner ceremony following the conference itself.

About the Presenter: An instructor in pathology at Stanford University School of Medicine, Dr. Ansuman Satpathy is focused on combining disciplines of genomics and human immunology. His research will identify key gene regulatory mechanisms that trigger protective immunity following vaccination using novel epigenomic sequencing technologies applied directly to patient samples. The Prize will allow him to greatly accelerate his work, advancing both 3D and single-cell epigenetic technologies to human immunology and vaccine research.


  • Ansuman Satpathy, MD, PhD, Instructor, Department of Pathology. Postdoctoral Fellow, Department of Genetics, Stanford University School of Medicine.
  • Peggy Hamburg, MD, Foreign Secretary of the National Academy of Medicine (NAM), former Commissioner of the Food and Drug Administration (FDA), former Vice-President and Senior Scientist at the Nuclear Threat Initiative, former New York City Health Commissioner.

Peggy Hamburg: We are very excited to have one of our Michelson Award winners, Ansuman Satpathy, who is instructor in the Department of Pathology and postdoctoral fellow in the Department of Genetics at Stanford University, and his research focuses on developing new genome sequencing technologies to study immune system function and dysfunction in patients, and he has been working hard in this arena and we expect great things from you going forward, but he’s going to take just a little bit of time to talk to us about his work.

Ansuman Satpathy: So thanks very much and thanks to the foundation and the Human Vaccines Project. I’m really delighted to be here. So as Peggy said, my name is Ansu Satpathy. I’m an instructor in pathology and also a postdoctoral fellow in the Department of Genetics. I’m working with Howard Chang and Will Greenleaf at Stanford, and I’ll talk about something a little bit different. What we’ve been doing basically over the last few years to develop new genome technologies to study primary human immune cells, and I’ll tell you what I mean by that.

So what our group is generally interested in understanding is how regions of non-protein coding DNA in the genome impact the expression of disease-associated genes, so this is the 98 percent of the genome that we often don’t hear about and we often don’t study, and so when we talk about this non-coding genome, we’re primarily talking about two things. We’re talking about cis-regulatory elements which are typically thought of as enhancer regions that control gene expression and we’re talking about trans inputs, which are typically thought of as transcription factors that bind to these elements and bring them in close proximity with the genes that they regulate.

Okay, so why do we care about this? So other than the fact that it’s the majority of the genome, studies over the last ten years or so have shown, for example, genome-wide association studies, that the majority of the disease risk actually falls into this area, so across many, many diseases. In this case, I’m showing an example from autoimmune diseases where about 90 or 95 percent of the risk actually falls into this non-coating space. Only 5 percent falls into the protein-coating exome, so clearly there’s something important to understand here and it’s important to measure that directly in patients and directly in the setting of disease.

Okay, so how do we do this? There are essentially two challenges here. The first is a linear challenge, which is as you walk down the chromosome in the genome, where precisely the bases that are open, active, and regulating genes, so where are the sites that are regulatory? The second challenge is a more complicated challenge. It’s a three-dimensional challenge, so even if you know where those enhancer regions are in the genome, they could be hundreds of kilobases or megabases away from the genes that they regulate, so how do you know in three-dimensional space, in the folding of the nuclear chromatin, understand which cis-regulatory elements binds to which gene?

So the problem has been for immunologists that prior gold standard methods in the field have really required a large number of cells to do this, so 10, 100 million cells, even billions of cells to do this, so you really could never take these measurements in specific immune cell subtypes, certainly not from humans and certainly not from patient samples in a serial fashion, so you really never had access to this 98 percent of the genome that’s controlling genes.

So what our group has done in the past few years is develop two new technologies to address these two challenges. The first is a method called ATAC-seq, or Assay for Transposase Accessible Chromatin with sequencing. This was a method that was developed by a former graduate student in the group, Jason Buenrostro, and what Jason figured out is that you could take this naturally occurring bacterial transposase called TN5 whose function it is to just cut and paste DNA across bacterial strains, so it mediates the hopping of transposable elements. What Jason did very cleverly is that he just dumped on or he just incubated this enzyme with Illumina sequencing adapters. Then you take that complex and you just dump it onto cells or nuclei and then it just naturally integrates these sequencing adapters into these open regions of chromatin which are active regulatory elements.

Then you can amplify those regions by PCR and then really identify with base pair resolution as you walk down the genome where are the regulatory sites in genome, so you can find them with really high precision. I think this has really been a revolution for immunologists because you can now do this in primary cells that you care about that we study every day. So initially, you could do this in 50,000 cells. As I’ll show you later, you can now do it in single cells.

The problem, however, is that while it’s really good at finding these sites, it can’t link them to the genes that they regulate, so what I’ll tell you about in this first part of the talk is a new method that we’ve developed called H3K27 acetyl HiChIP. This method now allows you in primary cells to identify functional enhancer promoter connections genome-wide using a single assay.

I’ll just briefly walk you through the details of the method. Essentially what we do is in situ, we crosslink the chromatin architecture, so we crosslink the 3D contacts and then physically ligate them together using a DNA ligation step, and then we pull down using a ChIP protocol just the regions that are decorated by this active chromatin mark, call it H3K27 acetyl, and then we make a library of these hybrid fragments. So the fragments are made up of two pieces of DNA which could have been close together in the linear genome or they could be far apart, but they’re all acting together in the 3D nucleus. Then we sequence to then infer the contacts between those sites.

Okay, so let me walk you through some of the primary data. So this is a HiChIP experiment done in K562 acute myeloid leukemia cells. This is a cell line. What we’re going to do is focus now on one locus, one position on the chromosome, and that’s the promoter for the gene MYC, a common oncogene, and so what I’m doing here is just setting our anchor point of analysis at that MYC promoter and then we’re just walking down the genome and walking down the chromosome and asking where do we see a HiChIP signal, so what regions are making contact with this promoter?

So you can see we get five sites, three that are relatively close by, within 200 kilobases or so, and then 2 that are actually quite far away, 2 megabases away from the promoter and actually hop over two other genes to contact MYC. So this is nice but you get a lot of noise with these types of assays, so how do we actually know the signal that we’re getting is some type of functional enhancer interaction?

So what we did in this case and several other cases is to compare our data to high-resolution CRISPRi screens that were really coming out almost every day, at that time, and here I’m showing a comparison to a CRISPRi screen done by Eric Lander and Jesse Engreitz at the Broad Institute. So what Jesse and Eric did is they tiled 100,000 CRISPR guides along this exact same locus and then asked which of those CRISPR guides impacted MYC, so this is a beautiful and elegant experiment that took a lot of time to do, and they find five sites which perfectly overlap with the five sites that we find, so we can recapitulate this 100,000 CRISPR guide experiment using a single assay. So I’m showing you of course just one gene locus here. We get the same information for every gene genome-wide using a single assay.

Okay, so someone could say do you actually need to do this type of high-resolution precision three-dimensional analysis in every cell type? Immune cells, maybe they’re relatively similar and they have an invariant chromatin structure, so the enhanced, you could do it once, and predict it from that. So now what I’m showing you is the exact same MYC locus but now done in two different cell lines, a T-cell line, a My-La cell, and a B-cell line GM12878 cells, and what you can see is a dramatically different enhancer landscape in these relatively similar cells.

Of course they’re not similar to this audience but they’re both lymphocytes compared to a skin cell or a heart cell. They’re relatively similar even down to the direction in which the enhancers go, so T-cell enhancers go off to the right here, B-cell enhancers off to the left. We can then validate these results using our own CRISPRi experiments and show if we knock down B-cell enhancers and B-cells, you lose MYC expression, not so if you do it with the T-cell enhancers.

Okay, so can we now actually do these measurements in primary cells? This was the whole point of developing this assay, and so just to set up the challenge for you, the challenge is to improve the signal to noise of this type of assay by about three orders of magnitude, so from 50 million cells, which was the prior gold standard, to 50,000 cells. So what I’m showing you now is three HiChIP experiments and three starting cell pools, 25 million cells, 1 million cells, and 50,000 cells, and we use this as a benchmark for what’s getting in the realm of primary cells, and the plots are these 2D interaction maps, so this is the raw HiChIP data, and what you’re looking at is you’re walking down the chromosome and asking what part of the chromosome does each piece make contact with.

So the large diagonal is each piece of DNA interacting with itself. Anything off the diagonal, like this dot here, is two pieces of DNA that are far apart coming together to make a chromatin loop. So you can see we get nice signal with 25 million cells and then maintain that in 50,000 cells.

Okay, so at that point, we move to primary immune cells, and we move to our favorite cell type which is CD4 T-cells, recruited three healthy volunteers and then from a single blood draw, sorted out naïve T-cells, T-helper 17 cells or T-regulatory cells, and then did HiChIP. And so what you can see is these really beautiful chromosome confirmation maps and what you’re looking at as you go left to right is essentially zooming into a piece of chromosome, okay? So you can see higher order chromosome structures on the left and then zoom in to chromatin domains and then individual interactions between a gene promoter and a distal enhancer.

So the first question we asked when we had this primary data is how do the interactions look? Do we get complex interactions or could we have just predicted it from the nearest enhancer goes to the nearest gene and so forth, but what we actually see is a lot of complex interactions. I’m showing you four examples of that, so overall, about 80 percent of the interactions that we see are not just simply nearest enhancer to nearest promoter. So we see examples of enhancer-promoter skipping, which means that a distal enhancer will skip over intervening genes to target the expression of a distal gene. We see several enhancers working together often to control gene expression. We see promoter-promoter interactions, so promoters acting as enhancers for other genes and we see enhancer-promoter switching, which is essentially dynamic enhancer usage for a single gene.

And I don’t have to tell this audience, as we’ve heard several times today, that information that you get from doing it in a primary cell type of sorted cell subset is much more useful than what you can get from reference maps or books online. The genes that were T-reg specific we can see nice chromatin structure. You totally lose it if you do total T-cells, you lose the sensitivity, and especially if you do cell lines, so you really need to do the assay in the primary cells that you care about in the human cells to get the right information.

So I’ll just end this part by returning to the original question that I posed. Can we use this type of 3D chromatin structure to predict or potentially nominate target genes of intergenic SNPs? So we focused on autoimmune disease because we have T-cell maps and I’ll show you two specific examples first and then we’ll zoom out to more global analysis. So if you just focus on this bottom left plot here, what I’m doing is setting the analysis now instead of at a gene, at a SNP of interest, so this is a SNP that’s associated with Type 1 diabetes and rheumatoid arthritis.

It’s been validated several times in multiple studies and that SNP is actually closest to the gene SMIM20, which when I saw that, I’m sure most of you, doesn’t really ring any bells. It doesn’t have really any known immune cell function, but of course that was the nearest gene so that’s what the studies have predicted as the target, but you can see that we really don’t find any HiChIP signal at all at that gene, but instead, we see really nice HiChIP signal at RBPJ, which is a notch regulator known to function in T-cells, and STIM2, which is a T-cell activation gene.

Same thing if we look at a SNP associated with MS closest to the gene RMI2 but actually has very nice interactions with SOCS1 and CLEC16A, which have also been shown to be involved in autoimmune disease.

Okay, so if we zoom out now and do this for about 21 autoimmune diseases where we have really nice GWAS data, we can expand the list of nominated target genes by about fourfold, and only about 50 percent of the nearest genes to those SNPs had any signal at all in our 3D chromatin confirmation map. We can go a step further and ask how does the risk SNP actually affect the chromatin loop. Does it strengthen the loop, making gene expression stronger, or break the loop, and you can see we get dots on either side, meaning that it does both things.

So as we move forward now into thinking about how can we use this information to target genes and target cellular therapies, we’re taking both approaches, so we’re trying to strengthen enhancer-promoter contacts using CRISPR or break them. So on the left here now I’m showing you the same two SNP that were associated with disease that we just talked about, but now what we’re doing is we’re targeting that enhancer with a CRISPR interference dead Cas9 and crab domain containing complex. We’re bringing a repression domain to that enhancer, and what I’m showing you is that we can now dial down the expression of the genes that we predict by HiChIP, so this validates the approach. This validates the data that I just showed you but also maybe nominates ways that we could use that information to target cell therapies.

And conversely, if we target CRISPR activation domains to these enhancers, so these are four enhancers for the T-cell genes CD69, you can now dial up the gene expressions of the genes that you predict by HiChIP both by RNA and protein.

Okay, so in the last few slides, I want to tell you sort of the next level of that. So it’s very nice to have this epigenomic information, understand what are the molecular switches in the noncoding genome, but of course everybody here’s probably thinking it’s not enough to do it in 50,000 cells. You need to go down to single cells, and we know a lot of things about T-cells already, so why lose all that information just to gain epigenetic information? So we’d really like to be able to pair these pieces of information with single cell resolutions, so get the epigenomic information, particularly for T-cells, get TCR sequence, and then maybe even add additional layers, transcriptomes, protein cell surface proteins, and other things.

So in the same way that maybe Google Maps is useful because you have many layers of information. You know where the streets are, where the people are, where the stores are, you can use that to navigate the map. In the same way, epigenetic information is important. It tells you a lot about the chromatin state but we want to integrate that with all these other pieces of information.

So just in a few slides I’ll show you, so this is all work in collaboration with Mark’s group and Naresha Saligrama, who is really fantastic postdoc in his group. So what we’ve been able to do is exactly that, so take now this epigenetic profiling to single-cell resolution and then pair that in this case with TCR alpha and beta sequence from every single cell, and I’ll just briefly walk you through this. So essentially what we’ve done is miniaturized the whole protocol for a ATAC-seq in a microfluidic device, so we’ve engineered the device to do specific chemical processes in sequence, so the cells are individually captured and then go through a series of ATAC-seq steps by just serial addition of different chemicals, and then we targeted amplify the TCR alpha and beta locus or RNA from each cell, and then pool everything together, amplify together, and then barcode the libraries and sequence, and then computationally integrate TCR in ATAC-seq profiles for every cell.

Okay, so here’s what the data looks like, so what you’re looking at here is chromatin accessibility in a piece of a chromosome. This contains the CD4 gene, and on top, you have reference data from the particle standard method called DNAse hypersensitivity sequencing from ten million cells. The red line is ATAC-seq data from 50,000 cells and the blue line is now single cell ATAC-seq data from about 200 cells aggregated, so you can see we can recapitulate the information in ten million cells now in 200 cells, but of course now we have that information for every single cell, so the rows in this heat map are individual cells and the columns are individual enhances and promoters, and so you can see things like enhancers that are very constant across T-cells and some that are variable. And then of course for every single cell, we also have TCR alpha and beta sequence.

Okay, so what does this data look like? This is a reduced dimensionality plot, the tSNE plot on primary T-cells now, so sorted in the same way as in the HiChIP experiment. We sorted either naïve T-cells, bulk sort of memory C4 T-cells, or T-helper 17 cells, and they largely cluster by how we sorted them with some important differences. There are some naïve cells that actually cluster like memory cells and then there’s some variation in the T-memory population.

So how do we get to his plot? For every single cell, so here’s a naïve cell, the cell that was sorted as a naïve cell but actually clusters with memory or T-helper 17 cells. We can zoom in and look at the Cis regulatory elements but we can actually infer the activity of transcription factors by diving deep into the sequence of each enhancer, and that’s because you can think of it as you’re spray painting a wall, so if you’re spray painting a wall and you put your hand in front of it, when you’re done, there’s an impression of your hand.

In the same way, when you’re spray painting the genome with transposase in ATAC-seq, if there’s a transcription factor sitting down on the piece of DNA, it will leave an imprint there, and so we can use that imprint to then infer what transcription factor was sitting at a particular site, do that over, and over, and over for every peak, to get general activities of transcription factors genome-wide for every single cell, and so in these bottom plots now what we’ve done is paint the same tSNE plot, but now from blue to red with the activity of a number of transcription factors, so you see things that make sense, unlike aurora gamme T being very active in a lot of T-helper 17 cells but not all of them and then you can find new transcription factors by doing this over about 3,000 different motifs.

Okay, so this is the last data slide, so I just want to show you one example of how we’ve been able to do this in a different setting. This is a patient who had metastatic basal cell carcinoma and was being treated with immunotherapy, was working really, really well for the first six months. Something happened at six months and stopped responding. So we thought why don’t we go dive deep into that person’s T-cells and see if we can figure out some chromatin state changes that may have happened that reflect that change in response.

So this is now a reduced dimensionality plot from about 6,000 cells. This is not our single cell RNA-seq. This is now single cell ATAC-seq data, and so for the T-cell cluster, you see a large number of tumor cells and these are some stromal cells here but you see a ton of T-cells, and the dark dots represent the biggest clone in that T-cell population, and so now what we can do is zoom in to that clone and compare the clonal cells versus all the bystander cells, and you can look at the activity of transcription factors overall in the clonal cell versus bystander cells and then zoom in to individual enhancers to identify potentially dysfunction-associated enhancers, for example, next to CTLA4.

Okay, so what I told you today, I’m sorry that was very quick, so I apologize for that, but hopefully what I’ve been able to convince you of is that we’re really working hard to develop new technologies to study this non-coding space in primary immune cells. We really want to be able to take these measurements in human cells. So we’ve done it in T-cells because that’s what we’re most interested in, but of course this should be applicable to any cell type or disease process that you’re interested in, and one thing that we’ve learned is that it’s really important to do it in the primary cell type. You don’t get the same information if you use reference maps, certainly not from cell lines, not from ensemble populations, and for the specific sequences, you also don’t get the information from mice.

As we move forward, we’re thinking about, okay, we can read this information really well now but can we actually use that to engineer cells to do new functions, make them more durable, long-lasting, things like that.

So again, thanks to the Human Vaccines Project. It’s really great to meet you all and become part of this community. We weren’t working on vaccines before and I think this will now allow us to move into that area and we’d like to use all of these approaches to really study why vaccines work well in some patients and not in others, and hopefully the next time I see you all, I’ll have some preliminary results for that, but we’re really working on the whole gamut of epigenomics, 3D measurement tools, single cell epigenomics, pairing that data with orthogonal information and TCR, BCR, even CRISPR guide for high throughput perturbation screens, and we’re also working very hard on the computational and statistical analysis side.

So thanks very much. I’ll just thank a bunch of people here, in particular, Max Mumbach and Naresha, who were really the trainees that worked very closely with me on this. Thank you.


Peggy Hamburg: Well, thank you very much, very impressive and sophisticated work.

Transcript curation: Alison Deshong

Dr. Ansuman Satpathy, PhD, Stanford University; 2018 Michelson Prize Winner for Human Immunology and Vaccine ResearchDr. Ansuman Satpathy – 2018 Michelson Prize Winner for Human Immunology and Vaccine Research.

Dr. Ansuman Satpathy; 3D and Single-Cell Epigenome Technologies for Precision Immune Profiling. 1st Annual Conference on the Future of Vaccine Development. [2018-06-27] {#291} (Credit: Marv Steindler / Steve Cohn Photography)Dr. Ansuman Satpathy explains the details of a slide excerpted from his presentation ‘3D and Single-Cell Epigenome Technologies for Precision Immune Profiling’ during the 1st Annual Conference on the Future of Vaccine Development. [2018-06-27] {#0291} (Credit: Marv Steindler / Steve Cohn Photography)

Dr. Mark M. Davis, Dr. Ansuman Satpathy, Michelson Prizes 2018; The Future of Vaccine Development Symposium Dinner. {#0722} [2018-06-27. Marv Steindler / Steve Cohn Photography]Dr. Mark M. Davis of the Department of Microbiology and Immunology at Stanford University poses with his mentee Dr. Ansuman Satpathy, 2018 Michelson Prize for Human Immunology and Vaccine Research winner, during the dinner award ceremony following the 1st Annual Conference on Vaccine Development held at the USC Michelson Center for Convergent Bioscience. [2018-06-27] {#0722} (Credit: Marv Steindler / Steve Cohn Photography)

Wayne Koff, PhD, Dr. Ansuman Satpathy, Michelson Prizes 2018; The Future of Vaccine Development Symposium Dinner. {#0681} [2018-06-27. Marv Steindler / Steve Cohn Photography]Wayne Koff, PhD, President & CEO of the Human Vaccines Project presents the Michelson Prize for Human Immunology and Vaccine Research to awardee Dr. Ansuman Satpathy, one of the three inaugural winners of the Michelson Prize. [2018-06-27] {#0681} (Credit: Marv Steindler / Steve Cohn Photography)

Dr. Patricia Therese Illing, Dr. Ansuman Satpathy, Dr. Laura Kate Mackay, Michelson Prizes 2018, Dr. Gary K. Michelson, Wayne Koff, PhD, Ian Gust AO, Dr. Steve A. Kay; The Future of Vaccine Development Conference. {#0005} [2018-06-27. Marv Steindler / Steve Cohn Photography](Left to Right) The 3 winners of the 2018 Michelson Prize for Human Immunology and Vaccine Research: Dr. Patricia Therese Illing, Dr. Ansuman Satpathy and Dr. Laura Kate Mackay pose with Dr. Gary K. Michelson, Founder of the Michelson Medical Research Foundation, Wayne Koff PhD, Ian Gust AO of the Human Vaccines Project and Steve A. Kay, Director of the USC Michelson Center for Convergent Bioscience, for a group picture ahead of the 1st Annual Conference on the Future of Vaccine Development held at the USC Michelson Center for Convergent Bioscience. [2018-06-27] {#0005} (Credit: Marv Steindler / Steve Cohn Photography)

Ian Gust AO, Dr. Ansuman Satpathy, Dr. Laura Kate Mackay, Dr. Gary K. Michelson, Wayne Koff, PhD, Steve A. Kay, PhD, Dr. Patricia Therese Illing, Michelson Prizes 2018; The Future of Vaccine Development Conference. {#0011} [2018-06-27. Marv Steindler / Steve Cohn Photography]Ian Gust, AO (Human Vaccines Project), Dr. Ansuman Satpathy, Dr. Laura Kate Mackay (2018 Michelson Prize winners of the Human Immunology and Vaccine Research Prize), Dr. Gary K. Michelson (Michelson Medical Research Foundation), Wayne Koff, PhD (Human Vaccines Project), Steve A. Kay, PhD (USC Michelson Center for Convergent Bioscience) and Dr. Patricia Therese Illing (2018 Michelson Prize winner of the Human Immunology and Vaccine Research Prize) pose outside the USC Michelson Center for Convergent Bioscience ahead of the 1st Annual Conference on the ‘Future of Vaccine Development’. [2018-06-27] {#0011} (Credit: Marv Steindler / Steve Cohn Photography)

Back to Top