MIC-Drop and Perturb-seq Explained: Revolutionizing In Vivo Functional Genomics Screening

Allison Howard Feb 02, 2026 289

This article provides a comprehensive guide to MIC-Drop and Perturb-seq, two transformative pooled screening technologies for in vivo functional genomics.

MIC-Drop and Perturb-seq Explained: Revolutionizing In Vivo Functional Genomics Screening

Abstract

This article provides a comprehensive guide to MIC-Drop and Perturb-seq, two transformative pooled screening technologies for in vivo functional genomics. It explores their foundational principles, from the encapsulation of CRISPR guides in MIC-Drop to single-cell transcriptomic readouts in Perturb-seq. We detail step-by-step methodological workflows for designing and executing in vivo screens, covering model system selection and viral delivery. Practical troubleshooting sections address common challenges like screen depth and off-target effects. Finally, the article offers a critical comparison of these techniques against each other and traditional methods, validating their power for uncovering gene function and genetic interactions in complex physiological contexts. This resource is essential for researchers and drug developers aiming to implement these cutting-edge approaches to accelerate target discovery and mechanistic understanding of disease.

What Are MIC-Drop and Perturb-seq? Core Principles of In Vivo Pooled Screening

This application note, framed within a thesis on in vivo functional genomics screening, compares two transformative technologies: MIC-Drop and Perturb-seq. Both integrate genetic perturbations with single-cell RNA sequencing (scRNA-seq) to decode genotype-phenotype relationships at scale. MIC-Drop specializes in high-multiplex in vivo screening via lipid-encapsulated guide RNA (gRNA) barcodes, while Perturb-seq, typically used in vitro and ex vivo, links gRNAs to cellular barcodes within a pooled viral library. This document provides detailed protocols and comparative analysis to guide researchers in selecting the appropriate methodology for their in vivo screening research.

Technology Comparison: Core Specifications

Table 1: Core Technology Comparison

Feature MIC-Drop Perturb-seq (Pooled CRISPR Screens with scRNA-seq)
Perturbation Format Lipid-coated droplets containing gRNA-DNA barcodes. Pooled lentiviral library with transcribed gRNA barcodes.
Delivery Method Direct microinjection into model organisms (e.g., zebrafish embryo). Viral transduction in vitro; in vivo requires specialized delivery (e.g., tail vein, transplantation).
Multiplexing Capacity Very High (Theoretically millions of unique barcodes). High (Limited by viral library diversity, typically 10^2-10^5).
Key Innovation Separation of gRNA synthesis from delivery; scalable barcoding. Direct capture of gRNA transcript alongside cellular transcriptome.
Typical Scale 100s of perturbations in a single animal. 10s-1000s of perturbations across a cell population.
Primary Screening Context In vivo (whole organism, early development). In vitro / Ex vivo (cell cultures, organoids).
Perturbation Readout scRNA-seq detects DNA barcode from droplet. scRNA-seq detects transcribed gRNA sequence.

Table 2: Quantitative Performance Metrics (Representative Data)

Metric MIC-Drop Perturb-seq
Perturbation Efficiency ~70-80% (injected cells) ~20-60% (varies by cell type & viral titer)
Multiplexing Demonstrated >1000 gRNAs in a single zebrafish embryo >200,000 cells profiled with 100+ gRNAs in a pooled culture
Cell Throughput 10,000-50,000 cells per experiment 100,000-1,000,000+ cells per experiment
Cost per Perturbed Cell Higher (microinjection, droplet prep) Lower (pooled viral production, bulk transduction)
Temporal Control High (injection at precise developmental time). Lower (depends on viral expression kinetics).

Detailed Protocols

Protocol 1: MIC-Drop forIn VivoZebrafish Screening

Objective: To perform multiplexed CRISPR knockout screening in a living zebrafish embryo using MIC-Drop.

I. Materials & Reagent Preparation

  • MIC-Drop Library: DNA oligonucleotides encoding gRNA sequence + unique 20bp barcode, prepared by array synthesis.
  • Droplet Generation Oil & Reagents: (e.g., Bio-Rad Droplet Generation Oil for Probes).
  • Lipid Mix: DOPE, DOTAP, Cholesterol in chloroform.
  • Cas9 Protein: High-purity, nuclease-active S. pyogenes Cas9.
  • Microinjection System: Pneumatic picopump, micromanipulator, and pulled glass capillary needles.
  • Zebrafish Embryos: At the 1-4 cell stage.
  • 10X Genomics Chromium Controller & Single Cell 3' Reagent Kits.
  • Custom Primer for Barcode Amplification: Designed against the constant region flanking the unique DNA barcode.

II. Step-by-Step Methodology

  • Droplet Encapsulation:
    • Mix the MIC-Drop DNA library with Cas9 protein in an aqueous buffer.
    • Use a microfluidic droplet generator to emulsify the aqueous mix with the oil-surfactant blend, creating monodisperse droplets (~100 µm diameter). Each droplet encapsulates, on average, one DNA barcode and multiple Cas9 proteins.
    • Formulate the lipid mix in oil and fuse with the primary droplets to create a lipid monolayer shell, stabilizing the droplet for injection.
  • Microinjection:

    • Backload the MIC-Drop droplets into a glass microneedle.
    • Align the needle with the yolk or cell of a 1-4 cell stage zebrafish embryo.
    • Inject ~1 nL, containing ~50-100 droplets, into each embryo.
    • Incubate embryos at 28.5°C until desired developmental stage for analysis.
  • Single-Cell Dissociation & Library Prep:

    • Dissociate pooled, injected embryos into a single-cell suspension.
    • Load cells onto the 10X Chromium Controller to generate Gel Bead-In-Emulsions (GEMs).
    • Perform reverse transcription. The custom primer included in the master mix amplifies the DNA barcode from the MIC-Drop droplet, while the standard primers capture poly-adenylated cellular mRNA.
    • Process libraries following the standard 10X protocol, with an additional PCR step to enrich the barcode library.
  • Sequencing & Analysis:

    • Sequence on an Illumina platform. The standard read 1 captures the cellular gene expression UMI, the i7 index captures the MIC-Drop DNA barcode, and the i5 index captures the sample index.
    • Align cellular reads to the zebrafish genome and barcode reads to the library manifest.
    • Construct a cell x gene matrix and a cell x barcode matrix, merging them to assign perturbation identities to each cell's transcriptome.

Protocol 2: Pooled Perturb-seq in Cultured Cells

Objective: To perform a pooled CRISPRi Perturb-seq screen in human cell lines to identify transcriptional phenotypes.

I. Materials & Reagent Preparation

  • Lentiviral Perturb-seq Library: Pooled plasmids containing gRNA, MS2 stem-loops, and a constant U6 promoter.
  • Target Cells: HEK293T or K562 cells expressing dCas9-KRAB (for CRISPRi) or dCas9-VP64 (for CRISPRa).
  • Lentiviral Packaging Plasmids: psPAX2 and pMD2.G.
  • Polybrene: To enhance viral transduction.
  • Puromycin: For selection of transduced cells.
  • 10X Genomics Chromium Single Cell 3' Reagent Kit v3.1.
  • Custom PCR Primer for gRNA Capture.

II. Step-by-Step Methodology

  • Virus Production & Transduction:
    • Co-transfect the Perturb-seq library plasmids with psPAX2 and pMD2.G into HEK293T cells using PEI.
    • Harvest lentiviral supernatant at 48 and 72 hours post-transfection.
    • Transduce target cells at a low MOI (<0.3) to ensure most cells receive a single gRNA, in the presence of 8 µg/mL Polybrene.
    • At 48 hours post-transduction, begin puromycin selection (e.g., 1-2 µg/mL) for 3-5 days to enrich for transduced cells.
  • Cell Culture & Harvest:

    • Culture the selected cell pool for a sufficient time for transcriptional perturbations to stabilize (e.g., 7-10 days for CRISPRi/KRAB).
    • Harvest cells, ensure >90% viability, and resuspend at 700-1200 cells/µL in PBS + 0.04% BSA.
  • Single-Cell Library Preparation:

    • Load cells onto the 10X Chromium Controller. The Gel Bead contains a custom capture sequence complementary to the MS2 loops on the gRNA transcript.
    • During GEM-RT, both cellular mRNA and the gRNA transcript are reverse-transcribed.
    • Follow the standard 10X protocol. The gRNA-derived cDNA is amplified in a separate PCR reaction using custom primers.
  • Sequencing & Analysis:

    • Sequence the gene expression library and the gRNA library separately.
    • Align mRNA reads to the reference genome and gRNA reads to the library manifest.
    • Use cell barcodes to pair each cell's transcriptome with its assigned gRNA perturbation. Perform differential expression analysis between cells with different gRNAs.

Visualized Workflows & Pathways

Title: Comparative High-Level Workflow of MIC-Drop vs Perturb-seq

Title: Structure of a Single MIC-Drop Particle

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Materials for Perturbation Screening

Item Function Example/Supplier
Array-Synthesized Oligo Library Source of gRNA and barcode sequences for library construction. Twist Bioscience, Agilent.
Lipid Components (DOPE, DOTAP) Form stable monolayer around aqueous droplet for in vivo delivery (MIC-Drop). Avanti Polar Lipids.
Microfluidic Droplet Generator Creates monodisperse emulsions for MIC-Drop encapsulation. Dolomite Bio, Bio-Rad (QX200).
High-Activity Cas9 Nuclease Efficiently executes DNA cleavage upon gRNA delivery. IDT Alt-R S.p. Cas9, Thermo Fisher TrueCut.
Lentiviral Packaging System Produces high-titer, replication-incompetent virus for Perturb-seq. psPAX2/pMD2.G plasmids (Addgene).
dCas9-KRAB/dCas9-VP64 Cell Line Enables transcriptional repression/activation for Perturb-seq phenotype modulation. Available from ATCC or generated via stable transduction.
10X Genomics Chromium Controller & Kits Gold-standard platform for generating single-cell RNA-seq libraries. 10X Genomics (Single Cell 3' Gene Expression).
Custom PCR Primer Cocktail Specifically amplifies the gRNA or DNA barcode during library prep. IDT, Thermo Fisher.
Microinjection System Precisely delivers MIC-Drop droplets into model organisms. Warner Instruments, Narishige.

Within the broader thesis investigating scalable in vivo functional genomics, this document details the application of MIC-Drop and Perturb-seq technologies. The core challenge is establishing a causal, high-resolution link between a targeted genetic perturbation and the resulting molecular and cellular phenotype within the complex tissue environment of a living, multicellular organism. This bridges the gap between pooled screening scalability and single-cell phenotypic resolution.

Application Notes

Integrating MIC-Drop for In Vivo Perturbation

MIC-Drop (Multiplexed Interrogation of Cells by Droplet) enables the delivery of multiple, uniquely barcoded genetic perturbations (e.g., CRISPR guide RNAs) into a single complex organism, such as a zebrafish or mouse embryo. Each perturbation is encapsulated within a unique droplet alongside a DNA barcode. This allows for the in vivo generation of a mosaic of genetically distinct cells, where the identity of the perturbation in any given cell is recorded.

Key Quantitative Data: Table 1: MIC-Drop Performance Metrics

Parameter Typical Specification Notes
Perturbation Library Size 10² - 10⁴ constructs Limited by droplet barcode diversity & delivery efficiency.
Delivery Efficiency (In Vivo) 20-60% (cell transfection/transduction) Highly organism & tissue dependent.
Co-perturbation Capability 2-5 gRNAs per droplet Enables combinatorial knockout studies.
Barcode Recovery Rate >70% From sorted cells for single-cell RNA-seq.

Coupling to Perturb-seq for Phenotypic Readout

Perturb-seq refers to the combination of genetic perturbations with single-cell RNA sequencing (scRNA-seq). Cells from the MIC-Drop-perturbed organism are dissociated, and their transcriptomes are captured alongside the gRNA barcodes. This generates a unified dataset where each cell's gene expression profile (phenotype) is linked to its genetic perturbation (genotype).

Key Quantitative Data: Table 2: Perturb-seq Output Specifications

Parameter Typical Value/Range
Target Cell Recovery per Perturbation 100-500 cells (for robust statistical power)
Median Genes Detected per Cell 1,500 - 3,000 (10x Genomics platform)
Critical Min. Cells per gRNA ~50 cells (for differential expression analysis)
Differential Expression Sensitivity Log2FC > 0.25, adjusted p-value < 0.05

Detailed Protocols

Protocol 1: MIC-Drop Library Preparation & In Vivo Delivery (Zebrafish Model)

Aim: To encapsulate and deliver a barcoded CRISPR-Cas9 gRNA library into zebrafish embryos.

Materials: See "The Scientist's Toolkit" below. Procedure:

  • Library Cloning: Clone your pooled gRNA library (e.g., targeting 200 key signaling genes) into the MIC-Drop vector backbone containing a unique molecular identifier (UMI) and a T7 promoter.
  • In Vitro Transcription (IVT): Linearize the plasmid library and perform IVT to generate gRNA library pools.
  • Droplet Generation: Use a microfluidic droplet generator to encapsulate individual gRNA molecules from the library, along with Cas9 protein and a barcoded primer bead, into picoliter-scale water-in-oil droplets. Each bead's barcode tags the co-encapsulated gRNA.
  • Emulsion PCR (ePCR): Perform ePCR within the droplets to amplify the barcoded gRNA construct.
  • Droplet Pooling & Breaking: Pool all droplets, break the emulsion, and purify the barcoded gRNA library.
  • Microinjection: Co-inject the purified barcoded gRNA library + Cas9 protein complex into the yolk of 1-cell stage zebrafish embryos.
  • Embryo Rearing: Raise injected embryos under standard conditions until the desired developmental stage (e.g., 3 dpf for early development screens).

Protocol 2: Single-Cell Dissociation & Perturb-seq Library Preparation

Aim: To recover perturbed cells, prepare barcoded scRNA-seq libraries, and link gRNA identities to cell transcriptomes.

Procedure:

  • Tissue Dissociation: At the chosen timepoint, pool 20-30 mosaic zebrafish embryos. Dissociate the whole embryo or micro-dissected tissue of interest into a single-cell suspension using enzymatic digestion (e.g., Liberase TM) in conjunction with gentle mechanical trituration.
  • Cell Viability & Concentration: Pass suspension through a 40μm flow cell strainer. Assess viability (>80% via Trypan Blue) and adjust concentration to ~1,000 cells/μL.
  • Single-Cell Partitioning & RT: Load cells onto the 10x Genomics Chromium Controller using the Single Cell 3' Reagent Kit v3.1. Within each Gel Bead-in-Emulsion (GEM), poly-adenylated mRNA from a single cell is reverse-transcribed. The reverse transcription primer on the Gel Bead contains the Cell Barcode and a Unique Molecular Identifier (UMI).
  • gRNA Capture Amplification: In parallel, within the same GEM, the poly-A-tailed gRNA transcript is also reverse-transcribed using the same bead-bound primer, linking the same Cell Barcode to the gRNA.
  • Library Construction & Sequencing: Following standard 10x Genomics protocol, generate two separate libraries: (a) the Gene Expression Library (from cDNA) and (b) the Feature (gRNA) Barcode Library (from the gRNA amplicon). Sequence the Gene Expression library deeply (~50,000 reads/cell) and the Feature Barcode library with sufficient depth (~5,000 reads/cell) to confidently assign gRNAs.

Protocol 3: Computational Analysis for Genotype-Phenotype Linking

Aim: To process sequencing data and associate perturbations with transcriptional phenotypes.

Procedure:

  • Alignment & Quantification: Use Cell Ranger (10x Genomics) to align gene expression reads to the zebrafish reference genome (GRCz11) and count UMIs per gene per cell. Align feature barcode reads to the gRNA reference library.
  • Cell-Guide Association: Assign gRNAs to each cell based on the shared Cell Barcode. Filter cells for high-confidence assignments (≥10 UMI counts for a single gRNA, minimal secondary gRNA signal).
  • Data Integration & QC: Integrate the gene expression matrix and gRNA assignment table using a single-cell analysis toolkit (e.g., Scanpy in Python). Filter out low-quality cells (low gene counts, high mitochondrial read percentage).
  • Differential Expression & Pathway Analysis: For each targeted gene, perform differential expression analysis comparing cells containing its targeting gRNA vs. cells containing non-targeting control gRNAs. Use methods like MAST or Wilcoxon rank-sum test. Input significant differentially expressed genes into pathway analysis tools (e.g., GSEA, Enrichr).

Visualizations

Title: MIC-Drop to Perturb-seq Integrated Workflow

Title: Logical Chain from Perturbation to Phenotype

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials for MIC-Drop/Perturb-seq Screens

Item Function & Critical Features
MIC-Drop Vector Backbone Plasmid for gRNA cloning; contains UMI, T7 promoter, and poly-A tail for in vivo transcription and scRNA-seq capture.
Pooled gRNA Library Defined or genome-scale set of target sequences. Must be cloned, amplified, and quality-controlled to maintain diversity.
Microfluidic Droplet Generator (e.g., Bio-Rad QX200) Device for generating monodisperse water-in-oil emulsions for barcoding.
Cas9 Protein, NLS-tagged High-activity, purified Cas9 for direct RNP complex formation with gRNA; improves editing speed and reduces off-target effects.
10x Genomics Chromium Controller & 3' Kit Standardized platform for partitioning thousands of single cells into GEMs and constructing barcoded sequencing libraries.
Liberase TM Research Grade Blend of collagenase I/II used for gentle, high-viability dissociation of complex tissues (e.g., zebrafish embryo).
Cell Ranger Suite (10x Genomics) Primary analysis pipeline for demultiplexing, alignment, barcode processing, and UMI counting from raw sequencing data.
Scanpy (Python) / Seurat (R) Open-source toolkits for comprehensive single-cell data analysis, including QC, clustering, visualization, and differential expression.

Application Notes

MIC-Drop (Microfluidic Droplet-Enabled Guide RNA Delivery) represents a transformative approach for large-scale, in vivo functional genomics screening. It integrates CRISPR-based perturbation with single-cell RNA sequencing (Perturb-seq) within living organisms. The core innovation lies in the microfluidic encapsulation of uniquely barcoded guide RNAs (gRNAs) into degradable hydrogel microspheres, which are then delivered en masse into a model organism for pooled, yet traceable, screening.

Key Principles

  • Barcoded gRNA Encapsulation: Individual gRNA expression cassettes, each paired with a unique DNA barcode, are co-encapsulated with hydrogel precursors (e.g., PEG-DA) into picoliter-scale droplets using a microfluidic device. UV polymerization creates solid, biocompatible microspheres.
  • Multiplexed Delivery: Millions of these microspheres, each representing a single genetic perturbation, are co-injected into the circulatory system (e.g., of a zebrafish embryo). Cells phagocytose the microspheres.
  • Intracellular Release & Perturbation: The intracellular environment degrades the hydrogel, releasing the gRNA cassette for transcription. The CRISPR machinery enacts the genetic knockout.
  • Perturb-seq Readout: The animal is dissociated into single cells for sequencing. The expressed gRNA barcode (from the cassette) and the cell's whole transcriptome are captured together, linking each perturbation to its transcriptional outcome.

Table 1: MIC-Drop Encapsulation and Delivery Efficiency Metrics

Parameter Typical Performance Range Measurement Method
Microdroplet Diameter 20 - 50 µm Microscopy with size calibration
gRNA Cassette Encapsulation Efficiency ~70% Digital PCR on sorted droplets
Single-Encapsulation Rate (Poisson Loading) >90% of occupied droplets Fluorescence co-encapsulation assay
In Vivo Delivery Efficiency (Zebrafish) 10-30% of cells receive a bead Flow cytometry for bead-positive cells
Barcode Detection Sensitivity (scRNA-seq) >80% bead-positive cells yield barcode Single-cell RNA sequencing analysis

Table 2: Comparison of In Vivo Screening Platforms

Platform Perturbation Scale Single-Cell Readout In Vivo Tracing Major Advantage
MIC-Drop High (10^4-10^5) Yes (Perturb-seq) Yes (Barcoded beads) Direct, traceable in vivo delivery
Bulk Viral Delivery High No (Bulk RNA-seq) Limited (Complex barcode deconvolution) Established, high transduction efficiency
Electroporation Low to Medium Possible but challenging No Suitable for early embryos
Transgenesis Low Yes Yes (Germline stable) Stable, heritable lines

Experimental Protocols

Protocol 1: Microfluidic Encapsulation of Barcoded gRNA Cassettes

Objective: To generate PEG-based hydrogel microspheres containing single gRNA expression cassettes.

Materials (Research Reagent Toolkit):

  • Microfluidic Device: PDMS-based flow-focusing droplet generator.
  • gRNA Cassette Library: PCR-amplified templates with U6 promoter, gRNA scaffold, and unique 20bp barcode.
  • Hydrogel Precursor: 20% (w/v) Polyethylene glycol diacrylate (PEG-DA, MW 700) in nuclease-free water.
  • Photoinitiator: 2-Hydroxy-2-methylpropiophenone (0.5% v/v).
  • Oil Phase: HFE-7500 fluorinated oil with 2% (w/w) PEG-PFPE block copolymer surfactant.
  • UV Light Source: 365 nm, 100 mW/cm².

Procedure:

  • Prepare Aqueous Phase: Mix the gRNA cassette library (final ~10^6 molecules/µL) with PEG-DA and photoinitiator. Keep on ice, protected from light.
  • Prime Microfluidic System: Load the aqueous phase and oil phase into separate syringes. Mount onto syringe pumps and connect to device inlets.
  • Generate Droplets: Set oil phase flow rate to 600 µL/hr and aqueous phase to 200 µL/hr. Monitor droplet formation (~30 µm diameter) under microscope. Collect effluent in a chilled tube.
  • Polymerize Hydrogel: Transfer collected emulsion to a shallow dish. Expose to UV light (365 nm) for 15 seconds under gentle agitation.
  • Break Emulsion: Add 1 volume of perfluorooctanol to the polymerized emulsion. Vortex gently for 30 seconds. Centrifuge at 500 x g for 1 minute. Remove the oil and interface. Wash the pelleted microspheres 3x with PBS + 0.1% BSA.

Protocol 2: In Vivo Delivery and Screening in Zebrafish Embryos

Objective: To deliver MIC-Drop microspheres systemically and prepare single-cell suspensions for Perturb-seq.

Procedure:

  • Micro-injection: At 1-2 cell stage, co-inject ~200 nL of concentrated microspheres and 100 pg of Cas9 protein mRNA into the yolk of zebrafish embryos.
  • Embryo Rearing: Raise injected embryos at 28.5°C in E3 embryo medium. Monitor development.
  • Tissue Dissociation: At desired stage (e.g., 3 dpf), pool 20-30 dechorionated embryos. Dissociate in 1 mL of Leibovitz's L-15 medium containing 1 mg/mL collagenase and 10 U/mL papain for 45 minutes at 28°C with gentle trituration every 15 minutes.
  • Single-Cell Preparation: Quench with PBS + 10% FBS. Filter through a 40 µm strainer. Centrifuge at 300 x g for 5 min. Resuspend in PBS + 0.04% BSA. Count and assess viability (>85% required).
  • Single-Cell RNA Sequencing: Process 10,000-20,000 cells per sample on the 10x Genomics Chromium Controller using the Single Cell 3' Gene Expression v3.1 kit. Include a custom PCR step in library preparation to amplify the expressed gRNA barcode from the captured cDNA.

Diagrams

MIC-Drop Microsphere Fabrication Workflow

In Vivo Delivery and Screening Pathway

MIC-Drop Integrated Screening Pipeline

The Scientist's Toolkit: Essential Research Reagents

Table 3: Key Reagents for MIC-Drop Experiments

Reagent Function & Role in Experiment
PEG-Diacrylate (PEG-DA) Forms the biodegradable hydrogel matrix for encapsulating and protecting the gRNA cassette.
Fluorinated Oil (HFE-7500) with Surfactant Creates the immiscible oil phase for generating stable, monodisperse water-in-oil emulsion droplets.
Barcoded gRNA Cassette Library The core payload; a pooled library of PCR amplicons encoding the gRNA and its unique molecular identifier (barcode).
Cas9 mRNA / Protein The CRISPR effector. Co-delivered to enable immediate gRNA activity upon intracellular release.
10x Genomics Chromium Kit Enables high-throughput single-cell RNA sequencing and gRNA barcode capture from dissociated tissues.
Microfluidic Droplet Generator (Chip) The core hardware for precision encapsulation of single DNA molecules into picoliter droplets.

Application Notes

Perturb-seq is a high-throughput, single-cell functional genomics platform that combines pooled CRISPR-based genetic perturbations with single-cell RNA sequencing. This enables the systematic mapping of gene function to transcriptional phenotypes at scale.

Core Advantages inIn VivoScreening Research

Within the context of MIC-Drop (Microinjection of CRISPR Droplets) and in vivo Perturb-seq, this technology allows for the dissection of complex biological systems within a living organism. Key applications include:

  • Gene Regulatory Network Mapping: Uncovering direct and indirect targets of genetic perturbations.
  • Genetic Interaction Screening: Identifying synthetic lethality and epistatic relationships in complex tissues.
  • Disease Mechanism Elucidation: Characterizing cell-type-specific transcriptional responses to disease-associated mutations.
  • Drug Mode-of-Action Studies: Profiling transcriptional signatures of drug treatments in tandem with genetic perturbations.

Recent Quantitative Performance Data

Table 1: Benchmarking Data for Perturb-seq Throughput and Efficiency

Metric 2022-2023 Standard Protocol 2024 High-Efficiency Protocol (Example) Notes
Cells Profiled per Experiment 100,000 - 1,000,000+ 2,000,000+ Enabled by advancements in droplet microfluidics.
Perturbations Screened in Parallel 100 - 1,000 5,000+ Using highly complex sgRNA libraries.
Single-Cell Capture Efficiency 10-50% (varies by platform) Up to 70% Improvements in cell loading and barcoding.
Multiplexing (Cells per Perturbation) 100 - 1,000 cells 500 - 2,000+ cells Critical for robust statistical power.
Linkage Efficiency (sgRNA to transcriptome) >90% >95% Via improved viral barcoding and capture.

Table 2: Key Outcomes from Recent In Vivo Perturb-seq Studies

Study Focus (Year) Model System Perturbations Tested Key Quantitative Finding
Tumor Suppressor Networks (2023) In vivo mouse cancer model 50+ tumor suppressors Identified 3 distinct transcriptional clusters of tumor suppressor loss, correlating with metastatic potential.
Neuronal Diversity (2024) Mouse brain (primary cells) ~200 transcription factors Mapped 12 neuronal subtypes to specific TF-regulated gene modules; quantified effect size (log2FC >1) for 45 key regulators.
Immune Cell Activation (2023) PBMCs ex vivo 120 immune-related genes 20% of perturbations caused significant shifts in cell state proportions (p<0.001).

Experimental Protocols

Integrated Protocol: From Library Design to Single-Cell Analysis

A. sgRNA Library and Perturbation Vector Design

  • Library Design: Select 3-5 sgRNAs per target gene plus non-targeting controls. For in vivo compatibility (e.g., MIC-Drop), include a unique barcode (UMI) for each sgRNA.
  • Cloning: Clone the pooled sgRNA library into a lentiviral vector containing the Cas9 gene (for stable expression) or a guide-only vector for use with transgenic Cas9 models.
  • Quality Control: Sequence the plasmid library to confirm representation and lack of bias.

B. Viral Production & Cell Perturbation (In Vitro or for Ex Vivo Transplantation)

  • Produce lentivirus from the sgRNA library at a low MOI (<0.3) to ensure single integrations.
  • Transduce target cells (e.g., primary cells, cell lines). Include a selection marker (e.g., puromycin) for 3-7 days to enrich for infected cells.
  • For in vivo MIC-Drop applications: Prepare a mixed suspension of cells, each receiving a single sgRNA, for microinjection or transplantation into the host organism.

C. Single-Cell RNA-Seq Library Preparation (10x Genomics Platform Example)

  • Harvest Cells: After perturbation (typically 5-14 days), harvest cells to create a single-cell suspension. For in vivo studies, dissociate target tissue(s).
  • Cell Viability & Count: Assess viability (>80%) and count cells. Target recovery of 10,000-20,000 cells per sample channel.
  • Partitioning & Barcoding: Load cells onto the Chromium Chip along with Gel Beads and RT reagents. Each cell is co-encapsulated with a uniquely barcoded bead in a droplet.
  • Reverse Transcription: Inside the droplet, poly-dT primers on the beads capture mRNA and attach the cell barcode and UMI. Perform RT to create cDNA.
  • Library Construction: Break droplets, amplify cDNA, and construct sequencing libraries. Include a separate PCR amplification step to enrich for the sgRNA barcode from the integrated vector, using specific primers.
  • Sequencing: Sequence the gene expression library (Read 1: Cell Barcode + UMI; Read 2: cDNA insert) and the sgRNA barcode library on an Illumina platform.

D. Computational Analysis Pipeline

  • Alignment & Quantification: Align cDNA reads to the reference genome (e.g., with STARsolo) and count UMIs per gene per cell. Align sgRNA reads to the library manifest.
  • Cell Calling & Filtering: Use cell calling algorithms (e.g., Cell Ranger). Filter out low-quality cells (high mitochondrial %, low gene counts).
  • Perturbation Assignment: Assign each cell to its sgRNA based on the barcode read. Filter cells with multiple perturbations.
  • Differential Expression: For each perturbation, aggregate cells and compare against control cells using methods like MAST or DESeq2 to identify differentially expressed genes.
  • Advanced Analysis: Perform clustering, trajectory inference, and gene regulatory network reconstruction on perturbation-altered populations.

Visualizations

Perturb-seq Core Workflow

MIC-Drop to In Vivo Perturb-seq Pipeline

The Scientist's Toolkit

Table 3: Essential Research Reagent Solutions for Perturb-seq

Item Function/Description Example Product/Format
Pooled sgRNA Library Contains thousands of unique sgRNA sequences targeting genes of interest and controls. Cloned into a lentiviral backbone. Custom-designed library (e.g., from Twist Bioscience), Brunello or Sabatini genome-wide libraries.
Lentiviral Packaging System Produces the viral particles for efficient delivery of the CRISPR/Cas9 and sgRNA components into target cells. 2nd/3rd generation systems (psPAX2, pMD2.G).
Single-Cell Partitioning System Creates oil droplets or nanowells to isolate single cells with unique barcodes for RNA capture. 10x Genomics Chromium Controller, Parse Biosciences Evercode kits.
scRNA-seq Kit Reagents for reverse transcription, cDNA amplification, and library construction from single cells. 10x Genomics Chromium Next GEM kits, SMART-seq kits.
High-Fidelity Polymerase For accurate amplification of cDNA and sgRNA barcode libraries prior to sequencing. Q5 (NEB), KAPA HiFi.
Dual-Indexed Sequencing Primers Allows for multiplexing of multiple Perturb-seq libraries in a single sequencing run. 10x Dual Index kits, Illumina index sets.
Cell Dissociation Reagents For creating high-viability single-cell suspensions from complex tissues (in vivo applications). Miltenyi Biotec GentleMACS, Worthington collagenase blends.
Dead Cell Removal Kit Critical for removing apoptotic cells from post-perturbation samples to improve data quality. Magnetic bead-based kits (e.g., from Miltenyi, STEMCELL).
Cas9-Expressing Cell Line or Animal Model Provides the Cas9 nuclease in trans. Enables use of sgRNA-only libraries. Custom cell lines, B6J.Cg-Tg(ACTFLPe)9205Dym/J mice, Cas9-KI lines.
Bioinformatics Pipelines Software to demultiplex, align, assign perturbations, and perform differential expression. Cell Ranger, Seurat, Scanpy, mixscape.

The transition from in vitro to in vivo functional genomics screening represents a pivotal advancement for understanding gene function in physiologically relevant contexts. This application note, framed within ongoing research on MIC-Drop and Perturb-seq platforms, details protocols and considerations for scaling pooled CRISPR screens to complex in vivo models, offering a direct path from genetic perturbation to phenotypic readout in a living organism.

Key Comparative Data: In Vitro vs. In Vivo Screening

Table 1: Comparison of Screening Modalities

Parameter In Vitro Perturb-seq In Vivo Perturb-seq (e.g., in mouse)
Physiological Relevance Limited; lacks tissue architecture, systemic signals, immune context. High; includes native microenvironment, cell-cell interactions, and systemic physiology.
Throughput (Cells) Very High (10^5 - 10^6 cells per experiment). Moderate to High (10^4 - 10^5 recoverable cells per tissue).
Perturbation Complexity High (Can screen 1000s of gRNAs in single experiment). Moderate (Limited by delivery efficiency and animal number).
Cost per Perturbation Low High (Includes animal husbandry, processing).
Major Technical Hurdle Single-cell sequencing efficiency. In vivo delivery, tissue dissociation, target cell recovery.
Key Readout Single-cell RNA-seq profiles. Single-cell RNA-seq profiles with in situ context.

Table 2: Quantitative Outcomes from Recent In Vivo Perturb-seq Studies

Study Focus (Year) Model System Perturbations Tested Key Metric: Cell Recovery Major Finding
Tumor Immunology (2023) Mouse melanoma (anti-PD-1 treated) ~200 gene knockouts ~5,000 T cells recovered per tumor Identified Ppp2r2d KO as enhancing T-cell expansion & function.
Brain Development (2022) Mouse embryonic brain 35 neurodevelopmental genes ~100,000 cells total from pooled embryos Mapped gene perturbation effects on neural lineage trajectories.
Lung Cancer (2024) Mouse KP model 100+ tumor suppressor genes ~10,000 tumor cells per lung Quantified in vivo fitness scores distinct from in vitro scores.

Detailed Protocols

Protocol 1: In Vivo Pooled CRISPR Screening with Perturb-seq

Objective: To perform a pooled CRISPR knockout screen in a mouse model and assess transcriptomic phenotypes via single-cell RNA sequencing.

Materials: See "The Scientist's Toolkit" below.

Procedure:

  • gRNA Library & Virus Production:
    • Design a lentiviral sgRNA library targeting genes of interest, including non-targeting controls. A typical complexity is 3-5 sgRNAs per gene.
    • Produce high-titer lentivirus (>10^8 IU/mL) in HEK293T cells using standard packaging plasmids (psPAX2, pMD2.G). Concentrate via ultracentrifugation.
  • In Vitro Transduction & Cell Preparation:

    • Transduce your target cells (e.g., cancer cell line, primary T cells) at a low MOI (~0.3) to ensure most cells receive one viral construct. Include a puromycin selection marker.
    • Culture cells for 5-7 days post-selection to allow for gene knockout and protein depletion.
  • In Vivo Implantation/Engraftment:

    • For tumor studies, implant 1-5x10^6 transduced cells subcutaneously or intravenously into immunodeficient or immunocompetent mice (n=3-5 per group).
    • For in situ editing, directly inject CRISPR delivery vectors (e.g., AAV-packaged sgRNA) into the target tissue.
  • In Vivo Perturbation & Development:

    • Allow disease or development to proceed for a defined period (e.g., tumor growth for 3 weeks, embryonic development for 10 days).
  • Tissue Harvest and Single-Cell Suspension:

    • Euthanize mice and harvest target tissues. Mechanically dissociate and enzymatically digest (e.g., with collagenase IV/DNase I) to create a single-cell suspension.
    • Pass cells through a 40μm strainer. Perform RBC lysis if needed. Count live cells via trypan blue exclusion.
  • Single-Cell RNA-seq Library Preparation:

    • Use a droplet-based platform (10x Genomics Chromium) according to the "Feature Barcoding" protocol for CRISPR screening.
    • Load up to 10,000 cells per sample. The protocol captures both the cellular transcriptome and the expressed sgRNA barcode from the same cell.
  • Sequencing & Data Analysis:

    • Sequence libraries on an Illumina NovaSeq to a depth of ~50,000 reads per cell.
    • Process data using the Cell Ranger pipeline (10x Genomics) aligned to a combined genome reference (host + sgRNA library).
    • Use specialized tools (Mosaic from the Broad Institute, CITE-seq-Count) to demultiplex cells by their sgRNA barcode.
    • Perform differential expression analysis (e.g., with Seurat and MAST) comparing cells with a specific perturbation to cells with non-targeting controls.

Protocol 2: MIC-Drop for In Vivo Multiplexed Targeting

Objective: To co-deliver multiple CRISPR components (e.g., Cas9 + gRNA) in a single, traceable droplet for in vivo mosaic analysis.

Materials: MIC-Drop reagent kit, microfluidic droplet generator, Cas9 protein, sgRNA complexes.

Procedure:

  • Assembly of MIC-Drop Components:
    • Formulate "MICs" (Multiplexed Interrogation of CRISPR) by mixing purified Cas9 protein, in vitro transcribed sgRNA, and a unique DNA barcode within an oil-phase microdroplet.
  • In Vivo Delivery:

    • Load the emulsion containing thousands of uniquely barcoded CRISPR perturbations into a microinjection system.
    • Inject droplets directly into the target organ of a model organism (e.g., zebrafish embryo, mouse liver) at early developmental or regenerative stages.
  • Phenotypic Analysis and Cell Sorting:

    • Allow phenotypic manifestation (e.g., 2-5 days post-injection).
    • Dissociate tissue, and use FACS to sort cells based on phenotypic markers (e.g., GFP expression, surface markers).
  • Barcode Recovery and Perturbation Deconvolution:

    • Isolate genomic DNA from sorted cell populations.
    • Amplify the MIC barcodes via PCR and sequence them on a MiSeq.
    • The frequency of each barcode in different phenotypic bins reveals the functional impact of its associated perturbation.

Visualizations

Workflow: In Vivo Perturb-seq Pipeline

Pathway: In Vivo Perturbation Effects

The Scientist's Toolkit

Table 3: Essential Reagents & Materials for In Vivo Functional Genomics

Item Function & Rationale Example Product/Supplier
Lentiviral sgRNA Library Delivers heritable genetic perturbations to target cells. Enables pooled screening. Custom library (Addgene, Twist Bioscience)
High-Titer Lentivirus Efficient delivery of CRISPR constructs in vitro prior to in vivo engraftment. Lenti-X Concentrator (Takara)
Cas9-Expressing Cell Line Provides the CRISPR nuclease machinery. Enables knockout screens. LentiCas9-Blast (Addgene #52962)
Single-Cell 3' Kit with Feature Barcoding Captures transcriptome and sgRNA barcode from the same cell. Chromium Next GEM Single Cell 3' v3.1 (10x Genomics)
Tissue Dissociation Enzyme Generates high-viability single-cell suspensions from complex in vivo tissues. Tumor Dissociation Kit (Miltenyi Biotec)
Cell Recovery Microcentrifuge Tubes Maximizes recovery of low-abundance cell populations after FACS. Protein LoBind Tubes (Eppendorf)
Nuclease-Free Water Critical for all molecular biology steps to prevent RNA/DNA degradation. UltraPure DNase/RNase-Free Water (Invitrogen)
Next-Generation Sequencing Platform High-depth sequencing of single-cell libraries. Illumina NovaSeq 6000

Key Historical Context and Pioneering Studies (e.g., Dixit et al., 2016; Jaitin et al., 2016)

Historical Context and Thesis Integration

The advent of single-cell RNA sequencing (scRNA-seq) transformed phenotypic screening by enabling high-resolution, unbiased readouts of cellular states. The pioneering studies of Dixit et al. (2016) and Jaitin et al. (2016) laid the foundational logic for integrating pooled genetic perturbations with scRNA-seq. Dixit et al. introduced Perturb-seq by coupling CRISPR-mediated gene knockdown with a droplet-based scRNA-seq platform, using expressed guide RNAs as barcodes. Concurrently, Jaitin et al. demonstrated a similar principle with CRISP-seq. These studies proved that complex transcriptional phenotypes from hundreds of perturbations could be deconvoluted in a single, pooled experiment.

This conceptual breakthrough directly informs the current thesis on MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and in vivo Perturb-seq. The thesis posits that by combining the pooled, scalable screening framework of Perturb-seq with novel in vivo delivery and barcoding strategies like MIC-Drop's water-in-oin droplet encapsulation of sgRNAs, one can overcome key limitations of earlier in vitro work. The goal is to enable systematic, functional genomics directly within the native tissue microenvironment of a living organism.

Core Findings from Pioneering Studies

Table 1: Key Parameters from Seminal Studies (2016)

Study Technology Name Perturbation System scRNA-seq Platform Key Scale Demonstrated Primary Model System
Dixit et al. Perturb-seq CRISPRi (dCas9-KRAB) inDrop / 10x Genomics 13 sgRNAs targeting 10 genes across ~60,000 cells K562 leukemia cell line
Jaitin et al. CRISP-seq CRISPR-Cas9 (knockout) MARS-seq 58 sgRNAs targeting 21 genes across ~8,000 cells Dendritic cells (in vitro, LPS-stimulated)

Table 2: Quantitative Outcomes and Impact

Study Key Quantitative Result Conceptual Advancement
Dixit et al. Clustering of single-cell profiles grouped cells by targeted gene, not sgRNA sequence. Recovered known and novel gene signatures (e.g., RELA knockdown induced TNFα-response signature). Established a direct, high-dimensional link between genotype and transcriptional phenotype at scale in a pooled format.
Jaitin et al. Identified known and novel regulators of LPS response (e.g., Cebpb, Rel). Quantified heterogeneity in perturbation responses. Demonstrated the method's power in primary, immunologically stimulated cells. Introduced combinatorial perturbations.

Detailed Experimental Protocols

Protocol 1: Perturb-seq (Adapted from Dixit et al., 2016) Objective: To generate a pooled library of CRISPRi-perturbed cells and profile their transcriptional phenotypes using droplet-based scRNA-seq.

A. Library Generation & Cell Transduction

  • Design & Clone: Synthesize an sgRNA library targeting genes of interest. Clone into a lentiviral vector containing the sgRNA expression cassette and a selectable marker (e.g., puromycin resistance).
  • Produce Virus: Generate high-titer lentivirus for the pooled sgRNA library in HEK293T cells.
  • Stable Line Generation: Infect target cells (e.g., K562) expressing dCas9-KRAB at a low MOI (<0.3) to ensure most cells receive one sgRNA. Select with puromycin for 5-7 days.
  • Library Validation: Harvest a sample of cells, extract genomic DNA, amplify the sgRNA region via PCR, and sequence to confirm library representation.

B. Single-Cell RNA-Sequencing (inDrop Platform)

  • Cell Preparation: Harvest ~500,000 perturbed cells, wash, and resuspend in PBS with 0.04% BSA at a target concentration of 100-200 cells/µL.
  • inDrop Encapsulation: Load cells, along with inDrop hydrogel beads (containing barcoded primers with unique molecular identifiers (UMIs) and poly(dT)), and enzymatic mix into a microfluidic device. Generate nanoliter droplets co-encapsulating a single cell and a single bead.
  • On-Bead Reverse Transcription: Lysate the cell inside the droplet. Poly-adenylated mRNA hybridizes to the bead's poly(dT) primers and is reverse-transcribed into barcoded cDNA.
  • Library Prep: Break droplets, pool barcoded cDNA, and perform exonuclease I digestion to remove unused primers. Amplify cDNA via PCR with Illumina adapters. Construct final sequencing library with sample indices.

C. Data Analysis

  • Demultiplexing: Assign reads to individual cells based on the cell barcode and to the originating molecule via the UMI. Identify the expressed sgRNA from the CRISPR transcript reads.
  • Expression Matrix: Generate a gene expression (UMI count) matrix, annotated with the perturbed gene for each cell.
  • Differential Analysis: Use statistical models (e.g., MAST, DESeq2 adapted for single-cell) to compare gene expression between cells targeting a specific gene vs. non-targeting control sgRNA cells.

Protocol 2: MIC-Drop Workflow for In Vivo Screening (Current Thesis Context) Objective: To perform pooled CRISPR perturbation and single-cell profiling directly in a living mouse model.

A. sgRNA Droplet Library Preparation (MIC-Drop)

  • Reagent Setup: Prepare an aqueous mix containing: Cas9 protein (or mRNA), sgRNA library, cell transfection reagent (e.g., Lipofectamine CRISPRMAX), and a fluorescent dye.
  • Droplet Generation: Use a microfluidic droplet generator to encapsulate the aqueous mix into monodisperse, water-in-oil droplets (~100 pL volume). Each droplet contains components for a single perturbation.
  • Quality Control: Analyze droplet size and uniformity via microscopy. Count and concentrate droplets to a defined injection volume.

B. In Vivo Delivery and Harvest

  • Animal Model: Use an immunocompromised or humanized mouse model with a targetable tumor xenograft or engrafted primary cells.
  • Localized Injection: Inject the concentrated droplet emulsion directly into the target tissue (e.g., intratumorally) using a fine-gauge syringe. A control animal receives droplets with non-targeting sgRNAs.
  • Incubation: Allow 5-14 days for perturbation effects (e.g., gene knockout) and phenotypic changes to manifest.
  • Tissue Processing: Harvest the target tissue, dissociate into a single-cell suspension, and filter to remove debris.

C. Single-Cell Capture & Sequencing (10x Genomics)

  • Cell Preparation: Count live cells and resuspend in PBS + 0.04% BSA at 700-1,200 cells/µL.
  • Gel Bead-in-Emulsion (GEM) Generation: Use the 10x Chromium Controller to co-encapsulate single cells, barcoded gel beads, and reaction reagents in droplets.
  • Library Construction: Follow the 10x Single Cell 3' Reagent Kit v3.1 protocol for GEM-RT, cDNA amplification, and library construction. Include a custom PCR step to enrich for sgRNA sequences from the cDNA.
  • Sequencing: Pool libraries and sequence on an Illumina platform (e.g., ~50,000 read pairs/cell for gene expression; deep sequencing for sgRNA amplicon).

Diagrams

Title: Evolution from In Vitro Perturb-seq to In Vivo MIC-Drop

Title: MIC-Drop Perturb-seq In Vivo Screening Protocol

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials for In Vivo Perturb-seq Screening

Item Function & Role in Protocol Example Product/Source
Pooled sgRNA Library Defines the genetic perturbations screened; cloned into vector for viral production or used directly for encapsulation. Custom synthesis (Twist Bioscience, IDT).
Cas9 Protein (or mRNA) The effector enzyme for CRISPR-mediated gene knockout. High-quality, RNase-free material is critical for MIC-Drop. Alt-R S.p. Cas9 Nuclease V3 (IDT); Trilink CleanCap Cas9 mRNA.
Microfluidic Droplet Generator Creates monodisperse water-in-oil droplets for MIC-Drop reagent encapsulation. Dolomite Microfluidic System; Bio-Rad QX200 Droplet Generator.
Fluorinated Oil & Surfactant Forms the immiscible oil phase for droplet generation and stabilizes droplets during storage/handling. Dolomite Droplet Generation Oil; 3M Novec 7500 with 2% PEG-PFPE surfactant.
In Vivo Transfection Reagent Enhances delivery and uptake of Cas9/sgRNA RNP complexes from droplets into target cells in vivo. InvivoJetPEI (Polyplus); Lipofectamine CRISPRMAX.
Single-Cell Dissociation Kit Generates high-viability single-cell suspensions from complex in vivo tissues for scRNA-seq. Miltenyi Biotec GentleMACS Dissociator & tumor dissociation kits.
scRNA-seq Kit with Feature Barcoding Enables simultaneous capture of transcriptome and sgRNA barcode (Feature Barcode) from single cells. 10x Genomics Single Cell 3' Kit v3.1 with Feature Barcode technology.
Cell Ranger with CRISPR Add-on Primary analysis software for demultiplexing cells, aligning reads, counting UMIs, and assigning sgRNAs. 10x Genomics Cell Ranger (with cellranger count --feature-ref).

Application Note: This document details the integrated application of MIC-Drop (Multiplexed Interrogation of Cells by CRISPR Droplets) and Perturb-seq for in vivo functional genomics screening. These high-throughput, single-cell RNA sequencing (scRNA-seq) coupled CRISPR screening platforms enable the systematic deconvolution of gene function within complex biological systems, directly addressing the core challenges of modern therapeutic development.

Application Notes

1. Target Discovery: MIC-Drop/Perturb-seq facilitates unbiased identification of novel therapeutic targets by screening hundreds of gene perturbations in vivo and quantifying their phenotypic impact via single-cell transcriptomes. Hits are prioritized based on their ability to shift cell states toward a therapeutic outcome (e.g., reduction of a pathogenic cell population, reversal of disease signatures).

Table 1: Representative Quantitative Output from an *In Vivo Target Discovery Screen*

Perturbed Gene Cell Population of Interest (%) in Control Cell Population of Interest (%) Post-Perturbation p-value Disease Signature Score Change
Gene A 12.5 3.2 <0.001 -0.78
Gene B 12.7 11.9 0.45 -0.05
Gene C 13.1 20.5 <0.001 +0.65

2. Gene Network Mapping: By clustering cells based on their transcriptional profiles post-perturbation, these methods allow for the construction of causal gene regulatory networks. Genes with similar transcriptomic consequences are inferred to be in the same pathway or regulatory module.

3. Disease Mechanism Elucidation: Perturbing genes in disease models and comparing single-cell trajectories to human disease atlas data reveals how genetic perturbations alter disease progression, identifies key driver cell states, and maps the molecular pathways responsible.

Experimental Protocols

Protocol 1:In VivoMIC-Drop/Perturb-seq Screening Workflow

I. Library Preparation and MIC-Drop Assembly

  • Design & Cloning: Design a sgRNA library targeting genes of interest. Clone into the MIC-Drop vector backbone containing the sgRNA expression cassette and a unique barcode sequence for each guide.
  • mRNA Synthesis: In vitro transcribe the plasmid library to generate sgRNA mRNA.
  • Droplet Generation: Use a microfluidic device to encapsulate individual Cas9-expressing cells, a single sgRNA mRNA molecule, and a uniquely barcoded primer bead into picoliter-scale droplets (MIC-Drop).

II. In Vivo Delivery and Recovery

  • Pooled Injection: Pool all MIC-Drop droplets and inject intravenously or directly into the tissue of interest in an animal model.
  • Incubation: Allow 7-14 days for gene editing and phenotypic manifestation in vivo.
  • Tissue Processing: Harvest and dissociate the target tissue into a single-cell suspension.

III. Single-Cell RNA Sequencing & Analysis

  • Library Prep: Use a standard scRNA-seq platform (e.g., 10x Genomics) to capture cells, lysing droplets to associate each cell's transcriptome with its sgRNA barcode.
  • Sequencing: Perform paired-end sequencing to capture both cellular transcripts and sgRNA barcodes.
  • Bioinformatics Analysis:
    • Align reads to the reference genome and sgRNA barcode library.
    • Assign each cell to its perturbed gene via the recovered barcode.
    • Perform differential expression analysis comparing cells perturbing different genes.
    • Use dimensionality reduction (UMAP/t-SNE) and clustering to identify perturbation-driven cell states.

Protocol 2: Validation of Candidate Hits via Focused Perturb-seq

  • Hit Confirmation: Select 20-30 top candidate genes from the primary screen.
  • Focused Library: Generate a new MIC-Drop library containing 5-10 sgRNAs per target gene.
  • Repeat In Vivo Screening: Repeat Protocol 1 with the focused library, increasing coverage per perturbation.
  • Deep Phenotyping: Incorporate feature barcoding (CITE-seq) for surface proteins or use a scRNA-seq assay that captures chromatin accessibility (multiome) to gain a multimodal view of the perturbation effect.

Visualizations

In Vivo MIC-Drop/Perturb-seq Screening Workflow

Causal Gene Network to Phenotype Elucidation

The Scientist's Toolkit

Table 2: Essential Research Reagent Solutions

Item Function in MIC-Drop/Perturb-seq
Pooled sgRNA Library Contains uniquely barcoded guides for multiplexed gene targeting. The fundamental perturbation reagent.
Cas9-Expressing Cell Line or Animal Model Provides the genomic editing machinery. Enables in vivo screening in a relevant physiological context.
MIC-Drop Vector Backbone Plasmid for sgRNA mRNA synthesis, containing the essential barcode for downstream deconvolution.
Single-Cell 3' RNA Seq Kit (w/ Feature Barcoding) Standardized reagents for generating barcoded scRNA-seq libraries from recovered cells.
Bioinformatics Pipeline (e.g., CellRanger, Seurat, Scanpy) Software suites for demultiplexing cells, aligning reads, assigning perturbations, and performing differential expression.
Validated sgRNA/Cas9 Delivery Vehicle (e.g., AAV, Lentivirus) An alternative delivery method for specific tissues where MIC-Drop injection is not optimal.

How to Design and Execute an In Vivo MIC-Drop or Perturb-seq Screen: A Step-by-Step Protocol

In vivo functional genomics screening is essential for understanding gene function in physiological contexts. Two prominent technologies, MIC-Drop and Perturb-seq, offer distinct approaches. The choice between them depends on the specific biological question, scale, and experimental constraints. This note guides researchers in selecting the appropriate platform.

Defining the Biological Question

The selection process begins with a precise biological question.

Question Aspect Considerations for Technology Choice
Phenotypic Readout Transcriptome-wide (Perturb-seq) vs. focused, imaging-based or survival-based (MIC-Drop).
Scale of Perturbation Number of genes/conditions to test (100s-1000s vs. 10s-100s).
In Vivo Model Suitability for delivery (viral vs. lipid nanoparticle vs. direct injection).
Spatial Resolution Need for single-cell resolution within a tissue.
Temporal Resolution Need to track phenotypes over time in the same organism.
Cost & Throughput Budget constraints and number of samples required for statistical power.

Technology Comparison Table

Feature MIC-Drop Perturb-seq
Core Principle Pooled in vivo screening using barcoded, slow-release CRISPR-Cas9 mRNA/gRNA droplets. Single-cell RNA sequencing (scRNA-seq) readout of CRISPR-mediated perturbations.
Primary Readout Binary or quantitative phenotypic selection (e.g., survival, tumor size, fluorescence). Whole-transcriptome profiling at single-cell resolution.
Perturbation Scale Moderate (10s to 100s of targets per pool). High (1000s of targets across a population).
In Vivo Delivery Direct injection into tissue or cavity (e.g., zebrafish yolk, mouse tumor). Often requires explant & dissociation; in vivo via viral/barcode delivery possible.
Key Advantage Longitudinal tracking in same animal; cost-effective for in vivo positive selection screens. Reveals mechanistic state changes and heterogeneous responses without prior phenotype bias.
Major Limitation Limited mechanistic insight without separate downstream assays. Loss of spatial context; higher cost per cell; more complex computational analysis.
Ideal Use Case In vivo positive/negative selection screens (e.g., essential genes in cancer, developmental genetics). Decoding gene regulatory networks, characterizing cell states post-perturbation in complex tissues.

Decision Framework Protocol

Objective

To systematically choose between MIC-Drop and Perturb-seq based on the experimental goals.

Materials

  • Defined biological hypothesis and required endpoints.
  • Relevant animal model (zebrafish, mouse, etc.).
  • Budget and computational resource assessment.

Procedure

  • Phenotype Prioritization:

    • If the primary need is to identify genes causing a specific, pre-defined macroscopic phenotype (e.g., altered morphology, survival), MIC-Drop is favorable.
    • If the primary need is to understand the transcriptional consequences and cellular states resulting from perturbations, Perturb-seq is necessary.
  • Scale Assessment:

    • For focused libraries (<500 genes) where in vivo delivery efficiency is paramount, use MIC-Drop.
    • For genome-scale or pathway-scale libraries where mechanistic depth is critical, use Perturb-seq.
  • Logistical Evaluation:

    • Evaluate single-cell dissociation compatibility of your tissue. If not feasible, MIC-Drop may be the only viable in vivo option.
    • Assess sequencing budget. Perturb-seq requires deep sequencing (~5,000-10,000 reads/cell).
    • Confirm computational pipelines for scRNA-seq analysis are available for Perturb-seq.

Detailed Experimental Protocols

Protocol 1: MIC-Drop for In Vivo Positive Selection Screen in Zebrafish

Objective: Identify genes essential for embryonic survival.

Materials:

  • MIC-Drop library (barcoded gRNA/Cas9 mRNA droplets).
  • Wild-type zebrafish embryos (1-cell stage).
  • Microinjection apparatus.
  • PCR reagents, NGS sequencer.

Procedure:

  • Library Preparation: Obtain or synthesize a MIC-Drop droplet library, each droplet containing a unique barcode linked to a specific gRNA and Cas9 mRNA.
  • Embryo Injection: At the 1-cell stage, inject a pool of droplets into the yolk of multiple embryos. Aim for ~1 droplet per embryo.
  • Phenotypic Incubation: Raise embryos, collecting and barcoding genomic DNA from all embryos at 24 hours post-fertilization (hpf) (Input sample).
  • Positive Selection: Continue raising embryos. At 120 hpf, collect genomic DNA only from surviving, normally developing larvae (Output sample).
  • Barcode Amplification & Sequencing: Amplify barcode regions from Input and Output genomic DNA samples via PCR. Perform next-generation sequencing.
  • Analysis: Depletion of a specific barcode in the Output vs. Input indicates that its target gene is essential for survival.

Protocol 2: Perturb-seq for In Vivo Immune Cell Profiling in Mouse

Objective: Characterize the impact of cytokine gene knockouts on tumor-infiltrating lymphocyte states.

Materials:

  • AAV-based pooled CRISPR library targeting immune modulators.
  • Cas9-expressing mouse model.
  • Tumor cell line for implantation.
  • Single-cell dissociation kit for tumors.
  • 10x Chromium Controller & scRNA-seq reagents.

Procedure:

  • In Vivo Perturbation: Generate a pool of AAVs, each carrying a unique barcoded gRNA. Infect tumor cells with this pool ex vivo, or inject AAV pool directly into established tumors in Cas9+ mice.
  • Tissue Harvest & Dissociation: After 7-14 days, harvest tumors. Dissociate into single-cell suspensions using a gentle MACS dissociator and enzymatic kit.
  • Single-Cell Library Prep: Process cells through the 10x Chromium platform using the 5' Gene Expression with Feature Barcoding kit to capture both transcriptomes and gRNA barcodes.
  • Sequencing: Sequence libraries on an Illumina NovaSeq.
  • Computational Analysis:
    • Align reads and generate gene expression (GEX) and CRISPR Guide Capture (CGC) count matrices.
    • Use Cell Ranger and Seurat for initial processing.
    • Assign cells to perturbations using the CGC data (e.g., with MUSIC or CITE-seq-Count).
    • Compare transcriptional profiles between cells with different gRNA assignments to identify differentially expressed genes and altered cell states.

The Scientist's Toolkit

Research Reagent / Solution Function
MIC-Drop Droplet Library Pre-formatted, barcoded microdroplets containing gRNA and Cas9 mRNA for pooled in vivo delivery.
AAV-pgk-sgRNA (Serotype) Adeno-associated virus vector for in vivo delivery of single guide RNAs to specific tissues (e.g., AAV9 for liver).
10x Chromium Controller & Next GEM Kits Platform for partitioning single cells and generating barcoded scRNA-seq libraries.
GentleMACS Dissociator Instrument for standardized, gentle tissue dissociation to viable single cells.
Hash Tag Oligonucleotides (HTOs) Antibody-conjugated oligonucleotides for multiplexing samples in a single Perturb-seq run.
CRISPRko Library (e.g., Brunello) Genome-wide human CRISPR knockout sgRNA library for loss-of-function screens.
Cell Ranger (Software) 10x Genomics' pipeline for processing scRNA-seq data to generate count matrices.
Seurat / Scanpy R/Python packages for comprehensive scRNA-seq data analysis and visualization.

Visualizations

Decision Flow: MIC-Drop vs Perturb-seq

MIC-Drop In Vivo Screening Workflow

Perturb-Seq In Vivo Screening Workflow

1. Introduction and Thesis Context Within the broader thesis on advancing in vivo functional genomics, this protocol details the critical second step: constructing a pooled CRISPR guide RNA (gRNA) library. This library is the foundational reagent for coupling MIC-Drop (Microscopic Inventory of CRISPR-Cas Droplets) — a multiplexed delivery system — with downstream Perturb-seq (single-cell RNA sequencing readout of CRISPR perturbations) for high-throughput, in vivo screening. A meticulously designed and cloned library ensures specific, efficient, and interpretable genetic perturbations across complex cell populations in living organisms.

2. gRNA Library Design Principles The design focuses on specificity, efficiency, and compatibility with high-throughput cloning and sequencing.

  • Target Selection: For genome-wide screens, target all coding genes using 4-6 gRNAs per gene. For focused screens (e.g., kinase family), include all relevant isoforms and non-coding regulatory elements.
  • gRNA Sequence Rules:
    • Length: 20-nt spacer sequence adjacent to the 5' NGG Protospacer Adjacent Motif (PAM) for S. pyogenes Cas9.
    • On-Target Efficiency: Predict using algorithms like Doench '16 or CHOPCHOP. Select gRNAs with high predicted scores (>0.6).
    • Off-Target Minimization: Use algorithms (e.g., MIT specificity tool) to avoid sites with ≥3 mismatches in the seed region (PAM-proximal 12 bases) to unintended genomic loci.
    • Uniqueness: Ensure each 20-nt spacer is unique within the genome and the library itself to maintain unambiguous target assignment.
    • Cloning Compatibility: Avoid BsmBI restriction sites within the spacer sequence.

Table 1: Quantitative Design Parameters for a Focused Kinase Library

Design Parameter Target Value/Range Rationale
gRNAs per gene 5 Balances statistical confidence with library size.
Predicted On-Target Score (Doench '16) ≥ 0.65 Ensures high activity.
Max. Off-Target Sites (≤3 mismatches) ≤ 5 Minimizes confounding phenotypes.
Spacer Length 20 nucleotides Standard for SpCas9.
Genomic Coverage 500 human kinase genes Focused, hypothesis-driven library.
Total Library Size 2,500 gRNAs Manageable for in vivo delivery and sequencing.
Non-Targeting Controls 100 gRNAs (4% of library) Controls for non-specific effects.
Positive Controls (e.g., essential genes) 50 gRNAs (2% of library) Controls for knockout efficacy.

3. Detailed Protocol: Oligo Pool to Cloned Plasmid Library

A. Materials: Oligo Pool Synthesis and Preparation

  • Designed Oligo Pool: Commercially synthesized as an oligonucleotide library (e.g., Twist Bioscience). Format: 5'-ACCG-[20nt spacer]-GTTTT-3' (forward) and 5'-AAAC-[reverse complement of 20nt spacer]-C-3' (reverse).
  • Cloning Vector: Lentiviral backbone (e.g., lentiGuide-Puro, Addgene #52963) pre-digested with BsmBI-v2.
  • Enzymes: BsmBI-v2 restriction enzyme, T4 DNA Ligase, T7 DNA Polymerase.
  • Kits: PCR Purification Kit, Gel Extraction Kit, DNA Clean & Concentrator Kit.
  • Bacteria: Endura electrocompetent cells (or similar high-efficiency, recA- strain).

B. Step-by-Step Methodology

Part 1: Amplification of the Oligo Pool

  • PCR Amplification: Set up 4 x 100 µL PCR reactions to amplify the oligo pool.
    • Reaction Mix: 10 ng oligo pool, 0.5 µM forward/reverse primers (containing overhangs for Golden Gate assembly), 1x Q5 Hot Start Master Mix.
    • Cycling Conditions: 98°C 30s; (98°C 10s, 63°C 20s, 72°C 20s) x 14 cycles; 72°C 2 min.
  • Purification: Pool PCR reactions and purify using a PCR Purification Kit. Elute in 30 µL nuclease-free water. Quantify by UV spectrophotometry.

Part 2: Golden Gate Assembly

  • Assembly Reaction: Set up a 20 µL Golden Gate assembly.
    • Reaction Mix: 100 ng BsmBI-digested vector, 20 ng purified PCR product (3:1 insert:vector molar ratio), 1 µL BsmBI-v2, 1 µL T4 DNA Ligase, 1x T4 Ligase Buffer.
    • Cycling Conditions: (37°C 5 min, 20°C 5 min) x 30 cycles; 80°C 5 min; hold at 4°C.
  • Clean-up: Treat reaction with 1 µL Proteinase K (10 mg/mL) at 37°C for 15 min. Purify using a DNA Clean & Concentrator Kit. Elute in 12 µL.

Part 3: Bacterial Transformation and Library Amplification

  • Electroporation: Transform 2 µL of purified assembly into 50 µL Endura electrocompetent cells per manufacturer's protocol. Plate 1% of the transformation on selective agar to assess colony count. Plate the remainder on large, low-salt LB agar plates (245 x 245 mm) with appropriate antibiotic (e.g., ampicillin).
  • Colony Collection: Incubate at 32°C for 18-24 hours. Ensure colony count is >500x library size (e.g., >1.25 million colonies for a 2,500-gRNA library). Flood plates with LB medium, scrape colonies, and pool into a single culture for maxiprep.
  • Plasmid Library Recovery: Perform a Maxiprep (e.g., using a Qiagen Plasmid Plus Maxi Kit) on the pooled bacterial culture. Quantify the final plasmid library. The expected yield is 200-500 µg.

Part 4: Quality Control (QC) by Next-Generation Sequencing (NGS)

  • Library Preparation for NGS: Amplify the gRNA inserts from 100 ng of the final plasmid library using primers containing Illumina adapters and sample indexes.
  • Sequencing and Analysis: Sequence on an Illumina MiSeq (or equivalent) with a minimum of 100 reads per gRNA. Analyze the resulting FASTQ files to confirm:
    • Library Complexity: >90% of designed gRNAs are present.
    • Uniformity: No single gRNA constitutes >0.1% of total reads.

Table 2: Expected QC Metrics for the Cloned Library

QC Metric Acceptance Criteria Purpose
Plasmid Yield > 200 µg Sufficient for lentivirus production.
A260/A280 Ratio 1.8 - 2.0 Indicates pure DNA.
gRNA Representation > 90% of designed gRNAs detected Ensures library completeness.
Read Distribution Evenness Gini Coefficient < 0.2 Confirms lack of strong amplification bias.
Non-Targeting Control Presence 100% detected Validates cloning success.

4. The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials for Library Construction

Item Function Example Product/Catalog #
Custom Oligo Pool Source of all gRNA spacer sequences. Twist Bioscience Custom Oligo Pools
BsmBI-v2 Restriction Enzyme Type IIS enzyme for Golden Gate assembly; cuts outside its recognition site. NEB #R0739S
High-Efficiency Cloning Vector Lentiviral backbone for mammalian expression of gRNA and selection marker. lentiGuide-Puro (Addgene #52963)
Electrocompetent E. coli High-transformation-efficiency bacteria for library propagation. Lucigen Endura Electrocompetent Cells (#60242-2)
PCR Purification Kit For cleaning up enzymatic reactions. Zymo Research DNA Clean & Concentrator Kit (#D4033)
Maxiprep Kit For high-yield, high-quality plasmid DNA isolation from bacterial cultures. Qiagen Plasmid Plus Maxi Kit (#12963)
Next-Generation Sequencer For quality control of gRNA representation and uniformity. Illumina MiSeq System

5. Visualization: Library Construction Workflow

Pooled gRNA Library Construction Workflow

Within the broader thesis investigating scalable in vivo functional genomics via MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq integration, this application note details the core wet-lab workflow. This step is critical for transitioning from pooled library construction to the delivery of multiplexed perturbations into a live animal model, enabling high-resolution in vivo screening.

The MIC-Drop workflow involves three consecutive, integrated phases: (1) Generating a monodisperse water-in-oil emulsion, (2) Co-encapsulating barcoded perturbation vectors (e.g., CRISPR guide RNA plasmids) with individual cells, and (3) Precisely injecting microdroplets into the target organism (e.g., zebrafish embryo).

Detailed Protocols

Protocol: Microdroplet Generation via Flow-Focusing

Objective: To produce monodisperse aqueous microdroplets in a fluorinated oil carrier phase. Materials:

  • Microfluidic droplet generator chip (e.g., 30-50 µm channel width).
  • Programmable syringe pumps (2).
  • Gas-tight glass syringes (1 mL).
  • Aqueous Phase: 1% PFPE-PEG surfactant in nuclease-free water.
  • Oil Phase: Fluorinated oil (e.g., 3M Novec 7500) with 2% (w/w) fluorosurfactant.
  • Collection tube (PCR strip tube or 0.5 mL Eppendorf).

Method:

  • Prime System: Load oil phase into a 1 mL syringe and connect to the oil inlet (center channel) of the chip via tubing. Load aqueous phase into a 1 mL syringe for the aqueous inlet (side channel). Place syringes on pumps.
  • Flush: Run the oil pump at 500 µL/hr for 5 minutes to fill all channels and remove air bubbles.
  • Generate Droplets: Set flow rates. A typical ratio is Oil:Aqueous = 3:1 (e.g., Oil at 1500 µL/hr, Aqueous at 500 µL/hr). Start pumps simultaneously and collect effluent into a tube on ice for 10-15 minutes. Droplets should be visually uniform under a microscope.
  • Store: Collected droplets can be stored at 4°C for several hours before encapsulation.

Protocol: Cell & Perturbation Library Co-Encapsulation

Objective: To encapsulate single cells and single barcoded perturbation vectors within individual microdroplets. Materials:

  • Prepared microdroplets (from Protocol 3.1).
  • Target cells in single-cell suspension (e.g., dissociated zebrafish cells, cultured cells) at high viability (>90%).
  • MIC-Drop perturbation library (e.g., plasmid library at characterized concentration).
  • Cell staining dye (e.g., Calcein AM for viability).
  • Centrifuge with swing-bucket rotor for droplet handling.
  • Modified aqueous phase: 1X PBS, 1% PFPE-PEG surfactant, cells, and library.

Method:

  • Prepare Aqueous Mix: Pellet 1x10^6 cells, resuspend in 1 mL of modified aqueous phase containing the perturbation library at a limiting dilution concentration (e.g., 50 pM) to ensure a high probability of single-guide, single-cell encapsulation. Add viability dye per manufacturer's protocol.
  • Re-Generate Droplets: Use the aqueous mix from Step 1 as the new aqueous phase input. Repeat Protocol 3.1 to generate "loaded" droplets.
  • Incubate for Lysis: Transfer collected droplets to a thermal cycler. Incubate at 65°C for 15 minutes to lyse encapsulated cells and release cellular mRNA.
  • Merge with RT Mix Droplets (Optional for Perturb-seq): If performing direct mRNA capture, generate a separate batch of droplets containing reverse transcription (RT) master mix. Use a droplet pairing chip to pairwise merge cell lysate droplets with RT mix droplets via an electric field or passive coalescence.

Protocol: High-Throughput Microinjection into Zebrafish Embryos

Objective: To deliver thousands of encapsulated perturbations into developing zebrafish embryos at the single-cell stage. Materials:

  • Zebrafish embryos at 1-4 cell stage.
  • Loaded microdroplets (from Protocol 3.2).
  • Microinjection rig: Micromanipulator, pneumatic picopump, and pulled glass capillary needles (~10 µm tip opening).
  • Injection plate with agarose grooves.
  • Fluorinated oil to backfill needle.

Method:

  • Prepare Needle: Backfill the injection needle with fluorinated oil using a fine gel-loading tip. Front-load droplets by carefully aspirating ~2 µL of the droplet emulsion into the needle tip.
  • Prepare Embryos: Align dechorionated embryos along the groove of an agarose plate.
  • Calibrate Injection: Using a blank droplet emulsion, calibrate the injection pressure and pulse duration to deliver a consistent droplet volume (~1 nL, containing ~5-10 droplets) into the yolk or cell cytoplasm.
  • High-Throughput Injection: Systematically inject each aligned embryo. Target 500-2000 injected embryos per experiment.
  • Recovery: Post-injection, carefully transfer embryos to egg water and incubate at 28°C. Screen for normal development at 24 hours post-fertilization (hpf) before sorting for downstream analysis.

Data Presentation

Table 1: Optimized Parameters for MIC-Drop Workflow Steps

Workflow Step Key Parameter Optimal Value / Range Impact on Outcome
Droplet Generation Oil:Aqueous Flow Rate Ratio 3:1 Determines droplet size (~50 µm) & monodispersity.
Droplet Generation Total Flow Rate 2000 µL/hr Affects throughput and stability of droplet formation.
Encapsulation Cell Concentration 1x10^6 cells/mL Targets <10% of droplets containing a cell (Poisson distribution).
Encapsulation Library Plasmid Concentration 50 pM Targets >90% of cell-containing droplets with exactly one plasmid.
Encapsulation Lysis Temperature/Time 65°C for 15 min Ensures complete cell lysis and mRNA release without damaging nucleic acids.
Microinjection Injection Volume ~1 nL Balishes perturbation delivery with embryo viability.
Microinjection Droplets per Injection 5-10 Ensures delivery of at least one encapsulated payload.

Table 2: Critical Quality Control Checkpoints

Checkpoint Measurement Method Target Metric Required Action if Out of Spec
Droplet Uniformity Microscopy + ImageJ CV of diameter < 5% Adjust flow rates or check chip/channel cleanliness.
Encapsulation Efficiency Flow Cytometry (droplet stream) <10% cell-positive droplets Adjust cell concentration in aqueous phase.
Cell Viability Post-Encapsulation Fluorescence (Calcein AM+) >80% in droplets Check surfactant biocompatibility; reduce lysis time.
Embryo Viability (24 hpf) Stereomicroscope observation >70% normal development Reduce injection volume; check needle sharpness.

Visualizations

Diagram 1: MIC-Drop Workflow Overview

Diagram 2: Single-Cell, Single-Guide Co-Encapsulation

The Scientist's Toolkit: Essential Research Reagents & Materials

Item Function in MIC-Drop Workflow Key Consideration
Fluorinated Oil (Novec 7500) Continuous phase for droplet generation; immiscible with water, biocompatible. Low viscosity and high oxygen permeability are crucial for cell health.
PFPE-PEG Block Copolymer Surfactant Stabilizes water-in-oil droplets, prevents coalescence. Critical for maintaining droplet integrity during thermal lysis and injection.
Microfluidic Chips (Flow-Focusing) Generates highly uniform monodisperse droplets via precise fluidic control. Channel diameter (30-50µm) determines final droplet size and payload capacity.
Barcoded sgRNA Plasmid Library The multiplexed perturbation vector (e.g., for CRISPR knockout). Must be purified to high quality and quantified accurately for limiting dilution.
Pneumatic Picopump Microinjector Delivers precise, repeatable nanoliter volumes of droplet emulsion. Allows high-throughput injection of hundreds of embryos with consistent payload.
Pulled Glass Capillary Needles Fine tip for embryo injection without significant damage. Tip aperture (~10µm) must be large enough to pass droplets but small for viability.

Within the broader thesis framework integrating MIC-Drop with Perturb-seq for in vivo functional genomics, this protocol details the critical step of generating a high-complexity lentiviral perturbation library and delivering it into a live animal model. This enables single-cell RNA sequencing (scRNA-seq) to read out both the genetic perturbation and the consequent transcriptional state of thousands of cells in situ.

Key Quantitative Benchmarks for Library Production and Delivery

Table 1: Quantitative Benchmarks for Lentiviral Library Production

Parameter Target Specification Typical Range / Value Measurement Method
Library Representation >95% of designed constructs 90-99% NGS of plasmid library vs. packaged virus
Viral Titer >1 x 10^8 TU/mL (concentrated) 1x10^8 - 5x10^8 TU/mL qPCR or fluorescence-based transduction
MOI (Multiplicity of Infection) in vitro 0.3 - 0.5 0.2 - 0.8 Fluorescence/functional assay + cell counting
Transduction Efficiency in vivo Cell-type dependent (e.g., >30% for target population) 10-70% scRNA-seq perturbation detection rate
Insert Size (sgRNA + barcode) ~200-350 bp 200-400 bp Post-packaging NGS amplicon sequencing

Table 2: In Vivo Delivery Parameters for Common Models

Animal Model Target Tissue Delivery Method Typical Volume & Titer Key Efficiency Consideration
Mouse (Adult) Brain (Cortex) Stereotactic Injection 1-2 µL, >1e8 TU/mL Limited diffusion, local transduction.
Mouse Immune System in vivo Tail Vein Injection (systemic) 100-200 µL, >1e8 TU/mL Lower effective MOI, broad distribution.
Mouse Liver in vivo Hydrodynamic Tail Vein Injection 1-2 mL, >1e7 TU/mL High hepatocyte transduction, acute stress.
Mouse (P0-P2) Brain (Developing) Intraventricular Injection 1-3 µL, >5e7 TU/mL Widespread progenitor cell transduction.
Organoid ex vivo Cerebral Organoids Microinjection / Soaking 0.5-2 µL, >1e8 TU/mL Penetration depth vs. organoid size.

Detailed Experimental Protocols

Protocol 3.1: High-Titer Lentiviral Library Production (Lenti-X 293T System)

Objective: To produce a replication-incompetent lentiviral library from a pooled sgRNA plasmid library while maintaining complexity.

Materials: See Scientist's Toolkit (Section 5).

Method:

  • Day 0: Plate Cells: Seed Lenti-X 293T cells in poly-L-lysine coated 15-cm dishes at ~4x10^6 cells/dish in 20 mL antibiotic-free DMEM+10% FBS. Target 70-80% confluency for transfection next day.
  • Day 1: Transfection (Calcium Phosphate Method): a. For each dish, prepare DNA mix in 1.5 mL sterile water: - 18 µg sgRNA expression plasmid library (e.g., lentiGuide-Puro, with barcodes). - 12 µg psPAX2 (packaging plasmid). - 6 µg pMD2.G (VSV-G envelope plasmid). b. Add 216 µL of 2M CaCl₂ to DNA mix. Vortex briefly. c. In a separate tube, add 1.8 mL of 2X HEPES Buffered Saline (HBS). Using a bubbler, slowly add the DNA-CaCl₂ mixture dropwise to the HBS while bubbling. A fine precipitate should form. d. Incubate mixture at room temperature for 1-2 minutes, then distribute evenly dropwise over the 293T cell medium. Swirl gently. e. Incubate cells at 37°C, 5% CO₂.
  • Day 2: Medium Change: ~16 hours post-transfection, carefully aspirate medium containing transfection complex and replace with 20 mL fresh, pre-warmed DMEM+10% FBS + 1% BSA.
  • Day 3 & 4: Viral Harvest: ~48 and 72 hours post-transfection, collect the virus-containing supernatant. Pass through a 0.45 µm PES filter to remove cell debris. Pool harvests.
  • Virus Concentration (Ultracentrifugation): a. Transfer filtered supernatant to ultracentrifuge tubes. Balance precisely. b. Centrifuge at 50,000-70,000 x g for 2 hours at 4°C. c. Carefully decant supernatant. Resuspend the viral pellet in 200-500 µL of cold HBSS + 1% BSA per original 15-cm dish. Gently pipette on ice for 2+ hours or overnight at 4°C. d. Aliquot, snap-freeze in liquid nitrogen, and store at -80°C.
  • Titer Determination (qPCR): a. Treat 1 µL of concentrated virus with DNase I to remove residual plasmid DNA. b. Perform viral RNA extraction, reverse transcription, and qPCR targeting the WPRE region or psi packaging signal using a lentivirus standard of known titer. c. Calculate titer in transducing units per mL (TU/mL).

Protocol 3.2: In Vivo Stereotactic Intracranial Delivery

Objective: To deliver lentiviral library into a specific brain region of an adult mouse for in vivo Perturb-seq.

Materials: Stereotactic frame, microsyringe pump, Hamilton syringe, disinfectants, analgesics, heating pad.

Method:

  • Pre-surgery: a. Thaw an aliquot of concentrated viral library on ice. b. Anesthetize mouse (e.g., using isoflurane 3-5% for induction, 1-3% for maintenance). Confirm depth of anesthesia via toe pinch. c. Administer pre-operative analgesic (e.g., buprenorphine SR). d. Place mouse in stereotactic frame with heating pad. Apply ophthalmic ointment. e. Shave scalp and disinfect with alternating scrubs of betadine and 70% ethanol (3x each).
  • Surgery: a. Make a midline sagittal incision (~1 cm) to expose the skull. b. Level the skull such that Bregma and Lambda are at the same dorsal-ventral coordinate. c. Using coordinates from a mouse brain atlas, mark the injection site(s) relative to Bregma. Create a small burr hole with a dental drill. d. Load virus into a sterile Hamilton syringe. Mount syringe onto the microinjector. e. Lower the needle slowly to the target depth (e.g., cortex: -0.5 mm DV). f. Inject virus at a slow, constant rate (e.g., 100 nL/min). For 1 µL total volume, inject over 10 minutes. g. After injection, wait 5-10 minutes to allow for diffusion before slowly retracting the needle.
  • Post-surgery: a. Suture the incision. Administer post-operative fluids and analgesia. b. Monitor animal until fully recovered.
  • Incubation: Allow 7-21 days for robust transgene expression and phenotypic manifestation before tissue harvest for scRNA-seq.

Visualized Workflows and Pathways

Diagram 1: Lentiviral Library Production Workflow

Title: Lentiviral Library Production Steps

Diagram 2: In Vivo Perturb-seq Delivery & Analysis Logic

Title: In Vivo Delivery to scRNA-seq Pipeline

The Scientist's Toolkit

Table 3: Essential Research Reagent Solutions

Item Function & Rationale Example Product/Catalog
Lenti-X 293T Cells HEK 293T derivative optimized for high-titer lentivirus production with minimal splicing. Takara Bio, 632180
psPAX2 Packaging Plasmid 2nd generation packaging plasmid providing Gag, Pol, Rev, Tat. Essential for virus particle formation. Addgene, 12260
pMD2.G Envelope Plasmid Encodes VSV-G glycoprotein, providing broad tropism and enabling virus concentration by ultracentrifugation. Addgene, 12259
Polybrene (Hexadimethrine bromide) Cationic polymer that enhances viral transduction efficiency by neutralizing charge repulsion. Sigma-Aldrich, H9268
DNase I (RNase-free) Critical for titering to remove unpackaged plasmid DNA from viral preps, preventing false positives. Thermo Fisher, EN0521
Lentivirus qPCR Titer Kit Quantitative measurement of physical viral particles via detection of conserved genomic RNA region (e.g., WPRE). Takara Bio, 631235
Ultracentrifuge & Rotor Equipment for high-speed pelleting and concentration of viral particles from large-volume supernatants. Beckman Coulter, Optima XE-90
Stereotactic Instrument Precision apparatus for targeting specific brain coordinates in rodent models for viral delivery. Kopf Instruments, Model 940
Hamilton Syringe (10 µL) Precision glass syringe for nanoliter-scale viral delivery in stereotactic surgery. Hamilton, 80300
Single-Cell Dissociation Kit Enzyme-based tissue dissociation reagents optimized for live cell yield and viability for scRNA-seq. Miltenyi Biotec, Neural Tissue Dissociation Kit

1. Introduction Within a thesis focused on integrating MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq for high-throughput in vivo functional genomics, the selection of an appropriate model system is paramount. This step determines the biological relevance, scalability, and mechanistic depth of the screening data. Mice, zebrafish, and organoids represent three pivotal models, each with distinct advantages and limitations for in vivo perturbation screening.

2. Comparative Analysis: Mice, Zebrafish, and Organoids The following table synthesizes key quantitative and qualitative parameters critical for model selection in MIC-Drop/Perturb-seq studies.

Table 1: Model System Comparison for In Vivo Screening

Parameter Mouse (Mus musculus) Zebrafish (Danio rerio) Organoids (e.g., Intestinal, Cerebral)
Genetic Tractability High (complex transgenesis, Cre-lox) Very High (efficient CRISPR, Tol2 transgenesis) Moderate-High (CRISPR feasible, clonal derivation)
Throughput (Scale) Low-Medium (cost/time intensive) Very High (100s of embryos/day) High (scalable in 96/384-well plates)
In Vivo Complexity High (immune system, circulation, physiology) Medium (transparent, simpler physiology) Low (simplified tissue microanatomy)
Imaging Accessibility Low (requires invasive window) Very High (optical clarity of embryos) High (3D confocal, live-cell)
Cost per Perturbation High ($50-$500+) Low (<$10 per embryo) Medium ($20-$100 per well)
Time to Result Months Weeks Weeks
Suitability for MIC-Drop High for pooled barcoded sgRNA delivery Excellent for direct embryo injection of barcoded constructs Excellent for lentiviral transduction in culture
Perturb-seq Compatibility Challenging (cell recovery from tissues) Good (dissociation of whole embryos) Excellent (easy single-cell suspension)
Key Application Systemic disease, immunology, cancer Developmental biology, toxicology, rapid phenotype screening Disease modeling, host-pathogen, epithelial biology

3. Detailed Experimental Protocols

Protocol 3.1: MIC-Drop sgRNA Library Delivery in Zebrafish Embryos Objective: To introduce a barcoded MIC-Drop sgRNA library into zebrafish embryos for large-scale in vivo knockout screening. Materials: MIC-Drop sgRNA library (lyophilized), phenol red, zebrafish injection rig, fine-glass needle, one-cell stage zebrafish embryos.

  • Library Resuspension: Reconstitute the lyophilized MIC-Drop library in nuclease-free water to a final sgRNA concentration of 25 ng/µL. Add 0.1% phenol red for visualization.
  • Needle Preparation: Pull fine-glass capillaries to create injection needles. Back-load 2 µL of the library mixture.
  • Embryo Injection: Align one-cell stage embryos on an agarose plate. Using a micromanipulator, inject approximately 1 nL of the library mixture directly into the cell cytoplasm or yolk. Target 500-1000 embryos per library screen.
  • Post-Injection Care: Incubate embryos in E3 embryo medium at 28.5°C. Screen for perturbation phenotypes at desired developmental stages (e.g., 24-72 hours post-fertilization).
  • Tissue Dissociation & Sequencing: For Perturb-seq, pool phenotypically similar embryos, dechorionate, and dissociate tissues using gentle enzymatic treatment (e.g., Liberase TM). Process the single-cell suspension for 10x Genomics library preparation and single-cell RNA sequencing.

Protocol 3.2: Lentiviral Transduction of Organoids for Pooled Perturb-seq Objective: To generate a genetically perturbed organoid culture for single-cell transcriptomic phenotyping. Materials: Matrigel, Intestinal stem cell organoids, lentiviral MIC-Drop sgRNA pool (MOI~0.3), Polybrene (8 µg/mL), Y-27632 (ROCK inhibitor).

  • Organoid Dissociation: Mechanically and enzymatically dissect mature organoids to single cells or small clumps using TrypLE. Quench with complete media containing BSA.
  • Lentiviral Transduction: Resuspend 1x10^5 cells in 100 µL of medium containing Polybrene and the lentiviral sgRNA pool. Seed in a 96-well plate. Spinoculate by centrifugation at 600 x g for 60 minutes at 32°C. Incubate for 6 hours.
  • Embedding & Culture: Post-transduction, mix cells with Matrigel and plate as domes. Overlay with complete organoid growth medium supplemented with Y-27632 for 48 hours.
  • Selection & Expansion: If using vectors with antibiotic resistance (e.g., puromycin), apply selection for 5-7 days. Allow organoids to expand for 10-14 days, enabling phenotype manifestation.
  • Single-Cell Harvest for Perturb-seq: Dissociate organoids to single cells. Pass through a 40 µm strainer. Count and viability-check cells. Proceed with 10x Genomics Chromium Next GEM single-cell 3’ reagent kit v3.1 according to manufacturer instructions, targeting 10,000 cells per condition.

4. Visualization of Workflow and Pathway

Title: In Vivo Screening Workflow with Model Selection

Title: From Genetic Perturbation to scRNA-seq Readout

5. The Scientist's Toolkit: Key Reagent Solutions

Table 2: Essential Research Reagents for MIC-Drop/Perturb-seq Screening

Reagent/Material Function & Application Example Product/Catalog
MIC-Drop sgRNA Library Pre-barcoded, pooled sgRNAs for multiplexed knockout screening. Enables direct linkage of phenotype to genetic perturbation. Custom synthesized (e.g., Twist Bioscience).
10x Genomics Chromium Next GEM Kit Enables high-throughput single-cell RNA-seq library construction from dissociated tissues or organoids. Essential for Perturb-seq. 10x Genomics, 1000121.
Matrigel (Growth Factor Reduced) Basement membrane matrix for 3D organoid culture and embedding post-transduction. Corning, 356231.
Liberase TM Gentle, purified enzyme blend for high-viability dissociation of complex tissues (zebrafish, mouse) and organoids. Sigma-Aldrich, 5401119001.
Y-27632 (ROCK Inhibitor) Improves survival of dissociated stem cells and organoids post-transduction/plating by inhibiting apoptosis. Tocris, 1254.
Polybrene Cationic polymer that enhances lentiviral transduction efficiency in organoid and primary cell cultures. Sigma-Aldrich, TR-1003-G.
Cell Ranger ARC Analysis software for aligning single-cell RNA-seq data and calling CRISPR sgRNA barcodes from the same library. 10x Genomics (Software).
CITE-seq Antibodies (Optional) Antibody-derived tags for surface protein measurement alongside transcriptome, enabling multimodal phenotyping. BioLegend TotalSeq.

Within the thesis framework of combining MIC-Drop (Multiplexed Interrogation of Cells by Droplet) with Perturb-seq for in vivo genetic screening, Step 6 is the critical transition from a live, perturbed organism to a digital gene expression matrix. The goal is to capture the single-cell transcriptional consequences of in vivo perturbations (delivered via MIC-Drop) with high fidelity, minimizing technical artifacts that could confound the identification of phenotype-genotype linkages. This protocol details the standardized workflow from tissue processing to ready-to-sequence libraries.


Key Research Reagent Solutions

Reagent/Material Function in Perturb-seq Workflow
Cold PBS + 1% BSA Wash and collection buffer; maintains cell viability, reduces enzymatic activity, and prevents cell clumping.
Collagenase IV/Dispase/DNase I Mix Enzymatic cocktail for gentle tissue dissociation, breaking down extracellular matrix while preserving cell integrity and minimizing RNA degradation.
ACK Lysing Buffer (For immune-rich tissues) Lyses red blood cells without damaging nucleated cells of interest.
Live/Dead Cell Stain (e.g., DAPI, Propidium Iodide) Fluorescent viability indicator for downstream fluorescence-activated cell sorting (FACS).
Anti-mouse/rat IgG Magnetic Beads (For some protocols) Depletion of non-target cells (e.g., immune cells from an epithelial tumor) to enrich for the perturbed cell population.
Chromium Next GEM Chip G (10x Genomics) Partitions single cells, gel beads with barcoded oligonucleotides, and RT master mix into nanoliter-scale droplets.
Dual Index Kit TT Set A (10x Genomics) Provides unique sample indexes for multiplexing libraries from different experiments or conditions during sequencing.
SPRIselect Beads (Beckman Coulter) Size-selects and purifies cDNA and final libraries, removing primers, adapter dimers, and other contaminants.

Detailed Protocols

Tissue Harvesting & Single-Cell Dissociation

This protocol is optimized for a solid tissue (e.g., tumor, liver) from a mouse model subjected to prior MIC-Drop perturbation.

Materials:

  • Pre-chilled dissection tools, PBS/1% BSA on ice
  • GentleMACS Octo Dissociator (Miltenyi) or similar orbital shaker in 37°C incubator
  • 70µm cell strainer
  • Refrigerated centrifuge
  • Hemocytometer or automated cell counter

Method:

  • Rapid Harvest: Euthanize the animal per IACUC protocol. Excise the target tissue immediately and place it in a petri dish with 5 mL of cold PBS/1% BSA on ice.
  • Mechanical Disruption: Mince the tissue into ~1-2 mm³ pieces using sterile scalpels or razor blades.
  • Enzymatic Digestion: Transfer the minced tissue and buffer to a C-tube (Miltenyi). Add the appropriate pre-warmed enzymatic cocktail (e.g., 2 mg/mL Collagenase IV, 1 mg/mL Dispase, 20 µg/mL DNase I in PBS). Cap tightly.
  • Run Dissociation Program: Attach the C-tube to the GentleMACS Octo Dissociator and run the pre-programmed "37CmTDK_1" protocol or equivalent (typically 30-45 min at 37°C with intermittent mechanical agitation).
  • Quench & Filter: Add 10 mL of cold PBS/BSA to quench enzymes. Pass the cell suspension through a 70µm cell strainer into a 50mL conical tube. Rinse the strainer with an additional 5-10 mL of buffer.
  • Pellet & Lysate RBC: Centrifuge at 300-500 x g for 5 min at 4°C. Aspirate supernatant. For tissues with high RBC content, resuspend pellet in 1-5 mL of ACK buffer for 2 min on ice, then quench with excess PBS/BSA.
  • Wash & Count: Pellet cells again (300 x g, 5 min, 4°C). Resuspend in 1-5 mL of PBS/BSA. Count cells using a hemocytometer with a live/dead stain (e.g., Trypan Blue). Assess viability (target >80%).
  • Viability Enrichment (Optional but Recommended): If viability is suboptimal (<70%), perform dead cell removal using a magnetic bead-based kit or sort for live cells using FACS (DAPI-negative/PI-negative population).
  • Final Resuspension: Adjust viable cell concentration to the target range for your scRNA-seq platform (e.g., 700-1,200 cells/µL in PBS/BSA for 10x Genomics, targeting 10,000 cells per run). Keep on ice until loading.

scRNA-seq Library Preparation (10x Genomics 3’ v3.1/v4 Chemistry)

Follow manufacturer's guidelines precisely. This is an abbreviated overview.

Materials:

  • Chromium Controller & Thermal Cycler
  • Chromium Next GEM Single Cell 3' Reagent Kits v3.1 or v4
  • Agilent Bioanalyzer/TapeStation

Method:

  • Gel Bead-In-Emulsions (GEMs) Generation: Load the Chromium Next GEM Chip G with:
    • 70µL of cell suspension (targeting ~10,000 cells),
    • Gel Beads, and
    • Partitioning Oil. Run on the Chromium Controller. Single cells, barcoded gel beads, and RT reagents are co-partitioned into ~100,000 oil droplets.
  • Post-GEM-RT Cleanup & cDNA Amplification: Transfer the GEMs to a PCR tube. Perform Reverse Transcription in a thermal cycler to add cell and molecular barcodes. Break droplets, recover barcoded cDNA, and amplify via PCR (12-14 cycles).
  • Library Construction: Fragment the amplified cDNA, attach Illumina adapters, and amplify via a second PCR to add sample index (i7) and P5/P7 flow cell binding sites. This creates the final Perturb-seq library containing both gene expression reads and the integrated sgRNA barcode from the original perturbation.
  • Quality Control & Quantification: Assess library size distribution (~550 bp peak) on a Bioanalyzer High Sensitivity DNA chip. Quantify via qPCR (Kapa Library Quant Kit) for accurate pooling and sequencing.

Parameter Target Range / Typical Value Purpose & Impact
Final Cell Viability (Post-Dissociation) >80% Low viability increases background noise from ambient RNA and reduces cell recovery.
Cell Concentration for Loading 700 - 1,200 cells/µL Optimizes capture efficiency and minimizes doublet rate.
Target Cell Recovery 5,000 - 10,000 cells per channel Ensures sufficient statistical power for perturbation analysis.
cDNA Amplification PCR Cycles 12 - 14 cycles Minimizes amplification bias; cycle number depends on input cell count and tissue type.
Final Library Concentration 2 - 10 nM (by qPCR) Ensures adequate clustering on sequencer.
Library Fragment Size ~550 bp (major peak) Confirms successful adapter ligation and cleanup.
Estimated Sequencing Depth 20,000 - 50,000 reads/cell Sufficient for detecting expressed sgRNAs and transcriptome profiling.

Workflow & Pathway Diagrams

Diagram Title: Perturb-seq Tissue to Library Workflow

Diagram Title: Perturb-seq Read Deconvolution Logic

Within the broader thesis investigating the in vivo application of MIC-Drop and Perturb-seq for functional genomics and therapeutic target discovery, this protocol details the critical computational step. Following the generation of single-cell RNA sequencing (scRNA-seq) data from pooled in vivo screens—where single-guide RNAs (sgRNAs) or molecular barcodes identify genetic perturbations within individual cells—this pipeline transforms raw sequencing data into interpretable biological insights. It enables the identification of differentially expressed genes (DEGs), perturbed pathways, and the derivation of gene expression signatures specific to each genetic perturbation, ultimately linking genotype to phenotype in a complex tissue environment.

Application Notes and Core Principles

  • Input Specificity: This pipeline is designed for Perturb-seq (CRISPR-based) or MIC-Drop-like (barcoded molecule-based) data, where cell barcodes, unique molecular identifiers (UMIs), gene identifiers, and perturbation barcodes are embedded in the sequencing reads.
  • Single-Cell Aware: All steps account for the inherent sparsity, noise, and count-based nature of scRNA-seq data (e.g., UMI counts). Normalization and statistical models are specifically chosen for single-cell data.
  • Perturbation-Centric Analysis: The core task is to compare gene expression profiles between cells containing a target perturbation (e.g., a specific sgRNA) and appropriate control cells (e.g., non-targeting sgRNAs or untreated controls) within the same experimental batch.
  • Scalability: The pipeline is built using modular, scriptable tools (e.g., command-line based, R/Python packages) to handle datasets ranging from thousands to millions of cells.

Detailed Computational Protocol

A. Raw Data Processing & Alignment

  • Objective: Convert BCL or FASTQ files into a cell-by-gene count matrix with annotated perturbations.
  • Protocol:
    • Demultiplexing: Use bcl2fastq or mkfastq (Cell Ranger) to generate FASTQ files for each sample lane using the sample index sequences.
    • Read Alignment & Feature Counting: For Perturb-seq data, use Cell Ranger (10x Genomics-compatible) or starsolo to align reads to a combined reference genome (host genome + sgRNA/barcode sequence). For custom barcode schemes (e.g., MIC-Drop), tools like umi-tools and kallisto | bustools with a custom index are recommended.
    • Perturbation Call Assignment: Extract perturbation barcodes from reads (often from the CRISPR R1 read or a custom barcode read). Match these to a whitelist of expected barcodes (sgRNA library). Assign each cell barcode to its detected perturbation(s) using a tool like Cell Ranger count (for feature barcoding) or a custom script based on UMI thresholds.
    • Output: A digital gene expression matrix (DGE, genes x cells) in MTX/H5AD/Seurat object format, with associated cell metadata containing the assigned perturbation per cell.

B. Quality Control (QC) & Filtering

  • Objective: Remove low-quality cells and uninformative genes to reduce noise.
  • Protocol:
    • Calculate per-cell metrics: total UMI counts (library size), number of detected genes, and percentage of mitochondrial/ribosomal RNA reads.
    • Visualize metrics and set thresholds (see Table 1). Filter out cells that are outliers (likely dead, damaged, or doublets).
    • Filter genes detected in fewer than a minimum number of cells (e.g., <10 cells).
  • Data Table 1: Typical QC Thresholds for Murine In Vivo scRNA-seq
Metric Low-Quality Threshold High-Quality/Empty Drop Threshold Rationale
Total UMIs/Cell < 1,000 > 50,000 Low counts indicate dying cells; very high counts may indicate multiplets.
Genes Detected/Cell < 500 > 6,000 Similar rationale to UMI counts.
% Mitochondrial Reads > 20% N/A High percentage indicates cellular stress or apoptosis.
% Ribosomal Reads > 50% N/A Extremely high percentage may indicate low mRNA content.

C. Normalization, Integration, and Clustering

  • Objective: Correct for technical variation and identify cell states/types present in the data.
  • Protocol:
    • Normalization: Apply a global scaling normalization (e.g., LogNormalize in Seurat: counts per cell multiplied by 10,000, then log1p-transformed) or a variance-stabilizing method (e.g., SCTransform).
    • Variable Feature Selection: Identify 2,000-5,000 highly variable genes (HVGs) that drive biological heterogeneity.
    • Integration (if multiple samples/batches): Use Harmony, Seurat's CCA, or Scanorama to correct for batch effects while preserving biological variation.
    • Dimensionality Reduction & Clustering: Perform PCA on HVGs. Construct a shared nearest neighbor (SNN) graph. Cluster cells using the Louvain or Leiden algorithm on the SNN graph (resolution ~0.2-1.0). Visualize with UMAP or t-SNE.
    • Cell Type Annotation: Use known marker genes to manually annotate clusters or employ a reference-based annotation tool (e.g., SingleR).

D. Differential Expression (DE) & Signature Generation

  • Objective: For each target perturbation, identify DEGs relative to control cells within the same cell type or cluster.
  • Protocol:
    • Subsetting: For each perturbation of interest, subset the dataset to include: a) cells with that perturbation and b) control cells, ideally from the same cell type/cluster and batch.
    • Statistical Testing: Use single-cell-appropriate DE test that models UMI count data. Recommended models include:
      • MAST: (Model-based Analysis of Single-cell Transcriptomics) A generalized linear model that accounts for the bimodality of scRNA-seq data.
      • Wilcoxon Rank-Sum Test: A non-parametric test robust to outliers, often used on normalized (log) data.
      • DESeq2 (pseudobulk): Aggregate counts per perturbation group within a cell type to create a "pseudobulk" sample, then apply the bulk RNA-seq DE tool DESeq2. This is highly recommended for its robustness and control of false positives.
    • Multiple Testing Correction: Apply the Benjamini-Hochberg procedure to adjust p-values, controlling the False Discovery Rate (FDR). A common significance threshold is adjusted p-value (FDR) < 0.05 and |log2(fold change)| > 0.25.
    • Gene Signature Creation: For each perturbation, the signature is defined as the ranked list of significant DEGs. The top N up- and down-regulated genes (e.g., top 50 each, by log2FC) constitute a concise signature for downstream pathway analysis or comparison.

E. Downstream Analysis & Visualization

  • Pathway Enrichment Analysis: Input the ranked DEG list into tools like GSEA, or perform over-representation analysis (ORA) on significant genes using databases like GO, KEGG, or Reactome (via clusterProfiler).
  • Signature Scoring: Use methods like AUCell, singscore, or Seurat's AddModuleScore to project perturbation-specific gene signatures onto other datasets (e.g., disease atlases) to find phenotypic matches.
  • Visualization: Generate volcano plots, heatmaps of top DEGs, and UMAPs overlaid with perturbation identity or signature scores.

Experimental Workflow Diagram

From FASTQ to Biological Insights in Perturb-seq Data

Differential Expression Analysis Logic Diagram

Workflow for Perturbation-Specific DEG Identification

The Scientist's Toolkit: Key Research Reagent Solutions

Tool / Resource Category Function in Pipeline
Cell Ranger (10x Genomics) Commercial Software Suite End-to-end processing of 10x scRNA-seq data, including alignment, filtering, counting, and feature barcode analysis for CRISPR guide capture. Essential for standard Perturb-seq.
Seurat (R) / Scanpy (Python) Open-Source Analysis Suite Comprehensive toolkits for QC, normalization, clustering, integration, visualization, and basic differential expression analysis of scRNA-seq data. The core environment for most analyses.
DESeq2 (R) Statistical Package Industry-standard pseudobulk differential expression analysis. Provides robust dispersion estimation and FDR control when applied to aggregated single-cell counts per perturbation group.
MAST (R) Statistical Package Generalized linear model framework designed specifically for single-cell DE testing, modeling the bimodal distribution and including cellular detection rate as a covariate.
Harmony (R/Python) Integration Algorithm Rapid and effective tool for integrating multiple scRNA-seq datasets (e.g., different mice, treatment batches) by removing technical batch effects while preserving biological structure.
clusterProfiler (R) Pathway Analysis Package Performs over-representation and gene set enrichment analysis (GSEA) on DEG lists using up-to-date annotations from GO, KEGG, Reactome, and other databases.
umi-tools (Python) NGS Processing Toolkit Handles barcode/UMI extraction, deduplication, and counting for custom sequencing schemes, useful for non-standard perturbation barcode designs.

Common Challenges in In Vivo Pooled Screens: Troubleshooting MIC-Drop and Perturb-seq

Application Notes

Achieving comprehensive screen coverage and high perturbation diversity in vivo is the foundational challenge for leveraging pooled CRISPR screening technologies like MIC-Drop and Perturb-seq in whole organisms. Unlike in vitro systems, in vivo delivery, biodistribution, immune clearance, and cellular turnover impose severe bottlenecks on library complexity. The primary goal is to maximize the number of distinct genetic perturbations that reach and are expressed in the target cell population at sufficient representation for robust statistical analysis.

Key Quantitative Hurdles:

Parameter In Vitro Ideal In Vivo Challenge Target for Sufficient Coverage
Initial Library Diversity 10^6 - 10^8 clones Limited by delivery vehicle capacity >500x guide representation
Delivery Efficiency to Target Tissue ~100% (in media) 1-20% (varies by route & vehicle) Maximize via optimized route
Cell Type Specificity N/A (homogeneous) Off-target transduction major concern Use cell-specific promoters
Minimum Cell Coverage per Guide 200-500 cells Drastically reduced by bottlenecks Aim for >50 cells/guide in target
Perturbation Diversity Recovered Near-input levels Often <10% of input library >1,000 unique perturbations analyzed

Failure to address these constraints results in "bottlenecked" screens where only the most fit or efficiently delivered perturbations are recovered, biasing biological conclusions.

Protocols

Protocol 1: In Vivo MIC-Drop Library Preparation & Complexity Preservation Objective: To package a high-diversity MIC-Drop sgRNA library into lipid nanoparticles (LNPs) for systemic delivery while preserving complexity.

  • Library Cloning & Amplification: Clone your pooled sgRNA library (e.g., 10,000-element library) into the MIC-Drop vector backbone. Perform low-cycle PCR amplification (≤10 cycles) of the pooled plasmid library using barcoded primers to add unique molecular identifiers (UMIs) and sequencing adapters.
  • LNP Formulation: Formulate the purified plasmid DNA library into targeted LNPs using a microfluidic mixer. Use ionizable lipids with high in vivo transfection efficiency (e.g., SM-102) and incorporate a targeting ligand (e.g., CD22 for B cells) into the lipid mix if needed.
  • Titration & Quality Control: Inject escalating doses of LNP library into a cohort of 3-5 mice. Sacrifice animals 48 hours post-injection. Isolate genomic DNA (gDNA) from target organs and perform sgRNA amplification followed by NGS sequencing. Use the data to determine the dose that yields >500x coverage of your library in the target tissue.
  • Scale-Up & Administration: Based on titration results, scale up LNP production. Inject the optimal dose into your experimental cohort via the tail vein (systemic) or local route (e.g., intracranial). Use a minimum cohort size (n) to account for animal-to-animal variation in delivery.

Protocol 2: Multiplexed Perturbation Recovery & Single-Cell Sequencing for In Vivo Perturb-seq Objective: To recover a diverse set of perturbed cells from tissue and prepare them for single-cell RNA sequencing (scRNA-seq).

  • Tissue Harvest & Single-Cell Dissociation: Euthanize animals at the experimental endpoint (e.g., 7-14 days post-LNP administration). Perfuse with cold PBS to clear blood. Harvest target tissue, mince, and dissociate using a gentle, optimized enzymatic cocktail (e.g., Liberase TL + DNase I) to maximize viable single-cell yield.
  • Cell Enrichment & Viability Sort: Filter cells through a 40μm strainer. Perform FACS sorting to isolate live (DAPI-), singlet cells. If applicable, use fluorescence from a reporter in the vector (e.g., GFP) to enrich for successfully transduced cells.
  • Single-Cell Library Preparation: Process ~20,000 sorted cells per condition through the 10x Genomics Chromium Next GEM platform using the Single Cell 3’ Gene Expression v3.1 kit with Feature Barcoding for CRISPR guide capture.
  • Sequencing & Analysis: Sequence libraries on an Illumina NovaSeq to a minimum depth of 50,000 reads per cell. Process data using Cell Ranger (10x Genomics) with the count function to align transcripts and detect feature barcodes (sgRNAs). Use MARS-seq or Seurat pipelines for downstream analysis, linking each cell's transcriptional profile to its assigned sgRNA perturbation.

Diagrams

Title: In Vivo CRISPR Screen Workflow & Critical Bottlenecks

Title: Strategies to Maximize In Vivo Screen Coverage

The Scientist's Toolkit

Research Reagent Solution Function in In Vivo Screening
Targeted Lipid Nanoparticles (LNPs) Efficient, systemically deliverable vehicles for encapsulating CRISPR plasmid or RNP libraries to specific tissues (e.g., liver, spleen).
AAV Serotypes (e.g., AAV9, PHP.eB) Viral vectors with tropism for specific cell types (neurons, muscle) for long-term perturbation expression.
10x Genomics Single Cell 3' Kit w/ Feature Barcoding Enables simultaneous capture of single-cell transcriptomes and associated sgRNA barcodes in Perturb-seq.
Gentle MACS Dissociation Kits Enzyme mixes optimized for specific tissues (brain, tumor) to maximize yield of viable, single cells for sequencing.
UMI (Unique Molecular Identifier) Oligos PCR additives that tag each original RNA molecule or guide to control for amplification bias and improve quantification.
Cell Surface Marker Antibody Panels (for FACS) Used to pre-enrich specific cell populations from dissociated tissue prior to scRNA-seq, reducing sequencing cost on irrelevant cells.
Next-Generation Ionizable Lipids (e.g., SM-102) Key components of modern LNPs providing high in vivo transfection efficiency and reduced immunogenicity.

Within the broader thesis of applying high-throughput, in vivo functional genomics via MIC-Drop and Perturb-seq, a fundamental technical challenge is the inconsistency of viral or droplet-based delivery efficiencies across diverse cell types in a complex tissue. This variability introduces significant noise and bias, confounding the accurate measurement of phenotypic outcomes (e.g., gene expression changes) and limiting the quantitative power of pooled screens. This application note details strategies and protocols to diagnose, mitigate, and account for this challenge, ensuring more robust and interpretable in vivo screening data.

Quantitative Data on Delivery Variability

Table 1: Factors Contributing to Variable Delivery Efficiencies

Factor Impact on Efficiency Typical Range (Relative) Notes
Cell Surface Receptor Abundance Primary determinant for viral entry (e.g., VSV-G, AAV serotypes). 10-100x Crucial for in vivo tropism.
Cell Size & Membrane Properties Affects droplet fusion or electroporation efficiency. 2-10x Larger cells often show higher droplet incorporation.
Cell Cycle State Dividing cells more permissive to lentiviral integration. 5-50x A major confounder in vivo.
Phagocytic/Autophagic Activity Can degrade delivered vectors/particles. 0.1-5x High in macrophages, microglia.
Tissue Architecture & Accessibility Physical barriers limit vector/droplet penetration in vivo. 100-1000x Major hurdle for solid tissues.
Viral/Vector Serotype Tropism defined by capsid-receptor interactions. 100-10000x AAV1 vs. AAV9 in CNS, for example.

Table 2: Common Methods to Assess Delivery Efficiency

Method Measures Throughput Key Advantage
Flow Cytometry (GFP/RFP) Transduction/transfection percentage. Medium-High Quantitative, single-cell.
qPCR for Vector Genome Vector copies per cell. Low-Medium Absolute quantification.
Droplet Barcode Sequencing Proportion of cells with detectable guide/oligo. Very High Directly measures screen delivery.
Spike-in Control Cells Relative efficiency vs. a reference line. Medium Contextualizes in vivo data.

Experimental Protocols

Protocol 3.1: In Vivo Titration and Tropism Profiling for Lentiviral Vectors

Objective: Determine the optimal viral titer and characterize cell-type-specific transduction in target tissue.

  • Prepare a dilution series of concentrated lentivirus (e.g., 10^6 to 10^9 TU/mL) encoding a ubiquitous fluorescent reporter (e.g., GFP).
  • Stereotactically inject (or use relevant route) each titer into separate cohorts of animals (n=3 per group). Include a vehicle control.
  • After 7-14 days, perfuse and harvest the target tissue.
  • Process tissue for single-cell suspension and perform flow cytometry to determine the percentage of GFP+ cells.
  • Sort GFP+ and GFP- populations for cell-type-specific analysis (e.g., 10x Genomics scRNA-seq) or stain for lineage markers (e.g., NeuN for neurons, GFAP for astrocytes) via FACS.
  • Calculate transduction efficiency per cell type: (Number of GFP+ cells within a marker-positive gate) / (Total number of marker-positive cells) * 100.

Protocol 3.2: Incorporating Multimodal Barcodes for Normalization in Pooled Screens

Objective: Decouple delivery efficiency from phenotypic readout using dual barcoding.

  • Clone your sgRNA library into a MIC-Drop or Perturb-seq vector containing both a guide barcode (unique to the perturbation) and a delivery barcode (unique to each viral particle or droplet).
  • Package library at low MOI (<0.3) to ensure most cells receive a single delivery barcode.
  • Deliver library in vivo and harvest cells after phenotypic development period.
  • Perform single-cell RNA sequencing (e.g., 10x Genomics).
  • Bioinformatic Analysis:
    • Extract both guide and delivery barcodes from cDNA reads.
    • Group cells by delivery barcode. Cells sharing a delivery barcode are sibling clones originating from a single transduction event.
    • Normalize gene expression phenotypes (e.g., differential expression) within each delivery barcode group first, then aggregate across groups sharing the same guide barcode.

Protocol 3.3: Using Spike-in Reference Cells for Normalization

Objective: Control for microenvironment variability by co-injecting a standardized cell population.

  • Generate a stable reference cell line (e.g., HEK293T) expressing a constitutive fluorescent protein (e.g., mCherry) and a known, inert sgRNA.
  • Mix these reference cells with your primary in vivo cell population before transplantation or injection, or co-inject them as a separate, localized bolus.
  • After in vivo incubation, sort mCherry+ (spike-in) and mCherry- (target) populations.
  • Sequence both populations separately. Use the perturbation signatures measured in the spike-in cells—which experienced the same delivery and microenvironment—as a baseline to normalize signals from the target cells.

Visualization of Workflows and Relationships

Title: Normalizing Screens with Dual Barcoding

Title: Delivery Challenge Causes & Solutions

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Reagents and Materials

Item Function Example/Supplier Notes
High-Diversity Barcode Libraries Provides unique delivery/guide barcodes for normalization. Custom cloned oligo pools (Twist Bioscience), ready-to-use libraries (Cellecta).
Tropism-Optimized Viral Packaging Systems Enhances delivery to specific cell types in vivo. AAV serotypes (AAV1, AAV9, AAV-PHP.eB), alternative envelope pseudotypes (VSV-G, Rabies-G, LCMV-G).
Fluorescent Reporter Constructs Visual and FACS-based quantification of delivery efficiency. pLenti-PGK-GFP, pAAV-CAG-tdTomato. Ubiquitous promoters are key.
Cell Lineage-Specific Antibodies For post-hoc analysis of cell-type-specific delivery via FACS. Anti-NeuN (neurons), Anti-GFAP (astrocytes), Anti-CD31 (endothelial).
Single-Cell RNA-Seq Kits Captures transcriptome, guide barcode, and delivery barcode simultaneously. 10x Genomics Chromium Next GEM, Parse Biosciences kit.
Spike-in Control Cell Lines Genetically defined reference cells for normalization. HEK293T-mCherry, NIH/3T3-GFP. Must be non-proliferative in vivo.
High-Titer Viral Concentration Kits Enables precise, low-volume in vivo deliveries. Lentivirus concentration via PEG-it (System Biosciences) or ultracentrifugation.
Stereotactic Injection Apparatus Precise, reproducible delivery to deep brain structures or tissues. Hamilton syringes, Kopf Instruments stereotaxic frame.

High-throughput in vivo functional genomics using platforms like MIC-Drop (Multiplexed Intermixed CRISPR Droplets) and Perturb-seq (CRISPR screens with single-cell RNA sequencing readout) enables the systematic dissection of gene function in complex organisms. However, the fidelity of these screens is compromised by off-target effects and false-positive signals. Off-target effects arise from unintended guide RNA (gRNA) activity, while false positives stem from technical noise, batch effects, or cellular stress responses to delivery. Mitigating these issues is critical for deriving biologically actionable insights, especially in therapeutic development.

2.1. CRISPR-Specific Artifacts:

  • Guide RNA Off-Targeting: gRNAs can induce indels at genomic sites with partial complementarity, especially with high-expression systems like Streptococcus pyogenes Cas9.
  • Perturbation-Induced Stress: Cellular responses to DNA damage (p53 activation) or high CRISPR machinery load can create confounding transcriptional states.
  • MOI (Multiplicity of Infection) Variance: In pooled screens, uneven viral transduction leads to cells with 0, 1, or >1 perturbations, confounding phenotypes.

2.2. Platform-Specific Noise (MIC-Drop & Perturb-seq):

  • MIC-Drop: Encapsulation efficiency variances lead to empty droplets or droplets with multiple gRNAs, generating false compound phenotypes.
  • Perturb-seq: Low capture efficiency of mRNAs from perturbed cells can cause dropout events, misassigning cell states. PCR amplification biases during library prep further distort expression counts.

2.3. Biological Confounders:

  • Genetic Background Heterogeneity in animal models.
  • Immune & Inflammatory Responses to viral or lipid nanoparticle (LNP) delivery in vivo.
  • Cellular Multiplicity: In in vivo screens, the same perturbation in different cell types can yield divergent outcomes, mistaken as inconsistent signals.

Application Notes & Mitigation Strategies

Computational Correction & Analytical Frameworks

Table 1: Key Analytical Tools for Signal Deconvolution

Tool Name Primary Function Application to MIC-Drop/Perturb-seq Key Metric for Fidelity
Cell Ranger ARC Processes single-cell multiome (ATAC + Gene Exp.) data. Identifies gRNA integration sites & links to chromatin accessibility changes. Confirms on-target chromatin remodeling.
CITE-seq Simultaneous protein & transcriptome measurement. Measures phenotypic protein markers post-perturbation, cross-validating RNA signals. Correlation between transcript & protein change.
MAGeCK-VISPR Comprehensive QC and analysis pipeline for CRISPR screens. Models guide-level variability and identifies high-confidence hits. False Discovery Rate (FDR) < 0.05.
CrispR (R package) Analyzes pooled screen data with mixed MOI. Corrects for multiple integrations per cell in MIC-Drop data. MOI-adjusted phenotype effect size.

Protocol 3.1.1: In Silico Off-Target Prediction & Filtering

  • Input: Candidate gRNA sequences (20-mer + NGG PAM for SpCas9).
  • Tool: Use Cas-OFFinder or CRISPOR to scan reference genome (e.g., mm10, hg38) for potential off-target sites with up to 4 mismatches.
  • Filter: Discard any gRNA with a predicted off-target site in:
    • An exon of any gene (excluding the target).
    • A known regulatory element (enhancer, promoter) from ENCODE.
  • Prioritize: Select 5-10 gRNAs per target gene with the highest predicted on-target activity scores (e.g., Doench '16 score > 0.6) and zero high-confidence exonic off-targets.
  • Validate: For final hit genes, design and synthesize two additional, independent gRNAs with distinct seed sequences for phenotypic confirmation.

Experimental Design & Wet-Lab Protocols

Protocol 3.2.1: Dual-guRNA Knockdown for MIC-Drop Specificity Control

  • Objective: To distinguish on-target from off-target effects by requiring two independent gRNAs against the same gene to produce the same phenotype.
  • Workflow:
    • Design: For each target gene, design 2 gRNAs targeting distinct exons.
    • MIC-Drop Assembly: Co-encapsulate both gRNAs + Cas9 protein in the same droplet targeting a single cell.
    • In Vivo Delivery: Inject droplets into model organism (e.g., zebrafish embryo).
    • Phenotyping: Use high-content imaging (for morphology) or FACS (for cell composition).
    • Analysis: A "high-confidence" hit is defined only if embryos receiving both gRNAs show the phenotype, while those with a single gRNA (from droplet variance) do not.

Protocol 3.2.2: Perturb-seq with Smash-and-Grab Genotyping

  • Objective: To directly link gRNA presence to the transcriptional profile of the same cell, reducing false associations.
  • Workflow:
    • Perturb & Harvest: Perform in vivo screen. Harvest and dissociate cells into a single-cell suspension.
    • Split-seq Approach:
      • Aliquot 90% of cells for 10x Genomics 3’ RNA-seq library preparation.
      • Aliquot 10% of cells for targeted gRNA amplification via PCR from genomic DNA.
    • Linked Readouts: Use cellular hashing (e.g., MULTI-seq) on all cells to retrospectively match the gRNA PCR product from the 10% aliquot to the transcriptional profile from the 90% aliquot of the same original cell population.
    • Bioinformatics: Use tools like Seurat and MULTI-seq demuxing to create a final cell x gene expression matrix with verified gRNA assignments.

The Scientist's Toolkit

Table 2: Essential Research Reagent Solutions for Mitigation

Item Function & Role in Mitigation Example Product/Catalog
High-Fidelity Cas9 Engineered variant (e.g., SpCas9-HF1) with reduced non-specific DNA binding, lowering off-target cleavage. IDT Alt-R S.p. HiFi Cas9 Nuclease V3
Chemically Modified sgRNA 2'-O-methyl 3' phosphorothioate modifications increase stability and can enhance specificity. Synthego sgRNA EZ Kit
Perturb-seq Kit Optimized reverse transcription & amplification reagents for capturing low-abundance transcripts in perturbed cells. 10x Genomics Single Cell 3' Kit v3.1
Cellular Hashtag Antibodies For sample multiplexing, allowing pooling of conditions to minimize batch effects. BioLegend TotalSeq-A Antibodies
Drop-Seq Microfluidic Chips For generating monodisperse emulsions in MIC-Drop, ensuring single-cell, single-guide encapsulation. Chemyx Inc. PDMS Microfluidic Chips
Next-Gen Sequencing Spike-Ins External RNA controls (ERCC) to quantify technical noise and normalize dropout effects. Thermo Fisher ERCC RNA Spike-In Mix

Validation & Hit Confirmation Workflow

A tiered validation strategy is non-negotiable.

  • Primary Screen: MIC-Drop/Perturb-seq in vivo.
  • Computational Triaging: Apply tools from Table 1. Filter hits by: (a) agreement across >2 gRNAs, (b) absence in negative control (non-targeting gRNA) population, (c) pathway coherence.
  • Orthogonal Validation: For candidate hits, use an independent perturbation method (e.g., small-molecule inhibitor, RNAi) in the same in vivo model. Measure phenotype congruence.
  • Mechanistic Deconvolution: Use in situ hybridization or immunofluorescence to validate expression changes at the protein/tissue level in the intact organism.

Visual Summaries

Diagram 1: Mitigation Strategy Overview (96 chars)

Diagram 2: Integrated Mitigation Workflow (96 chars)

Application Notes

Single-cell RNA sequencing (scRNA-seq) enables high-resolution dissection of cellular states in complex samples, such as those generated by in vivo MIC-Drop and Perturb-seq screens. However, technical noise—including amplification bias, dropout events, and ambient RNA—compromises data quality. A more pernicious challenge is batch effect, where systematic technical differences between experimental runs (e.g., different library preparations, sequencing lanes, or animal cohorts) can confound biological signals. In a thesis focused on scaling MIC-Drop for in vivo perturbation screening, robust correction of these artifacts is non-negotiable for accurate identification of genotype-phenotype linkages and compound-mode-of-action.

Key Quantitative Challenges in scRNA-seq Data:

Challenge Description Typical Impact (Quantitative Range)
Dropout Events Gene transcripts not detected due to low capture efficiency. 50-90% of expressed genes can show zero counts per cell.
Library Size Variation Technical differences in total counts per cell. Can vary by an order of magnitude (e.g., 1,000 to 50,000 UMI/cell).
Batch Effect Systematic non-biological variation between experimental batches. Can explain >50% of variance in PCA space if uncorrected.
Ambient RNA Background RNA from lysed cells contaminating cell barcodes. Can contribute 1-20% of counts in a cell, skewing cluster identity.

Core Correction Strategies & Tools:

Strategy Tool/Algorithm Primary Function Best Applied When
Normalization SCTransform (Seurat) Models technical noise, variance stabilizes. Prior to integration for within-batch normalization.
Data Imputation MAGIC, SAVER Infers missing values, smooths data. After QC, for visualizing gene-gene relationships.
Batch Integration Harmony, Seurat CCA, BBKNN Aligns cells across datasets in low-dimensional space. Before clustering and trajectory analysis on pooled data.
Ambient RNA Removal SoupX, DecontX Estimates and subtracts background contamination. As a pre-processing step immediately after cell calling.

Experimental Protocol: A Standardized Workflow for Batch-Effect-Corrected Analysis of In Vivo Perturb-seq Data

1. Sample Preparation & QC (Wet Lab)

  • Input: Dissociated tissue from in vivo MIC-Drop/Perturb-seq experiment across multiple batches.
  • Protocol: Use a standardized single-cell suspension protocol (e.g., gentleMACS dissociation). Perform viability staining (Trypan Blue, ~85% viability required). Aim for consistent cell concentration across batches. Use the same library preparation kit (e.g., 10x Genomics 3’ v3.1) for all batches. Include spike-in RNAs (e.g., Sequins, ERCC) at fixed ratios in each batch to monitor technical performance.

2. Computational Pre-processing & Quality Control

  • Tools: Cell Ranger (mkfastq, count), SoupX, Scrublet.
  • Protocol:
    • Demultiplexing & Alignment: Use cellranger mkfastq and cellranger count with a consistent reference genome.
    • Ambient RNA Correction: Run SoupX using the autoEstCont function to estimate the contamination fraction and correct the count matrix.
    • Doublet Detection: Use Scrublet to predict and filter out likely doublets (threshold >0.25).
    • Cell QC Filtering: Filter cells using thresholds (e.g., nFeature_RNA > 500 & < 6000; percent.mt < 15%). Filter out genes detected in <10 cells.

3. Normalization, Integration, and Downstream Analysis

  • Tools: Seurat (v5), Harmony.
  • Protocol:
    • Within-Batch Normalization: For each batch separately, perform SCTransform normalization, regressing out percent.mt and cell cycle score (if relevant).
    • Feature Selection: Select 3000 highly variable genes (HVGs) common across batches.
    • Batch Integration:
      • Create a Seurat object list of SCT-transformed batches.
      • Find integration anchors: FindIntegrationAnchors(object.list, normalization.method = "SCT", anchor.features = hvgs).
      • Integrate data: IntegrateData(anchorset, normalization.method = "SCT").
      • Alternative: Run Harmony on PCA embeddings: RunHarmony(seurat_object, group.by.vars = "batch").
    • Clustering & Visualization: Run PCA on integrated data, find neighbors, cluster (Louvain/Leiden), and generate UMAPs.
    • Differential Expression: Use the integrated, corrected data for finding markers or testing perturbation effects with models like MAST or DESeq2, using batch as a covariate.

Visualizations

Title: scRNA-seq Batch Correction Workflow for In Vivo Screens

Title: Conceptual Impact of Batch Effect Correction

The Scientist's Toolkit: Key Reagents & Resources

Item Function/Description Example Product/Code
Viability Stain Distinguish live/dead cells during QC. Trypan Blue, DAPI, Propidium Iodide.
Spike-in RNA Exogenous RNA added to monitor technical variation. ERCC RNA Spike-In Mix, Sequins.
Single-Cell 3' Kit Library preparation for barcoded scRNA-seq. 10x Genomics Chromium Next GEM 3' v3.1.
Cell Ranger Primary analysis suite for demux, alignment, counting. 10x Genomics Cell Ranger (v7+).
SoupX R Package Estimates and subtracts ambient RNA contamination. SoupX::autoEstCont()
Scrublet Python Tool Predicts and flags transcriptional doublets. scrublet.Scrublet()
Seurat R Toolkit Comprehensive suite for scRNA-seq analysis. Seurat::SCTransform(), IntegrateData()
Harmony R/Python Fast, sensitive batch integration algorithm. harmony::RunHarmony()
MIC-Drop Vector Library For in vivo delivery of genetic perturbations. Custom-designed sgRNA/miRNA library.

Application Notes

Within the thesis framework of applying MIC-Drop and Perturb-seq for in vivo pooled screening, library design is the foundational determinant of experimental robustness. Deconvolution—the accurate assignment of a phenotype (RNA-seq profile) to a specific genetic perturbation—relies entirely on the quality and complexity of the molecular barcodes linked to each guide RNA (gRNA) or molecular payload. Insufficient barcode diversity or poor design leads to index collisions, misassignment, and data loss, which is catastrophic in complex in vivo environments.

Key principles for robust deconvolution include:

  • Maximal Barcode Diversity: A large theoretical barcode space minimizes the probability of two cells receiving the same barcode (collision), which is critical when profiling hundreds of thousands of cells from an entire organism.
  • Sequence Constraints: Barcodes must be designed to avoid homopolymers, secondary structure, and sequences that interfere with oligo synthesis, PCR amplification, or sequencing library preparation.
  • Error Correction: Incorporating error-correcting codes (e.g., Hamming distance) allows for the detection and correction of single-nucleotide errors introduced during PCR or sequencing.
  • Multi-Modal Identification: Combining a slowly-rotating barcode (e.g., a 10bp sequence read in R1) with a unique molecular identifier (UMI) and the gRNA/payload sequence itself creates a multi-layered, deconvolvable identity for each perturbation event.

Protocol: Design and Cloning of a High-Complexity Barcoded gRNA Library

Objective: To synthesize and clone a pooled gRNA library with complexity >10^5, incorporating error-detecting barcodes suitable for in vivo Perturb-seq via MIC-Drop.

Materials:

  • Oligo pool (Twist Bioscience or equivalent) containing designed library sequences.
  • Restriction enzymes (e.g., BsmBI) and T4 DNA ligase.
  • Lentiviral backbone plasmid (e.g., lentiGuide-Puro with mCherry reporter).
  • Electrocompetent E. coli (e.g., Endura ElectroCompetent Cells).
  • Ampicillin/LB agar plates.
  • QIAprep Spin Miniprep and Maxiprep Kits.

Procedure:

  • Library Design In Silico: a. For each gRNA target sequence, generate 10-20 associated barcode sequences (minimum length 10bp) using a constrained random algorithm. b. Filter barcodes to ensure a minimum Hamming distance of 3 between any two barcodes in the pool. c. Avoid homopolymers >3bp and sequences with high GC content (>70% or <30%). d. Assemble the final oligo sequence: 5'- [Adapter] - [Barcode] - [gRNA Scaffold] - [Adapter] -3'.
  • Pooled Oligo Synthesis: Order the final library as a pooled, array-synthesized oligonucleotide pool.
  • PCR Amplification: Amplify the oligo pool using primers that add appropriate restriction overhangs (compatible with BsmBI sites in the backbone).
  • Golden Gate Assembly: a. Digest the PCR-amplified oligo pool and the lentiviral backbone plasmid with BsmBI. b. Perform a Golden Gate assembly reaction using T4 DNA ligase and the digested products. c. Incubate in a thermocycler: (37°C for 5 min, 16°C for 5 min) x 30 cycles, then 50°C for 5 min, 80°C for 10 min.
  • Library Transformation and Amplification: a. Transform the entire assembly reaction into a large volume of electrocompetent E. coli using a high-efficiency protocol. Critical: Aim for a transformation yield that recovers at least 1000x the library diversity (e.g., >10^8 colonies for a 10^5 library). b. Plate serial dilutions to assess colony count. Scrape all colonies from the plates for maxiprep culture. c. Culture the pooled bacteria in 500mL LB + Amp overnight. d. Perform a maxiprep to harvest the final plasmid library. Sequence a sample (MiSeq) to validate barcode distribution and gRNA representation.

Table 1: Quantitative Comparison of Barcode Design Strategies

Design Parameter Low-Complexity Design (Prone to Collision) High-Complexity, Optimized Design (This Protocol) Impact on In Vivo Deconvolution
Theoretical Barcode Diversity 1,000 - 10,000 1 x 10^6 Enables screening of >10^5 cells with negligible collision probability.
Minimum Hamming Distance 1 (None enforced) 3 Allows correction of single-base sequencing errors, reducing data loss.
Barcode Length 8 bp 10-12 bp Increases diversity and reduces optical duplicates.
UMI Integration No Yes (8bp UMI on read 2) Distinguishes between PCR duplicates and unique transcriptional events.
Estimated Collision Rate at 100k Cells >20% <0.1% Preserves unique phenotype-assignment fidelity in a complex tissue sample.

Table 2: Research Reagent Solutions Toolkit

Item Function/Application in MIC-Drop/Perturb-seq
Array-Synthesized Oligo Pools (Twist Bioscience) Source for highly complex, customized gRNA/barcode libraries.
High-Efficiency Electrocompetent Cells (Endura, Lucigen) Essential for maintaining library complexity during cloning without bottlenecking.
Lentiviral Packaging Mix (psPAX2, pMD2.G) Production of lentiviral particles for in vitro or in vivo delivery of the barcoded library.
Single-Cell RNA-seq Kit (10x Genomics Chromium Next GEM) Standard downstream platform for capturing barcodes and transcriptomes from pooled in vivo samples.
Droplet Generation Oil & Chips (Bio-Rad, Dolomite) For creating MIC-Drop or similar water-in-oin emulsions encapsulating single cells and barcoded beads.
Barcode Demultiplexing Software (Cell Ranger, Bartender, zUMIs) Computational pipeline to extract, correct, and count barcodes/UMIs from raw sequencing data.

Barcode Library Construction and Screening Workflow

Anatomy of a Deconvolvable Sequencing Read

Within the expanding toolkit for in vivo functional genomics, integrated platforms like MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq (CRISPR-based single-cell RNA sequencing screens) enable high-throughput, pooled screening in model organisms. A critical, often under-optimized, parameter for successful in vivo screening is the precise delivery and analysis timeline of genetic perturbations. This Application Note details a systematic strategy for titrating viral or droplet dose and determining the optimal post-infusion analysis window, which is paramount for achieving robust signal-to-noise, minimizing confounding adaptive responses, and ensuring interpretable phenotyping within the complex milieu of a living animal.

Core Principles and Rationale

The efficacy of an in vivo Perturb-seq screen hinges on balancing several competing factors:

  • Saturation vs. Toxicity: The delivered dose of lentivirus (for Perturb-seq) or packaged guide RNA complexes (for MIC-Drop) must be high enough to ensure a high percentage of target cells are perturbed (high "infection/transduction efficiency") but low enough to avoid acute toxicity, immune clearance, or overwhelming the system's capacity, which can induce non-specific stress responses.
  • Phenotype Penetrance vs. Homeostatic Compensation: The timing of tissue harvest and single-cell analysis must allow the molecular and cellular phenotype of the perturbation to fully manifest but precede the onset of compensatory mechanisms that can mask the primary effect. This window is highly dependent on the target cell type's turnover rate and the biological process being studied.
  • Multiplexing Capacity vs. Clone Detection: Higher perturbation diversity per animal increases screening throughput but requires sufficient library coverage and cell yield to reliably detect each clone within the sampled population.

Failure to optimize these parameters can lead to false negatives, high variability, and uninterpretable data.

Table 1: Exemplar Titration Parameters for Murine BrainIn VivoPerturb-seq

Parameter Low Dose Medium Dose (Recommended Start) High Dose Measurement Goal
Lentiviral Titer 1 x 10^7 TU/mL 5 x 10^7 TU/mL 2 x 10^8 TU/mL Transduction Efficiency >30%
Injection Volume 1 µL 2 µL 3-4 µL Minimal reflux, target area coverage
MOI (Estimated in vivo) ~0.3 ~1.5 ~6.0 Balance between multiplicity and toxicity
Analysis Timepoints 3 days post-injection (dpi) 7 dpi, 14 dpi 21 dpi, 28 dpi Capture early vs. stable phenotypes
Target Cell Recovery 50-200 cells/perturbation 500-1000 cells/perturbation 1000+ cells/perturbation Statistical power for differential expression
Library Complexity 50 guides/animal 200 guides/animal 500+ guides/animal Maximize throughput with confident clone ID

Table 2: Impact of Dose & Timing on Key Screen Quality Metrics

Optimization State Perturbation Efficiency Cell Viability Post-Transduction Phenotype Strength (Avg. DE Genes) Inter-Animal Variability (CV) Notes
Under-dosed / Too Early <20% High (>90%) Low (<100) High (>30%) Insufficient signal, high noise.
Optimized 30-60% Acceptable (70-85%) High (100-300) Low (<20%) Robust, reproducible signatures.
Over-dosed / Too Late >80% Low (<60%) High but Confounded Medium Toxicity & compensation dominate.

Detailed Experimental Protocols

Protocol 1: Pilot Viral/Droplet Dose Titration and Validation

Objective: To determine the maximal sub-toxic dose that achieves robust perturbation efficiency in the target tissue.

Materials:

  • Purified lentiviral library (for Perturb-seq) or packaged MIC-Drop guide RNA complexes.
  • Sterile PBS or appropriate infusion vehicle.
  • Animal model (e.g., adult mouse) with defined injection coordinates.
  • Stereotaxic injection system.
  • Analytical tools: Flow cytometer (if using a fluorescent reporter), genomic DNA extraction kit, PCR reagents for barcode amplification.

Procedure:

  • Prepare Dose Cohorts: Aliquot the viral/library stock into at least three serial dilutions (e.g., 1:10, 1:2, and neat stock). Maintain consistent infusion volume across cohorts.
  • Stereotaxic Infusion: For each dose cohort (n=3-4 animals), perform targeted intracranial (or tissue-specific) infusion using standard aseptic surgical procedures. Include a vehicle-only control cohort.
  • Harvest and Process: At a fixed, early timepoint (e.g., 7 days post-infusion), euthanize animals and dissect the target tissue region.
  • Split Sample Analysis: For each animal, divide the dissected tissue into two portions:
    • Portion A (Efficiency): Dissociate into a single-cell suspension. Analyze via flow cytometry for reporter expression (if applicable) or extract genomic DNA for quantification of vector copies per cell (qPCR of barcode or viral backbone).
    • Portion B (Toxicity): Process for histology (H&E staining) or conduct a viability assay (e.g., Trypan Blue exclusion) on the dissociated cells.
  • Data Integration: Plot dose against transduction efficiency and cell viability. The optimal dose is the highest titer that does not significantly reduce viability compared to the vehicle control while achieving >30% efficiency.

Protocol 2: Longitudinal Time-Course Analysis for Phenotype Stabilization

Objective: To identify the time window where perturbation-induced transcriptional phenotypes are fully penetrant but not yet masked by systemic compensation.

Materials:

  • Optimized viral/library dose from Protocol 1.
  • Animal cohorts for each planned timepoint.
  • Single-cell RNA-sequencing platform (e.g., 10x Genomics).
  • Bioinformatics pipeline for Perturb-seq analysis (e.g., CellRanger, Seurat, Mixscape).

Procedure:

  • Cohort Establishment: Infect a large cohort of animals using the optimized dose. Randomly assign animals to pre-defined harvest timepoints (e.g., 3, 7, 14, 21, and 28 days post-infusion).
  • Tissue Harvest and scRNA-seq Library Prep: At each timepoint, harvest the target tissue from n=3-4 animals. Pool cells from animals within the same timepoint cohort to minimize individual variation. Proceed immediately with single-cell suspension preparation and barcoded scRNA-seq library generation according to platform-specific protocols (e.g., 10x Genomics 3’ Gene Expression with Feature Barcoding technology for guide RNA capture).
  • Sequencing and Primary Analysis: Sequence libraries to sufficient depth. Align reads, quantify gene expression, and assign cell-specific perturbation identities (guide barcodes).
  • Time-Course Phenotype Assessment: For a set of positive control perturbations (e.g., essential genes, known pathway activators), perform the following at each timepoint:
    • Calculate the average number of differentially expressed genes per cell within the perturbed population.
    • Assess the magnitude and significance of expression changes in known pathway markers.
    • Measure the within-population consistency of the signature (e.g., by correlation across cells).
  • Determine Optimal Window: The ideal analysis timepoint is characterized by: (i) a peak or plateau in the number and magnitude of differential expression events for control perturbations, (ii) high signature consistency, and (iii) minimal activation of non-specific stress or immune pathways in negative control cells.

Visualizations

Diagram 1: Optimization Strategy Decision Workflow

Diagram 2: Phenotype Dynamics Dictating Analysis Timing

The Scientist's Toolkit: Research Reagent Solutions

Item Function in Optimization Example Product/Type
High-Titer Lentiviral Prep Enables delivery of complex sgRNA libraries at low volumes, reducing surgical trauma and increasing local MOI. Lenti-X Concentrator (Takara), 3rd gen packaging systems (psPAX2, pMD2.G).
Fluorescent Reporter Virus Allows rapid, quantitative assessment of transduction efficiency and spatial distribution in tissue slices via microscopy or flow cytometry prior to sequencing. pLenti-EF1a-EGFP, pLKO.1-puro-CMV-tGFP.
Viability/Cytotoxicity Assay Quantifies acute toxicity of the delivery formulation or perturbation in vivo. Critical for dose-limiting determinations. Lactate Dehydrogenase (LDH) Assay, TUNEL Staining Kits.
sgRNA Barcode Amplification Primers Specific primers to amplify integrated guide barcodes from genomic DNA for qPCR or NGS library prep to measure clone abundance and distribution. Custom oligonucleotides targeting library backbone constant regions.
Single-Cell Partitioning Reagents Essential for converting optimized tissue samples into scRNA-seq libraries. Chromium Next GEM Chip G (10x Genomics), Partitioning Oil.
Feature Barcoding Kit Captures perturbation identity (sgRNA) alongside transcriptional profile in each cell during scRNA-seq. CellPlex Kit, CRISPR Guide Capture (10x Genomics).
Bioinformatics Software Analyzes time-course scRNA-seq data to quantify perturbation strength, specificity, and compensation over time. Seurat, Scanpy, Mixscape, MUSIC.

Within the broader thesis on advancing in vivo pooled CRISPR screening using MIC-Drop and Perturb-seq, rigorous experimental and analytical controls are paramount. This Application Note details the implementation of Control Guides and the Mixscape computational framework to enhance the specificity and interpretability of in vivo genetic perturbation data.

The Critical Role of Control Guides

Control guides are non-targeting sgRNAs or sgRNAs targeting safe-harbor loci (e.g., AAVS1, ROSA26) that induce no specific phenotypic change. In MIC-Drop/Perturb-seq workflows, they are essential for:

  • Establishing baseline gene expression profiles.
  • Distinguishing perturbation-specific effects from non-specific delivery/vector effects.
  • Accounting for batch effects and inter-animal variability in in vivo studies.
  • Empowering advanced analytical tools like Mixscape to model and remove technical noise.

Mixscape: A Primer forIn VivoPerturb-seq Analysis

Mixscape (Papalexi et al., Nature Biotechnology, 2021) is a computational method designed to enhance the signal-to-noise ratio in pooled Perturb-seq data. Its application to in vivo screens is critical due to increased biological and technical noise.

Core Principle: Mixscape uses the multivariate gene expression profiles of cells transfected with control guides to define a "perturbation-negative" reference population. It then projects all cells (both control and perturbed) into a principal component (PC) space defined by this reference. The key metric, the "perturbation signature score," is the PC1 value for each cell, which separates cells that have responded to a specific genetic perturbation from non-responders and control cells.

Table 1: Impact of Mixscape Analysis on Perturbation Detection Sensitivity

Metric Raw Perturb-seq Data (No Mixscape) Mixscape-Processed Data Notes
Median DE Genes per KO (p<0.05) 45 118 In a model in vivo T cell screen (targeting 20 kinases).
Signal-to-Noise Ratio* 1.0 (ref) 2.8 *Calculated as (mean signature score of KO cells) / (SD of control guide cells).
Percentage of KO Cells Classified as "Responsive" ~40-60% ~75-90% Varies by gene essentiality and cell type.
False Positive Rate (DE Genes) 12% <5% At nominal p-value threshold of 0.05.

Table 2: Recommended Control Guide Ratios for In Vivo MIC-Drop/Perturb-seq

Screening Scale Total Guide Number Minimum Control Guides Recommended % of Library Purpose
Focused Screen 50-200 20-30 15-20% Robust per-animal normalization.
Genome-wide (Mouse) ~10,000 500-1000 5-10% Modeling complex noise, batch correction.

Detailed Protocols

Protocol 1: Designing and Incorporating Control Guides forIn VivoMIC-Drop

  • Design: Select 20-30 non-targeting sgRNA sequences from established libraries (e.g., Brunello, horCRISPRi-v2). Ensure no significant off-target homology (BLAST against relevant genome).
  • MIC-Drop Emulsion Preparation: Pool control guide plasmids with targeting guides at the ratio defined in Table 2. Co-encapsulate with Cas9 mRNA/protein in droplets.
  • Animal Injection & Harvest: Follow standard MIC-Drop protocol for your model system (e.g., intravenous or intratumoral injection). Harvest target tissue at endpoint.
  • Single-Cell Processing: Prepare a single-cell suspension for 10x Genomics Chromium Next GEM 3' or 5' gene expression with CRISPR guide capture.
  • Sequencing & Alignment: Sequence libraries and align using Cell Ranger (v7.0+). Use cellranger count with the --feature-ref flag to assign guide identities from the Feature Barcode library to cell barcodes.

Protocol 2: Implementing Mixscape Analysis forIn VivoPerturb-seq Data

Prerequisite: A Seurat object containing UMI counts, guide calls per cell, and preliminary clustering.

Visualization of Workflows & Signaling

Diagram Title: Integrated Experimental and Computational Workflow

Diagram Title: Mixscape Classification Logic

The Scientist's Toolkit

Table 3: Essential Research Reagent Solutions for Controlled In Vivo Screens

Item Function & Rationale Example/Supplier
Validated Non-Targeting sgRNA Pool Provides the essential negative control population for normalizing expression and running Mixscape. Reduces bias from any single sequence. Horizon Discovery (horCRISPRi); Addgene #127243 (Brunello NT).
MIC-Drop Assembly Reagents For reproducible co-encapsulation of guides, Cas9, and barcodes in droplets for in vivo delivery. Bio-Rad Droplet Generation Oil; custom PEG surfactants.
10x Genomics Chromium Next GEM Kit Enables coupled gene expression and guide capture from single cells recovered from in vivo tissue. 10x Genomics (Cat #1000268 / #1000269).
Cell Ranger Feature Barcoding Software Essential pipeline for aligning sequencing data and associating specific sgRNA barcodes with cell transcriptomes. 10x Genomics (Cell Ranger v7.0+).
Mixscape R Package Core computational toolkit for noise modeling, perturbation signature scoring, and responder classification. GitHub: immunogenomics/mixscape.
High-Fidelity Cas9 Nuclease Ensures efficient and specific cleavage in vivo, minimizing off-target effects that confound control comparisons. IDT Alt-R S.p. Cas9; Thermo Fisher TrueCut Cas9.

MIC-Drop vs. Perturb-seq vs. Traditional Methods: Validation, Strengths, and Limitations

Within the rapidly advancing field of in vivo functional genomics, two technologies—MIC-Drop and Perturb-seq—represent paradigm-shifting approaches for pooled CRISPR screening with single-cell transcriptomic readouts. The broader thesis posits that the choice between these platforms is not a matter of simple superiority but a strategic decision dictated by the specific biological question, scale, and resource constraints of the research. This application note provides a detailed, data-driven comparison to guide researchers and drug development professionals in selecting and implementing the optimal tool for their in vivo screening campaigns.

MIC-Drop (Multiplexed Interrogation of Cells by Droplet) combines droplet microfluidics with a unique molecular barcoding strategy. Each Cas9 ribonucleoprotein (RNP) complex and its corresponding single-guide RNA (sgRNA) are encapsulated within a water-in-oil droplet alongside a unique DNA barcode. Cells are subsequently merged with these RNP-loaded droplets for transfection. This physical co-encapsulation allows for the delivery of multiple RNPs per cell and directly links the perturbation barcode to the cell prior to sequencing.

Perturb-seq (CRISPR-based Perturbation sequencing) utilizes viral delivery (typically lentivirus) to stably integrate both the Cas9 protein and a library of sgRNAs into a cell population. The sgRNA sequence itself acts as the barcode, which is captured during single-cell RNA sequencing (scRNA-seq) library preparation from the polyadenylated transcript. The perturbation identity is inferred by sequencing the sgRNA from the cellular cDNA.

Head-to-Head Quantitative Comparison

The following table summarizes the key performance and logistical parameters of each technology based on current literature and implementation reports.

Table 1: Core Comparison of MIC-Drop and Perturb-seq

Parameter MIC-Drop Perturb-seq (Pooled Lentiviral)
Primary Throughput (Cells) Moderate (~10⁴ - 10⁵ per run) Very High (10⁵ - 10⁷+)
Perturbations per Cell High (Typically 5-10) Low (Typically 1, some doublets)
Delivery Method Droplet Microfluidics (RNP) Lentiviral Transduction (DNA)
Perturbation Timing Acute, transient (RNP) Chronic, stable (Integrated)
In Vivo Compatibility High (Direct RNP delivery to tissues/organisms) Moderate (Requires pre-engineered cells or in vivo viral delivery)
Multimodal Readouts Compatible with CITE-seq, ATAC-seq Well-established for CITE-seq, ATAC-seq
Setup Cost High (Microfluidics device, custom reagents) Lower (Leverages standard scRNA-seq workflows)
Reagent Cost per Cell Higher Lower (at very large scale)
Operational Complexity High (Microfluidics expertise required) Moderate (Standard molecular biology)
Primary Advantage Multiplexing in vivo, acute perturbations Unmatched scale, stable lineage tracing

Detailed Experimental Protocols

Protocol A: MIC-Drop for In Vivo Liver Screening in Zebrafish

Objective: To perform multiplexed gene knockout in zebrafish hepatocytes and assess transcriptional outcomes. Key Reagents: See Scientist's Toolkit below.

  • sgRNA and Barcode Complex Preparation:

    • Synthesize sgRNAs with T7 promoter. Transcribe and purify using a commercial kit.
    • Generate a barcode oligo pool (80-100bp) containing a universal primer site, a unique 12-16nt barcode, and an overhang complementary to the sgRNA constant region.
    • Assemble Cas9 RNP by incubating purified Cas9 protein with each sgRNA (3:1 molar ratio) at 25°C for 10 minutes.
    • Ligate the barcode oligo to the sgRNA constant region using T4 DNA ligase. Purify the RNP-Barcode complex.
  • Droplet Generation and Cell Loading:

    • Load the RNP-Barcode complexes into a microfluidic droplet generator (e.g., based on FlowJEM or Drop-seq chips).
    • Generate droplets with an oil phase (e.g., HFE-7500 with 2% PEG-PFPE surfactant).
    • Prepare a single-cell suspension from dissociated zebrafish liver (or inject whole embryos at 1-cell stage).
    • Merge the cell stream with the RNP-droplet stream on-chip to co-encapsulate individual cells with RNP complexes.
  • In Vivo Delivery and Recovery:

    • Break the emulsion and inject the pooled cells/RPNs into the zebrafish circulation or target tissue.
    • Allow phenotypic development for 3-7 days.
    • Re-dissociate the target tissue (e.g., liver) into a single-cell suspension.
  • Single-Cell RNA Sequencing:

    • Process the cell suspension through a standard scRNA-seq platform (10x Genomics Chromium).
    • Prepare libraries following standard protocols, ensuring primers capture the perturbation barcode (added as a custom feature).
    • Sequence on an Illumina NovaSeq (28:10:90 for Read1:Index:Read2).
  • Data Analysis:

    • Align gene expression reads (Read2) to the zebrafish reference genome.
    • Extract barcode reads (Read1) and demultiplex to assign 1-10 perturbations per cell.
    • Perform differential expression analysis (e.g., with MAST) between cells with a target perturbation versus control sgRNAs.

Protocol B: Pooled In Vitro Perturb-seq Screen in a Cell Line

Objective: Genome-scale CRISPRi screen in human K562 cells to identify regulators of differentiation.

  • Library Cloning and Virus Production:

    • Clone a pooled sgRNA library (e.g., human CRISPRi v2) into a lentiviral vector containing the sgRNA scaffold and U6 promoter.
    • Co-transfect HEK293T cells with the lentiviral vector, psPAX2, and pMD2.G using polyethylenimine (PEI).
    • Harvest virus supernatant at 48 and 72 hours, concentrate by ultracentrifugation, and titer.
  • Cell Transduction and Selection:

    • Transduce K562 cells stably expressing dCas9-KRAB at a low MOI (<0.3) to ensure single integrations. Include a spinfection step.
    • After 48 hours, select transduced cells with puromycin (2 µg/mL) for 5-7 days.
  • Single-Cell Library Preparation:

    • Harvest 1-2 million cells, ensuring >500x coverage of the sgRNA library.
    • Load cells onto a 10x Genomics Chromium Controller to generate single-cell Gel Bead-in-Emulsions (GEMs).
    • Construct libraries using the Chromium Single Cell 3’ Reagent Kit v3.1, incorporating a custom step during cDNA amplification to enrich for sgRNA-derived cDNA (using a primer specific to the sgRNA constant region).
  • Sequencing and Analysis:

    • Sequence libraries: 26 bp for cell barcode/UMI (Read1), 8 bp for sample index (i7), and 98 bp for transcript/sgRNA (Read2).
    • Process using Cell Ranger with a custom reference genome that includes the sgRNA library sequences.
    • Use cellranger count to generate a feature-barcode matrix encompassing both gene expression and sgRNA counts.
    • Analyze with dedicated tools (e.g., Crispy or Perturb-seq pipeline) to associate sgRNA identities with transcriptomic clusters and perform differential expression.

Visualization of Workflows and Pathways

Diagram 1: Comparative experimental workflows for MIC-Drop and Perturb-seq.

Diagram 2: Logical pathway from perturbation to sequencing readout.

The Scientist's Toolkit: Essential Research Reagent Solutions

Table 2: Key Reagents and Materials

Item Function in MIC-Drop Function in Perturb-seq
Purified Cas9 Nuclease Core component of pre-assembled RNP complexes. Not directly used; replaced by stable cell line expression.
Custom Barcode Oligo Pool Provides unique, ligatable molecular identifier for each RNP. Not typically used; sgRNA sequence is the barcode.
Microfluidic Chips & Surfactant Generates stable water-in-oil emulsions for RNP/cell encapsulation. Not required for standard workflows.
Lentiviral sgRNA Library Not used in standard protocol. Core reagent. Delivers stable, integrated genetic perturbations at scale.
Packaging Plasmids (psPAX2, pMD2.G) Not used. Essential for producing lentiviral particles.
dCas9-KRAB Expressing Cell Line Optional; can be used as starting material. Critical starting material for CRISPRi/a screens.
10x Genomics Chromium Controller & Kits Used for final single-cell RNA-seq library prep. Standard platform for high-throughput scRNA-seq readout.
Custom PCR Primers for Barcode/sgRNA Enrichment Designed to amplify the attached barcode during library prep. Designed to amplify the sgRNA from polyadenylated transcript.
Polyethylenimine (PEI) Not typically used. Standard transfection reagent for lentivirus production in HEK293T cells.

Application Notes

The advent of scalable in vivo functional genomics has been pivotal for target discovery and validation. While RNA interference (RNAi) has been the cornerstone for loss-of-function screening for decades, CRISPR/Cas9-mediated knockout now offers a compelling alternative. This analysis contrasts these two pillars within the framework of next-generation screening technologies like MIC-Drop (Multiplexed Intermixed CRISPR Droplets) and Perturb-seq, which are shifting paradigms from bulk phenotypic readouts to single-cell, multi-omics resolution in vivo.

Key Comparative Parameters:

Table 1: Core Technological Comparison

Parameter CRISPR/Cas9 Knockout RNAi (shRNA)
Mechanism of Action Creates double-strand breaks leading to frameshift indels and gene disruption. Triggers mRNA degradation or translational inhibition via the RNA-induced silencing complex (RISC).
Effect on Target Permanent, complete knockout (biallelic). Transient, partial knockdown (typically 70-90% reduction).
Specificity & Off-Targets High specificity; off-targets are sequence-dependent and can be minimized with high-fidelity Cas9. High risk of seed-sequence-mediated off-target mRNA degradation.
Kinetics Permanent effect post-DNA repair; phenotype may manifest over days as protein degrades. Rapid mRNA depletion (hours to days); phenotype is reversible.
Screening Library Design Requires sgRNA design in early coding exons; ~4-5 guides/gene recommended. Requires shRNA design against mature mRNA; ~5-10 shRNAs/gene recommended to overcome inefficacy.
In Vivo Delivery AAV, lentivirus for sgRNA + Cas9 (constitutive/inducible). Commonly uses Cre-dependent Cas9 mouse lines. Lentivirus for shRNA expression; can use inducible Pol II/III systems.
Phenotypic Readout Integration Highly compatible with single-cell RNA-seq (Perturb-seq) for direct transcriptome capture. Compatible, but shRNA barcode must be captured separately from transcriptome, adding complexity.

Integration with Advanced Screening Platforms

For in vivo applications, CRISPR/Cas9 is inherently synergistic with techniques like MIC-Drop and Perturb-seq. MIC-Drop encapsulates individual sgRNAs with Cas9 protein and a unique barcode into droplets, enabling highly multiplexed direct in vivo injection and lineage tracing. Perturb-seq links a CRISPR perturbation to the whole-transcriptome profile of the same single cell. RNAi struggles to match this integrated workflow due to its cytoplasmic mechanism and less straightforward coupling of the guide barcode to the transcriptional outcome within a single-cell sequencing workflow.

Table 2: Performance in Advanced Screening Contexts

Screening Context CRISPR/Cas9 Advantage RNAi Consideration
Essential Gene Identification Identifies strong, consistent lethal phenotypes. May miss essential genes due to incomplete knockdown, revealing hypomorphic phenotypes.
Complex Phenotype (e.g., Tumor Metastasis) Reveals genes where complete loss is required for phenotype. Can identify genes sensitive to dosage effects.
Single-Cell Omics Readout (Perturb-seq) Native compatibility; sgRNA transcript is captured in nuclear RNA-seq. Requires custom barcoding strategies to link shRNA to cell transcriptome.
High-Throughput In Vivo Delivery (MIC-Drop) Ideal for encapsulated RNP delivery and in situ mutagenesis. Less suited for protein encapsulation; relies on viral transduction for shRNA expression.

Experimental Protocols

Protocol 1: In Vivo Pooled CRISPR Knockout Screen Using Lentiviral sgRNA Delivery

Objective: To perform a positive selection screen for tumor suppressor genes in a murine hepatocellular carcinoma model.

Materials:

  • Murine sgRNA library (e.g., Brie, Mouse GeCKO v2)
  • Lentiviral packaging plasmids (psPAX2, pMD2.G)
  • HEK293T cells for virus production
  • Cas9-expressing mouse model (e.g., Rosa26-LSL-Cas9)
  • Hepatocyte-specific Cre-AAV8
  • NGS platform for barcode sequencing

Procedure:

  • Library Amplification & Virus Production: Amplify the sgRNA plasmid library in E. coli and prepare high-titer lentivirus in HEK293T cells using standard calcium phosphate transfection.
  • In Vivo Transduction & Tumor Initiation: Harvest primary hepatocytes from Rosa26-LSL-Cas9 mice. Transduce cells ex vivo with the sgRNA lentivirus at a low MOI (<0.3) to ensure single-guide incorporation. Re-implant transduced cells into the liver of immunodeficient recipient mice.
  • Tumor Formation & Selection: Allow tumors to develop over 8-12 weeks.
  • Sample Collection & Genomic DNA Extraction: Harvest tumor tissue and matched normal liver control. Extract genomic DNA using a column-based kit.
  • sgRNA Amplification & Sequencing: Perform a two-step PCR to amplify the integrated sgRNA cassette from genomic DNA and attach Illumina sequencing adapters and sample barcodes.
  • Data Analysis: Sequence on an Illumina MiSeq/NextSeq. Align reads to the sgRNA library reference. Calculate sgRNA enrichment/depletion in tumors vs. input using MAGeCK or PinAPL-Py.

Protocol 2: In Vivo RNAi Knockdown Screen Using Inducible shRNA

Objective: To identify kinase genes required for the maintenance of an established lymphoma in vivo.

Materials:

  • Doxycycline-inducible shRNAmir library (e.g., TRMPV-Neo)
  • Lentiviral packaging system
  • Target lymphoma cell line
  • NSG mice
  • Doxycycline chow (625 mg/kg)

Procedure:

  • Stable Cell Line Generation: Transduce the target cell line with the inducible shRNA library at a low MOI (<0.3). Select with puromycin for 7 days to generate a stable, polyclonal library pool.
  • Tumor Establishment: Inject 1x10^6 library-representative cells subcutaneously into NSG mice. Allow tumors to reach ~200 mm³.
  • Induction of Knockdown: Randomize mice into two groups. Switch the experimental group to doxycycline chow to induce shRNA expression; maintain the control group on standard chow.
  • Phenotypic Monitoring: Measure tumor volume every 3 days for 3 weeks.
  • Sample Processing & Barcode Recovery: Harvest tumors at endpoint. Extract total RNA. Perform RT-PCR to amplify the specific shRNA barcode sequence embedded in the shRNAmir backbone.
  • NGS & Analysis: Sequence barcode amplicons. Compare shRNA barcode abundance in doxycycline-induced tumors versus pre-induction input or control tumors to identify depleted shRNAs.

Visualizations

Gene Editing Screening Selection Logic

In Vivo Screening Workflow Comparison

The Scientist's Toolkit: Research Reagent Solutions

Item Function in Screening Example/Notes
High-Complexity sgRNA/shRNA Library Provides genome-scale coverage for unbiased gene discovery. Mouse Brunello (CRISPR), TRC (RNAi). Must maintain >500x coverage.
Cas9-Expressing Mouse Model Enables in vivo CRISPR screening without Cas9 delivery. Rosa26-LSL-Cas9, CAG-Cas9 transgenic lines.
Inducible shRNA System Allows controlled gene knockdown after tumor establishment. Tet-On systems (e.g., TRMPV, pINDUCER).
Next-Gen Barcoded Delivery For high-multiplex, traceable in vivo editing. MIC-Drop: encapsulates sgRNA+Cas9 RNP with a heritable DNA barcode.
Single-Cell Multi-omics Platform Links genetic perturbation to transcriptional outcome. Perturb-seq: combines CRISPR screening with droplet-based scRNA-seq (10x Genomics).
sgRNA/shRNA Amplicon Seq Kit Recovers perturbation identity from bulk tissue for NGS. Illumina Nextera-based custom PCR protocols.
Bioinformatics Pipeline Quantifies guide/barcode abundance and statistical significance. MAGeCK (CRISPR), RIGER (RNAi), PinAPL-Py.

Application Notes

This application note validates the MIC-Drop (Multiplexed Interrogation of Cells by gRNA Dropout) platform for large-scale, in vivo genetic screening. The work is contextualized within a broader thesis that positions MIC-Drop, coupled with Perturb-seq principles, as a transformative methodology for functional genomics in whole organisms. The case study demonstrates a pooled, in vivo CRISPR screen targeting 133 candidate genes in zebrafish to systematically identify novel regulators of cardiac development and function.

Key Quantitative Results: The primary screen quantified cardiac morphology and function at 3 days post-fertilization (dpf). Key hits were validated through secondary assays.

Table 1: Primary Screen Hit Identification (Phenotypic Classes)

Phenotypic Class Number of Genes Identified Example Phenotype Statistical Threshold
Severe Dysmorphology 8 Incomplete looping, severe edema p < 0.001, effect size > 2 SD
Cardiomyopathy 12 Cardiomyopathy, reduced fractional shortening p < 0.005, effect size > 1.5 SD
Heart Rate Defects 9 Bradycardia or arrhythmia p < 0.01
No Observable Defect 104 Wild-type-like morphology & function Not significant

Table 2: Validation Rates from Secondary Analysis

Validation Method Genes Tested Confirmation Rate Key Metric
Individual MIC-Drop Re-injection 20 85% (17/20) Phenotype recapitulation
Whole-mount in situ Hybridization 15 80% (12/15) Altered cardiac chamber markers (e.g., vmhc, amhc)
High-Speed Videomicroscopy 10 90% (9/10) Quantified fractional shortening & heart rate

Detailed Experimental Protocols

Protocol 1: Pooled MIC-Drop Library Preparation and Embryo Injection

  • gRNA Design & Library Cloning: Design four gRNAs per target gene (133 genes + 10 non-targeting controls) using optimized zebrafish CRISPR rules. Clone into the MIC-Drop vector backbone containing the U6:gRNA scaffold and a unique barcode sequence for each gRNA.
  • Water-in-Oil Emulsion Encapsulation: Dilute the pooled plasmid library. Use a microfluidic droplet generator to encapsulate single plasmid molecules, PCR reagents, and barcoded beads into ~50 µm diameter water-in-oil droplets. Perform emulsion PCR to amplify and link each gRNA sequence to its unique bead barcode.
  • Break Emulsion & Bead Recovery: Break the emulsion using perfluoro-octanol. Wash and quantify the barcoded beads. Load beads into the MIC-Drop device cartridge.
  • Zebrafish Embryo Microinjection: Align one-cell stage Tg(myl7:EGFP) zebrafish embryos on an injection mold. Using the MIC-Drop device, co-inject a single barcoded bead (carrying one gRNA) and Cas9 protein into the cell yolk of each embryo. Target ~500 embryos per target gene for statistical power.

Protocol 2: Phenotypic Screening & Image Analysis at 3 dpf

  • Sample Preparation: Anesthetize 3 dpf larvae with tricaine. Mount laterally in 1% low-melt agarose on a glass-bottom dish.
  • High-Throughput Imaging: Use an automated fluorescent microscope with a temperature-controlled stage. Acquire z-stacks of the heart for each larva using GFP channel (cardiac muscle). Capture brightfield images for overall morphology.
  • Automated Image Analysis Pipeline:
    • Segmentation: Use convolutional neural network (U-Net) to segment the heart ventricle and atrium from GFP stacks.
    • Morphometry: Extract metrics: heart area, perimeter, circularity, and chamber dimensions.
    • Function Analysis: From a time-series, calculate heart rate (beats/min) and fractional shortening: ((End-diastolic diameter - End-systolic diameter) / End-diastolic diameter) * 100.
  • Genotype-Phenotype Linking: Isolate genomic DNA from each phenotyped larva. Amplify and sequence the bead barcode region to link the observed cardiac phenotype to the specific gRNA/gene perturbation.

Protocol 3: Secondary Validation by Whole-mount In Situ Hybridization (WISH)

  • Probe Synthesis: Generate digoxigenin (DIG)-labeled antisense RNA probes for cardiac markers (vmhc for ventricle, amhc for atrium).
  • Fixation & Hybridization: Fix control and mutant larvae (identified by barcode sequencing) at 48-72 hpf in 4% PFA. Permeabilize with proteinase K. Pre-hybridize, then incubate with DIG-probe overnight at 65°C.
  • Detection: Wash stringently. Incubate with anti-DIG-AP antibody. Develop color reaction using NBT/BCIP substrate. Image larvae and assess chamber-specific gene expression patterns.

Visualizations

Title: MIC-Drop In Vivo Screening Workflow

Title: Cardiac Gene Network with MIC-Drop Hits

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Materials for MIC-Drop Cardiac Screening

Item / Reagent Function / Purpose Key Feature
MIC-Drop Vector Library Backbone for gRNA cloning and barcode association. Contains U6 promoter, gRNA scaffold, and unique molecular identifier (UMI) sites.
Barcoded Beads (MIC-Drop) Solid support for single gRNA delivery. Hydrogel beads with covalently linked, unique DNA barcode sequences.
Recombinant Cas9 Protein CRISPR effector for immediate genome editing post-injection. High specific activity, nuclease-grade, zebrafish-tested.
Tg(myl7:EGFP) Zebrafish Line Transgenic line with GFP-labeled cardiomyocytes. Enables live, high-contrast imaging of heart morphology and function.
Automated Fluorescence Microscope High-throughput phenotypic imaging. Motorized stage, z-stack capability, temperature control, and software automation.
Cardiac Phenotype Analysis Software Quantifies heart morphology and function from images. Custom or commercial (e.g., HeartJ) pipelines for segmentation and motion analysis.
NBT/BCIP Stock Solution Chromogenic substrate for in situ hybridization validation. Alkaline phosphatase substrate yielding purple precipitate.

Within the broader thesis on advancing in vivo functional genomics, this case study validates the synergistic potential of integrating MIC-Drop (Multiplexed Interrogation of Cells by Droplet) with Perturb-seq. While MIC-Drop enables scalable, in vivo combinatorial genetic perturbation, Perturb-seq provides the high-throughput transcriptional phenotyping necessary to decode the resulting cellular states. This application note details a foundational ex vivo Perturb-seq study in mice, which established critical methodology and biological insights into immune cell gene regulatory networks, paving the way for future combined in vivo MIC-Drop/Perturb-seq screens.

A pivotal study (Dixit et al., Cell, 2016) demonstrated the power of Perturb-seq by systematically mapping gene regulatory networks in immune cells. Researchers used CRISPR-Cas9 to perturb transcription factors (TFs) in mouse bone marrow-derived macrophages (BMDMs) and sequenced single-cell RNA to capture the effects.

Key Quantitative Findings:

Table 1: Key Experimental Parameters and Outcomes

Parameter Detail
Perturbation Library 24 transcription factor genes + 6 non-targeting controls
Screening Model Immortalized BMDMs from Cas9-expressing mice
Cells Analyzed ~80,000 single cells post-quality control
Key Deliverable A directed gene regulatory network linking TFs to target genes
Validation Rate ~90% of predicted regulatory interactions confirmed

Table 2: Example Network Interactions Identified

Perturbed Master Regulator Key Downstream Target Gene(s) Effect on Expression
Stat1 Irf1, Cxcl10 Strong activation
Cebpb Mmp13, Il6ra Context-dependent regulation
Irf8 Ccl22, Cish Repression

Detailed Experimental Protocols

Protocol 1: Library Cloning & Lentiviral Production

  • Design & Synthesis: Design 3 sgRNAs per target TF gene and 6 non-targeting control sgRNAs. Clone into a lentiviral sgRNA expression vector containing a GFP marker.
  • Lentivirus Production: Produce lentiviral particles in HEK293T cells using standard third-generation packaging systems (psPAX2, pMD2.G).
  • Titer Determination: Transduce HEK293T cells with serial dilutions of virus; measure GFP+ percentage via flow cytometry 72h post-transduction to calculate TU/mL.

Protocol 2: Cell Line Generation & Perturbation

  • Isolate BMDMs: Flush bone marrow from femurs/tibias of Cas9-expressing transgenic mice. Differentiate in DMEM + 10% FBS + 20% L929-conditioned medium (source of M-CSF) for 7 days.
  • Transduction at Low MOI: Transduce immortalized BMDMs at an MOI of ~0.3-0.4 to ensure most cells receive a single sgRNA. Use spinfection (1000g, 90 min, 32°C).
  • Selection & Expansion: Culture transduced cells for 7-10 days to allow for target protein turnover and transcriptional effects.

Protocol 3: Single-Cell RNA-Seq Library Preparation (Perturb-seq)

  • Single-Cell Capture: Use the Chromium Controller (10x Genomics) to partition single, perturbed cells into droplets with gel beads carrying unique barcodes.
  • Reverse Transcription: Perform RT within droplets to produce barcoded cDNA.
  • Library Construction: Amplify cDNA, enzymatically fragment, and add sample indexes via PCR to generate sequencing-ready libraries. Generate a separate amplicon sequencing library for sgRNA detection from bulk cellular DNA.
  • Sequencing: Sequence on an Illumina HiSeq 2500/4000 (or equivalent). Aim for >50,000 reads/cell for gene expression and sufficient coverage for sgRNA detection.

Protocol 4: Computational Analysis Pipeline

  • Alignment & Quantification: Align reads to the mouse genome (mm10) using Cell Ranger (10x Genomics). Count unique molecular identifiers (UMIs) per gene per cell.
  • Cell & sgRNA Assignment: Assign each cell to its perturbed gene by matching the cell barcode to the sgRNA sequence from the amplicon library.
  • Differential Expression: For cells perturbing Gene X vs. all control cells, perform differential expression (e.g., using MAST) to identify significantly up- or down-regulated genes.
  • Network Inference: Construct a directed network where edges are drawn from perturbed TFs to significantly altered genes. Use graph-based visualization tools.

Signaling Pathway & Workflow Visualizations

Perturb-seq Core Mechanism

Mouse BMDM Perturb-seq Workflow

The Scientist's Toolkit: Key Research Reagent Solutions

Table 3: Essential Materials and Reagents

Item Function in Experiment Specific Example / Note
Cas9-Expressing Mouse Line Provides the endogenous nuclease for CRISPR perturbations in primary immune cells. B6J.129(Cg)-Gt(ROSA)26Sor/J (JAX: 026179)
Lentiviral sgRNA Vector Delivers sgRNA and a selectable marker (e.g., GFP) into target cells. lentiGuide-Puro (Addgene #52963) or similar with mouse-specific promoters.
L929 Cell Line Source of M-CSF for the differentiation of mouse bone marrow progenitors into macrophages. ATCC CCL-1; conditioned medium is critical.
Chromium Single Cell 3' Kit Integrated solution for barcoding, RT, and library prep of single-cell transcripts. 10x Genomics, Cat. No. 1000268.
Cell Ranger Software Primary analysis pipeline for demultiplexing, alignment, barcode counting, and UMI aggregation. 10x Genomics (requires reference genome).
MAST R Package Statistical model for differential expression analysis in single-cell RNA-seq data, handles bimodality. Commonly used for Perturb-seq DE (Finak et al., Genome Biology, 2015).

Assessing Reproducibility and Robustness of In Vivo Screening Hits

Introduction and Context The integration of high-throughput in vivo screening technologies, such as MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq (CRISPR screening with single-cell RNA sequencing readout), is transforming functional genomics and early drug discovery. A core thesis in modern screening research posits that while these technologies enable unprecedented scale in identifying genetic hits in vivo, the subsequent validation of hit reproducibility and robustness forms the critical bottleneck. This application note provides detailed protocols and frameworks for systematically assessing screening hits, moving them from initial discovery toward robust therapeutic targets.

Key Research Reagent Solutions

Reagent / Material Function in Validation
MIC-Drop barcoded sgRNA libraries Enables pooled, in vivo delivery of multiplexed CRISPR perturbations with unique molecular identifiers (UMIs) for tracing.
Perturb-seq compatible sgRNA libraries Allows linking genetic perturbations to single-cell transcriptomic profiles in complex tissues.
Next-generation sequencing (NGS) reagents For quantifying sgRNA abundance from bulk tissue (hit enrichment) and for single-cell RNA-seq library preparation.
Viral vectors (AAV, Lentivirus) For efficient in vivo delivery of CRISPR components.
Cell-type specific Cre drivers Enables conditional, cell-type-restricted perturbation in mouse models for cell-autonomy tests.
Multiplexed fluorescent reporters Validates phenotypic consequences and co-localization of hits in tissue sections.
Statistical analysis software (e.g., Seurat, Scanpy) For processing single-cell Perturb-seq data and assessing transcriptional robustness.

Protocol 1: Primary In Vivo MIC-Drop/Perturb-seq Screening and Initial Hit Calling Objective: To execute a pooled in vivo screen and generate a primary hit list.

  • Library Design & Packaging: Design a barcoded sgRNA library targeting genes of interest. Package sgRNAs into lentiviral or AAV vectors alongside Cas9 (or dCas9-effector) elements.
  • In Vivo Delivery: Systemically or locally deliver the packaged library to adult animal models (e.g., mouse) or inject into embryos. Include a non-targeting sgRNA control pool.
  • Phenotypic Selection: Apply a phenotypic filter (e.g., tumor size, survival, imaging-based readout) or harvest target tissue for unbiased analysis.
  • Sample Processing for Hit Identification:
    • For MIC-Drop (Bulk Phenotype): Isolate genomic DNA from sorted cells or bulk tissue of interest. Amplify sgRNA barcodes via PCR and sequence. Calculate enrichment/depletion of sgRNAs relative to the input library using MAGeCK or similar tools.
    • For Perturb-seq (Single-Cell Phenotype): Generate a single-cell suspension from the tissue. Prepare libraries using 10x Genomics Chromium or similar platform with protocols that capture sgRNA barcodes. Sequence and align to reference genome.
  • Primary Hit Calling: For MIC-Drop, hits are sgRNAs significantly enriched/depleted (FDR < 0.1). For Perturb-seq, hits are perturbations causing significant transcriptional changes (differential expression) relative to non-targeting controls.

Quantitative Data from Primary Screen Table 1: Representative Primary Screening Metrics (Hypothetical Data)

Metric MIC-Drop (Bulk Readout) Perturb-seq (scRNA-seq Readout)
Library Size 5,000 sgRNAs 1,000 sgRNAs
Animal Replicates n=8 per condition n=4 per condition
Cells/Units Analyzed N/A (Bulk tissue) 50,000 cells total
Primary Hits (FDR<10%) 250 sgRNAs 75 perturbations
False Discovery Rate (FDR) 8% 9%

Protocol 2: Assessing Hit Reproducibility Objective: To determine if primary hits are consistent across experimental replicates and independent cohorts.

  • Inter-Replicate Correlation Analysis: Quantify the correlation (e.g., Pearson's r) of sgRNA log2 fold-changes or perturbation gene signatures between all pairs of biological replicates from the primary screen.
  • Independent Validation Cohort: Initiate a new, independent in vivo screen using the same parameters but with animals from a separate cohort. Target only the primary hit list (e.g., 250 genes) plus controls.
  • Reproducibility Scoring: Calculate a reproducibility score for each hit. For example: Score = -log10(P_primary) * (1 - |log2FC_primary - log2FC_validation| / (|log2FC_primary| + |log2FC_validation|))
  • Threshold Setting: Establish a pass/fail threshold (e.g., top 60% of scores) to generate a reproducible hit list.

Quantitative Reproducibility Data Table 2: Hit Reproducibility Metrics

Analysis Metric Value
Inter-Replicate Concordance Average Pearson's r (log2FC) 0.78
Validation Screen Success % of Primary Hits Replicated (p<0.05) 65%
High-Confidence Hits Hits passing reproducibility score threshold 120 sgRNAs

Protocol 3: Evaluating Hit Robustness Objective: To test if the hit phenotype is robust to experimental variance and is cell-autonomous.

  • Dosage Robustness Test: In vivo, test 2-3 independent sgRNAs per hit gene. Compare phenotype severity across sgRNAs targeting the same gene. A robust hit shows concordant phenotypes.
  • Cell-Autonomy Test (Using Perturb-seq/Cre drivers): Perform a cell-type specific Perturb-seq experiment. Isolate the transcriptional signature of the perturbed cell type versus bystander cells. A cell-autonomous hit shows strong differential expression in perturbed cells only.
  • Phenotypic Orthogonal Validation: For a subset of top hits, perform an orthogonal assay (e.g., immunohistochemistry, flow cytometry, behavioral assay) to confirm the phenotype using an independent readout.

Quantitative Robustness Data Table 3: Hit Robustness Assessment

Robustness Test Criterion for Passing % of Reproducible Hits Passing
Multi-sgRNA Concordance ≥2/3 sgRNAs show same effect direction (p<0.1) 85%
Cell-Autonomy (from Perturb-seq) >90% of DEGs are in perturbed cell type 70%
Orthogonal Validation Significant effect in independent assay (p<0.05) 90% (of tested subset)

Workflow for Hit Assessment

Title: Hit Assessment Workflow from Screen to Validation

Signaling Pathway for a Hypothetical Robust Hit

Title: Cell-Autonomous Pathway of a Validated Hit

Conclusion Systematic application of these protocols for assessing reproducibility and robustness is essential for advancing hits from in vivo MIC-Drop and Perturb-seq screens. This tiered validation framework mitigates the high false-positive rates inherent in complex in vivo models and builds confidence for downstream investment in mechanistic studies and drug development.

Application Notes

In the context of large-scale in vivo screening using MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq, a key limitation is the loss of native spatial context. Integrating these pooled perturbation screens with spatial transcriptomics or in situ sequencing (ISS) enables the direct mapping of genetic perturbation effects onto tissue architecture, revealing cell-cell communication networks and microenvironment-specific phenotypes. This follow-up is critical for advancing drug development, particularly for oncology and neurobiology targets, where spatial organization dictates function.

Application 1: Spatial Transcriptomics Follow-up (e.g., 10x Genomics Visium, Slide-seqV2)

  • Purpose: To profile the transcriptomes of perturbed cells within intact tissue sections, identifying zonally restricted gene expression programs and neighborhood effects.
  • Workflow: After in vivo MIC-Drop/Perturb-seq screening, target organs are harvested, fresh-frozen, and cryosectioned. Sections are placed on spatially barcoded oligonucleotide arrays for whole-transcriptome capture. Perturbed cells are identified in silico by matching the expressed sgRNA/barcode from the spatial data to the perturbation library manifest.
  • Key Insight: Enables the discovery of perturbation effects that are only manifest in specific tissue niches (e.g., a tumor-killing phenotype active only at the invasive margin).

Application 2: In Situ Sequencing Follow-up (e.g., STARmap, HybISS by Cartana/Resolve Biosciences)

  • Purpose: To directly visualize and quantify the spatial distribution of both perturbation barcodes (sgRNAs) and key downstream transcriptional markers at subcellular resolution.
  • Workflow: Fixed tissue sections undergo targeted in situ amplification and sequencing-by-ligation cycles. Custom panels are designed to include readouts for the MIC-Drop barcode region and 50-100 key response genes.
  • Key Insight: Provides the highest-resolution validation of cell-autonomous vs. non-autonomous signaling, directly visualizing how a perturbation in one cell alters gene expression in its immediate neighbors.

Quantitative Comparison of Follow-up Modalities

Feature Spatial Transcriptomics (Visium) In Situ Sequencing (HybISS)
Resolution 55 µm spots (multiple cells) Subcellular (~0.5 µm)
Transcript Coverage Whole transcriptome (~10,000 genes) Targeted panel (50-500 genes)
Perturbation Detection Indirect, via sgRNA transcript Direct, via barcode sequence
Multiplexing Capacity High (genome-wide) Medium (hundreds of targets)
Primary Output Spatial expression maps linked to perturbations Direct co-localization of barcode + RNA targets
Best For Discovery of novel spatial phenotypes & niches High-resolution mechanistic validation

Experimental Protocols

Protocol 1: Spatial Transcriptomics Follow-up for Perturb-seq Tumors

Materials:

  • Fresh-frozen tissue block from MIC-Drop/Perturb-seq mouse model.
  • 10x Genomics Visium Spatial Tissue Optimization Slide & Kit.
  • 10x Genomics Visium Spatial Gene Expression Slide & Kit.
  • Standard RNA-seq library preparation reagents.
  • Cryostat.

Method:

  • Tissue Preparation: Embed harvested tissue in OCT. Cut 10 µm sections on a cryostat and mount onto the Visium slide.
  • Fixation & Staining: Fix sections in pre-chilled methanol, stain with H&E, and image.
  • Permeabilization Optimization: Determine optimal tissue permeabilization time using the Tissue Optimization Slide.
  • Spatial Gene Expression: For the experimental slide, permeabilize tissue to release mRNA, which binds to spatially barcoded primers on the slide.
  • On-Slide cDNA Synthesis: Perform reverse transcription to create cDNA with spatial barcodes.
  • Library Construction: Harvest cDNA, amplify, and construct sequencing libraries according to the Visium protocol. Include an additional PCR step with custom primers to enrich for the MIC-Drop/perturbation barcode region.
  • Sequencing & Analysis: Sequence libraries (SPATSEQ + Barcode). Use Space Ranger (10x) for alignment. Align barcode reads to the perturbation manifest. Use Seurat or Squidpy for integrated spatial analysis of perturbation groups.

Protocol 2: Direct In Situ Sequencing of Perturbation Barcodes and Marker Genes

Materials:

  • Formalin-fixed, paraffin-embedded (FFPE) tissue sections on glass slides.
  • Custom ISS probe pool (targeting perturbation barcodes and key genes).
  • Resolve Biosciences Molecular Cartography Kit or STARmap reagents.
  • Fluorescently labeled nucleotides/detection probes.
  • Epifluorescence microscope with motorized stage.

Method:

  • Probe Design & Hybridization: Design padlock probes for targeted genes and for the constant region flanking the MIC-Drop variable barcode. Deparaffinize, rehydrate, and permeabilize FFPE sections. Hybridize probes to target RNA.
  • Rolling Circle Amplification (RCA): For bound padlock probes, perform ligation and RCA to create a rolling circle product (RCP), an amplified copy of the target sequence.
  • Sequencing-by-Ligation: Incubate RCPs with a pool of fluorescently labeled, degenerate interrogator probes. Each cycle uses probes that correspond to a specific base (e.g., cycle 1: detect base position 1). Image fluorescence at each cycle.
  • Decoding & Barcode Calling: Repeat for 4-7 cycles to read the short, targeted sequence. Decode fluorescence sequences per RCP to identify the gene or the specific perturbation barcode.
  • Spatial Mapping: Register images to generate a spatial map of every detected RNA molecule and perturbation barcode, enabling single-cell, multiplexed analysis.

Visualizations

Title: Follow-up Workflow from In Vivo Screen to Spatial Analysis

Title: Decision Logic for Spatial Follow-up Technique Selection

The Scientist's Toolkit

Research Reagent / Solution Function in Protocol
10x Genomics Visium Slides Glass slides with spatially barcoded oligo arrays for capturing mRNA from tissue sections.
Custom Padlock Probes Oligonucleotides designed to hybridize to target RNA (e.g., sgRNA barcode, marker gene) for in situ amplification and sequencing.
Rolling Circle Amplification (RCA) Enzymes Phi29 polymerase and ligase to amplify hybridized padlock probes into detectable DNA concatemers.
Fluorescently Labeled Nucleotides (Cy3, Cy5, Alexa Fluor) Used as detection probes in sequencing-by-ligation cycles to decode the amplified sequence.
Methanol (100%, -20°C) Preferred fixative for spatial transcriptomics protocols, preserving RNA integrity better than paraformaldehyde for this application.
Cryostat Instrument for cutting thin (5-20 µm), high-quality frozen tissue sections for spatial analysis.
Custom Barcode Enrichment Primers PCR primers to specifically amplify the perturbation barcode region from spatial cDNA libraries for confident sgRNA assignment.
DAPI (4',6-diamidino-2-phenylindole) Nuclear stain used during in situ sequencing imaging to define cellular boundaries for single-cell analysis.

Application Notes: Integrating Base Editing, Multi-omics, and Phenotyping forIn VivoScreening

The convergence of pooled CRISPR screening with single-cell multi-omics is revolutionizing functional genomics. Within the context of MIC-Drop (Multiplexed Interrogation of Cells by Droplet) and Perturb-seq for in vivo research, the field is advancing along three interconnected axes: (1) the shift from CRISPR knockouts to precise base editing for modeling genetic variants, (2) the expansion of multi-modal readouts beyond transcriptomics, and (3) the integration of high-content spatial and morphological phenotyping. This integration enables systematic dissection of genotype-phenotype relationships in complex tissue environments.

Key Quantitative Advances: The table below summarizes recent benchmark data highlighting the scaling and multi-omic capabilities of next-generation screening platforms.

Table 1: Benchmarking Data for Advanced In Vivo Screening Modalities

Screening Modality Scale (Max Guide/Variant #) Perturbation Type Primary Readout Key Metric (Reported Performance)
Traditional Perturb-seq ~1,000 guides CRISPRko/a/i Single-cell RNA-seq (scRNA-seq) ~10,000 cells per guide (in pooled screen)
Base Editing Screens ~10,000 sgRNAs A•T to G•C or C•G to T•A Target amplicon sequencing + scRNA-seq >90% on-target editing efficiency; <0.5% indels
Multi-omic Perturb-seq ~500-1,000 guides CRISPRko scRNA-seq + scATAC-seq (CITE-seq) Paired transcriptome & surface protein (100+ antibodies) or chromatin accessibility from same cell
High-Content Phenotyping ~100-200 clones (in situ) CRISPRko Multiplexed immunofluorescence (10-60 plex) + Spatial transcriptomics Subcellular segmentation of 5+ morphological features per cell

Experimental Protocols

Protocol 1: In Vivo Base Editing Screen with MIC-Drop Encapsulation

Objective: To model a spectrum of single-nucleotide variants (SNVs) in a target gene and assess their functional impact in a pooled, in vivo context.

Materials: MIC-Drop vector system, ABE8e or BE4max base editor plasmid, sgRNA library targeting SNV sites, lentiviral packaging mix, target cells (e.g., primary T cells or progenitor cells), recipient mice.

Procedure:

  • Library Design & Cloning: Design a sgRNA library (110-nt oligo pool) where each guide directs the base editor to a specific genomic coordinate for a desired transition. Clone library into the MIC-Drop barcoded sgRNA expression vector via Golden Gate assembly.
  • Virus Production & Cell Transduction: Produce high-titer lentivirus. Transduce target cells at a low MOI (<0.3) to ensure single-guide incorporation. Select with puromycin for 72 hours.
  • In Vivo Delivery & Expansion: Adoptively transfer transduced cells or inject lentivirus directly into a tissue/organ of interest. Allow for in vivo expansion and phenotypic selection for 2-4 weeks.
  • Sample Processing & Sequencing: Harvest the target tissue. Dissociate into single-cell suspension. Split sample for two analyses:
    • Genomic DNA: Extract gDNA. Perform PCR to amplify the sgRNA barcode region and the targeted genomic loci for NGS. This links barcode to editing outcome.
    • Single-Cell Multi-omics: Process cells for 10x Genomics Multiome (ATAC + Gene Expression) or CITE-seq. The sgRNA barcode is captured in the cDNA library.
  • Data Analysis: Align NGS reads to compute base editing efficiency per guide. Map sgRNA barcodes to single-cell profiles. Perform differential expression (DE) and differential chromatin accessibility (DA) analysis between cells bearing functional vs. non-functional variants.

Protocol 2: Multi-omic Readout from a Perturb-seq In Vivo Screen

Objective: To obtain coupled gene expression and chromatin accessibility profiles from single cells harvested from an in vivo MIC-Drop screen.

Materials: Cells from in vivo screen, Chromium Next GEM Chip J, 10x Multiome ATAC + Gene Expression Kit, recommended buffers and enzymes.

Procedure:

  • Nuclei Isolation: Lyse cells with a mild detergent to isolate intact nuclei. Centrifuge and resuspend nuclei in chilled PBS with BSA.
  • Transposition & Barcoding: Follow 10x Multiome protocol. Use loaded transposase to fragment accessible chromatin. Load nuclei onto the chip for GEM generation, where transposed DNA and mRNA from each nucleus are co-encapsulated with a unique barcode.
  • Library Construction: Generate two libraries from the same GEM reaction: (i) a gene expression library (from poly-adenylated mRNA) and (ii) an ATAC library (from transposed DNA fragments).
  • Sequencing & Alignment: Sequence libraries on an Illumina platform. Align reads to the reference genome (ATAC) and transcriptome (GEX). Cell Ranger ARC pipeline assigns both ATAC fragments and mRNA reads to individual cell barcodes.
  • Data Integration: Use the cell barcode to link the perturbation identity (from the captured sgRNA) to the paired GEX and ATAC profiles. Tools like Signac or ArchR can be used for joint analysis to identify perturbation-induced changes in both transcriptional networks and regulatory element activity.

Protocol 3: High-Content Phenotypic Analysis of In Vivo Derived Samples

Objective: To quantify spatial and morphological phenotypes in tissue sections from perturbed cell clones.

Materials: FFPE or frozen tissue sections, CODEX/ PhenoCycler or MIBI-TOF system, panel of 30-40 metal-tagged or dye-labeled antibodies, imaging platform.

Procedure:

  • Tissue Staining: For multiplexed ion beam imaging (MIBI-TOF), label antibodies with pure metal isotopes. Stain tissue sections with the antibody panel. For cyclic staining methods (CODEX), stain with oligonucleotide-conjugated antibodies.
  • Image Acquisition: Acquire images using the specialized platform (TOF mass spectrometer for MIBI, cyclical fluorescence for CODEX). Generate high-resolution, multi-channel images where each channel represents a specific protein marker.
  • Image & Data Analysis: Use tools like CellProfiler or DeepCell for automated cell segmentation and feature extraction. Quantify single-cell metrics: marker intensity, cell size/shape, nuclear morphology, and neighborhood analysis (number of contacting immune cells).
  • Perturbation Correlation: Correlate extracted morphological features with the perturbation identity (determined by in situ sequencing of the sgRNA barcode or via pre-defined spatial coordinates if clones are isolated). Identify morphological signatures specific to genetic perturbations.

Visualizations

Title: In Vivo Base Editing Screen with Multi-omic Analysis

Title: Multi-omic Signaling Cascade from Perturbation

The Scientist's Toolkit: Research Reagent Solutions

Table 2: Essential Materials for Advanced In Vivo Functional Genomics Screens

Item Function Example Product/Kit
MIC-Drop Vector System Enables pooled, barcoded sgRNA delivery with a DNA-barcoded molecular identifier for high-specificity in vivo screening. Addgene Kit #1000000092
Base Editor Plasmids Expresses adenine or cytosine base editor protein for precise single-nucleotide conversion without double-strand breaks. ABE8e (for A•T>G•C), BE4max (for C•G>T•A)
10x Genomics Multiome Kit Provides reagents for simultaneous profiling of gene expression and chromatin accessibility from the same single nucleus. Chromium Next GEM Single Cell Multiome ATAC + GEX
CITE-seq Antibody Panel Oligo-tagged antibodies allow measurement of surface protein abundance alongside transcriptome in single cells. BioLegend TotalSeq Antibodies
Multiplexed Imaging Antibodies Metal-tagged (for MIBI-TOF) or oligonucleotide-conjugated (for CODEX) antibodies for high-plex spatial protein detection. Standard BioTools Maxpar Antibodies
Cell Segmentation Software AI-based tools for identifying individual cell boundaries in complex tissue images for feature extraction. DeepCell, CellProfiler
Perturb-seq Data Analysis Suite Specialized computational pipelines for aligning sequencing data and linking genetic perturbations to single-cell readouts. Cell Ranger ARC, Seurat, Signac, ArchR

Conclusion

MIC-Drop and Perturb-seq represent a paradigm shift in functional genomics, moving high-throughput genetic screening into the physiologically relevant context of a living organism. As detailed in this guide, these methods offer unparalleled power to map gene function and regulatory networks in vivo, from foundational principles through complex troubleshooting. While Perturb-seq provides deep, single-cell transcriptional phenotyping, MIC-Drop offers unique advantages in delivery and model systems like zebrafish. The choice between them depends on the specific biological question, model, and desired readout. Validation studies confirm their superior ability to identify causal genes and mechanisms compared to traditional in vitro screens. Looking forward, the integration of these platforms with spatial omics, epigenetic profiling, and more sophisticated perturbation tools (e.g., base editors) promises to further deconvolute the genotype-to-phenotype map in health and disease. For drug developers and researchers, mastering these techniques is no longer a niche pursuit but a critical pathway for accelerating the discovery of novel therapeutic targets and understanding their mechanisms of action within the intact complexity of biological systems.