Revised Workflow — ONT CRISPR Amplicon Sequencing

Step-by-Step Commands

All commands assume the following directory structure and variables:

# --- Set paths ---
OUTDIR="/Volumes/KINGSTON/ONTsequencing/ONT_testing"
FASTQ_BASE="/Volumes/KINGSTON/ONTsequencing/NSP_500bp_trial/NSP_Pnapi_500bp/20260305_1613_MD-103872_FBF81064_b0e09933/fastq_pass"
REF="$OUTDIR/NSP_genomic_ref.fasta"
REF_MMI="$OUTDIR/NSP_genomic_ref.mmi"

1 MinION Sequencing

Data was generated on an Oxford Nanopore MinION with an R10.4.1 flow cell. Basecalling used the HAC model (dna_r10.4.1_e8.2_400bps_hac@v5.2.0) with built-in demultiplexing into 24 barcode directories.

Raw reads are located at:

$FASTQ_BASE/barcode01/*.fastq.gz
$FASTQ_BASE/barcode02/*.fastq.gz
...
$FASTQ_BASE/barcode24/*.fastq.gz

2 Raw Read QC (FastQC)

Run FastQC on the raw reads to assess base quality, read length distributions, and adapter content before any trimming.

# Run FastQC on raw reads for a single barcode
fastqc "$FASTQ_BASE/barcode01"/*.fastq.gz -o "$OUTDIR/fastqc_raw" --threads 2

# Loop over all barcodes
mkdir -p "$OUTDIR/fastqc_raw"
for i in $(seq -w 1 24); do
  fastqc "$FASTQ_BASE/barcode${i}"/*.fastq.gz -o "$OUTDIR/fastqc_raw" --threads 2
done

Check the HTML reports for per-base quality scores, read length distributions, and any adapter contamination flags.

3 Read Trimming & Cleaning (fastp) New

Use fastp to trim adapters, remove low-quality bases, and filter out short reads. fastp auto-detects adapters and provides its own HTML/JSON quality reports.

Install

# Install via Homebrew (macOS)
brew install fastp

# Or via conda
conda install -c bioconda fastp

Single barcode

# Concatenate multiple fastq files per barcode if needed, then run fastp
mkdir -p "$OUTDIR/trimmed"

# For a single barcode with one or more input files:
cat "$FASTQ_BASE/barcode01"/*.fastq.gz > "$OUTDIR/trimmed/barcode01_raw_merged.fastq.gz"

fastp \
  --in1 "$OUTDIR/trimmed/barcode01_raw_merged.fastq.gz" \
  --out1 "$OUTDIR/trimmed/barcode01_trimmed.fastq.gz" \
  --html "$OUTDIR/trimmed/barcode01_fastp.html" \
  --json "$OUTDIR/trimmed/barcode01_fastp.json" \
  --qualified_quality_phred 10 \
  --length_required 200 \
  --thread 4

All barcodes (loop)

mkdir -p "$OUTDIR/trimmed"

for i in $(seq -w 1 24); do
  BC="barcode${i}"

  # Merge raw fastqs per barcode
  cat "$FASTQ_BASE/$BC"/*.fastq.gz > "$OUTDIR/trimmed/${BC}_raw_merged.fastq.gz"

  # Run fastp
  fastp \
    --in1 "$OUTDIR/trimmed/${BC}_raw_merged.fastq.gz" \
    --out1 "$OUTDIR/trimmed/${BC}_trimmed.fastq.gz" \
    --html "$OUTDIR/trimmed/${BC}_fastp.html" \
    --json "$OUTDIR/trimmed/${BC}_fastp.json" \
    --qualified_quality_phred 10 \
    --length_required 200 \
    --thread 4

  # Clean up merged raw file to save space
  rm "$OUTDIR/trimmed/${BC}_raw_merged.fastq.gz"
done

Key fastp parameters

Flag	Value	Purpose
`--qualified_quality_phred`	10	Min base quality to keep (Q10; suitable for ONT data)
`--length_required`	200	Discard reads shorter than 200 bp (amplicon is ~500 bp)
`--thread`	4	Parallel processing threads
`--html / --json`	—	Per-barcode QC reports from fastp

Adapter detection is automatic. For ONT data, consider also using --cut_front and --cut_tail to trim low-quality stretches at read ends, or adjust --qualified_quality_phred depending on your basecaller accuracy.

4 Post-Trim QC (FastQC) New

Run FastQC again on the trimmed reads to confirm that quality has improved and problematic reads were removed. Compare these reports to the raw QC from Step 2.

mkdir -p "$OUTDIR/fastqc_trimmed"

for i in $(seq -w 1 24); do
  fastqc "$OUTDIR/trimmed/barcode${i}_trimmed.fastq.gz" \
    -o "$OUTDIR/fastqc_trimmed" --threads 2
done

Things to check:

Per-base quality scores should be more uniform after trimming
Read length distribution should center around ~500 bp with short reads removed
Adapter content warnings should be resolved
Compare total read counts before/after to assess filtering loss

5 Alignment (minimap2)

Align the trimmed reads to the 699 bp genomic reference using minimap2 with the ONT preset. The --MD flag adds MD tags for accurate variant display in IGV.

# Build minimap2 index (if not already done)
minimap2 -d "$REF_MMI" "$REF"

# Align a single barcode
minimap2 -a -x map-ont --MD "$REF_MMI" \
  "$OUTDIR/trimmed/barcode01_trimmed.fastq.gz" \
  > "$OUTDIR/barcode01_aligned.sam"

We write SAM first here to keep alignment and BAM processing as separate steps, but they can be piped together (see Step 6).

6 BAM Sorting & Indexing (samtools)

Convert SAM to sorted BAM and create an index for fast random access in IGV.

# Sort and convert to BAM
samtools sort -o "$OUTDIR/barcode01_aligned.bam" "$OUTDIR/barcode01_aligned.sam"

# Index the BAM
samtools index "$OUTDIR/barcode01_aligned.bam"

# Clean up SAM
rm "$OUTDIR/barcode01_aligned.sam"

Combined pipeline (all barcodes)

For efficiency, pipe alignment directly into sorting:

for i in $(seq -w 1 24); do
  BC="barcode${i}"

  minimap2 -a -x map-ont --MD "$REF_MMI" \
    "$OUTDIR/trimmed/${BC}_trimmed.fastq.gz" 2>/dev/null | \
    samtools sort -o "$OUTDIR/${BC}_aligned.bam" -

  samtools index "$OUTDIR/${BC}_aligned.bam"
done

Quick alignment stats

# Check mapping rate and read counts per barcode
for i in $(seq -w 1 24); do
  echo -n "barcode${i}: "
  samtools flagstat "$OUTDIR/barcode${i}_aligned.bam" 2>/dev/null | head -1
done

7 Indel Analysis (IGV)

Open the aligned BAMs in IGV to visually inspect CRISPR-induced indels at the sgRNA cut site.

Loading files in IGV

Genomes > Load Genome from File → select NSP_genomic_ref.fasta
File > Load from File → select one or more barcodeXX_aligned.bam files
Navigate to position ~230 in the reference — this is the sgRNA cut site

IGV settings for indel visualization

Color alignments by: → insert size and pair orientation or read strand
Sort alignments by: → base (at the cut site position) to group indel alleles
Show soft-clipped bases to see if reads extend beyond expected amplicon boundaries
Right-click track → Collapsed or Squished view for high-coverage samples
Look for consistent insertion/deletion patterns clustered at position ~230

What to look for

Deletions appear as horizontal black lines spanning the cut site
Insertions appear as purple marks (I) at the cut position
Wild-type reads align cleanly through the cut site with no gaps
Editing efficiency = proportion of reads with indels vs. wild-type at the sgRNA target

Tool	Version	Purpose
minimap2	2.30	ONT read alignment (`-x map-ont`)
samtools	1.23	BAM sorting, indexing, stats
FastQC	0.12.1	Read quality assessment (pre- and post-trim)
fastp	0.24+	Read trimming, adapter removal, quality filtering
IGV	—	Visual inspection of alignments and indels

Revised Analysis Workflow

What changed?

Analysis Workflow

MinION Sequencing

Raw QC

Trim (fastp)

Post-trim QC

Align

BAM Sort & Index

Indel Analysis