Model Organism Sequencing Report

Project name: MOSS_Project_example-BDGP6
Samples (excluding reference genome): 38
Report generated: 02/02/2026 02:24 PM
Species: Drosophila_melanogaster
Genome Assembly: BDGP6

Quality metrics

Alignment

Alignment Rate

Percent of reads aligned to reference is shown. These metrics are calculated by Picard collectalignmentsummarymetrics

Coverage (mean)

Mean Coverage

Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics

Coverage (standard deviation)

Standard Deviation of Coverage

Standard Deviation of Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics

Consequences

Consesquences of variants as determined by bcftools consequence caller. Double click on a category in the legend to isolate and magnify it. Double click the legend again to restore all categories.

Consequences

High Impact

Moderate Impact

Other Impact

Integration elements

Integration elements with fewer than 4 detected reads are shown as not detected in the following table.

Comparative visualizations

Principal coordinates analysis

Principal Coordinates Analysis Plot

Principal Coordinates Analysis (PCoA) plot is shown. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr

Dendrogram

Dendrogram is shown based on IBS distance. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr

Data

The output folder generated by this analysis pipeline contains the following folders and files:

integration_elements: see Methods for file descriptions
- <integration element>
  - <sample>
    - bam
    - bed
    - filtered.bed
    - matched_reads
quality_control
- picardalignmentplot.dat: alignment rate for each sample
- picardcoverageplot.dat: mean coverage for each sample
strain_identification
- consequences
- dendrogram
- MOSS_Project_example-BDGP6_csq.txt: consequences of variants, limited to one consequence per gene per sample
- IBS_distance_matrix.csv: distance matrix based on identity by state
- pcoa
bcf: project-specific multi-sample BCF file
vcfs: sample-wise VCF files
gvcfs: sample-wise gVCF files

Methods

Reads were aligned to the reference genome using Ultima aligner (ua) version 2.2.1. Variants were called using Ultima make_examples version 3.1.10 (make_examples_3.1.10.sif) and Ultima call_variants version 2.2.4 (call_variants_2.2.4.sif).

Quality of whole genome sequencing and alignment data was assessed with Picard v. 2.25.6 using CollectWgsMetrics and CollectMultipleMetrics tools, respectively.

Integration elements were identified by matching sequences to reads using seqkit v. 2.9.0, aligning reads to the reference genome with bowtie2 v. 2.5.4. Output .sam files were converted to .bam files using samtools v. 1.20, which were converted to .bed files using bedtools v. 2.31.1. Filtered .bed files were created by filtering .bed files to exclude sites with MAPQ < 20.

Click for detailed integration elements identification commands

Integration elements commands:

# find matching reads
seqkit grep -s -P -f {insertion}.txt {sample}.fastq > {sample}_{insertion}_matched_reads.txt

# align matching reads to reference genome
bowtie2 --very-sensitive-local -x {REFERENCE_INDEX} -U {sample}_{insertion}_matched_reads.txt -S {sample}_{insertion}_aligned_reads.sam

# convert sam file to bam file
samtools view -bS {sample}_{insertion}_aligned_reads.sam > {sample}_{insertion}_aligned_reads.bam

# convert bam file to bed file
bedtools bamtobed -i {sample}_{insertion}_aligned_reads.bam > {sample}_{insertion}_insertion_sites.bed

# filter bed file
awk -v threshold=20 '$5 >= threshold' {sample}_{insertion}_insertion_sites.bed > {sample}_{insertion}_insertion_sites_filtered.bed

Consequences of variants were annotated using bcftools version 1.6 consequence caller in haplotype aware mode. Results were deduplicated to give one consequence per gene rather than one consequence per transcript. The complete command used is: bcftools csq -f <input.fasta> -g <input.gff3> -O t –phase s | awk ‘!seen[$1,$2,$3,$4,$5]++’ > /_csq.txt