Project name: Gohl3_Project_002-BDGP6
Samples (excluding reference genome): 12
Report generated: 09/29/2025 10:38 PM
Species: Drosophila_melanogaster
Genome Assembly: BDGP6

Alignment Rate

Percent of reads aligned to reference is shown. These metrics are calculated by Picard collectalignmentsummarymetrics

Mean Coverage

Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics

Standard Deviation of Coverage

Standard Deviation of Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics

Consequences

Consesquences of variants as determined by bcftools consequence caller. Double click on a category in the legend to isolate and magnify it. Double click the legend again to restore all categories.

Integration elements

Integration elements with fewer than 4 detected reads are shown as not detected in the following table.



Principal Coordinates Analysis Plot

Principal Coordinates Analysis (PCoA) plot is shown. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr

Dendrogram

Dendrogram is shown based on IBS distance. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr

Data

The output folder generated by this analysis pipeline contains the following folders and files:

Methods

Reads were aligned to the reference genome using Ultima aligner (ua) version 2.2.1. Variants were called using Ultima make_examples version 2.2.6 and Ultima call_variants version 2.2.2.

Quality of whole genome sequencing and alignment data was assessed with Picard v. 2.25.6 using CollectWgsMetrics and CollectMultipleMetrics tools, respectively.

Integration elements were identified by matching sequences to reads using seqkit v. 2.9.0, aligning reads to the reference genome with bowtie2 v. 2.3.4.1. Output .sam files were converted to .bam files using samtools v. 1.20, which were converted to .bed files using bedtools v. 2.29.2. Filtered .bed files were created by filtering .bed files to exclude sites with MAPQ < 20.

Consequences of variants were annotated using bcftools version 1.6 consequence caller in haplotype aware mode. Results were deduplicated to give one consequence per gene rather than one consequence per transcript. The complete command used is: bcftools csq -f <input.fasta> -g <input.gff3> -O t –phase s | awk ‘!seen[$1,$2,$3,$4,$5]++’ > /_csq.txt