Project name: Gohl3_Project_002-BDGP6
Samples
(excluding reference genome): 12
Report generated:
09/29/2025 10:38 PM
Species: Drosophila_melanogaster
Genome Assembly: BDGP6
Percent of reads aligned to reference is shown. These metrics are calculated by Picard collectalignmentsummarymetrics
Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics
Standard Deviation of Coverage is shown. These metrics are calculated by Picard CollectWgsMetrics
Consesquences of variants as determined by bcftools consequence caller. Double click on a category in the legend to isolate and magnify it. Double click the legend again to restore all categories.
Integration elements with fewer than 4 detected reads are shown as not detected in the following table.
Principal Coordinates Analysis (PCoA) plot is shown. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr
Dendrogram is shown based on IBS distance. Genetic distance is calculated using plink –bfile PROJECT –distance 1-ibs –out PROJECT –allow-extra-chr
The output folder generated by this analysis pipeline contains the following folders and files:
integration_elements: see Methods for file descriptions
<integration element>
<sample>
bambedfiltered.bedmatched_readsquality_control
picardalignmentplot.dat: alignment rate for each
samplepicardcoverageplot.dat: mean coverage for each
samplestrain_identification
consequencesdendrogramGohl3_Project_002-BDGP6_csq.txt: consequences of
variants, limited to one consequence per gene per sampleIBS_distance_matrix.csv: distance matrix based on
identity by statepcoabcf: project-specific multi-sample BCF filevcfs: sample-wise VCF filesgvcfs: sample-wise gVCF filesReads were aligned to the reference genome using Ultima aligner (ua) version 2.2.1. Variants were called using Ultima make_examples version 2.2.6 and Ultima call_variants version 2.2.2.
Quality of whole genome sequencing and alignment data was assessed with Picard v. 2.25.6 using CollectWgsMetrics and CollectMultipleMetrics tools, respectively.
Integration elements were identified by matching sequences to reads using seqkit v. 2.9.0, aligning reads to the reference genome with bowtie2 v. 2.3.4.1. Output .sam files were converted to .bam files using samtools v. 1.20, which were converted to .bed files using bedtools v. 2.29.2. Filtered .bed files were created by filtering .bed files to exclude sites with MAPQ < 20.
Consequences of variants were annotated using bcftools version 1.6
consequence caller in haplotype aware mode. Results were deduplicated to
give one consequence per gene rather than one consequence per
transcript. The complete command used is: bcftools csq -f
<input.fasta> -g <input.gff3>