Main statistics¶

  value
metric  
MIXED_read_mean_coverage 5.01
PCT_MIXED_both_tags 32.41
PCT_MIXED_both_tags_where_endreached 32.71
PCT_MINUS_both_tags_where_endreached 13.65
PCT_PLUS_both_tags_where_endreached 29.38
PCT_UNDETERMINED_either_tag 13.38
PCT_DISCORDANT 10.88
PCT_read_end_unreached 0.91
Mean_cvg 15.45
Indel_Rate 0.20
Mean_Read_Length 76.34
PF_Barcode_reads 658215149.00
PCT_PF_Reads_aligned 98.74
PCT_Chimeras 0.41
PCT_duplicates 8.29
PCT_Failed_QC_reads 0.00
PCT_failed_adapter_dimers 0.14
PCT_failed_unrecognized_start_stem 0.18
PCT_failed_unrecognized_start_loop 0.07




  • MIXED_read_mean_coverage is the coverage of reads where the tag in both the start and the end of the read were detected as MIXED
  • pct_MIXED_both_tags is the ratio of reads where both loops were detected as MIXED out of all the reads
  • pct_MIXED_both_tags_where_endreached is the ratio of reads where both loops were detected as MIXED out of the reads where the read end was reached so that the end loop could be measured

QC plots¶

No description has been provided for this image
This barplot shows the ratio of each category type in the data according to the spec in the top of the file.
The categories are reported separately for the start- and end-loops.
The end loop breakdown is shown only for the reads that reached the end loop.





No description has been provided for this image
These plots show the concordance between the strand ratio categories of the start-loop and end-loop. Each loop is assigned a category separately, and the concordance is plotted. The top plot includes all the reads, including those with END_UNREACHED, while the bottom includes reads where the end was reached only.





No description has been provided for this image
This plot shows the homopolymers called in the A, T, G and C hmers in the start loop (left) and in the T, G, C, A hmers in the end loop (right). The loops are expected to yield:
- A signal of [1 1 1 1], AGCT and GCAT for the start and end loops, for MIXED reads
- A signal of [0 2 0 2], TTCC and CCTT for the start and end loops, for MINUS-only reads
- A signal of [2 0 2 0], AAGG and GGAA for the start and end loops, for PLUS-only reads





No description has been provided for this image

About ppmSeq¶

Identifying single nucleotide variants (SNVs) is fundamental to genomics. While consensus mutation calling, requiring multiple variant-containing reads to call genetic variation, is often used, it is unsuitable in calling rare SNVs, such as in circulating tumor DNA or somatic mosaicism, where often only a single supporting read is available. Paired Plus and Minus strand Sequencing (ppmSeq), a PCR-free library preparation technology that uniquely leverages the Ultima Genomics clonal amplification process, overcomes this challenge. Here, DNA denaturation is not required prior to clonal amplification so both native strands are clonally amplified on many sequencing beads, allowing for a linear increase in duplex recovery and scalable duplex coverage without requiring unique molecular identifiers or redundant sequencing.

In ppmSeq, modified Ultima Genomics adapters containing mismatched homopolymers are used to detect reads that are the result of the mixture of the two native DNA strands. While some reads are amplicons of only the Plus or Minus strands and are generally of typical UG read SNV accuracy, the so-called Mixed reads exhibit much lower error rates, well below 1E-6, facilitating the accurate detection of rare SNVs. Artifactual mutations manifesting on one strand only are common sources of error in SNV detection from NGS. While beads that are amplicons of Plus or Minus strand only are exposed to these artifacts that would appear as high-quality reads, in Mixed beads they create an inconsistent signal that translates into a low quality base or read, preventing them from being read as false positive SNVs.

This report is generated from preprocessing of the ppmSeq sequencing data, and is intended to be used as a QC report for the library prep and sequencing run. The distribution of the MINUS/PLUS ratio, assignment of reads to categories (MIXED/MINUS/PLUS/UNDETERMINED), and with the raw calls are shown.

ppmSeq adapter version¶

The ppmSeq_v1 adapter is used in this sample. It is composed of an AAGG-AAGG loop in the start and a GGAA-GGAA loop in "
"the end of the read, so that reads are expected to ideally yield in each loop:
- TTCC and CCTT for MINUS-only reads
- AAGG and GGAA for PLUS-only reads
- AGCT and GCAT for 50% MINUS - 50% PLUS reads

Up to 2 homopolymer errors are allowed, as long as the distance from the second best fit is at least 4.

Additionally, since the end loop is at the end of the reads it is not necessarily reached, in which case the loop is "
"annotated as END_UNREACHED.





Detailed statistics¶


Statistics table: keys_to_convert
0                      stats_shortlist
1                         sorter_stats
2         strand_ratio_category_counts
3           strand_ratio_category_norm
4    strand_ratio_category_concordance
5      strand_ratio_category_consensus
6                trimmer_failure_codes
dtype: object

Statistics table: sorter_stats
metric
Mean_cvg                1.545000e+01
Indel_Rate              2.000000e-01
Mean_Read_Length        7.634000e+01
PF_Barcode_reads        6.582151e+08
PCT_PF_Reads_aligned    9.874000e+01
PCT_Chimeras            4.100000e-01
PCT_duplicates          8.290000e+00
PCT_Failed_QC_reads     0.000000e+00
Name: value, dtype: float64

Statistics table: stats_shortlist
metric
MIXED_read_mean_coverage                5.007969e+00
PCT_MIXED_both_tags                     3.241404e+01
PCT_MIXED_both_tags_where_endreached    3.271226e+01
PCT_MINUS_both_tags_where_endreached    1.365068e+01
PCT_PLUS_both_tags_where_endreached     2.938198e+01
PCT_UNDETERMINED_either_tag             1.337989e+01
PCT_DISCORDANT                          1.087520e+01
PCT_read_end_unreached                  9.116373e-01
Mean_cvg                                1.545000e+01
Indel_Rate                              2.000000e-01
Mean_Read_Length                        7.634000e+01
PF_Barcode_reads                        6.582151e+08
PCT_PF_Reads_aligned                    9.874000e+01
PCT_Chimeras                            4.100000e-01
PCT_duplicates                          8.290000e+00
PCT_Failed_QC_reads                     0.000000e+00
PCT_failed_adapter_dimers               1.359672e-01
PCT_failed_unrecognized_start_stem      1.774658e-01
PCT_failed_unrecognized_start_loop      6.728910e-02
Name: value, dtype: float64

Statistics table: strand_ratio_category_concordance
strand_ratio_category_start  strand_ratio_category_end
MIXED                        MIXED                        0.324140
                             MINUS                        0.004714
                             PLUS                         0.054290
                             END_UNREACHED                0.006204
                             UNDETERMINED                 0.062611
MINUS                        MIXED                        0.010230
                             MINUS                        0.135262
                             PLUS                         0.026237
                             END_UNREACHED                0.001169
                             UNDETERMINED                 0.026921
PLUS                         MIXED                        0.007893
                             MINUS                        0.004397
                             PLUS                         0.291141
                             END_UNREACHED                0.001279
                             UNDETERMINED                 0.010217
UNDETERMINED                 MIXED                        0.015450
                             MINUS                        0.000791
                             PLUS                         0.009439
                             END_UNREACHED                0.000464
                             UNDETERMINED                 0.007149
Name: count_norm, dtype: float64

Statistics table: strand_ratio_category_consensus
strand_ratio_category_consensus
MIXED           0.327123
MINUS           0.136507
PLUS            0.293820
UNDETERMINED    0.133799
DISCORDANT      0.108752
Name: count_norm, dtype: float64

Statistics table: strand_ratio_category_counts
strand_ratio_category_start strand_ratio_category_end strand_ratio_category_end_no_unreached
MIXED 293891320 232607051 232607051
MINUS 129935084 94394295 94394295
PLUS 204785073 247819205 247819205
END_UNREACHED 0 5928018 0
UNDETERMINED 21649100 69512008 69512008

Statistics table: strand_ratio_category_norm
strand_ratio_category_start strand_ratio_category_end strand_ratio_category_end_no_unreached
MIXED 0.451959 0.357714 0.361005
MINUS 0.199820 0.145164 0.146499
PLUS 0.314928 0.381108 0.384614
END_UNREACHED 0.000000 0.009116 0.000000
UNDETERMINED 0.033293 0.106899 0.107882

Statistics table: trimmer_failure_codes
failed_read_count total_read_count PCT_failure
segment reason
First_C no match 120 873345008 0.000014
sequence was too short 1252 873345008 0.000143
Stem_start no match 1549889 873345008 0.177466
Unrecognized_End_loop sequence was too long 2877504 873345008 0.329481
Unrecognized_Start_loop sequence was too long 587666 873345008 0.067289
insert sequence was too short 1187463 873345008 0.135967
start rsq file 215129859 873345008 24.632861
sequence was too long 1750678 873345008 0.200457

Statistics table: trimmer_histogram
strand_ratio_category_start loop_sequence_start strand_ratio_category_end loop_sequence_end native_adapter_length count count_norm
0 PLUS AAGGA PLUS GGAAC 1.0 154076692 0.236946
1 MINUS TTCCA MINUS CCTTC 1.0 83668521 0.128669
2 MIXED TGCA MIXED GCATTC 1.0 31514816 0.048465
3 PLUS AAGGA PLUS GGAC 1.0 30588563 0.047040
4 MIXED ATGCA MIXED GCATTC 1.0 29180882 0.044876
5 MIXED ATGCA UNDETERMINED NaN 1.0 23441782 0.036050
6 MIXED ATGCA MIXED GCTC 1.0 15754095 0.024227
7 MIXED ATGCA MIXED GGCATTC 1.0 15556446 0.023923
8 MINUS TTCCA UNDETERMINED NaN 1.0 15400600 0.023684
9 MIXED ATGCA MIXED GTC 1.0 14576619 0.022417
10 MIXED TGCA MIXED GGCATTC 1.0 13750335 0.021146
11 MINUS TTCCA PLUS GGAAC 1.0 13502706 0.020765
12 MIXED TGCA MIXED GCTC 1.0 13103811 0.020152
13 MIXED ATGCA PLUS GGAAC 1.0 12806975 0.019695
14 MIXED ATGCA MIXED GGCTC 1.0 10770725 0.016564
15 MIXED TGCA UNDETERMINED NaN 1.0 10331640 0.015888
16 MIXED TGCA PLUS GGAAC 1.0 9920185 0.015256
17 MIXED TGCA MIXED GTC 1.0 8930061 0.013733
18 MIXED AGCA MIXED GCATTC 1.0 8025845 0.012343
19 MIXED TGCA MIXED GGCTC 1.0 6721537 0.010337
20 PLUS AAGGA UNDETERMINED NaN 1.0 6404236 0.009849
21 UNDETERMINED NaN MIXED GCATTC 1.0 6015340 0.009251
22 MIXED ATCA MIXED GCATTC 1.0 5429406 0.008350
23 MIXED ACA MIXED GCATTC 1.0 4754587 0.007312
24 UNDETERMINED NaN UNDETERMINED NaN 1.0 4648468 0.007149
25 UNDETERMINED NaN PLUS GGAAC 1.0 3242453 0.004986
26 MIXED ATGCA PLUS GGAC 1.0 2807897 0.004318
27 UNDETERMINED NaN PLUS GGAC 1.0 2747446 0.004225
28 PLUS AAGGA MINUS CCTTC 1.0 2743214 0.004219
29 MIXED ATTGCA MIXED GCATTC 1.0 2734818 0.004206
30 MINUS TTCCA MIXED GCATTC 1.0 2577225 0.003963
31 MINUS TTCCA PLUS GGAC 1.0 2519408 0.003874
32 MIXED ATGCA MIXED GGCATC 1.0 2288898 0.003520
33 MIXED ATGCA MIXED GCATC 1.0 2217096 0.003410
34 MIXED ATGCA END_UNREACHED NaN NaN 2199246 0.003382
35 PLUS AAGGA MIXED GCATTC 1.0 2082861 0.003203
36 MIXED ATGA UNDETERMINED NaN 1.0 1821550 0.002801
37 MIXED TGCA MIXED GCATC 1.0 1790798 0.002754
38 MIXED TGCA PLUS GGAC 1.0 1713303 0.002635
39 MINUS TTCCA MINUS CTTC 1.0 1700989 0.002616
40 MIXED ATGCA MIXED GATTC 1.0 1617865 0.002488
41 MIXED AGCA PLUS GGAAC 1.0 1584756 0.002437
42 MIXED AATGCA UNDETERMINED NaN 1.0 1494533 0.002298
43 MIXED ATGCA MIXED GATC 1.0 1466413 0.002255
44 MIXED TGCA MIXED GATTC 1.0 1424505 0.002191
45 MINUS ATTCCA UNDETERMINED NaN 1.0 1406821 0.002163
46 MIXED ATGCA MINUS CCTTC 1.0 1327435 0.002041
47 MIXED ATCA PLUS GGAAC 1.0 1221513 0.001878
48 PLUS AAGGA PLUS GGAATC 1.0 1209439 0.001860
49 UNDETERMINED NaN MIXED GCTC 1.0 1175749 0.001808