Vaccines Track Settings
 
COVID Vaccines BioNTech/Pfizer BNT-162b2 and mRNA-1273   (All Immunology tracks)

Display mode:   

Color track by codons or bases: Help on mRNA coloring

Alignment Gap/Insertion Display Options Help on display options
Draw double horizontal lines when both genome and query have an insertion
Draw a vertical purple line for an insertion at the beginning or end of the
query, orange for insertion in the middle of the query
Draw a vertical green line where query has a polyA tail insertion
View table schema
Data last updated at UCSC: 2021-04-21 16:45:45

Description

This track shows the alignment of three different mRNA vaccine sequences to the SARS-CoV-2 genome:

  1. The BioNTech/Pfizer BNT-162b2 sequence as published by the World Health Organization
  2. The reconstructed BioNTech/Pfizer BNT-162b2 RNA as sequenced by the Andrew Fire lab, Stanford University School of Medicine
  3. The Moderna mRNA-1273 sequence as sequenced by the Andrew Fire lab, Stanford University School of Medicine

Display Conventions and Configuration

The psl output from blat was converted to a bigPsl format file for display in this track. Depending upon the size of the section of the genome in display, the track will draw black where nucleotides are identical between vaccine sequence and the SARS-CoV-2 sequence. Red lines indicate differences in nucleotides. At viewpoints with smaller sections of the genome in view, setting the Color track by codons or bases: to different mRNA bases will show the nucleotides in the vaccine that are different than the SARS-CoV-2 sequence.

Methods

The mRNA sequences were obtained from the MS WORD documents as mentioned in the references below. And the Andrew Fire lab github supplied the fasta sequencing result for the BioNTech/Pfizer BNT-162b2 and Moderna mRNA-1273 samples.

The PSL alignment file was obtained via the UCSC genome browser blat service with parameters -t=dnax -q=rnax and filtered to allow only scores above 1000 to filter out the polyA match:

  gfClient -maxIntron=10 -t=dnax -q=rnax <host> <port> \
     /gbdb/wuhCor1 threeVaccines.fa stdout \
        | pslFilter -minScore=1000 stdin wuhCor1.vaccines.psl

  pslScore wuhCor1.vaccines.psl

  #tName          tStart  tEnd    qName:qStart-qEnd       score   percentIdent
  NC_045512v2     21559   25384   ModernaMrna1273:54-3879  1419    68.60
  NC_045512v2     21559   25384   ReconstructedBNT162b2:51-3876 1701    72.30
  NC_045512v2     21559   25384   WHO_BNT162b2:51-3876     1701    72.30

  faCount threeVaccines.fa | tawk '{print $1,"1.."$2+1}' \
     | head -4 | tail -3 > threeVaccines.cds
  pslToBigPsl -cds=threeVaccines.cds -fa=threeVaccines.fa wuhCor1.vaccines.psl stdout \
     | sort -k1,1 -k2,2n > wuhCor1.vaccines.bigPsl

  bedToBigBed -type=bed12+13 -tab -as=HOME/kent/src/hg/lib/bigPsl.as \
    wuhCor1.vaccines.bigPsl wuhCor1.chrom.sizes wuhCor1.vaccines.bb

Data Access

The fasta file sequences and psl alignment file can be obtained from our download server at: https://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/vaccines/.

The bigPsl alignment file used for the display of this track in the genome browser can be accessed from https://hgdownload.soe.ucsc.edu/gbdb/wuhCor1/bbi/wuhCor1.vaccines.bb. The kent command line access tool bigBedToBed, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here.

The protein encoded by the three sequences has two AA substitutions compared to the SARS-CoV-2 S glycoprotein. Variations: S:K986P and S:V987P in the vaccine sequence. See also: The tiny tweak behind COVID-19 vaccines.

>BNT162b2
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFD
NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY
SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT
LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV
QPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF
VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC
NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGS
NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI
SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF
NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAG
TITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN
TLVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRV
DFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL
QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYTZZ

References

Dae Eun Jeong, Matthew McCoy, Karen Artiles, Orkan Ilbay, Andrew Fire, Kari Nadeau, Helen Park, Brooke Betts, Scott Boyd, Ramona Hoh, and Massa Shoura Assemblies of putative SARS-CoV2-spike-encoding mRNA sequences for vaccines BNT-162b2 and mRNA-1273 obtained from github

Bert Hubert Reverse Engineering the source code of the BioNTech/Pfizer SARS-CoV-2 Vaccine 25 Dec 2020

WikiPedia Pfizer-BioNTech COVID-19 vaccine

World Health Organization MedNet Messenger RNA encoding the full-length SARS-CoV-2 spike glycoprotein Sept. 2020 document 11889

Cyril Le Nouën, Peter L. Collins, and Ursula J. Buchholz Attenuation of Human Respiratory Viruses by Synonymous Genome Recoding Frontiers in Immunology 2019; 10: 1250. PMID: 31231383

Ryan Cross The tiny tweak behind COVID-19 vaccines, Chemical & Engineering News 29 September 2020 Vol 98, issue 38

Credits

Thank you to the Andrew Fire lab, Stanford University School of Medicine for providing the sequencing data of these vaccines.

The presentation of this track was prepared by Hiram Clawson (hclawson@ucsc.edu).