selenoprotein N, 1 (SEPN1) - coding DNA reference sequence

(used for variant description)

(last modified May 12, 2017)


This file was created to facilitate the description of sequence variants on transcript NM_020451.2 in the SEPN1 gene based on a coding DNA reference sequence following the HGVS recommendations. The sequence was taken from NG_009930.1, covering SEPN1 transcript NM_020451.2. Exon 3 can be differentially spliced (see transcript NM_206926.1).

NOTE: SEPN1 is a selenoprotien, meaning Selenium is incorporated as selenocysteine (U or Sec) at a UGA codon, normally a termination codon (here *127 and *462). The recognition of UGA as a selenocysteine codon requires a secondary structure called SECIS (selenocysteine insertion sequence) that is located in the 3' UTR of the transcript.

Please note that introns are available by clicking on the exon numbers above the sequence.


 (upstream sequence)
           .         .         .         .         .                g.5055
      ccccgccccgctctttcgcttcccgggccgccggcagccgccgccagccgcagcc       c.-1

          .         .         .         .         .         .       g.5115
 ATGGGCCGGGCCCGGCCGGGCCAACGCGGGCCGCCCAGCCCCGGCCCCGCCGCGCAGCCT       c.60
 M  G  R  A  R  P  G  Q  R  G  P  P  S  P  G  P  A  A  Q  P         p.20

          .         .         .         .         .         .       g.5175
 CCCGCGCCACCGCGCCGCCGCGCCCGTTCCCTGGCGCTGCTCGGAGCCCTGCTGGCCGCC       c.120
 P  A  P  P  R  R  R  A  R  S  L  A  L  L  G  A  L  L  A  A         p.40

          .         .         .         .         .         .       g.5235
 GCCGCTGCCGCCGCCGTCCGGGTCTGCGCCCGCCACGCCGAGGCCCAGGCGGCCGCGCGG       c.180
 A  A  A  A  A  V  R  V  C  A  R  H  A  E  A  Q  A  A  A  R         p.60

     | 02    .         .         .         .         .         .    g.5924
 CAG | GAACTGGCGCTGAAGACCCTGGGGACAGATGGCCTTTTTCTCTTTTCCTCCTTGGAC    c.240
 Q   | E  L  A  L  K  T  L  G  T  D  G  L  F  L  F  S  S  L  D      p.80

          .         .         .         .         .         .       g.5984
 ACTGACGGGGATATGTACATCAGCCCTGAGGAGTTCAAACCCATTGCTGAGAAGCTAACA       c.300
 T  D  G  D  M  Y  I  S  P  E  E  F  K  P  I  A  E  K  L  T         p.100

   | 03      .         .         .         .         .         .    g.6899
 G | GGTCTTGTTCTGTCACCCAGACTGGAGTGCAGTGGTGCAGTCACAGCTCACTGCAGCCT    c.360
 G |   S  C  S  V  T  Q  T  G  V  Q  W  C  S  H  S  S  L  Q  P      p.120
     ^ differentially spliced exon

          .         .         .         .   | 04     .         .    g.10004
 CAACTTCCCTGGCTCAATTGATCCTCCTGCCTCAGCCTCCTGA | GGTCAACTCCCGCGGCC    c.420
Q L P W L N U S S C L S L L R | S T P A A 140 . . . . . . g.10043 AGCTGCGAGGAGGAGGAGTTGCCCCCTGACCCTAGCGAGGAGACGCTCACCATAGAAGCC c.480 S C E E E E L P P D P S E E T L T I E A p.160 . . . . . | 05. g.13407 CGATTCCAGCCTCTGCTCCCGGAGACCATGACCAAGAGCAAAGATGGCTTCCTAGGG | GTC c.540 R F Q P L L P E T M T K S K D G F L G | V p.180 . . . . . . g.13467 TCCCGCCTCGCCCTGTCCGGCCTCCGAAACTGGACAGCCGCCGCCTCACCAAGTGCAGTG c.600 S R L A L S G L R N W T A A A S P S A V p.200 . . . . . . g.13527 TTTGCCACCCGCCACTTCCAGCCCTTCCTTCCCCCGCCAGGCCAGGAGCTGGGTGAGCCC c.660 F A T R H F Q P F L P P P G Q E L G E P p.220 . . . . . . g.13587 TGGTGGATCATCCCCAGTGAGCTGAGCATGTTCACTGGCTACCTGTCCAACAACCGCTTC c.720 W W I I P S E L S M F T G Y L S N N R F p.240 . . | 06. . . . g.13883 TATCCACCGCCGCCCAAGGGCAAGGAG | GTCATCATCCACCGGCTCCTGAGCATGTTCCAC c.780 Y P P P P K G K E | V I I H R L L S M F H p.260 . . . . . . g.13943 CCTCGGCCCTTTGTGAAGACCCGCTTTGCCCCTCAGGGAGCTGTGGCCTGCCTGACTGCC c.840 P R P F V K T R F A P Q G A V A C L T A p.280 . . . | 07 . . . g.14535 ATCAGCGACTTCTACTACACTGTGATGTTCCG | GATCCATGCCGAGTTCCAGCTCAGTGAG c.900 I S D F Y Y T V M F R | I H A E F Q L S E p.300 . . . . . . g.14595 CCGCCCGACTTCCCCTTTTGGTTCTCCCCTGCTCAGTTCACCGGCCACATCATCCTCTCC c.960 P P D F P F W F S P A Q F T G H I I L S p.320 . . . . . | 08 . g.16288 AAAGACGCCACCCACGTCCGCGACTTCCGGCTCTTCGTGCCCAACCACAG | GTCTCTGAAT c.1020 K D A T H V R D F R L F V P N H R | S L N p.340 . . . . . . g.16348 GTGGACATGGAGTGGCTTTACGGGGCCAGTGAAAGCAGCAACATGGAGGTGGACATCGGC c.1080 V D M E W L Y G A S E S S N M E V D I G p.360 . | 09 . . . . . g.16563 TACATACCCCAG | ATGGAGCTGGAGGCCACGGGCCCCTCTGTGCCCTCCGTGATCCTGGAT c.1140 Y I P Q | M E L E A T G P S V P S V I L D p.380 . . . . . . g.16623 GAGGATGGCAGCATGATCGACAGCCACCTGCCTTCAGGGGAGCCCCTGCAGTTTGTGTTT c.1200 E D G S M I D S H L P S G E P L Q F V F p.400 . . . . . . g.16683 GAGGAGATCAAGTGGCAGCAGGAGCTGAGCTGGGAGGAGGCTGCCCGGCGCCTGGAGGTG c.1260 E E I K W Q Q E L S W E E A A R R L E V p.420 . . | 10 . . . . g.17550 GCCATGTACCCCTTCAAGAAG | GTCTCCTACTTGCCGTTCACTGAGGCCTTCGACCGAGCC c.1320 A M Y P F K K | V S Y L P F T E A F D R A p.440 . . . . . . g.17610 AAGGCTGAGAACAAGCTGGTGCACTCAATCCTGCTGTGGGGGGCCCTGGATGACCAGTCC c.1380 K A E N K L V H S I L L W G A L D D Q S p.460 | 11. . . . . . g.18758 TGCTGAG | GTTCAGGGCGGACTCTCCGGGAGACTGTCCTGGAAAGTTCGCCCATCCTCACC c.1440 C U G | S G R T L R E T V L E S S P I L T p.480 SRE (Sec redefinition element) . . . . . . g.18818 CTGCTCAACGAGAGCTTCATCAGCACCTGGTCCCTGGTGAAGGAGCTGGAGGAACTGCAG c.1500 L L N E S F I S T W S L V K E L E E L Q p.500 | 12 . . . . . . g.18961 | AACAACCAGGAGAACTCGTCCCACCAGAAGCTGGCTGGCCTGCACCTGGAGAAGTACAGC c.1560 | N N Q E N S S H Q K L A G L H L E K Y S p.520 . . . . | 13 . . g.20390 TTCCCCGTGGAGATGATGATCTGCCTGCCCAATGGCACCGTG | GTCCATCACATCAATGCC c.1620 F P V E M M I C L P N G T V | V H H I N A p.540 . . . . . . g.20450 AACTACTTCTTGGACATCACCTCCGTGAAGCCCGAGGAAATCGAGAGCAATCTCTTCAGC c.1680 N Y F L D I T S V K P E E I E S N L F S p.560 . . . . . . g.20510 TTCTCATCCACCTTTGAAGACCCGTCCACGGCCACCTACATGCAGTTCCTGAAGGAGGGA c.1740 F S S T F E D P S T A T Y M Q F L K E G p.580 . . . g.20543 CTCCGGCGTGGCCTGCCCCTCCTCCAGCCCTAG c.1773 L R R G L P L L Q P * p.590 . . . . . . g.20603 agtgcctggacgggatctgatgcacaggcccccacgcctcagagccagagtggtcctcag c.*60 . . . . . . g.20663 cccatttcagactgcagatgccgcccactcccaccccactcctaggctgccttggagggt c.*120 . . . . . . g.20723 acaagatccactgagggtggccaccacagccttggctccatggtggcgggtagacaaggg c.*180 . . . . . . g.20783 atgcctgggctgactgggcagaggaacctctagctctgactgtcactcggctctccctac c.*240 . . . . . . g.20843 ccatttggctctggaagctgcttggcccccccagatcagggcctgggtgaactccctgga c.*300 . . . . . . g.20903 cctttcctagccagccgcacagtctaggcccttgtggggtgaagaatggagggaggagca c.*360 . . . . . . g.20963 ggctaggaagacggggccaccaccctctccttgctttcagcccttcccacaggaaacatc c.*420 . . . . . . g.21023 aagaagccccagccaggaggggccaggctgccaaggcggctcccctgtttatctagagcc c.*480 . . . . . . g.21083 ttcgttcctggccataccccggactgccctcctgtgcctgatgtccccagctggggtcag c.*540 . . . . . . g.21143 tctcaacaggagccagtcttctggagcctctgggcagaaccctccatcagagtggaaatc c.*600 . . . . . . g.21203 agacgggaccccctgcagcttccctgaccacgccactgaccagctatctggggaagttta c.*660 . . . . . . g.21263 ctgtgaaggggtttctgcctttagcaatggggttcactaagggggttcccgaggcccagg c.*720 . . . . . . g.21323 gccaaggcactcccaccgcctaccttagcacagggtctctgcaggactgcgggagccagc c.*780 . . . . . . g.21383 gctcctgccgcccctcttgcccctcagaccttgcatccacagaagcacaacccagccaaa c.*840 . . . . . . g.21443 caccacagccttctccagagccggcactgtcccggcaaccaggggtgccccaggctagct c.*900 . . . . . . g.21503 cttctacctctggggcaccacggactccccttggccactcttgggactttggtccacgtc c.*960 . . . . . . g.21563 ctgagccactgaccacggccagtctctctttttatatgtgcagaaaagtgtttttacaca c.*1020 . . . . . . g.21623 aactttctcatggtttgtaggtatttttttataaccccagtgctgaggagaaaggagggg c.*1080 . . . . . . g.21683 cagtggcttccccggcagcagccccatgatggctgaatccgaaatcctcgatgggtccag c.*1140 . . . . . . g.21743 cttgatgtctttgcagctgcacctatgggaagaagtagtcctctcttccttctcctcttc c.*1200 . . . . . . g.21803 agctttttaaaaacagtcctcagaggatccatgatccccagcactgtcccatcctccaca c.*1260 . . . . . . g.21863 aaggcccacaggcatgcctgtactctctttcattaaggtcttgaagtcaggctgccccct c.*1320 . . . . . . g.21923 ccccagcccccagttctctccccaccccctcaccccacccggggctcactcagcctggca c.*1380 . . . . . . g.21983 gaggaagaaggaaggcagacatctccgcagccactcctgggccttttatgtgccgagtta c.*1440 . . . . . . g.22043 ccccacttgccttgggcgtgtccactgagccttccccagccagtcttgttctcaattttg c.*1500 . . . . . . g.22103 ttttgttttgttttgagacggagtcttgctctgtcacccaggctggagtgctatggctcg c.*1560 . . . . . . g.22163 atcttggctcactgcaacctccacctcccaggttcaagcaattctcttgcctcagcctcc c.*1620 . . . . . . g.22223 cgagtagctgggattacaggtgcatgccaccatggctggctaatttttgtatttttagta c.*1680 . . . . . . g.22283 gagatggggtttcaccatattggtcaggctgatctggaacttctgacctcaggtgatcca c.*1740 . . . . . . g.22343 cctgcctcagcctcccaaagtgctgggattacaggcgtgagcaatcgtgcccagccttgt c.*1800 . . . . . . g.22403 tcttaattttgtatcatccagtcatcgctaatattacacgcaccttctcacttaatcctc c.*1860 . . . . . . g.22463 acgacaagcctgtgaggcagatgctcattgttcccatcttgatgaaacttgagtctcagg c.*1920 . . . . . . g.22523 gaagtgaagtgacttgcccagggtcactcaggtagagttgagattcaaacccacatgtgg c.*1980 . . . . . . g.22583 ctccaaagtctgcatctggatttgggggtgttttttggcatggcaccctcacctctctcc c.*2040 . . . . . . g.22643 ctgcctgttttccccaaagtggaaaggaaggcctttcaaaccagagtgtctcactcccct c.*2100 . . . . . . g.22703 ctgacctccagaccagatggggcatgagccagccagctcagccaggctccctgtgtcctg c.*2160 . . . . . . g.22763 ggaggaagtgtccccatcccccatgccccttatggggagggagggcgtctgatgctctct c.*2220 . . . . . . g.22823 ctctgcctccccccccatcctgtcaggcacaggtgacgggggcagcccatgcgagccctt c.*2280 . . . . . . g.22883 ctcctgctgctctgggagggccagttccacattgagccagcctggtcccatggaaaatga c.*2340 . . . . . . g.22943 tggcctgggctttctgaggccttatctgatgcctctgcagttcatgtcccccaccaggcc c.*2400 . . . . . . g.23003 tcgaggctcagggtgggagagggccccgggctgccctgtcactcctctaacacttccctc c.*2460 . . . . g.23047 ccctgtccccaacatgccctgtaataaaattagagaagactaac c.*2504 (downstream sequence)
Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The Selenoprotein N, 1 protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift variants, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.

Powered by LOVD v.3.0 Build 18
©2004-2017 Leiden University Medical Center