mutS homolog 3 (E. coli) (MSH3) - coding DNA reference sequence

(used for variant description)

(last modified October 14, 2016)


This file was created to facilitate the description of sequence variants on transcript NM_002439.4 in the MSH3 gene based on a coding DNA reference sequence following the HGVS recommendations.

The sequence was taken from NG_016607.1, covering MSH3 transcript NM_002439.4.


Please note that introns are available by clicking on the exon numbers above the sequence.
 (upstream sequence)
                                         .         .                g.5193
                                         ccgcagacgcctgggaactg       c.-61

 .         .         .         .         .         .                g.5253
 cggccgcgggctcgcgctcctcgccaggccctgccgccgggctgccatccttgccctgcc       c.-1

          .         .         .         .         .         .       g.5313
 ATGTCTCGCCGGAAGCCTGCGTCGGGCGGCCTCGCTGCCTCCAGCTCAGCCCCTGCGAGG       c.60
 M  S  R  R  K  P  A  S  G  G  L  A  A  S  S  S  A  P  A  R         p.20

          .         .         .         .         .         .       g.5373
 CAAGCGGTTTTGAGCCGATTCTTCCAGTCTACGGGAAGCCTGAAATCCACCTCCTCCTCC       c.120
 Q  A  V  L  S  R  F  F  Q  S  T  G  S  L  K  S  T  S  S  S         p.40

          .         .         .         .         .         .       g.5433
 ACAGGTGCAGCCGACCAGGTGGACCCTGGCGCTGCAGCGGCTGCAGCGGCCGCAGCGGCC       c.180
 T  G  A  A  D  Q  V  D  P  G  A  A  A  A  A  A  A  A  A  A         p.60

          .         .         .         .         .        | 02.    g.6939
 GCAGCGCCCCCAGCGCCCCCAGCTCCCGCCTTCCCGCCCCAGCTGCCGCCGCACATA | GCT    c.240
 A  A  P  P  A  P  P  A  P  A  F  P  P  Q  L  P  P  H  I   | A      p.80

          .         .         .         .         .         .       g.6999
 ACAGAAATTGACAGAAGAAAGAAGAGACCATTGGAAAATGATGGGCCTGTTAAAAAGAAA       c.300
 T  E  I  D  R  R  K  K  R  P  L  E  N  D  G  P  V  K  K  K         p.100

          .         .         .         .         .         | 03    g.15670
 GTAAAGAAAGTCCAACAAAAGGAAGGAGGAAGTGATCTGGGAATGTCTGGCAACTCTG | AG    c.360
 V  K  K  V  Q  Q  K  E  G  G  S  D  L  G  M  S  G  N  S  E |       p.120

          .         .         .         .         .         .       g.15730
 CCAAAGAAATGTCTGAGGACCAGGAATGTTTCAAAGTCTCTGGAAAAATTGAAAGAATTC       c.420
 P  K  K  C  L  R  T  R  N  V  S  K  S  L  E  K  L  K  E  F         p.140

          .         .         .         .         .         .       g.15790
 TGCTGCGATTCTGCCCTTCCTCAAAGTAGAGTCCAGACAGAATCTCTGCAGGAGAGATTT       c.480
 C  C  D  S  A  L  P  Q  S  R  V  Q  T  E  S  L  Q  E  R  F         p.160

          .         .         .         .         .         .       g.15850
 GCAGTTCTGCCAAAATGTACTGATTTTGATGATATCAGTCTTCTACACGCAAAGAATGCA       c.540
 A  V  L  P  K  C  T  D  F  D  D  I  S  L  L  H  A  K  N  A         p.180

          .         .         .          | 04        .         .    g.20643
 GTTTCTTCTGAAGATTCGAAACGTCAAATTAATCAAAAG | GACACAACACTTTTTGATCTC    c.600
 V  S  S  E  D  S  K  R  Q  I  N  Q  K   | D  T  T  L  F  D  L      p.200

          .         .         .         .         .         .       g.20703
 AGTCAGTTTGGATCATCAAATACAAGTCATGAAAATTTACAGAAAACTGCTTCCAAATCA       c.660
 S  Q  F  G  S  S  N  T  S  H  E  N  L  Q  K  T  A  S  K  S         p.220

          .         .         .         .         .         .       g.20763
 GCTAACAAACGGTCCAAAAGCATCTATACGCCGCTAGAATTACAATACATAGAAATGAAG       c.720
 A  N  K  R  S  K  S  I  Y  T  P  L  E  L  Q  Y  I  E  M  K         p.240

          .         .         .         .         .         .       g.20823
 CAGCAGCACAAAGATGCAGTTTTGTGTGTGGAATGTGGATATAAGTATAGATTCTTTGGG       c.780
 Q  Q  H  K  D  A  V  L  C  V  E  C  G  Y  K  Y  R  F  F  G         p.260

          .   | 05     .         .         .         .         .    g.22817
 GAAGATGCAGAG | ATTGCAGCCCGAGAGCTCAATATTTATTGCCATTTAGATCACAACTTT    c.840
 E  D  A  E   | I  A  A  R  E  L  N  I  Y  C  H  L  D  H  N  F      p.280

          .         .         .         .         .         .       g.22877
 ATGACAGCAAGTATACCTACTCACAGACTGTTTGTTCATGTACGCCGCCTGGTGGCAAAA       c.900
 M  T  A  S  I  P  T  H  R  L  F  V  H  V  R  R  L  V  A  K         p.300

           | 06        .         .         .         .         .    g.23317
 GGATATAAG | GTGGGAGTTGTGAAGCAAACTGAAACTGCAGCATTAAAGGCCATTGGAGAC    c.960
 G  Y  K   | V  G  V  V  K  Q  T  E  T  A  A  L  K  A  I  G  D      p.320

          .         .         .         .         .         .       g.23377
 AACAGAAGTTCACTCTTTTCCCGGAAATTGACTGCCCTTTATACAAAATCTACACTTATT       c.1020
 N  R  S  S  L  F  S  R  K  L  T  A  L  Y  T  K  S  T  L  I         p.340

         | 07.         .         .         .         .         .    g.25561
 GGAGAAG | ATGTGAATCCCCTAATCAAGCTGGATGATGCTGTAAATGTTGATGAGATAATG    c.1080
 G  E  D |   V  N  P  L  I  K  L  D  D  A  V  N  V  D  E  I  M      p.360

          .         .         .         .         .         .       g.25621
 ACTGATACTTCTACCAGCTATCTTCTGTGCATCTCTGAAAATAAGGAAAATGTTAGGGAC       c.1140
 T  D  T  S  T  S  Y  L  L  C  I  S  E  N  K  E  N  V  R  D         p.380

          .         .         .    | 08    .         .         .    g.29479
 AAAAAAAAGGGCAACATTTTTATTGGCATTGTG | GGAGTGCAGCCTGCCACAGGCGAGGTT    c.1200
 K  K  K  G  N  I  F  I  G  I  V   | G  V  Q  P  A  T  G  E  V      p.400

          .         .         .         .         .         .       g.29539
 GTGTTTGATAGTTTCCAGGACTCTGCTTCTCGTTCAGAGCTAGAAACCCGGATGTCAAGC       c.1260
 V  F  D  S  F  Q  D  S  A  S  R  S  E  L  E  T  R  M  S  S         p.420

          .         .         .         .         .         .       g.29599
 CTGCAGCCAGTAGAGCTGCTGCTTCCTTCGGCCTTGTCCGAGCAAACAGAGGCGCTCATC       c.1320
 L  Q  P  V  E  L  L  L  P  S  A  L  S  E  Q  T  E  A  L  I         p.440

          .         . | 09       .         .         .         .    g.76018
 CACAGAGCCACATCTGTTAG | TGTGCAGGATGACAGAATTCGAGTCGAAAGGATGGATAAC    c.1380
 H  R  A  T  S  V  S  |  V  Q  D  D  R  I  R  V  E  R  M  D  N      p.460

          .         .         .         .         .         .       g.76078
 ATTTATTTTGAATACAGCCATGCTTTCCAGGCAGTTACAGAGTTTTATGCAAAAGATACA       c.1440
 I  Y  F  E  Y  S  H  A  F  Q  A  V  T  E  F  Y  A  K  D  T         p.480

          .    | 10    .         .         .         .         .    g.79423
 GTTGACATCAAAG | GTTCTCAAATTATTTCTGGCATTGTTAACTTAGAGAAGCCTGTGATT    c.1500
 V  D  I  K  G |   S  Q  I  I  S  G  I  V  N  L  E  K  P  V  I      p.500

          .         .         .         .         .         .       g.79483
 TGCTCTTTGGCTGCCATCATAAAATACCTCAAAGAATTCAACTTGGAAAAGATGCTCTCC       c.1560
 C  S  L  A  A  I  I  K  Y  L  K  E  F  N  L  E  K  M  L  S         p.520

          | 11         .         .         .         .         .    g.92041
 AAACCTGA | GAATTTTAAACAGCTATCAAGTAAAATGGAATTTATGACAATTAATGGAACA    c.1620
 K  P  E  |  N  F  K  Q  L  S  S  K  M  E  F  M  T  I  N  G  T      p.540

          .         .         .    | 12    .         .         .    g.95058
 ACATTAAGGAATCTGGAAATCCTACAGAATCAG | ACTGATATGAAAACCAAAGGAAGTTTG    c.1680
 T  L  R  N  L  E  I  L  Q  N  Q   | T  D  M  K  T  K  G  S  L      p.560

          .         .         .         .         .         .       g.95118
 CTGTGGGTTTTAGACCACACTAAAACTTCATTTGGGAGACGGAAGTTAAAGAAGTGGGTG       c.1740
 L  W  V  L  D  H  T  K  T  S  F  G  R  R  K  L  K  K  W  V         p.580

          .         .    | 13    .         .         .         .    g.112108
 ACCCAGCCACTCCTTAAATTAAG | GGAAATAAATGCCCGGCTTGATGCTGTATCGGAAGTT    c.1800
 T  Q  P  L  L  K  L  R  |  E  I  N  A  R  L  D  A  V  S  E  V      p.600

          .         .         .         .         .         .       g.112168
 CTCCATTCAGAATCTAGTGTGTTTGGTCAGATAGAAAATCATCTACGTAAATTGCCCGAC       c.1860
 L  H  S  E  S  S  V  F  G  Q  I  E  N  H  L  R  K  L  P  D         p.620

          .         .         .       | 14 .         .         .    g.118482
 ATAGAGAGGGGACTCTGTAGCATTTATCACAAAAAA | TGTTCTACCCAAGAGTTCTTCTTG    c.1920
 I  E  R  G  L  C  S  I  Y  H  K  K   | C  S  T  Q  E  F  F  L      p.640

          .         .         .         .         .         .       g.118542
 ATTGTCAAAACTTTATATCACCTAAAGTCAGAATTTCAAGCAATAATACCTGCTGTTAAT       c.1980
 I  V  K  T  L  Y  H  L  K  S  E  F  Q  A  I  I  P  A  V  N         p.660

          .         .         .         .         .         .       g.118602
 TCCCACATTCAGTCAGACTTGCTCCGGACCGTTATTTTAGAAATTCCTGAACTCCTCAGT       c.2040
 S  H  I  Q  S  D  L  L  R  T  V  I  L  E  I  P  E  L  L  S         p.680

          .         .         .         .     | 15   .         .    g.119376
 CCAGTGGAGCATTACTTAAAGATACTCAATGAACAAGCTGCCAA | AGTTGGGGATAAAACT    c.2100
 P  V  E  H  Y  L  K  I  L  N  E  Q  A  A  K  |  V  G  D  K  T      p.700

          .         .         .         .         .         .       g.119436
 GAATTATTTAAAGACCTTTCTGACTTCCCTTTAATAAAAAAGAGGAAGGATGAAATTCAA       c.2160
 E  L  F  K  D  L  S  D  F  P  L  I  K  K  R  K  D  E  I  Q         p.720

          .         .         .         .         .         .       g.119496
 GGTGTTATTGACGAGATCCGAATGCATTTGCAAGAAATACGAAAAATACTAAAAAATCCT       c.2220
 G  V  I  D  E  I  R  M  H  L  Q  E  I  R  K  I  L  K  N  P         p.740

          .         .         .    | 16    .         .         .    g.126246
 TCTGCACAATATGTGACAGTATCAGGACAGGAG | TTTATGATAGAAATAAAGAACTCTGCT    c.2280
 S  A  Q  Y  V  T  V  S  G  Q  E   | F  M  I  E  I  K  N  S  A      p.760

          .         .         .         | 17         .         .    g.129267
 GTATCTTGTATACCAACTGATTGGGTAAAGGTTGGAAG | CACAAAAGCTGTGAGCCGCTTT    c.2340
 V  S  C  I  P  T  D  W  V  K  V  G  S  |  T  K  A  V  S  R  F      p.780

          .         .         .         .         .         .       g.129327
 CACTCTCCTTTTATTGTAGAAAATTACAGACATCTGAATCAGCTCCGGGAGCAGCTAGTC       c.2400
 H  S  P  F  I  V  E  N  Y  R  H  L  N  Q  L  R  E  Q  L  V         p.800

          .         .         .      | 18  .         .         .    g.138115
 CTTGACTGCAGTGCTGAATGGCTTGATTTTCTAGA | GAAATTCAGTGAACATTATCACTCC    c.2460
 L  D  C  S  A  E  W  L  D  F  L  E  |  K  F  S  E  H  Y  H  S      p.820

          .         .         .         .         .         .       g.138175
 TTGTGTAAAGCAGTGCATCACCTAGCAACTGTTGACTGCATTTTCTCCCTGGCCAAGGTC       c.2520
 L  C  K  A  V  H  H  L  A  T  V  D  C  I  F  S  L  A  K  V         p.840

          .         .    | 19    .         .         .         .    g.143295
 GCTAAGCAAGGAGATTACTGCAG | ACCAACTGTACAAGAAGAAAGAAAAATTGTAATAAAA    c.2580
 A  K  Q  G  D  Y  C  R  |  P  T  V  Q  E  E  R  K  I  V  I  K      p.860

          .         .         .         .         .         .       g.143355
 AATGGAAGGCACCCTGTGATTGATGTGTTGCTGGGAGAACAGGATCAATATGTCCCAAAT       c.2640
 N  G  R  H  P  V  I  D  V  L  L  G  E  Q  D  Q  Y  V  P  N         p.880

          .      | 20  .         .         .         .         .    g.164154
 AATACAGATTTATCA | GAGGACTCAGAGAGAGTAATGATAATTACCGGACCAAACATGGGT    c.2700
 N  T  D  L  S   | E  D  S  E  R  V  M  I  I  T  G  P  N  M  G      p.900

          .         .         .         .         .         .       g.164214
 GGAAAGAGCTCCTACATAAAACAAGTTGCATTGATTACCATCATGGCTCAGATTGGCTCC       c.2760
 G  K  S  S  Y  I  K  Q  V  A  L  I  T  I  M  A  Q  I  G  S         p.920

          .         .         .         .         .    | 21    .    g.204662
 TATGTTCCTGCAGAAGAAGCGACAATTGGGATTGTGGATGGCATTTTCACAAG | GATGGGT    c.2820
 Y  V  P  A  E  E  A  T  I  G  I  V  D  G  I  F  T  R  |  M  G      p.940

          .         .         .         .         .         .       g.204722
 GCTGCAGACAATATATATAAAGGACAGAGTACATTTATGGAAGAACTGACTGACACAGCA       c.2880
 A  A  D  N  I  Y  K  G  Q  S  T  F  M  E  E  L  T  D  T  A         p.960

          .         .         .         .         .         .       g.204782
 GAAATAATCAGAAAAGCAACATCACAGTCCTTGGTTATCTTGGATGAACTAGGAAGAGGG       c.2940
 E  I  I  R  K  A  T  S  Q  S  L  V  I  L  D  E  L  G  R  G         p.980

          .         .         .         .         .         .       g.204842
 ACGAGCACTCATGATGGAATTGCCATTGCCTATGCTACACTTGAGTATTTCATCAGAGAT       c.3000
 T  S  T  H  D  G  I  A  I  A  Y  A  T  L  E  Y  F  I  R  D         p.1000

  | 22       .         .         .         .         .         .    g.215398
  | GTGAAATCCTTAACCCTGTTTGTCACCCATTATCCGCCAGTTTGTGAACTAGAAAAAAAT    c.3060
  | V  K  S  L  T  L  F  V  T  H  Y  P  P  V  C  E  L  E  K  N      p.1020

          .         .         .         .         .         .       g.215458
 TACTCACACCAGGTGGGGAATTACCACATGGGATTCTTGGTCAGTGAGGATGAAAGCAAA       c.3120
 Y  S  H  Q  V  G  N  Y  H  M  G  F  L  V  S  E  D  E  S  K         p.1040

          . | 23       .         .         .         .         .    g.223691
 CTGGATCCAG | GCGCAGCAGAACAAGTCCCTGATTTTGTCACCTTCCTTTACCAAATAACT    c.3180
 L  D  P  G |   A  A  E  Q  V  P  D  F  V  T  F  L  Y  Q  I  T      p.1060

          .         .         .         .         .         .       g.223751
 AGAGGAATTGCAGCAAGGAGTTATGGATTAAATGTGGCTAAACTAGCAGATGTTCCTGGA       c.3240
 R  G  I  A  A  R  S  Y  G  L  N  V  A  K  L  A  D  V  P  G         p.1080

          .         .         .         .         .         .       g.223811
 GAAATTTTGAAGAAAGCAGCTCACAAGTCAAAAGAGCTGGAAGGATTAATAAATACGAAA       c.3300
 E  I  L  K  K  A  A  H  K  S  K  E  L  E  G  L  I  N  T  K         p.1100

    | 24     .         .         .         .         .         .    g.226334
 AG | AAAGAGACTCAAGTATTTTGCAAAGTTATGGACGATGCATAATGCACAAGACCTGCAG    c.3360
 R  |  K  R  L  K  Y  F  A  K  L  W  T  M  H  N  A  Q  D  L  Q      p.1120

          .         .         .         .         .                 g.226388
 AAGTGGACAGAGGAGTTCAACATGGAAGAAACACAGACTTCTCTTCTTCATTAA             c.3414
 K  W  T  E  E  F  N  M  E  E  T  Q  T  S  L  L  H  X               p.1137

          .         .         .         .         .         .       g.226448
 aatgaagactacatttgtgaacaaaaaatggagaattaaaaataccaactgtacaaaata       c.*60

          .         .         .         .         .         .       g.226508
 actctccagtaacagcctatctttgtgtgacatgtgagcataaaattatgaccatggtat       c.*120

          .         .         .         .         .         .       g.226568
 attcctattggaaacagagaggtttttctgaagacagtctttttcaagtttctgtcttcc       c.*180

          .         .         .         .         .         .       g.226628
 taacttttctacgtataaacactcttgaatagacttccactttgtaattagaaaatttta       c.*240

          .         .         .         .         .         .       g.226688
 tggacagtaagtccagtaaagccttaagtggcagaatataattcccaagcttttggaggg       c.*300

          .         .         .         .         .         .       g.226748
 tgatataaaaatttacttgatatttttatttgtttcagttcagataattggcaactgggt       c.*360

          .         .         .         .         .         .       g.226808
 gaatctggcaggaatctatccattgaactaaaataattttattatgcaaccagtttatcc       c.*420

          .         .         .         .         .         .       g.226868
 accaagaacataagaattttttataagtagaaagaattggccaggcatggtggctcatgc       c.*480

          .         .         .         .         .         .       g.226928
 ctgtaatcccagcactttgggaggccaaggtaggcagatcacctgaggtcaggagttcaa       c.*540

          .         .         .         .         .         .       g.226988
 gaccagcctggccaacatggcaaaaccccatctttactaaaaatataaagtacatctcta       c.*600

          .         .         .         .         .         .       g.227048
 ctaaaaatacgaaaaaattagctgggcatggtggcgcacacctgtagtcccagctactcc       c.*660

          .         .         .         .         .         .       g.227108
 ggaggctgaggcaggagaatctcttgaacctgggaggcggaggttgcaatgagccgagat       c.*720

          .         .         .         .         .         .       g.227168
 cacgtcactgcactccagcttgggcaacagagcaagactccatctcaaaaaaaaaaaaag       c.*780

          .         .         .         .         .         .       g.227228
 aaaaaagaaaagaaatagaattatcaagcttttaaaaactagagcacagaaggaataagg       c.*840

          .         .         .         .         .         .       g.227288
 tcatgaaatttaaaaggttaaatattgtcataggattaagcagtttaaagattgttggat       c.*900

          .         .         .         .         .                 g.227341
 gaaattatttgtcattcattcaagtaataaatatttaatgaatacttgctata              c.*953

 (downstream sequence)
Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The MutS homolog 3 (E. coli) protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift variants, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.

Powered by LOVD v.3.0 Build 17
©2004-2016 Leiden University Medical Center