mutS homolog 4 (E. coli) (MSH4) - coding DNA reference sequence

(used for variant description)

(last modified June 9, 2020)


This file was created to facilitate the description of sequence variants on transcript NM_002440.3 in the MSH4 gene based on a coding DNA reference sequence following the HGVS recommendations.

The sequence was taken from NG_029861.1, covering MSH4 transcript NM_002440.3.


Please note that introns are available by clicking on the exon numbers above the sequence.
 (upstream sequence)
           .         .         .         .         .                g.5055
      agtgcgtcggcgcgcagttctcccgcccgtttcagcggcgcagcttctgtagttg       c.-61

 .         .         .         .         .         .                g.5115
 ggctactggaggggtcgctcagaaacctcatacttctcgggtcagggaaggtttgggagg       c.-1

          .         .         .         .         .         .       g.5175
 ATGCTGAGGCCTGAGATCTCATCAACCTCGCCTTCTGCCCCGGCGGTTTCCCCGTCGTCG       c.60
 M  L  R  P  E  I  S  S  T  S  P  S  A  P  A  V  S  P  S  S         p.20

          .         .         .         .         .         .       g.5235
 GGAGAAACCCGCTCACCTCAGGGTCCCCGCTACAATTTCGGACTCCAGGAGACTCCACAG       c.120
 G  E  T  R  S  P  Q  G  P  R  Y  N  F  G  L  Q  E  T  P  Q         p.40

          .         .         .         .         .         .       g.5295
 AGCCGCCCTTCGGTCCAGGTGGTCTCTGCATCCACCTGTCCTGGCACGTCAGGAGCTGCG       c.180
 S  R  P  S  V  Q  V  V  S  A  S  T  C  P  G  T  S  G  A  A         p.60

          .         .         .         .         .         .       g.5355
 GGCGACCGGAGCAGCAGCAGCAGCAGCCTTCCCTGCCCCGCGCCAAACTCCCGGCCAGCT       c.240
 G  D  R  S  S  S  S  S  S  L  P  C  P  A  P  N  S  R  P  A         p.80

      | 02   .         .         .         .         .         .    g.11916
 CAAG | GTTCATACTTTGGAAACAAAAGAGCTTATGCAGAAAACACAGTTGCATCAAATTTT    c.300
 Q  G |   S  Y  F  G  N  K  R  A  Y  A  E  N  T  V  A  S  N  F      p.100

          .         .         .         .         .         .       g.11976
 ACTTTTGGTGCAAGCTCATCTTCTGCACGAGATACTAATTATCCTCAAACACTTAAAACT       c.360
 T  F  G  A  S  S  S  S  A  R  D  T  N  Y  P  Q  T  L  K  T         p.120

          .         .         .         .         .         .       g.12036
 CCATTGTCTACTGGAAATCCTCAGAGATCAGGTTATAAGAGCTGGACACCACAAGTGGGA       c.420
 P  L  S  T  G  N  P  Q  R  S  G  Y  K  S  W  T  P  Q  V  G         p.140

         | 03.         .         .         .         .         .    g.15163
 TATTCAG | CTTCATCCTCATCTGCGATTTCTGCACACTCCCCATCAGTTATTGTAGCTGTT    c.480
 Y  S  A |   S  S  S  S  A  I  S  A  H  S  P  S  V  I  V  A  V      p.160

          .         .         .         .         .         .       g.15223
 GTAGAAGGGAGAGGACTTGCCAGAGGTGAAATAGGAATGGCAAGTATTGATTTAAAAAAC       c.540
 V  E  G  R  G  L  A  R  G  E  I  G  M  A  S  I  D  L  K  N         p.180

          .         .         .         .         | 04         .    g.18838
 CCCCAAATTATACTATCCCAGTTTGCAGACAACACAACATATGCAAAG | GTGATCACTAAA    c.600
 P  Q  I  I  L  S  Q  F  A  D  N  T  T  Y  A  K   | V  I  T  K      p.200

          .         .         .         .         .         .       g.18898
 CTTAAAATTTTATCACCTTTGGAAATAATAATGTCAAATACTGCTTGTGCTGTGGGGAAT       c.660
 L  K  I  L  S  P  L  E  I  I  M  S  N  T  A  C  A  V  G  N         p.220

          .         .         .          | 05        .         .    g.23171
 TCCACCAAGTTGTTCACTCTGATCACAGAAAATTTCAAG | AATGTTAATTTCACTACTATC    c.720
 S  T  K  L  F  T  L  I  T  E  N  F  K   | N  V  N  F  T  T  I      p.240

          .         .         .         .         .         .       g.23231
 CAAAGGAAATACTTCAATGAAACAAAAGGATTAGAGTACATTGAACAGTTATGCATAGCA       c.780
 Q  R  K  Y  F  N  E  T  K  G  L  E  Y  I  E  Q  L  C  I  A         p.260

          .         .         .      | 06  .         .         .    g.24527
 GAATTCAGCACTGTCCTAATGGAGGTTCAGTCCAA | GTATTACTGCCTTGCAGCTGTTGCA    c.840
 E  F  S  T  V  L  M  E  V  Q  S  K  |  Y  Y  C  L  A  A  V  A      p.280

          .         .         .         .         .         .       g.24587
 GCTTTGTTAAAATATGTTGAATTTATTCAAAATTCAGTTTATGCACCAAAATCACTGAAG       c.900
 A  L  L  K  Y  V  E  F  I  Q  N  S  V  Y  A  P  K  S  L  K         p.300

          .         .         .         .         .         .       g.24647
 ATTTGTTTCCAGGGTAGTGAACAGACAGCCATGATAGATTCATCATCAGCCCAAAACCTT       c.960
 I  C  F  Q  G  S  E  Q  T  A  M  I  D  S  S  S  A  Q  N  L         p.320

          .         .          | 07        .         .         .    g.30569
 GAATTGTTAATTAATAATCAAGACTATAG | GAATAATCACACTCTCTTTGGTGTTCTAAAT    c.1020
 E  L  L  I  N  N  Q  D  Y  R  |  N  N  H  T  L  F  G  V  L  N      p.340

          .         .         .         .         .         .       g.30629
 TATACTAAGACTCCTGGAGGGAGTAGACGACTTCGTTCTAATATATTAGAGCCTCTAGTT       c.1080
 Y  T  K  T  P  G  G  S  R  R  L  R  S  N  I  L  E  P  L  V         p.360

          .         .         .         .         .         .       g.30689
 GATATTGAAACCATTAACATGAGATTAGATTGTGTTCAAGAACTACTTCAAGATGAGGAA       c.1140
 D  I  E  T  I  N  M  R  L  D  C  V  Q  E  L  L  Q  D  E  E         p.380

          .         .   | 08     .         .         .         .    g.56376
 CTATTTTTTGGACTTCAATCAG | TTATATCAAGATTTCTTGATACAGAGCAGCTTCTTTCT    c.1200
 L  F  F  G  L  Q  S  V |   I  S  R  F  L  D  T  E  Q  L  L  S      p.400

          .         .         . | 09       .         .         .    g.75673
 GTTTTAGTCCAAATTCCAAAGCAAGACACG | GTCAATGCTGCTGAATCAAAGATAACAAAT    c.1260
 V  L  V  Q  I  P  K  Q  D  T   | V  N  A  A  E  S  K  I  T  N      p.420

          .         .         .         .      | 10  .         .    g.85080
 TTAATATACTTAAAACATACCTTGGAACTTGTGGATCCTTTAAAG | ATTGCTATGAAGAAC    c.1320
 L  I  Y  L  K  H  T  L  E  L  V  D  P  L  K   | I  A  M  K  N      p.440

          .         .         .         .         . | 11       .    g.86288
 TGTAACACACCTTTATTAAGAGCTTACTATGGTTCCTTGGAAGACAAGAG | GTTTGGAATC    c.1380
 C  N  T  P  L  L  R  A  Y  Y  G  S  L  E  D  K  R  |  F  G  I      p.460

          .         .         .         .         .         .       g.86348
 ATACTTGAAAAGATTAAAACAGTAATTAATGATGATGCAAGATACATGAAAGGATGCCTA       c.1440
 I  L  E  K  I  K  T  V  I  N  D  D  A  R  Y  M  K  G  C  L         p.480

          .         .         .         .         .         .       g.86408
 AACATGAGGACTCAGAAGTGCTATGCAGTGAGGTCTAACATAAATGAATTTCTTGACATA       c.1500
 N  M  R  T  Q  K  C  Y  A  V  R  S  N  I  N  E  F  L  D  I         p.500

          .         .         .         . | 12       .         .    g.87141
 GCAAGAAGAACATACACAGAGATTGTAGATGACATAGCAG | GAATGATATCACAACTTGGA    c.1560
 A  R  R  T  Y  T  E  I  V  D  D  I  A  G |   M  I  S  Q  L  G      p.520

          .         .         .         .         .         .       g.87201
 GAAAAATATAGTCTACCTTTAAGGACAAGTTTTAGCTCTGCTCGAGGATTTTTCATCCAG       c.1620
 E  K  Y  S  L  P  L  R  T  S  F  S  S  A  R  G  F  F  I  Q         p.540

          .         .         .         .         .        | 13.    g.88182
 ATGACTACAGATTGTATAGCCCTACCTAGTGATCAACTTCCTTCAGAATTTATTAAG | ATT    c.1680
 M  T  T  D  C  I  A  L  P  S  D  Q  L  P  S  E  F  I  K   | I      p.560

          .         .         .         .         .         .       g.88242
 TCTAAAGTGAAAAATTCTTACAGCTTTACATCAGCAGATTTAATTAAAATGAATGAAAGA       c.1740
 S  K  V  K  N  S  Y  S  F  T  S  A  D  L  I  K  M  N  E  R         p.580

          .         .         .         .  | 14      .         .    g.89394
 TGCCAAGAATCTTTGAGAGAAATCTATCACATGACTTATAT | GATAGTGTGCAAACTGCTT    c.1800
 C  Q  E  S  L  R  E  I  Y  H  M  T  Y  M  |  I  V  C  K  L  L      p.600

          .         .         .         .         .         .       g.89454
 AGTGAGATTTATGAACATATTCATTGCTTATATAAACTATCTGACACTGTGTCAATGCTG       c.1860
 S  E  I  Y  E  H  I  H  C  L  Y  K  L  S  D  T  V  S  M  L         p.620

          .         .         .         .       | 15 .         .    g.91764
 GATATGCTACTGTCATTTGCTCATGCCTGCACTCTTTCTGACTATG | TTCGACCAGAATTT    c.1920
 D  M  L  L  S  F  A  H  A  C  T  L  S  D  Y  V |   R  P  E  F      p.640

          .         .         .         .         .         .       g.91824
 ACTGATACTTTAGCAATCAAACAGGGATGGCATCCTATTCTTGAAAAAATATCTGCGGAA       c.1980
 T  D  T  L  A  I  K  Q  G  W  H  P  I  L  E  K  I  S  A  E         p.660

          .         .         .         .         .         .       g.91884
 AAACCTATTGCCAACAATACCTATGTTACAGAAGGGAGTAATTTTTTGATCATAACTGGA       c.2040
 K  P  I  A  N  N  T  Y  V  T  E  G  S  N  F  L  I  I  T  G         p.680

          .         .         .         .         .         .       g.91944
 CCAAACATGAGTGGAAAATCCACATATTTAAAACAGATTGCTCTTTGTCAGATTATGGCC       c.2100
 P  N  M  S  G  K  S  T  Y  L  K  Q  I  A  L  C  Q  I  M  A         p.700

         | 16.         .         .         .         .         .    g.97433
 CAGATTG | GATCATATGTTCCAGCAGAATATTCTTCCTTTAGAATTGCTAAACAGATTTTT    c.2160
 Q  I  G |   S  Y  V  P  A  E  Y  S  S  F  R  I  A  K  Q  I  F      p.720

          .         .         .         .         .         .       g.97493
 ACAAGAATTAGTACTGATGATGATATCGAAACAAATTCATCAACATTTATGAAAGAAATG       c.2220
 T  R  I  S  T  D  D  D  I  E  T  N  S  S  T  F  M  K  E  M         p.740

        | 17 .         .         .         .         .         .    g.98879
 AAAGAG | ATAGCATATATTCTACATAATGCTAATGACAAATCGCTCATATTAATTGATGAA    c.2280
 K  E   | I  A  Y  I  L  H  N  A  N  D  K  S  L  I  L  I  D  E      p.760

          .         .         .         .         .         .       g.98939
 CTTGGCAGAGGTACTAATACGGAAGAAGGTATTGGCATTTGTTATGCTGTTTGTGAATAT       c.2340
 L  G  R  G  T  N  T  E  E  G  I  G  I  C  Y  A  V  C  E  Y         p.780

          .      | 18  .         .         .         .         .    g.106081
 CTACTGAGCTTAAAG | GCATTTACACTGTTTGCTACACATTTCCTGGAACTATGCCATATT    c.2400
 L  L  S  L  K   | A  F  T  L  F  A  T  H  F  L  E  L  C  H  I      p.800

          .         .         .         .         .         .       g.106141
 GATGCCCTGTATCCTAATGTAGAAAACATGCATTTTGAAGTTCAACATGTAAAGAATACC       c.2460
 D  A  L  Y  P  N  V  E  N  M  H  F  E  V  Q  H  V  K  N  T         p.820

          .         .         .         .         .         .       g.106201
 TCAAGAAATAAAGAAGCAATTTTGTATACCTACAAACTTTCTAAGGGACTCACAGAAGAG       c.2520
 S  R  N  K  E  A  I  L  Y  T  Y  K  L  S  K  G  L  T  E  E         p.840

          . | 19       .         .         .         .         .    g.107797
 AAAAATTATG | GATTAAAAGCTGCAGAGGTGTCATCACTTCCACCATCAATTGTCTTGGAT    c.2580
 K  N  Y  G |   L  K  A  A  E  V  S  S  L  P  P  S  I  V  L  D      p.860

          .         .         .          | 20        .         .    g.120846
 GCCAAGGAAATCACAACTCAAATTACGAGACAAATTTTG | CAAAACCAAAGGAGTACCCCT    c.2640
 A  K  E  I  T  T  Q  I  T  R  Q  I  L   | Q  N  Q  R  S  T  P      p.880

          .         .         .         .         .         .       g.120906
 GAGATGGAAAGACAGAGAGCTGTGTACCATCTAGCCACTAGGCTTGTTCAAACTGCTCGA       c.2700
 E  M  E  R  Q  R  A  V  Y  H  L  A  T  R  L  V  Q  T  A  R         p.900

          .         .         .         .         .         .       g.120966
 AACTCTCAATTGGATCCAGACAGTTTACGAATATATTTAAGTAACCTCAAGAAGAAGTAC       c.2760
 N  S  Q  L  D  P  D  S  L  R  I  Y  L  S  N  L  K  K  K  Y         p.920

          .         .         .         .         .                 g.121017
 AAAGAAGATTTTCCCAGGACTGAACAAGTTCCAGAAAAGACTGAAGAATAA                c.2811
 K  E  D  F  P  R  T  E  Q  V  P  E  K  T  E  E  X                  p.936

          .         .         .         .         .         .       g.121077
 tcacaattctaatgtaataatatatcttaattcaaggaacctagaatttatttttctcct       c.*60

          .         .         .         .         .         .       g.121137
 tagagataaggaaaataacatttgccaaatttcatattttaattgaaaattacattatat       c.*120

          .         .         .         .         .         .       g.121197
 taacatcacaattgtcatctatatattctatatgaaaaatatttattataacttaacaaa       c.*180

          .         .         .         .         .         .       g.121257
 tgagaactacttaaaggaatggtttttatgttaggagaaaatacaatacaccactttttt       c.*240

          .         .         .         .         .         .       g.121317
 ctcacaaaaattcacattattaagacctaagttcatttatacttacaagttatagtttaa       c.*300

          .         .         .         .         .                 g.121368
 aatattattgtaaataccctagtttaataaacacctttctttttaggtctg                c.*351

 (downstream sequence)
Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The MutS homolog 4 (E. coli) protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift variants, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.

Powered by LOVD v.3.0 Build 23
©2004-2020 Leiden University Medical Center