msh homeobox 1 (MSX1) - coding DNA reference sequence

(used for variant description)

(last modified April 8, 2014)


This file was created to facilitate the description of sequence variants on transcript NM_002448.3 in the MSX1 gene based on a coding DNA reference sequence following the HGVS recommendations.

The sequence was taken from NC_000004.11, covering MSX1 transcript NM_002448.3.


Please note that introns are available by clicking on the exon numbers above the sequence.
 (upstream sequence)
           .         .         .         .         .                g.5055
      agggcccggagccggcgagtgctcccgggaactctgcctgcgcggcggcagcgac       c.-181

 .         .         .         .         .         .                g.5115
 cggaggccaggcccagcacgccggagctggcctgctggggaggggcgggaggcgcgcgcg       c.-121

 .         .         .         .         .         .                g.5175
 ggagggtccgcccggccagggccccgggcgctcgcagaggccggccgcgctcccagcccg       c.-61

 .         .         .         .         .         .                g.5235
 cccggagcccatgcccggcggctggccagtgctgcggcagaagggggggcccggctctgc       c.-1

          .         .         .         .         .         .       g.5295
 ATGGCCCCGGCTGCTGACATGACTTCTTTGCCACTCGGTGTCAAAGTGGAGGACTCCGCC       c.60
 M  A  P  A  A  D  M  T  S  L  P  L  G  V  K  V  E  D  S  A         p.20

          .         .         .         .         .         .       g.5355
 TTCGGCAAGCCGGCGGGGGGAGGCGCGGGCCAGGCCCCCAGCGCCGCCGCGGCCACGGCA       c.120
 F  G  K  P  A  G  G  G  A  G  Q  A  P  S  A  A  A  A  T  A         p.40

          .         .         .         .         .         .       g.5415
 GCCGCCATGGGCGCGGACGAGGAGGGGGCCAAGCCCAAAGTGTCCCCTTCGCTCCTGCCC       c.180
 A  A  M  G  A  D  E  E  G  A  K  P  K  V  S  P  S  L  L  P         p.60

          .         .         .         .         .         .       g.5475
 TTCAGCGTGGAGGCGCTCATGGCCGACCACAGGAAGCCGGGGGCCAAGGAGAGCGCCCTG       c.240
 F  S  V  E  A  L  M  A  D  H  R  K  P  G  A  K  E  S  A  L         p.80

          .         .         .         .         .         .       g.5535
 GCGCCCTCCGAGGGCGTGCAGGCGGCGGGTGGCTCGGCGCAGCCACTGGGCGTCCCGCCG       c.300
 A  P  S  E  G  V  Q  A  A  G  G  S  A  Q  P  L  G  V  P  P         p.100

          .         .         .         .         .         .       g.5595
 GGGTCGCTGGGAGCCCCGGACGCGCCCTCTTCGCCGCGGCCGCTCGGCCATTTCTCGGTG       c.360
 G  S  L  G  A  P  D  A  P  S  S  P  R  P  L  G  H  F  S  V         p.120

          .         .         .         .         .         .       g.5655
 GGGGGACTCCTCAAGCTGCCAGAAGATGCGCTCGTCAAAGCCGAGAGCCCCGAGAAGCCC       c.420
 G  G  L  L  K  L  P  E  D  A  L  V  K  A  E  S  P  E  K  P         p.140

          .         .         .         .          | 02        .    g.8047
 GAGAGGACCCCGTGGATGCAGAGCCCCCGCTTCTCCCCGCCGCCGGCCA | GGCGGCTGAGC    c.480
 E  R  T  P  W  M  Q  S  P  R  F  S  P  P  P  A  R |   R  L  S      p.160

          .         .         .         .         .         .       g.8107
 CCCCCAGCCTGCACCCTCCGCAAACACAAGACGAACCGTAAGCCGCGGACGCCCTTCACC       c.540
 P  P  A  C  T  L  R  K  H  K  T  N  R  K  P  R  T  P  F  T         p.180

          .         .         .         .         .         .       g.8167
 ACCGCGCAGCTGCTGGCGCTGGAGCGCAAGTTCCGCCAGAAGCAGTACCTGTCCATCGCC       c.600
 T  A  Q  L  L  A  L  E  R  K  F  R  Q  K  Q  Y  L  S  I  A         p.200

          .         .         .         .         .         .       g.8227
 GAGCGCGCGGAGTTCTCCAGCTCGCTCAGCCTCACTGAGACGCAGGTGAAGATATGGTTC       c.660
 E  R  A  E  F  S  S  S  L  S  L  T  E  T  Q  V  K  I  W  F         p.220

          .         .         .         .         .         .       g.8287
 CAGAACCGCCGCGCCAAGGCAAAGAGACTACAAGAGGCAGAGCTGGAGAAGCTGAAGATG       c.720
 Q  N  R  R  A  K  A  K  R  L  Q  E  A  E  L  E  K  L  K  M         p.240

          .         .         .         .         .         .       g.8347
 GCCGCCAAGCCCATGCTGCCACCGGCTGCCTTCGGCCTCTCCTTCCCTCTCGGCGGCCCC       c.780
 A  A  K  P  M  L  P  P  A  A  F  G  L  S  F  P  L  G  G  P         p.260

          .         .         .         .         .         .       g.8407
 GCAGCTGTAGCGGCCGCGGCGGGTGCCTCGCTCTACGGTGCCTCTGGCCCCTTCCAGCGC       c.840
 A  A  V  A  A  A  A  G  A  S  L  Y  G  A  S  G  P  F  Q  R         p.280

          .         .         .         .         .         .       g.8467
 GCCGCGCTGCCTGTGGCGCCCGTGGGACTCTACACGGCCCATGTGGGCTACAGCATGTAC       c.900
 A  A  L  P  V  A  P  V  G  L  Y  T  A  H  V  G  Y  S  M  Y         p.300

          .                                                         g.8479
 CACCTGACATAG                                                       c.912
 H  L  T  X                                                         p.303

          .         .         .         .         .         .       g.8539
 agggtcccaggtcgcccacctgtgggccagccgattcctccagccctggtgctgtacccc       c.*60

          .         .         .         .         .         .       g.8599
 cgacgtgctcccctgctcggcaccgccagccgccttccctttaaccctcacactgctcca       c.*120

          .         .         .         .         .         .       g.8659
 gtttcacctctttgctccctgagttcactctccgaagtctgatccctgccaaaaagtggc       c.*180

          .         .         .         .         .         .       g.8719
 tggaagagtcccttagtactcttctagcatttagatctacactctcgagttaaagatggg       c.*240

          .         .         .         .         .         .       g.8779
 gaaactgagggcagagaggttaacagatttatctaaggtccccagcagaattgacagttg       c.*300

          .         .         .         .         .         .       g.8839
 aacagagctagaggccatgtctcctgcatagcttttccctgtcctgacaccaggcaagaa       c.*360

          .         .         .         .         .         .       g.8899
 aagcgcagagaaatcggtgtctgacgattttggaaatgagaacaatctcaaaaaaaaaaa       c.*420

          .         .         .         .         .         .       g.8959
 aaaaaaaaaaaaaaaaaaaaaaaaagaaaagagaaaaaaaagactagccagccaggaaga       c.*480

          .         .         .         .         .         .       g.9019
 tgaatcctagcttcttccattggaaaatttaagacaagttcaacaacaaaacatttgctc       c.*540

          .         .         .         .         .         .       g.9079
 tggggggcagggaaaacacagatgtgttgcaaaggtaggttgaagggacctctctcttac       c.*600

          .         .         .         .         .         .       g.9139
 cagtaccagaaacacaattgtaaaattaaaaaaaaaaaaaaactctttctatttaacagt       c.*660

          .         .         .         .         .         .       g.9199
 acatttgtgtggctctcaaacatccctttggaagggattgtgtgtactatgtaatatact       c.*720

          .         .         .         .         .         .       g.9259
 gtatatttgaaattttattatcatttatattatagctatatttgttaaataaattaattt       c.*780

          .                                                         g.9272
 taagctacaaaaa                                                      c.*793

 (downstream sequence)
Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The Msh homeobox 1 protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift variants, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.

Powered by LOVD v.3.0 Build 09
©2004-2014 Leiden University Medical Center