Jump to content
RemedySpot.com

Whole-Genome Sequencing in a Patient with Charcot¡VMarie¡VTooth Neuropathy

Rate this topic


Guest guest

Recommended Posts

Guest guest

http://content.nejm.org/cgi/content/full/NEJMoa0908094

ABSTRACT

Background Whole-genome sequencing may revolutionize medical diagnostics through

rapid identification of alleles that cause disease. However, even in cases with

simple patterns of inheritance and unambiguous diagnoses, the relationship

between disease phenotypes and their corresponding genetic changes can be

complicated. Comprehensive diagnostic assays must therefore identify all

possible DNA changes in each haplotype and determine which are responsible for

the underlying disorder. The high number of rare, heterogeneous mutations

present in all humans and the paucity of known functional variants in more than

90% of annotated genes make this challenge particularly difficult. Thus, the

identification of the molecular basis of a genetic disease by means of

whole-genome sequencing has remained elusive. We therefore aimed to assess the

usefulness of human whole-genome sequencing for genetic diagnosis in a patient

with Charcot¡VMarie¡VTooth disease.

Methods We identified a family with a recessive form of Charcot¡VMarie¡VTooth

disease for which the genetic basis had not been identified. We sequenced the

whole genome of the proband, identified all potential functional variants in

genes likely to be related to the disease, and genotyped these variants in the

affected family members.

Results We identified and validated compound, heterozygous, causative alleles in

SH3TC2 (the SH3 domain and tetratricopeptide repeats 2 gene), involving two

mutations, in the proband and in family members affected by

Charcot¡VMarie¡VTooth disease. Separate subclinical phenotypes segregated

independently with each of the two mutations; heterozygous mutations confer

susceptibility to neuropathy, including the carpal tunnel syndrome.

Conclusions As shown in this study of a family with Charcot¡VMarie¡VTooth

disease, whole-genome sequencing can identify clinically relevant variants and

provide diagnostic information to inform the care of patients.

--------------------------------------------------------------------------------

The practice of medical genetics requires gene-specific analyses of DNA

sequences and mutations to definitively diagnose disease, provide prognostic

information, and guide genetic counseling regarding the risk of recurrence.

Studies of autosomal recessive traits such as cystic fibrosis1 and some dominant

traits such as neurofibromatosis type 12 revealed the role of single " disease

genes " in conveying traits. However, many phenotypes of mendelian diseases (see

the Glossary) are genetically heterogeneous: causative mutations have been

identified in more than 100 genes for deafness and retinitis pigmentosa, for

instance. Moreover, specific mutations may confer phenotypes that segregate as

dominant, recessive, or even digenic3 or triallelic4 traits. There is also ample

evidence of modifying loci in mendelian disorders.5,6 Thus, even when there are

simple patterns of inheritance in syndromes with a well-characterized pathologic

course, the underlying mutational events, which need to be resolved for precise

molecular diagnosis, within individual families may be complex.

Charcot¡VMarie¡VTooth disease is an inherited peripheral neuropathy with two

forms: a demyelinating form (type 1) affecting the glia-derived myelin and an

axonal form (type 2) affecting the nerve axon. The two forms can be

distinguished by means of electrophysiological or neuropathological studies.

Charcot¡VMarie¡VTooth disease has been used as a model disease to describe

genetic heterogeneity, posit the relation of hereditary pattern to clinical

severity, and investigate the relative importance of principal and modifying

genes in determining human diseases.7,8 Mutant alleles underlying

Charcot¡VMarie¡VTooth disease can segregate in an autosomal dominant, recessive,

or X-linked manner (Figure 1). Both single-base variants (single-nucleotide

polymorphisms [sNPs]) and copy-number variants,10 at 39 separate loci, confer

susceptibility to Charcot¡VMarie¡VTooth disease. Most of these susceptibility

variants cause dominant forms of the disease, although mutations in genes at 14

of the loci cause recessive disease.

View larger version (26K):

[in this window]

[in a new window]

Figure 1. Charcot¡VMarie¡VTooth (CMT) Disease Phenotypes, Their Genetic Forms

of Inheritance, and Their Mapped Genes and Loci.

CMT is divided in two major phenotypic types ¡X glial myelinopathy (CMT type 1)

and neuronal axonopathy (CMT type 2) ¡X according to electrophysiological,

clinical, and nerve-biopsy evaluations. Each type can be inherited in a

dominant, recessive, or X-linked fashion. There are also autosomal dominant

intermediate forms of CMT that can have features of both axonal and

demyelinating neuropathies. Several genes have been associated with CMT disease

to date, and other loci have been associated and mapped but their genes not yet

identified. MPZ, GDAP1, and GJB1 are known to be associated with CMT type 1, but

select mutations in these genes can also cause CMT type 2; NEFL is known to be

associated with CMT type 2, but select mutations convey a CMT type 1 phenotype.

Dominant intermediate forms of CMT have been reported to be associated with MPZ

mutations. Specific recessive alleles related to CMT have also been reported for

EGR2 and PMP22. Of the 31 genes in 39 known CMT loci, only 15 genes are

currently available for clinical testing. Current evidence-based clinical

guidelines for distal symmetric polyneuropathy recommend genetic testing

consisting of screening for common mutations, including the CMT1A duplication

copy-number variant and point mutations of the X-linked GJB1 gene.9

Adult-onset Charcot¡VMarie¡VTooth disease is highly variable in presentation but

is characterized by distal symmetric polyneuropathy,9 with slowly progressive

distal muscle weakness and atrophy (particularly peroneal muscular atrophy)

resulting in foot dorsiflexor weakness, foot drop, and secondary steppage gait.

Pes cavus (highly arched feet) or pes planus (flat feet) occurs in most

patients.

We applied next-generation-sequencing methods to identify the cause of disease

in a family with inherited neuropathy that had been previously screened, with

negative results, for alterations of some common Charcot¡VMarie¡VTooth genes,

including PMP22,11 MPZ, PRX, GDAP1, and EGR2.

Methods

Study Participants

The study family consisted of four affected siblings, four unaffected siblings,

and an unaffected mother and father, all of whom provided written informed

consent for participation in the study. The study was approved by the

institutional review board at Baylor College of Medicine. The diagnosis of

Charcot¡VMarie¡VTooth type 1 disease in the proband and the three affected

siblings was based on the results of physical examination (distal muscle

weakness and wasting, pes cavus, and absence of deep-tendon reflexes) and

electrophysiological studies.

Neurophysiological Assessments

Neurophysiological studies consisted of a standard battery of nerve-conduction

studies, including motor responses of the median, ulnar, tibial, and peroneal

nerves with F-wave latencies; orthodromic median-, ulnar-, and sural-nerve

sensory potentials; and bilateral tibial H-reflexes. When these studies revealed

demyelinating features, tests of blink reflexes were generally performed. Limbs

were warmed to a temperature of at least 32¢XC in all instances. Demyelination

was judged to be present if conduction velocities were significantly slowed and

the late-response latencies were substantially delayed. Median-nerve

mononeuropathy at the wrist was judged to be present when there was prolonged

motor terminal latency or slowed median-nerve sensory velocity with

disproportionate slowing in the palm-to-wrist segment, or both. The four

affected subjects, all of whom had diffuse slowing of conduction, were also

thought to have a median-nerve mononeuropathy at the wrist, since the

median-nerve motor terminal latency was much more prolonged than the ulnar-nerve

motor terminal latency (14.9 vs. 8.1, 10.2 vs. 7.5, 11.6 vs. 6.2, and 9.2 vs.

6.2 msec) (Table 1).

View this table:

[in this window]

[in a new window]

Table 1. Neurophysiological Findings in the Study Family.

DNA Sequence Analysis

DNA sequencing was performed with the use of the SOLiD (Sequencing by

Oligonucleotide Ligation and Detection) system (Applied Biosystems), a

next-generation-sequencing platform that involves ligation-based sequencing and

a two-base encoding method in which four fluorescent dyes are used to tag

various combinations of dinucleotides. Its accuracy in sequencing 50-base reads

is estimated at approximately 99.94%.12 Multiple sequences can be read

simultaneously, and when the sequence reads overlap, the overall accuracy

increases further, reducing the risk of false positive determinations and the

need for additional data validation. We determined bases from the primary

sequencing data, using the standard SOLiD analysis software. (For details, see

the Supplementary Appendix, available with the full text of this article at

NEJM.org.)

Array-Based Comparative Genomic Hybridization

For array-based comparative genomic hybridization and analysis of the

copy-number variants in the proband as compared with those in a male control, we

used a 1-million-probe high-resolution oligonucleotide whole-genome array

(Agilent), a 2.1-million-oligonucleotide whole-genome array (NimbleGen), and a

44,000-oligonucleotide array (Agilent) that was custom-designed to assay genes

previously implicated in inherited neuropathy. Analysis of the copy-number

variants was performed according to the manufacturer's instructions and

software.

Bioinformatic Analysis of SNP Variants

Analysis of SNP variants and cross-referencing of them with the Human Gene

Mutation Database (www.hgmd.cf.ac.uk), the Online Mendelian Inheritance in Man

database (www.ncbi.nlm.nih.gov/omim), and the PolyPhen database

(http://genetics.bwh.harvard.edu/pph/data, based on the National Center for

Biotechnology Information [NCBI] dbSNP, build 126) were performed with the use

of Perl scripts. Alignment of the orthologous SH3TC2 (SH3 domain and

tetratricopeptide repeats 2) proteins was performed with the use of the ClustalW

program and reference SH3TC2 proteins from the following organisms: human

(accession number, NP_078853 [GenBank] ), chimpanzee (XP_527069), macaque

(XP_001104761), dog (XP_546315), horse (XP_001501607), cow (XP_616288), mouse

(NP_766216 [GenBank] ), rat (XP_225887), opossum (XP_001380773), and chicken

(XP_424256).

Segregation Analysis

Exons 5 and 11 of the SH3TC2 gene were amplified by means of a

polymerase-chain-reaction (PCR) assay and directly sequenced in all members of

the study family. To verify the Arg954ter amino acid mutation (R954X),

corresponding to a GA mutation in the genomic DNA in exon 11 of SH3TC2 on

chromosome 5 at nucleotide 148,386,628, we also generated a 312-bp PCR fragment

and incubated it with the restriction enzyme TaqI; the nucleotide mutation

results in elimination of the restriction site for TaqI.

Results

Nerve-Conduction Studies

In addition to the Charcot¡VMarie¡VTooth type 1 phenotype that segregates as a

recessive trait, we identified through electrophysiological means an axonal

neuropathy in one parent and one grandparent of the proband. Further evidence of

a subtle phenotype evidenced by, at a minimum, median-nerve mononeuropathy at

the wrist was also observed among all the proband's grandparents and both

parents but had an unclear pattern of inheritance. Its variable presentation

(Table 1) included three neurophysiologically defined phenotypes: a normal

phenotype with superimposed severe median-nerve mononeuropathy at the wrist,

thought to be an incidental finding in an 80-year-old man who had been a

carpenter for more than 50 years (Subject I-1), a mild median-nerve

mononeuropathy at the wrist (the proband's maternal grandmother and mother

[subject II-1]), and a more severe median-nerve mononeuropathy at the wrist

associated with evidence of a more widespread axonal polyneuropathy (Subjects

I-2 and II-2). The latter phenotype is similar to that of patients with

hereditary neuropathy with liability to pressure palsies (Online Mendelian

Inheritance in Man number, 162500 [OMIM] ), a disorder pathologically

characterized by patchy myelin abnormalities and attributed to

haploinsufficiency of PMP22 (as a consequence of genomic deletion)13;

duplication of PMP22 causes Charcot¡VMarie¡VTooth type 1A disease, the most

common form.14

Genome Variation

The sequencing of DNA samples obtained from the proband produced a mappable

yield of 89.6 Gb of sequence data, representing an average depth of coverage of

approximately 30 times per base. The data from sequential machine runs consisted

of 8.3 Gb of 35-bp fragment sequence reads (one run), 30.3 Gb of 25-bp mate-pair

sequence reads (two runs), and 51.0 Gb of 50-bp mate-pair sequence reads (one

run).

We identified the differences between the consensus sequence of the proband and

the human genome reference sequence. These were used to produce a list of

putative single-base DNA substitutions, small insertions, and deletions and

potential changes in DNA copy number. This list of variants included 3,420,306

SNPs. A total of 2,255,102 of the SNPs were in extragenic regions and 1,165,204

SNPs were within gene regions, including introns, promoters, 3' and 5'

untranslated regions, and splice sites (Table 2). Of the intragenic SNPs, 9069

were nonredundant SNPs predicted to result in nonsynonymous codon changes, and

121 of the 9069 were nonsense mutations. The approximately 3.4 million SNPs

identified represent about 0.1% of the reference haploid human genome,15 and

both the total number of SNPs and the number of novel SNPs are similar to those

discovered in other diploid genome sequences for individual subjects (Table

3).12,16,17,18,19,20,21 Of the more than 3.4 million SNPs, 2,858,587 were

present in public databases and 561,719 were novel (Table 3). Data on the

sequence reads, quality, and mapping have been deposited in the NCBI Sequence

Read Archive (www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?) (accession number,

SRP001734); variant data have been deposited in the dbSNP database.

View this table:

[in this window]

[in a new window]

Table 2. SNPs Identified through Whole-Genome Sequencing of DNA from the

Proband.

View this table:

[in this window]

[in a new window]

Table 3. Individual Human Genomes Sequenced to Date.

We used two approaches to identifying copy-number variation: array-based

comparative genomic hybridization and mate-pair sequencing. We identified 234

copy-number variants ranging in size from 1690 bp to 1,627,813 bp. Of these 234

variants, 132 were confirmed by at least one other method (Table 1 in the

Supplementary Appendix); 220 of the 234 (94%) overlap with reported regions of

copy-number variants in the Database of Genomic Variants

(http://projects.tcag.ca/variation). We found no copy-number variants affecting

genes known to be involved in Charcot¡VMarie¡VTooth disease or other

neuropathies.

We cross-referenced the nonsynonymous SNPs that we detected by using

whole-genome sequencing with a database of previously observed mutations

implicated in human disease (the Human Gene Mutation database) (Table 4, and

Table 2 in the Supplementary Appendix). Of the 174 nonsynonymous database SNPs

identified in the proband, 159 had a clear association with a heritable trait

(i.e., the database entry was not annotated with a question mark). Of these, 21

(13%) were described as causing mendelian disease; 16 were heterozygous in the

proband, a finding that is consistent with the expected load of autosomal

recessive mutations. The other five SNPs might have been erroneously assigned as

disease mutations, which would explain why four of them were homozygous in the

proband and have been found to be homozygous in unaffected persons. It would

also explain why the sequence for the proband, who did not have

adrenoleukodystrophy, contained a SNP in ABCD1 previously described as a

dominant mutation that causes the X-linked disorder adrenoleukodystrophy.22 An

alternative to the interpretation that the five SNPs might have been erroneously

assigned as disease mutations is that these alleles might have reduced

penetrance.

View this table:

[in this window]

[in a new window]

Table 4. Disease and Trait Associations of Nonsynonymous SNPs Identified in

the Proband, According to the Human Gene Mutation Database.

We examined the putative mutations in 40 genes known to cause or be linked to

neuropathic or related conditions (Table 3 in the Supplementary Appendix). This

exercise led to closer examination of 3148 putative SNPs, including 54 coding

SNPs. Of these 54, 2 were at the SH3TC2 locus ¡X 1 missense mutation (identified

at 7.7 average depth coverage) and 1 nonsense mutation (identified at 29.9

average depth coverage) (Fig. 1 in the Supplementary Appendix). Mutations in

this locus have previously been found to be associated with

Charcot¡VMarie¡VTooth type 4C disease, described in families of eastern

European, Turkish or Spanish Gypsy origin.23,24,25 The R954X nonsense mutation

has previously been implicated in Charcot¡VMarie¡VTooth disease; the missense

mutation (AG, occurring on chromosome 5 at nucleotide 148,402,474 and

corresponding to the amino acid mutation Tyr169His [Y169H]) is novel.

Correlation between Genotype and Phenotype

Segregation analyses verified independent maternal and paternal origins of the

mutations (Figure 2). The nonsense mutation (R954X) appeared in one parent of

the proband and in two siblings who did not have Charcot¡VMarie¡VTooth type 1

disease. The missense mutation (Y169H) was found in one parent and one

grandparent, neither of whom had Charcot¡VMarie¡VTooth disease. Only the proband

(Subject III-4) and three of his siblings (Subjects III-2, III-6, and III-8) who

had inherited both mutant alleles had the Charcot¡VMarie¡VTooth type 1 phenotype

(Figure 2).

View larger version (49K):

[in this window]

[in a new window]

Figure 2. Pedigree of the Study Family and Segregation and Conservation of

SH3TC2 Mutations.

Panel A shows the pedigree of the proband (arrow) and his family and their

SH3TC2 genotypes: plus signs indicate the wild-type allele; Y169H indicates the

AG mutation on chromosome 5 at nucleotide 148,402,474 and corresponding to the

amino acid missense mutation Tyr169His, and R954X indicates the GA mutation in

the genomic DNA in exon 11 of SH3TC2 on chromosome 5 at nucleotide 148,386,628,

leading to the amino acid nonsense mutation Arg954ter. (Genomic coordinates for

the mutations in the proband are based on the human genome reference sequence,

build 36.2.) Squares indicate male subjects, and circles female subjects;

slashes indicate deceased subjects. Subjects in generations I and II had three

phenotypes. The paternal grandfather (Subject I-1) was studied 20 years ago, at

80 years of age, and had normal results, with the sole exception of a

median-nerve mononeuropathy at the wrist, thought to be caused by his occupation

as a carpenter. The paternal grandmother (Subject I-2, 77 years of age at the

time of evaluation) and the father (Subject II-2) had evidence of a patchy

axonal polyneuropathy, with definite median-nerve mononeuropathy at the wrist.

The maternal grandmother (evaluated at 90 years of age; data not shown) and the

mother (Subject II-1) had normal findings except for very mild median-nerve

mononeuropathy at the wrist. Two of the proband's sisters (Subjects III-3 and

III-7) had this same phenotype. Two members of this generation had completely

normal findings (Subjects III-1 and III-5). The other four siblings had diffuse,

disproportionate conduction slowing in the distal median nerve, without evidence

of conduction block, findings that are suggestive of a superimposed median

mononeuropathy at the wrist. Subjects III-2, III-4, III-6, and III-8 had

Charcot¡VMarie¡VTooth type 1 (CMT1) disease. Panel B shows the results of TaqI

restriction digestion of the SH3TC2 exon 11 polymerase-chain-reaction product on

which the GA mutation, corresponding to the R954X allele, occurs. This mutation

was present in the proband's mother and six of the eight siblings, as well as in

the maternal grandmother (not shown). The mutation destroys the restriction site

for TaqI; the wild type yields two small bands and the heterozygous mutant

yields three bands, the upper of which is the uncut DNA. Panel C shows sequence

alignment of the SH3TC2 protein among various species. The downward arrowhead

indicates the location of the highly conserved Tyr169 amino acid that, in

persons with the novel missense mutation Y169H, is changed to His. Sequences

were obtained from the National Center for Biotechnology Information.

The subjects with the heterozygous missense mutation (Y169H) (Subjects I-2 and

II-2) (Figure 2) also had the apparently dominant axonal neuropathy phenotype,

as detected by electrophysiological studies. These findings of axonal neuropathy

(Table 1) suggest a gain of function (i.e., a toxic effect) of this mutation. In

contrast, the presumed loss-of-function nonsense variant (R954X) was associated

with electrophysiological evidence of the carpal tunnel syndrome, regardless of

whether it was the sole mutation present (i.e., heterozygous genotype) or was

accompanied by the missense variant (Y169H) (i.e., compound heterozygous

genotype) (Table 1 and Figure 2).

Discussion

We ascertained the molecular basis of an inherited disease by using

next-generation-sequencing methods. We chose whole-genome sequencing over

targeted, exon-capture approaches26,27 because we did not know whether the

" causative " mutations would reside in known coding elements, and targeted

approaches are ill suited to capturing copy-number variants. The heterogeneity

of our sequence data is emblematic of the current rapid progress of sequencing

technology: over the 6-month course of this study, sequence read lengths doubled

(from 25 bp to 50 bp), the density of samples on the sequencing slide increased,

and mapping technology improved. Overall, the sequence yield increased by a

factor of three, with no appreciable increase in expense. This rapid pace of

technological improvement makes it difficult to accurately determine the expense

of repeating this experiment, but given that the expense of sequencing reagents

for a single run on the SOLiD instrument was $25,000 in April 2009, we estimate

that the entire effort would currently cost less than $50,000.

The whole-genome sequencing approach used in this proband enabled us to identify

the cause of his disease as compound heterozygous mutations in the SH3TC2 gene

and thus to delineate the specific biologic basis of disease in his family. The

SH3TC2 protein contains both SH3 and TPR motifs; SH3 motifs mediate the assembly

of protein complexes binding to proline-rich proteins, and TPR motifs are

involved in protein¡Vprotein interactions.

The mouse orthologue of SH3TC2 is specifically expressed in Schwann cells, and

the SH3TC2 protein localizes to the plasma membrane and to the perinuclear

endocytic recycling compartment, which is consistent with a role in myelination

or in axon¡Vglia interactions.28 Mice lacking Sh3tc2 have abnormal organization

of the node of Ranvier.28 Consistent with a role of SH3TC2 in endocytic

processes29 is the finding that SH3TC2 mutations result in disruption of the

endocytic and membrane recycling pathways.30

We observed that both of the SH3TC2 mutations, when heterozygous, have

phenotypic consequences that can be detected by electrophysiological means. The

Y169H missense variant segregates with an axonal neuropathy, whereas the

nonsense R954X mutation is associated with subclinical evidence of the carpal

tunnel syndrome; therefore, haploinsufficiency of SH3TC2 may cause

susceptibility to the carpal tunnel syndrome. This susceptibility may also

result from mutations in other genes related to Charcot¡VMarie¡VTooth disease in

addition to PMP22 and SH3TC2. Whole-genome sequencing of other members of the

proband's family might help clarify whether the additional 69 SNPs at the SH3TC2

locus and 3146 SNPs at the other 39 neuropathy-associated gene loci examined

(including many rare variants, 466 of which have not previously been described

[Table 3 in the Supplementary Appendix]) can modify the highly penetrant Y169H

and R954X mutations and thereby influence the neuropathy phenotype.

The whole-genome sequencing approach that we describe here contrasts with other

diagnostic approaches. A clinical-testing panel that screens for a copy-number

variant that commonly causes Charcot¡VMarie¡VTooth disease14 and

nucleotide-sequence variants in 15 of the genes known to be mutated in patients

with the disease can cost more than $15,000.31 Mutations in two or more genes

related to Charcot¡VMarie¡VTooth disease have been described as causing a

phenotype more severe than that of our proband or other patients affected by the

disease.32,33,34 Such groups of mutations include a combination of two SNPs at

the ACBD1 locus and a copy-number variant affecting PMP22, as well as the

combination of a SNP and a copy-number variant at the same locus.35,36 There is

also a report of mutations in two genes related to Charcot¡VMarie¡VTooth disease

segregating in the same family as either a recessive trait or a sporadic trait,

the latter of which was attributed to a de novo copy-number variant.37 Given

this locus heterogeneity, with evidence of a mutational load that has clinical

consequences, as well as the ease of use and accuracy of the whole-genome

sequencing methods we applied, clinical and genetics experts struggling to

explain poorly understood high-penetrance genetic diseases can now seriously

consider this approach for illuminating the molecular causes of these diseases.

The approach may ultimately contribute to the care of patients and families

living with such diseases.

Our results suggest that haploinsufficiency of SH3TC2 confers predisposition to

a mild polyneuropathy with particular susceptibility to the carpal tunnel

syndrome. More generally, they demonstrate the diagnostic power of whole-genome

sequencing in the context of genetically heterogeneous mendelian disease and

inform efforts to decipher the genetic bases of complex traits. As new, rare

alleles at other gene loci are implicated in conditions such as diabetes,

obesity, heart disease, and cancer and as the patterns of interaction of the

alleles with a patient's phenotype are delineated, genetic susceptibility to

such diseases may become clearer. As a practical matter, the identification of

rare, heterogeneous alleles by means of whole-genome sequencing may be the only

way to definitively determine genetic contributions to the associated clinical

phenotypes.

Glossary

Array-based comparative genomic hybridization: A hybridization method for

detecting copy-number variations in DNA samples from a patient as compared with

a control sample. The method provides higher resolution than cytogenetic methods

but lower resolution than sequencing methods.

Average depth of coverage: The average number of times each base in the genome

was sequenced, as a function of the distribution and number of sequence reads

that map to the reference genome.

Coding single-nucleotide polymorphisms: Single DNA-base changes that occur in

the coding regions of genes.

Copy-number variation: DNA changes that involve sequences of more than 100 bp,

larger than single-nucleotide changes or microsatellites, and that vary in their

number of copies among individual persons. These variants can be benign and

polymorphic, but some can cause disease.

DNA template: An individual fragment of DNA that is available for sequencing.

Exon capture: Methods for isolating and sequencing gene exons, to the exclusion

of the remainder of the genome. The DNA templates from exons are " captured " with

the use of probes complementary to the targeted exon sequences. After capture,

the targeted DNA is eluted and sequenced. The cost of exon capture can be 10 to

50% lower than that of whole-genome sequencing, although the method is

insensitive to copy-number variations and mutations that are outside the

targeted regions.

Fragment-sequence read: The contiguous nucleotide sequence from one end of a DNA

template (as opposed to a mate-pair read).

Haploinsufficiency: The state that occurs when a diploid organism has only a

single functional copy of a gene, which does not produce enough protein to

support normal function.

Mapping: The computational process of identifying the specific region of a

reference genome from which an individual sequenced DNA template originated.

Mappable yield: The number of bases generated by a DNA-sequencing instrument

that can be mapped to the reference genome.

Mate-pair sequencing: A sequencing strategy that permits the inference of

structural changes in a genome by sequencing at both the 5' and 3' ends of each

DNA template (as opposed to the fragment-sequencing approach).

Mendelian disease: Human disease caused by mutations in a single gene.

Missense mutations: Single DNA-base changes that occur in the coding regions of

genes and alter the resulting encoded amino acid sequence.

Next-generation sequencing: DNA-sequencing methods that involve chemical assays

other than the traditional Sanger dideoxy-chain-termination method.

Next-generation-sequencing methods produce much larger quantities of data at

less expense, but the individual raw sequence reads that are generated from

individual amplified DNA-template sequences are shorter and have lower quality.

Nonsense mutations: DNA-base changes that introduce termination codons in the

coding sequences of genes, resulting in truncated proteins.

Sequence read: The sequence generated from a single DNA template.

Single-base error rate: The total number of mismatched bases found in mapped

sequence reads from a sequencing run, divided by the mappable yield. This rate

estimates the probability that any given mappable base is an error.

Two-base encoding: A method used in the SOLiD (Sequencing by Oligonucleotide

Ligation and Detection) DNA-sequencing platform that represents a DNA sequence

as a chain of overlapping dimers encoded as single-base " colors. " This allows

for sequencing of the 16 unique sequence dimers with the use of only four unique

dye colors and provides a method for improving the overall accuracy of the

sequence reads (reducing the single-base error rate).

Supported in part by grants from the National Human Genome Research Institute (5

U54 HG003273, to Dr. Gibbs) and the National Institute of Neurological Disorders

and Stroke (R01 NS058529, to Dr. Lupski).

Disclosure forms provided by the authors are available with the full text of

this article at NEJM.org.

We thank McKernan, , Francisco de la Vega, Quynh Doan, and

Fiona Hyland for extensive discussion and support and Cristian Coarfa for

structural-variation analysis and insights.

Link to comment
Share on other sites

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...