Assembly of the Working Draft of the Human Genome with GigAssembler

  1. W. James Kent1,3 and
  2. David Haussler2
  1. 1Department of Biology, University of California at Santa Cruz, Santa Cruz, California 95064, USA; 2Howard Hughes Medical Institute, Department of Computer Science, University of California at Santa Cruz, Santa Cruz, California 95064, USA

Abstract

The data for the public working draft of the human genome contains roughly 400,000 initial sequence contigs in ∼30,000 large insert clones. Many of these initial sequence contigs overlap. A program,GigAssembler, was built to merge them and to order and orient the resulting larger sequence contigs based on mRNA, paired plasmid ends, EST, BAC end pairs, and other information. This program produced the first publicly available assembly of the human genome, a working draft containing roughly 2.7 billion base pairs and covering an estimated 88% of the genome that has been used for several recent studies of the genome. Here we describe the algorithm used byGigAssembler.

Footnotes

  • 3 Corresponding author.

  • E-MAIL kent{at}biology.ucsc.edu; FAX (831) 459-4829.

  • Article published on-line before print: Genome Res.,10.1101/gr.183201.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.183201.

    • Received February 7, 2001.
    • Accepted June 14, 2001.
| Table of Contents

Preprint Server