Sequencing, finishing and annotation were performed by the DOE Jo

Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2. Table 2 Genome sequencing project information Bicalutamide Growth conditions and DNA extractions A culture of DSM 23566T was grown in DSMZ medium 514 (Bacto Marine Broth) [23] at 20��C. gDNA was purified using Jetflex Genomic DNA Purification Kit (GENOMED 600100) following the directions provided by the supplier but modified by the addition of 20 ��l Proteinase K for cell lysis. The purity, quality and size of the bulk gDNA preparation were assessed by JGI according to DOE-JGI guidelines. DNA is available through the DNA Bank Network [24]. Genome sequencing and assembly The draft genome sequence was generated using Illumina data [25].

For this genome, we constructed and sequenced an Illumina short-insert paired-end library with an average insert size of 247 �� 59 bp which generated 16,028,960 reads and an Illumina long-insert paired-end library with an average insert size of 8,186 �� 3,263 bp which generated 9,112,084 reads totaling 3,771 Mbp of data (Feng Chen, unpublished). All general aspects of library construction and sequencing can be found at the JGI web site [26]. The initial draft assembly contained 20 contigs in 12 scaffolds. The initial draft data were assembled with Allpaths [27], version 39750, and the consensus was computationally shredded into 10 Kbp overlapping fake reads (shreds). The Illumina draft data were also assembled with Velvet [28], and the consensus sequences were computationally shredded into 1.

5 Kbp overlapping fake reads (shreds). The Illumina draft data were assembled again with Velvet using the shreds from the first Velvet assembly to guide the next assembly. The consensus from the second Velvet assembly was shredded into 1.5 Kbp overlapping fake reads. The fake reads from the Allpaths assembly and both Velvet assemblies and a subset of the Illumina CLIP paired-end reads were assembled using parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with manual editing in Consed [29-31]. Gap closure was accomplished using repeat resolution software (Wei Gu, unpublished), and sequencing of bridging PCR fragments with Sanger and/or PacBio (Cliff Han, unpublished) technologies. A total of 13 PCR PacBio consensus sequences were completed to close gaps and to raise the quality of the final sequence.

The final assembly is based on 3,771 Mbp of Illumina draft data, which provides an average 739�� coverage of the genome. Genome annotation Genes were identified using Prodigal [32] as part of the JGI genome annotation pipeline [33], followed by a round of manual curation using the JGI GenePRIMP pipeline [34]. Dacomitinib The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>