figshare
Browse
1/2
23 files

Data for PhD Thesis on Next Generation Nematode Genomes

Version 5 2012-09-29, 15:04
Version 4 2012-09-29, 15:04
Version 3 2012-09-24, 12:03
Version 2 2012-09-24, 12:03
Version 1 2012-09-26, 08:32
dataset
posted on 2012-09-29, 15:04 authored by Sujai KumarSujai Kumar

Data for PhD thesis on "Next-generation Nematode Genomes" Sujai Kumar

 

(Note: The thesis itself will be made publicly available after the viva/oral examination is complete).

Update: Thesis available at http://hdl.handle.net/1842/7609 (https://www.era.lib.ed.ac.uk/handle/1842/7609) 

--------------------------------------------------------------------

Species Abbreviations:

Trichinella spiralis (ts)

Ascaris suum (as)

Dirofilaria immitis (di)

Brugia malayi (bm)

Litomosoides sigmodontis (ls)

Acanthocheilonema viteae (av)

Strongyloides ratii (sr)

Bursaphelenchus xylophilus (bx)

Meloidogyne hapla (mh)

Meloidogyne incognita (mi)

Meloidogyne floridensis (mf)

Pristionchus pacificus (pp)

Caenorhabditis angaria (ca)

Caenorhabditis japonica (cj)

Caenorhabditis elegans (ce)

Caenorhabditis brenneri (cbn)

Caenorhabditis sp. 11 (csp11)

Caenorhabditis remanei (cr)

Caenorhabditis briggsae (cbg)

Caenorhabditis sp.5 (csp5)

--------------------------------------------------------------------

File descriptions:

--------------------------------------------------------------------

Chapter 3: Annotating nematode genomes

- 20_nematode_protein_files.tgz - This tgz file has 20 Nematode protein fasta files used in Chapter 3 "Annotating nematode genomes". The original files were obtained from WormBase (WS230), http://nematod.es, and www.inra.fr/meloidogyne_incognita/genomic_resources . The fasta files have been cleaned up: a) all whitespace converted to spaces in sequence headers (otherwise NCBI's makeblastdb fails) b) multi-line sequences have been converted to single line c) sequence IDs have been prefixed with a species abbreviation.

- 20_nematode_genome_files_part{1,2,3}.tgz - These three tgz files are Nematode genome nucleotide fasta files. The original files were obtained from WormBase (WS230),http://nematod.es, and www.inra.fr/meloidogyne_incognita/genomic_resources . The fasta files have been cleaned up: a) multi-line sequences have been converted to single line b) sequence IDs have been prefixed with a species abbreviation.

- 20_nematode_blast2go.annot.goslim.tgz 20 Blast2GO annotation files for each nematode proteome

- 20_nematode_iprscan.tgz 20 proteomes with InterProScan annotations

- 20_nematode_tRNA_counts.xls tRNA counts for 20 nematode genomes

- 20_nematode_tRNAscan_gff.tgz tRNA locations for 20 nematode genomes (GFF format)

- 20_nematode_rfamscan_gff.tgz Rfamscan output for 20 nematode genomes (GFF format)

--------------------------------------------------------------------

Chapter 4: Lack of deeply conserved non-coding elements in nematodes

- tba.alignments.tar Whole-genome multiple alignment files for specific nodes in the nematode phylogeny: Clade III, Onchocercidae, Clade IV, Meloidogyne, Clade V, Caenorhabditis, Elegans group

- tba.alignments.CNEs.tar CNE multiple alignment files for specific nodes in the nematode phylogeny (whole- genome multiple alignments with coding regions removed

- tba.alignments.CNEs.stats.tgz Tab delimited files with length and relative identity for each CNE

- pairwise.megablast.tar Pairwise MegaBLAST alignments for all 20 genomes

- megablast.cluster.tgz MegaBLAST based clusters of CNEs

--------------------------------------------------------------------

Chapter 5: The Meloidogyne floridensis genome reveals complex hybrid origins of the root-knot nematodes

- protein.faa.tgz Protein sets used for M. hapla, M. incognita, and M. floridensis after truncating at stop codons and filtering short proteins (protein fasta files)

- cds.fna.tgz CDS transcript files corresponding to proteins in M. hapla, M. incognita, and M. floridensis (nucleotide fasta files)

- mhmimf.98.self.id Tab-delimited file with self-identity scores for each CDS in each species

- InParanoid-mh-mi-mf.tgz InParanoid results (pair-wise clustering)

- QuickParanoid-mh-mi-mf.tgz QuickParanoid results (orthologous clusters across three species)

- raxml-mh-mi-mf.tgz phylogenetic trees for each QuickParanoid cluster

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC