Re: Help improve Aspergillus annotation and microarrays!

April 22, 2005

Help improve Aspergillus annotation and microarrays!

In response to community input at the Second Aspergillus Meeting, the

National Institute of Allergy and Infectious Diseases will fund the Pathogen

Functional Genome Resource Center (http://pfgrc.tigr.org/) in a 3 month

effort to incorporate existing expression data into the A. nidulans

annotation before microarrays are designed. EST, cDNA and microarray data

from all Aspergilli can potentially improve gene calls.

and Wortman will be heading the Aspergillus

Annotation Blitz at TIGR. They have requested that all data be sent through

me to make it easier for them to handle. For cDNAs and ESTs we need

sequences in FASTA format with the gene identifier (if known) in the

description line. For arrays we need sequences of oligos or amplicons in

FASTA format with gene identifiers (if known) and whether a signal has been

confirmed in the description line (see example below). Alternately the gene

identifiers and whether the signal has been confirmed can be sent in an

Excel spreadsheet or any tab-delimited text file. Data must be received by

May 5, 2005 to be sure they get incorporated into the annotation.

Data will only be used for annotation improvement and will not be released

to others. We are interested in 5’, 3’ and intron calls, not conditions

under which the messages are expressed. The improved annotation will be

made available to CADRE and Broad genome databases, but will not include

expression profiles. This effort will result in the elimination of many of

the fused genes and improper calls in the current annotation and will allow

PFGRC to design better microarrays.

If you have questions or data to contribute email me before May 5 at

momany@....

FASTA FORMAT (Adapted from http://ngfnblast.gbf.de/docs/fasta.html):

A sequence in FASTA format begins with a single-line description, followed

by lines of sequence data.

The description line starts with a greater than symbol ( " > " ).

The word following the greater than symbol ( " > " ) immediately is the " ID "

(name) of the sequence, the rest of the line is the description.

The " ID " and the description are optional.

All lines of text should be shorter than 80 characters.

The sequence ends if there is another greater than symbol ( " > " ) symbol at

the beginning of a line and another sequence begins.

The following example contains two amplicon sequences (Amplicon1,

Amplicon2), their gene designations and whether amplicon hybridization has

been confirmed:

>Amplicon1|envelope protein|AN4417.1|confirmed

ACGTACCCTTGGGCAAATTTGGGCCCTCTCGTGTCTCTCTAAACCCCTTTGGGGGGGGGGG

CCCCGGGTTTATATATTAGGCGCGCGCGCGAATATATATTATATTATATTATATTATTAT

>Amplicon2|hypothetical|AN4370.1|unconfirmed

CCGGCGCGAATTATACGCGCAGCGACGACGACCCCCGGGGTCTCTCTCTCTCGGGGGGCC

AATTTGTTGTGTGACCATCTACTCAGACTTCATACTACTACTACTACTCTCTCTCTCTCTCTCT

___________________________________________

Momany, PhD

Chair, Aspergillus Genome Research Policy Committee

Associate Professor

Department of Plant Biology

University of Georgia

Athens, GA 30602

Phone: (1) 706-542-2014

FAX: (1) 706-542-1805

Email: momany@...

Webpage:http://www.plantbio.uga.edu/~momany/momany.html

http://www.aspergillus.man.ac.uk

Sign In

Re: Help improve Aspergillus annotation and microarrays!

Recommended Posts

Guest guest

Link to comment

Share on other sites

Join the conversation

Activity