Guest guest Posted April 22, 2005 Report Share Posted April 22, 2005 Help improve Aspergillus annotation and microarrays! In response to community input at the Second Aspergillus Meeting, the National Institute of Allergy and Infectious Diseases will fund the Pathogen Functional Genome Resource Center (http://pfgrc.tigr.org/) in a 3 month effort to incorporate existing expression data into the A. nidulans annotation before microarrays are designed. EST, cDNA and microarray data from all Aspergilli can potentially improve gene calls. and Wortman will be heading the Aspergillus Annotation Blitz at TIGR. They have requested that all data be sent through me to make it easier for them to handle. For cDNAs and ESTs we need sequences in FASTA format with the gene identifier (if known) in the description line. For arrays we need sequences of oligos or amplicons in FASTA format with gene identifiers (if known) and whether a signal has been confirmed in the description line (see example below). Alternately the gene identifiers and whether the signal has been confirmed can be sent in an Excel spreadsheet or any tab-delimited text file. Data must be received by May 5, 2005 to be sure they get incorporated into the annotation. Data will only be used for annotation improvement and will not be released to others. We are interested in 5’, 3’ and intron calls, not conditions under which the messages are expressed. The improved annotation will be made available to CADRE and Broad genome databases, but will not include expression profiles. This effort will result in the elimination of many of the fused genes and improper calls in the current annotation and will allow PFGRC to design better microarrays. If you have questions or data to contribute email me before May 5 at momany@.... FASTA FORMAT (Adapted from http://ngfnblast.gbf.de/docs/fasta.html): A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line starts with a greater than symbol ( " > " ). The word following the greater than symbol ( " > " ) immediately is the " ID " (name) of the sequence, the rest of the line is the description. The " ID " and the description are optional. All lines of text should be shorter than 80 characters. The sequence ends if there is another greater than symbol ( " > " ) symbol at the beginning of a line and another sequence begins. The following example contains two amplicon sequences (Amplicon1, Amplicon2), their gene designations and whether amplicon hybridization has been confirmed: >Amplicon1|envelope protein|AN4417.1|confirmed ACGTACCCTTGGGCAAATTTGGGCCCTCTCGTGTCTCTCTAAACCCCTTTGGGGGGGGGGG CCCCGGGTTTATATATTAGGCGCGCGCGCGAATATATATTATATTATATTATATTATTAT >Amplicon2|hypothetical|AN4370.1|unconfirmed CCGGCGCGAATTATACGCGCAGCGACGACGACCCCCGGGGTCTCTCTCTCTCGGGGGGCC AATTTGTTGTGTGACCATCTACTCAGACTTCATACTACTACTACTACTCTCTCTCTCTCTCTCT ___________________________________________ Momany, PhD Chair, Aspergillus Genome Research Policy Committee Associate Professor Department of Plant Biology University of Georgia Athens, GA 30602 Phone: (1) 706-542-2014 FAX: (1) 706-542-1805 Email: momany@... Webpage:http://www.plantbio.uga.edu/~momany/momany.html http://www.aspergillus.man.ac.uk Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.