ID-Align

goto IdAlign on the Web or install IdAlign on my computer See also here for links to installable executables for your windows or mac.

Expected file format

The format of the uploaded file is expected to be a tab separated file with the first line in the file containing column names and subsequent lines being either blank or containing exactly the same number of columns as the first line. The column names line must contain a header "Name" -which indicates the column of metabolites - and "FileName" (case-sensitive) - which indicates the name of the "file" from which the row's data was drawn. All other column names are arbitrary although two or more columns with the same name will lead to undefined results.

Except for the Name and FileName columns all other columns are scanned to see if they can be converted into numbers (either floating point or integers). Currently no attempt is to infer the meaning of column values from header names. Spaces are stripped and a number can be prefixed by '<' or '>' and can have a "units" value postfix as one of (m/z | scans | %). If this is not the case then entire value is taken to be a text string.

Computation

The file is read and a list of metabolites created. Each metabolite references a map (dictionary) of files (keyed on the filename supplied in the FileName column) which – in turn -- contains a hash map of named values. Again the names are supplied by the column header under which the value appeared. A normalizing metabolite is found (initially the first that matches ‘rutenol’ in the list).

The user can then supply a data name (‘selected Data’) to display in the table. Values that fall below a user defined minimum are highlighted. Missing values for each column are calculated as half the smallest value found in that file/column or 0.0).

XLS Output

The output is a table containing filenames as columns and metabolite values as rows. The output uses Excel's Formula support to scale entire columns by the input of a single value. Multiple worksheets are created. The first sheet presents the table of values selected by the user. Those values that are missing are replaced by a formula referencing a missing value cell that appears above each column. This value is initially equal to half the smallest value found for that column or zero if no values exist.

The next worksheet is a table of formulas viz: normalized!A3 = rawdata!A3/normalized!A1 where A1 is a cell containing the normalization value specified by the user at the web interface. The final worksheet permits whole table scaling.

Technology

The software is entirely written in python. It is hosted as a webapp in a tomcat (http://tomcat.apache.org) Server 6.0 using jython (www.jython.org) and the Apache upload (http://commons.apache.org/fileupload/)

Known Issues

If two different values exist for the same (metabolite- Name,FileName) pair then they are averaged.

The webapp stores the uploaded and parsed files server-side in the servlet’s session object. Currently up to ten files will be stored per session with the oldest files being “lost” and the entire session expiring after an idle time of 4 hours. The files are keyed on the filename sent by the browser during upload - a point of difference here since for example Firefox (http://www.mozilla.org/firefox) only sends the filename whereas IE7 (http://www.microsoft.com/windows/products/winfamily/ie/default.mspx) sends the entire path name. Computational parameters (Data Value to display) specified by the users are stored in the session and applied to each computation and each file.

There are still unresolved issues about file character encoding. These seem to mainly affect the presentation of metabolite names under Firefox and not the computations.

When the software fails - such as when an incorrect file type is uploaded - it fails ungracefully and possibly confusingly. This is a UI issue that the author will improve if time permits.

Output of the XLS file uses a python library written by the authors and based on the perl library SpreadSheet::WriteExcel (http://homepage.tinet.ie/~jmcnamara/perl/WriteExcel.html). It has the benefit of being usable with either CPython, Jython and also IronPython but it currently only outputs Excel97 formats. For maintainability reasons future implementations may move to the Apache POI library http://poi.apache.org.

Update: the webstart, Windows and Mac versions now use apache POI to generate the Excel spreadsheets

Undefined results will occur for if two or more columns have the same name. Currently the last in the row will "overwrite" the earlier columns.

It is assumed that each column has a homogenous format - that all values parse to a number or all parse to text.

News

Centre Researcher Sota Fujii part of Rhizanthella Team

Western Australia's Mysterious Underground Orchid Revealed

Rhizanthella gardneri is a cute, quirky and critically endangered orchid that lives all its life underground. It even blooms underground, making it virtually unique amongst plants. Last year, using radioactive tracers, scientists at The University of Western Australia showed that the orchid gets all its nutrients by parasitising fungi associated with the roots of broom bush, a woody shrub of the WA outback. Now, with less than 50 individuals left in the wild, Plant Energy Biology scientists have made a timely and remarkable discovery about its genome.

Read our story in Cosmos magazine
Link to the UWA media release

Publication:

Delannoy E, Fujii S, Colas des Francs C, Brundrett M and Small ID (2011) "Rampant Gene Loss in the Underground Orchid Rhizanthella gardneri Highlights Evolutionary Constraints on Plastid Genomes" Molecular Biology and Evolution (in press) online

Centre Researcher Sota Fujii awarded by the JSPS 独立行政法人日本学術振興会

Sota Fujii Awarded:

Plant Energy Biology Research Associate Dr Sota Fujii is off to a terrific start in 2011. Following on from his recent publication (Full Text) in the Proceedings of the National Academy of Sciences (PNAS), he has won both a Japanese research award and a fellowship to continue his valuable work in plant genetics.

Dr Fujii was selected from 300 agricultural scientists for the position of "Super Postdoctoral Fellow" by the Japan Society for the Promotion of Science (JSPS). The fellowship is funded by the Japanese Ministry of Education, Science, Sports and Culture.

I will do my best to use this precious money from Japanese Taxpayers to contribute to the advancement of life science at global level, like my hero Dr. Barbara McClintock,

pledged Dr Fujii.

Dr Fujii's research on restorer to fertility genes in plants has also earned him a Inoue Research Award for Young Scientists. This prize for early career scientists highlights the great work being done by this promising young researcher.


Software