The Graphical Query Language: |
The papers bellows describes the methods implemented in GQL.
A. Schliep, A. Schönhuth, C. Steinhoff. Using Hidden Markov Models to Analyze Gene Expression Time Course Data. Proceedings of the ISMB 2003. Bioinformatics. 2003 Jul; 19 Suppl 1: I255-I263
A. Schliep, C. Steinhoff, A. Schönhuth.Robust inference of groups in gene expression time-courses using mixtures of HMM. Proceedings of the ISMB 2004. Bioinformatics, Aug 2004; 20 Suppl 1: I283 - I289.
A. Schliep, I. G. Costa, C. Steinhoff, A. Schönhuth. Analysing gene expression time-courses , IEEE Transactions on Computational Biology and Bioinformatics, to appear.
I. G. Costa, A. Schönhuth, A. Schliep. The Graphical Query Language: a tool for analysis of gene expression time-courses , Bioinformatics, 2005, 21(10):2544-2545.
Both tools supports GHMM file formats for input data and model descriptions (see GHMM). It also reads input files in standard tab separated files, as the ones used by most of gene expression analysis tools. In this format, each line represents a gene and the columns the measured time points. The first column holds the gene identifiers and the second column any type of annotation of the genes. Missing values should be decoded as either 'Nan' or by not placing any character at the position. Sample files of all formats are provided in examples.
YHR124W meiosis -0.377685 -0.427071 -0.479749 0.175438 YGR072W mRNA decay, nonsense-mediated unknown -0.067600 -0.664033 -0.412644 0.090134 YGR145W unknown 0.266238 -0.854138 -0.103595 0.371387 YIR031C allantoin utilization -0.017010 0.650807 0.461851 -0.146432 YJR010W methionine biosynthesis NaN 0.847968 0.078140 -0.137952 YMR172W osmotic stress response -0.734039 -0.258823 -0.135069 0.127290 YIR032C ureidoglycolate hydrolase -0.287924 0.701009 0.464117 -0.160077 YHR053C metallothionein -0.263116 0.780098 -0.363840 -0.396216
Example of a gene expression file during 4 time points. The second column holds functional annotation of the genes.
GQL also use tab separated files for files containing partial labels. Now, the files have only two colunms, the first containing the gene id and the second containing a numerical label (from 1 to n).
YHR124W 1 YGR072W 1 YGR145W 1 YIR031C 2 YJR010W 2 YMR172W 2 YIR032C 2 YHR053C 2
Version 1.0:
We had this version in heavy use for
the last months. There are still some missing feature, and bugs.