|Getting started with Wormpath|
The purpose of Wormpath is to search a user-provided list of
genetic markers of Caenorhabditis elegans for genetic
interactions and find genetic networks formed by these
interactions. It is organized as a web-service hosted by the
Bioinformatics Core Facility at CECAD Cologne, Germany. Typically,
the uploaded list origins from an RNA-Seq study or a genome-wide scan using microarray
hybridization for cDNAs from different conditions, one with and one
without treatment. The results of the search are provided to the user in an XML-based
summary report for easy navigation and exploration.|
To start a run of the Wormpath software, you need a list of C. elegans genes given as Wormbase identifiers or microarray markers using Affymetrix IDs. These need to be arranged in the first column of a two-column table. In the second column, the table should contain a measure of evidence for that gene, e. g. a fold change or p-value for a comparison of the expression of that gene between two entities.
You can start the run by uploading this file to the start website, choosing the format in which the table is given, the method by which evidence for each gene is quantified (p-value or fold-change) and the core options for the run, namely, whether indirect interactions (see below) should be included in the search and how many iterations are supposed to be used. Finally, specify a level of significance to limit the resulting networks to those which are statistically significant. You probably best switch this to a value of 1 for a first try. To submit your input and start the analysis, click the button and wait. The duration of the analysis depends on the size and complexity of your gene list. The logger in your browser window will keep you up-to-date what's going on. When the analysis finishes, click one of the two links to view the results or download them to your computer.
The input list to the Wormpath software needs to contain 2 columns
- one for the genes to search and one with a level of evidence for
each gene. For details, please refer to the sections below. As
file formats, Microsoft Excel spreadsheets (*.xls, not *.xlsx) and plain text files
(ASCII) are accepted. If text files are provided, a tab character
has to be used as field delimiter. For both formats, the file may
contain only 2 columns without header line. If given as an Excel
file, only the first spreadsheet should contain data. Apart from
the two columns needed, no other data should be contained in your
file to avoid the software being confused by this.|
The first column may either contain the genes as a list of Affymetrix or Agilent IDs from the respective C. elegans gene expression microarray, the corresponding Wormbase IDs formated like WBGene00123456 or the sequence names used as accession numbers to the sequence data of the respective gene in the Wormbase. These are typically shaped like F52B5.5.
Experimental evidence for the respective gene has to be included as the second column of the input file and can be given as either the ratio of the expression levels of the two entities being compared or as a fold change value, where the ratio is replaced by the negative reciprocal ratio for downregulated genes. Finally, a p-value from a standard t-test may be used which, of note, does not provide information on the direction of the regulation.
|Predicted and indirect interactions|
Interactions in the Wormbase
are classified as "Genetic", "Suppressing", "Predicted" etc.,
according to the scientific evidence for that
interaction. Reviewing the number of interactions that are
classified as the one or other, it becomes clear that most
interactions in the Wormbase are "Predicted" while there is
only very weak evidence for these. Depending on the length of the
gene list that is uploaded, the presence of predicted
interactions will more or less confuse the analysis and
output. We have therefore decided that
predicted interactions are generally excluded from the
analysis but listed in the output where all interactions of a
certain gene are shown.|
If two genes in the genelist do not have a direct interaction in the Wormbase but both interact with a third gene that is not differentially regulated, we call this an indirect interaction. Its importance is due to genes that might have been missed in the analysis but play an important role in the network of the others. There is an option to specify whether or not the (unregulated) genes inducing such indirect interactions should be involved in the graph and included in the search for networks.
This is the core parameter for modification of the behaviour
of the analysis algorithm. In the basic graph
underlying the analysis, this gives you the highest number of
steps which are performed to search a network surrounding a
specific gene. That is, if the iteration depth is 3,
networks surrounding a particular gene are formed by neighbors that
are 1, 2 or at most 3 steps apart. The software also searches for
smaller networks because this can improve visibility of the closer environment.|
On the other hand, the whole analysis is always repeated with any smaller maximum iteration depth, say for 3, 2, and 1, if you choose 4, because this gives you the chance to review the shorter and potentially handyier result lists for these values.
The results can either directly be accessed in your webbrowser or
be downloaded as a gzipped archive. If your iteration depth was 3,
you should first review the analysis results provided in the XML
file summary3.xml. The files summary1.xml and summary2.xml contain
the lists for only 1 or 2 iterations chosen. In the summary3.xml report, you can access these by
clicking the Show smaller networks link on top of the
This gives you a summary of genetic networks contained in the list uploaded. You can show the interactions of a network following the respective Show Interactions link. This also gives you the citation to the paper in which the interaction was established. Furthermore, you can show a graphical representation of the respective network by clicking the Show Graph link in the summary report. To reduce the number of neighbors in the figure, you can click the Exclude nodes link which will show you a sub-network with a smaller number of genes. Following the links in the Show neighbours of ... box for any gene in that network, you can show a new network centered around this molecule containing all genes that are a controlled number of steps apart. On the other hand, you can show details on any interaction involved in the network by clicking the links in the Show all interactions of ... box. When you are browsing the summary report, at any point there are links provided directly connecting you to the corresponding information on the Wormbase and Pubmed websites.
For each network in the summary report, statistical evidence is
given by a score and two p-values. The score simply represents the average number
of papers describing interactions of the respective network,
whereas the score-based p-value reflects the probability to get this or a larger score for a network of this size by chance alone. The list-based p-value reflects the probability that in a larger environment of the network, the number of differentially expressed genes in the network is by chance at least as large as the one observed.|
The Significance level switch may be used to reduce the list of results reported to only those genetic networks which show significance at the given level. To report all the resulting networks, this switch can simply be left at a value of 1.
|Modify graphical output|
To edit the graphics available from the Show Graph
reports by hand after the analysis has finished, you can click
Open image as vector graphic and edit the SVG-format
graphics with a tool suitable for vector graphic modification,
e. g. the free software Inkscape or Corel
Draw. You should set the tool of your choice as the standard
application for SVG vector graphics.|
Furthermore, you can save any graph in the XGMML format using the Save in XGMML format links. This is an XML-based format accepted by most softwares dedicated to graph vizualisation and interpretation. For example, you can import the XGMML files into Cytoscape and use the fully developed functionality for graph layout in this software.