MetaNetwork {MetaNetwork}R Documentation

genetic study of metabolites and network reconstruction

Description

An integrated computational protocol to run a complete genetic analysis workflow on metabolites using diverse MetaNetwork methods for quantitative trait analysis, network reconstruction and Cytoscape network visualization.

Usage

MetaNetwork( markers, genotypes, traits, spike, qtlProfiles = NULL, 
             qtlThres = NULL, qtlSumm = NULL, corrZeroOrder = NULL, 
             corrSecondOrder = NULL, corrMethod = "qtl", corrThres = 0, 
             cytoFiles = T, peaks = NULL, outputdir = "./MetaNetwork")

Arguments

markers matrix of markers (rownames) and their chromosome numbers (column 1) and centi-Morgan positions (cM, column 2), ordered by position.
See markers example data.
genotypes matrix of genotypes for each marker (rownames) and individual (columnnames), as numeric values 1, 2 or NA when missing.
See genotypes example data.
traits matrix of phenotypes for each trait (rownames) and individual (columnnames), as numeric or NA when missing.
See traits or traits2 example data.
spike numeric cut-off value to separate absent (qualitative) from available (quantitative) trait abundance.
qtlProfiles (optional) matrix of QTL mapping of traits (rownames) to markers (columnnames), as -log_{10}(p) values.
If qtlProfiles is NULL (default), MetaNetwork will call function qtlMapTwoPart to generate the qtlProfiles. Otherwise, MetaNetwork will use the provided QTL results.
See qtlProfiles example data.
qtlThres (optional) numeric -log_{10}(p) threshold value for significant QTLs.
If qtlThres is NULL (default), the QTL significance threshold will be estimated by simulation using qtlThreshold at alpha = 0.05 and n.simulations = 1000. Furthermore, the QTL significance threshold is also estimated by controlling the false discovery rate fdrThres = 0.05 using qtlFDR. By default, the most stringent outcome of either qtlThreshold or qtlFDR is used. Otherwise, MetaNetwork will use provided threshold.
qtlSumm (optional) data.frame with the summary of each QTL.
If qtlSumm is NULL (default), MetaNetwork will call function qtlSummary to summarize QTL effects.
See qtlSumm example data.
corrZeroOrder (optional) the matrix of zero-order correlation coefficient between metabolites.
If corrZeroOrder is NULL (default), zero-order correlation coefficient will be calculated for QTL profiles using function qtlCorrZeroOrder.
See corrZeroOrder example data.
corrSecondOrder (optional) matrix of second-order partial correlation between metabolites.
If corrSecondOrder is NULL (default), second-order partial correlation will be calculated for QTL profiles using function qtlCorrSecondOrder.
See corrSecondOrder example data.
corrMethod (optional) character string indicating which correlation method, either "qtl" or "abundance".
If corrMethod is "qtl" (default), MetaNetwork will call function qtlCorrZeroOrder to calculate the correlation between QTL profiles. Otherwise, when corrMethod is "abundance", MetaNetwork will use Spearman corrrelation via function cor to calculate the correlation between metabolite abundance profiles.
corrThres (optional) numeric threshold for significant partial correlation coefficients.
If corrThres is NULL, the empirical threshold is estimated by permutation using function qtlCorrThreshold with n.permutations = 10000. Otherwise, the provided threshold is used. Default is 0.
peaks (optional) matrix of mass/charge peaks (column1) for each trait (rownames).
If peaks is set, MetaNetwork will call findMultiplePeaks to relate multiple mass peaks for correlated traits.
See peaks2 example peaks data for unidentified metabolite example traits data traits2.
cytoFiles (optional) boolean value that indicates if files for network visualization in Cytoscape should be created.
If TRUE (default) MetaNetwork will call function createCytoFiles to create two network files in outputdir for the significant correlations amongst metabolites: 'network.sif' and 'network.eda'.
outputdir (optional) output directory where generated data files will be stored. Default is "./MetaNetwork"

Details

First, MetaNetwork maps metabolite quantitative trait loci (mQTLs) underlying variation in metabolite abundance in individuals of a segregating population using a two-part model to account for the nature of metabolite data (step A, qtlMapTwoPart). This model combines the analysis of the binary traits (positive/not-available) with conditional analysis of the quantitative trait (numeric) among individuals with a positive binary phenotype. Simulation procedures are used to assess statistical significance (step B, qtlThreshold, qtlFDR). MetaNetwork will summarize the information about significant mQTLs (step C, qtlSummary).

Then, MetaNetwork predicts the network of potential associations between metabolites using correlations of mQTL profiles or abundance profiles (step D, qtlCorrZeroOrder; step E, qtlCorrSecondOrder). Optionally, permutation procedures can be used to assess statistical significance (step F, qtlCorrThreshold).

Finally, MetaNetwork generates files of predicted networks, which can be visualized using Cytoscape (step G, createCytoFiles), and optionally relates multiple mass peaks per metabolite that may be consequence of isotopes or charge difference (step H, findMultiplePeaks).

Analysis of about 24 metabolites takes a few minutes on a desktop computer (Pentium 4). Analysis of a metabolome of about 2000 metabolites will take around four days. In addition, MetaNetwork is able to integrate high-throughput data from future metabolomics, transcriptomics and proteomics experiments in conjunction with phenotypic data.

After running MetaNetwork with defaults, the R console will show:

>MetaNetwork (markers=markers, genotypes=genotypes, traits=traits, spike=4
              qtlThres=3.79)        
Step A: QTL mapping.... 
         result in R object 'qtlProfiles' 
         result in ./MetaNetwork/qtlProfiles.csv 
         process time 29.25 sec 

Step B: Simulation test for QTL significance threshold....skipped 
         using user-provided QTL threshold: 3.79 

Step C: QTL summary.... 
         result in R object: 'qtlSumm' 
         result in ./MetaNetwork/qtlSumm.csv 
         process time 1.66 sec 

Step D: Zero-order correlation .... 
         result in R object: 'corrZeroOrder' 
         result in ./MetaNetwork/corrZeroOrder.csv 
         process time 2.97 sec 

Step E: 2nd-order correlation .... 
         result in R object: 'corrSecondOrder' 
         result in ./MetaNetwork/corrSecondOrder.csv 
         process time 9.58 sec 

Step F: Permutation test for 2nd-order correlation significance threshold...skipped 
         using user-provided correlation threshold: 0 

Step G: Create Cytoscape network files... 
         SIF file is: ./MetaNetwork/network.sif 
         EDA file is: ./MetaNetwork/network.eda 

Step H: Find Multiple Peaks....skipped

Value

qtlProfiles matrix of QTL mapping of traits (rownames) to markers (columnnames) as log-transformed "p values" [ -log_{10}(p)], see qtlMapTwoPart. A +/- sign is added to indicate the direction of the additive effect: values are positive if the QTL has higher metabolite abundance for individuals carrying the genotype 2 than those carrying the genotype 1; values are negative otherwise.
See qtlProfiles example data.
qtlThres estimated QTL significance threshold.
See function qtlThreshold.
qtlSumm data frame with QTL summary.
See qtlSumm example data.
corrZeroOrder matrix of zero order correlation of QTL profiles.
See corrZeroOrder example data.
corrSecondOrder matrix of 2nd order correlation of QTL profiles.
See corrSecondOrder example data.
corrPermutations vector of the permutations of maximum, absolute correlation values.
See function qtlCorrThreshold.
corrThres numeric correlation threshold.
See function qtlCorrThreshold.
cytoFiles network files "network.sif" and "network.eda" for cytoscape are produced in outputdir.
See function createCytoFiles.
multiplePeaks If peaks is not NULL, data frame with Multiple Peak summary.
See multiplePeaks example data.
resultFiles If outputdir is not NULL, the above outputs will be also saved in files "qtlProfiles.csv", "qtlSumm.csv", "corrZeroOrder.csv","corrSecondOrder.csv", "corrPermutations.csv", "multiplePeaks.csv", respectively. A summary of analysis processing, results objects and output files can be seen in the R console and is saved in file "output.txt".

Note

The names of individuals (columnnames) must be consistent over genotypes and traits. The names of peaks (rownames) must be consistent over peaks and traits.

Author(s)

Jingyuan Fu <j.fu@rug.nl>, Morris Swertz <m.a.swertz@rug.nl>, Ritsert Jansen <r.c.jansen@rug.nl>

References

Fu J, Swertz MA, Keurentjes JJB, Jansen RC. MetaNetwork: a computational tool for the genetic study of metabolism. Nature Protocols (2007).

http://gbic.biol.rug.nl/supplementary/2007/MetaNetwork

See Also

Use markers, genotypes and traits as example data sets or use loadData to load your own data.
Use qtlMapTwoPart for the calculation of qtlProfiles.
Use qtlThreshold and qtlFDR for the estimation of qtlThres QTL significance threshold.
Use qtlCorrZeroOrder and qtlCorrSecondOrder for the calculation of zero order and second order correlation for corrZeroOrder and corrSecondOrder respectively.
Use qtlCorrThreshold for the estimation of corrThres correlation significance threshold.
Use qtlSummary for the generation of qtlSumm QTL summary.
Use createCytoFiles for the generation of Cytoscape network files.
Use findMultiplePeaks for the relation of isotopic or differentially charged metabolites.

Examples

## load the example data provided with this package
data(genotypes)
data(traits)
data(markers)

#set qtlThres
qtlThres    <- 3.79

#run metanetwork with predefined thresholds
MetaNetwork (markers=markers, genotypes=genotypes, traits=traits, spike=4, 
             qtlThres=qtlThres)

##OR: load data from csv
#genotypes <- loadData("genotypes.csv")
#traits    <- loadData("traits.csv")
#markers   <- loadData("markers.csv")
#MetaNetwork (markers=markers, genotypes=genotypes, traits=traits2, 
#             qtlThres=qtlThres, spike=4) 
             
##OR: let MetaNetwork estimate qtlThres and identify multiple peaks
#data(genotypes)
#data(traits2)
#data(markers)
#data(peaks2)
#MetaNetwork (markers=markers, genotypes=genotypes, traits=traits2, 
#             peaks=peaks2, spike=4)                           
  
##show part of the qtlProfiles
qtlProfiles[1:5,1:5]

##show part of the qtl summary
qtlSumm[1:5,]

##show part of the zero order correlation
corrZeroOrder[1:5,1:5]

##show part of the second order correlation
corrSecondOrder[1:5,1:5]

##plot the qtlProfiles
qtlPlot(markers, qtlProfiles, qtlThres)  

##load network.sif and network.eda into Cytoscape 

[Package MetaNetwork version 1.0-0 Index]