xQTL workbench file format reference

This is documentation on the data exchange format for the 'xQTL workbench' system.

To ease data exchange this system comes with a simple 'tab separated values' file format. In such text files the data is formatted in tables with the columns separated using tabs, colons, or semi-colons. Advantage is that these files can be easily created and parsed using common spreadsheet tools like Excel. An example of such tab delimited file is shown below:

name	description	date
Experiment1	This is my first experiment	2010-01-19
Experiment2	This is my second experiment	2010-01-20
This document describes what file types and columns are defined for the 'xQTL workbench' system. Data in this format can be uploaded to the database via the user interface using the 'File' menu). Alternatively, a whole directory of such files can be loaded in batch using the CsvImport program. The following files are currently recognized by this program (grouped by topic):

Below, the columns for each of these file types are detailed as well as example data shown (if available).

org.molgenis.auth file types

File: molgenisrole.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
Constraint: values in column name should unique.

File: molgenisgroup.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
Constraint: values in column name should unique.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
group__name
xref YES   group_. This xref uses {group__name} to find related elements in file molgenisGroup.txt based on unique column {name}.
role__name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
Contraint: values in the combined columns (group_, role_) should be unique.

File: person.txt

Contents:
Person represents one or more people involved with an Investigation. This may include authors on a paper, lab personnel or PIs. Person has last name, firstname, mid initial, address, contact and email. A Person role is included to represent how a Person is involved with an investigation. For submission to repository purposes an allowed value is 'submitter' and the term is present in the MGED Ontology, an alternative use could represent job title. An Example from ArrayExpress is E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt. .
The FUGE equivalent to Person is FuGE::Person.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_name
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_name} to find related elements in file personRole.txt based on unique column {name}.
Constraint: values in column name should unique.
Contraint: values in the combined columns (firstname, midinitials, lastname) should be unique.

File: personrole.txt

Contents:
Seperate type of ontologyTerm to administrate roles.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: institute.txt

Contents:
A contact is either a person or an organization. Copied from FuGE::Contact.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
name string YES   name.
Constraint: values in column name should unique.

File: molgenisuser.txt

Contents:
Anyone who can login .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_name
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_name} to find related elements in file personRole.txt based on unique column {name}.
password_ string   secret big fixme: password type.
activationcode string     Used as alternative authentication mechanism to verify user email and/or if user has lost password.
active bool   false Boolean to indicate if this account can be used to login.
superuser bool   false superuser.
Constraint: values in column name should unique.
Contraint: values in the combined columns (firstname, midinitials, lastname) should be unique.

File: molgenispermission.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
role__name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
entity_className
xref YES   entity. This xref uses {entity_classname} to find related elements in file molgenisEntity.txt based on unique column {classname}.
permission enum YES   permission.
Contraint: values in the combined columns (role_, entity, permission) should be unique.

org.molgenis.core file types

Generic entities you can use as the starting point of your model.

File: ontologyterm.txt

Contents:
OntologyTerm defines a single entry (term) from an ontology or a controlled vocabulary (defined by Ontology). The name is the ontology term which is unique within an ontology source, such as [examples here]. Other data entities can reference to this OntologyTerm to harmonize naming of concepts. Each term should have a local, unique label within the Investigation. If no suitable ontology term exists then one can define new terms locally (in which case there is no formal accession for the term limiting its use for cross-Investigation queries). In those cases the local name should be repeated in both term and termAccession. Maps to FuGE::OntologyIndividual; in MAGE-TAB there is no separate entity to model terms.
Optionally a local controlled vocabulary or ontology can be defined, for example to represent 'Codelists' often used in questionaires. Note: this is not a InvestigationElement because of the additional xref_label and unique constraint.This class defines a single entry from an ontology or a controlled vocabulary.
If it is a simple controlled vocabulary, there may be no formal accession for the term. In these cases the local name should be repeated in both term and termAccession. If the term has a value, the OntologyTerm will have a single DataProperty whose value was the value for the property. For instance, for an OntologyIndividual based on the MO ontology the attributes might be: The term would be what is usually called the local name in the Ontology, for instance 'Age'; The termAccession could be 'http://mged.sourceforge.net/ontologies/MGEDOntology.owl#Age' or a an arbitrary accession if one exists; The identifier is a unique identifier for individuals in the scope of the FuGE instance; The inherited name attribute should not be used; The ontologyURI of OntologySource could be 'http://mged.sourceforge.net/ontologies/MGEDOntology.owl". The OntologyTerm subclasses are instances of Ontology classes and properties, not the actual terms themselves. An OntologyIndividual, if based on an existing Ontology, can be considered a statement that can be validated against the referenced ontology. The subclasses and their associations are based on the Ontology Definition Model, ad/2005-04-13, submitted to the OMG as a response to RFP ad/2003-03-40, Copyright 2005 DSTC Pty Ltd. Copyright 2005 IBM Copyright 2005 Sandpiper Software, Inc under the standard OMG license terms.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: ontology.txt

Contents:
Ontology defines a reference to a an ontology or controlled vocabulary from which well-defined and stable (ontology) terms can be obtained. Each Ontology should have a unique name, for instance: Gene Ontology, Mammalian Phenotype, Human Phenotype Ontology, Unified Medical Language System, Medical Subject Headings, etc. Also a abbreviation is required, for instance: GO, MP, HPO, UMLS, MeSH, etc. Use of existing ontologies/vocabularies is recommended to harmonize phenotypic feature and value descriptions. But one can also create a 'local' Ontology. The Ontology class maps to FuGE::Ontology, MAGE-TAB::TermSourceREF.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontologyaccession string     A identifier that uniquely identifies the ontology (typically an acronym). E.g. GO, MeSH, HPO.
ontologyuri hyperlink     (Optional) A URI that references the location of the ontology.
Constraint: values in column name should unique.

File: molgenisfile.txt

Contents:
Helper entity to deal with files. Has a decorator to regulate storage and coupling to an Entity. Do not make abstract because of subtyping. This means the names of the subclasses will be used to distinguish MolgenisFiles and place them in the correct folders.
MS: make it use the <field type="file" property under the hood.
MS: where do the mimetypes go? I mean, I don't see the added value now.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
Constraint: values in column name should unique.

File: runtimeproperty.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
value string YES   Value.
Constraint: values in column name should unique.

File: publication.txt

Contents:
Publication is part of the Investigation package and is used to represent information about one or more publications related to an Investigation. The publication need not only be primary publication for an Investigation but may also represent other related information- though this use is less common. Publications have attributes of publications Authors and also DOI and Pubmed identifiers (when these are available). These are represented as OntologyTerms as in the MAGE-TAB model all 'xrefs' (cross references) for ontologies and accession numbers are handled generically. An example of a publication is available in an IDF file from ArrayExpress is experiment E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt .
The FuGE equivalent to Publication is FuGE::Bibliographic Reference.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
pubmedid_name
xref     Pubmed ID. This xref uses {pubmedid_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
doi_name
xref     Publication DOI. This xref uses {doi_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
authorlist text     The names of the authors of the publication.
title string YES   The title of the Publication.
status_name
xref     The status of the Publication. This xref uses {status_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
year string     The year of the Publication.
journal string     The title of the Journal.
Constraint: values in column name should unique.

File: usecase.txt

Contents:
All the use cases send to the server are stored in this entity .

Structure:
column name type required? auto/default description
usecaseid int   n+1 UseCaseId.
usecasename string YES   UseCaseName.
searchtype string YES   SearchType.
Constraint: values in column usecasename should unique.

File: molgenisentity.txt

Contents:
Referenceable catalog of entity names, menus, forms and plugins.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   Name of the entity.
type_ string YES   Type of the entity.
classname string YES   Full name of the entity.
Constraint: values in column classname should unique.
Contraint: values in the combined columns (name, type_) should be unique.

org.molgenis.data file types

This package enables data models to treat part of their data model as data matrix. is essential for Entity Attribute Value modeling such as used in xgap and pheno.

File: datafile.txt

Contents:
ObservedFile is to store observations that result in a file. Mapping to other models: MAGE-TAB 1.1 has the column ArrayDataFile and DerivedArrayDataFile. In order to make the MAGE-TAB 1.1. model more generic we have generalized these to DataFile and provided named associations to the respective types via Scan and Assay. TODO: make this link to MolgenisFile? Or distinguish between links and data? .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
uri string YES   reference to the location of the file.
format_name
xref YES   format of the file. Discussion: is this not already solved in MolgenisFile. This xref uses {format_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: data.txt

Contents:
Data is a data structure to store a homogeneous matrix of observedvalues as one unit, that is, all data elements in the set have the same type of feature, target and value. For example: an expression qtlProfile (observation.feature) for a Panel of mouse (observation.target) that consists of a matrix of Probe X marker (featureType and targetType respectively). In the user interface we expect that this observation can be shown as a bigger set of observations but click-able so the user can drill down to the underlying matrix.
Data is also an observationTarget: this allows Data to be referred to in an ObservedValue.relation. TODO: describe how this can be used to define inputs/outputs for a protocolApplication. This would allow us to use it to link 'pheno' to 'cluster' package so that the whole provenace can be administrated as part of the observation models.
This class maps to XGAP.DataMatrix and MAGE-TAB.Data.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
time datetime   today time when the protocol was applied.
protocol_name
xref     Reference to the protocol that is being used.. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
performer_name
mref     Performer. This mref uses {performer_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
featuretype enum YES   Defines the type of the columns of this data set. Each column refers to a Feature or Subject.
targettype enum YES   Defines the type of the rows of this matrix. Each row refers to a Feature or Subject.
valuetype enum YES   Type of the values of this matrix, either text strings or decimal numbers.
storage enum   Binary Tells you how the data elements are stored or should be stored. For example, 'Binary'.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: binarydatamatrix.txt

Contents:
Binary file backend for a datamatrix. This extension is used to deal with the actual source file. Coupled to a matrix with source type 'BinaryFile'. This entity is not shown in the interface. Discussion: I am not so happy with the need of alternative subclasses. Instead you just need a driver.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this binary file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

File: csvdatamatrix.txt

Contents:
CSV file backend for a datamatrix. Convenient to deal with the actual source file. Coupled to a matrix with source type 'CSVFile'. This entity is not shown in the interface.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this CSV file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

File: decimaldataelement.txt

Contents:
A DataElement for storing decimal data.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
data_name
xref YES   Reference to the data set this entity belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
featureindex int YES   Row position in the matrix.
targetindex int YES   Col position in the matrix.
value decimal     The value, e.g., correlation.
Contraint: values in the combined columns (featureindex, targetindex, data) should be unique.

File: textdataelement.txt

Contents:
Store text data .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
data_name
xref YES   Reference to the data set this entity belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
featureindex int YES   Row position in the matrix.
targetindex int YES   Col position in the matrix.
value string     The value, e.g., genotype strings like AA, BA, BB.
Contraint: values in the combined columns (featureindex, targetindex, data) should be unique.

File: originalfile.txt

Contents:
An unmodified original file that belongs to this datamatrix.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

org.molgenis.organization file types

Generic entities you can use as the starting point of your model.

File: investigation.txt

Contents:
Investigation defines self-contained units of study. For example: Framingham study. Optionally a description and an accession to a data source can be provided. Each Investigation has a unique name and a group of subjects of observation (ObservableTarget), traits of observation (ObservableFeature), results (in ObservedValues), and optionally actions (Protocols, ProtoclApplications). 'Invetigation' maps to standard XGAP/FuGE Investigation, MAGE-TAB Experiment and METABASE:Study.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
startdate datetime   today The start point of the study.
enddate datetime     The end point of the study.
contacts_name
mref     Contact persons for this study. This mref uses {contacts_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
accession hyperlink     (Optional) URI or accession number to indicate source of Study. E.g. arrayexpress:M-EXP-2345.
Constraint: values in column name should unique.

org.molgenis.pheno file types

Pheno is an generic version of XGAP developed in close collaboration within GEN2PHEN, EBI, UMC Groningen, U Groningen, FIMM, U Leicester. Todo: add docs again from pheno model. The pheno core needs to be preserved! Add changelog special section.

File: species.txt

Contents:
Ontology terms for species. E.g. Arabidopsis thaliana. DISCUSSION: should we avoid subclasses of OntologyTerm and instead make a 'tag' filter on terms so we can make pulldowns context dependent (e.g. to only show particular subqueries of ontologies).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: alternateid.txt

Contents:
An external identifier for an annotation. For example: name='R13H8.1', ontology='ensembl' or name='WBgene00000912', ontology='wormbase'.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: observationelement.txt

Contents:
Elements that are the targets or features we are looking at of our research.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observationtarget.txt

Contents:
An ObservationTarget class defines the subjects of observation. For instance: individual 1 from Investigation x. The ObservationTarget class maps to XGAP:Subject, METABASE:Patient and maps to Page:Abstract_Observation_Target. The name of observationTargets is unique.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observablefeature.txt

Contents:
ObservableFeature defines anything that can be observed in a phenotypic Investigation. For instance: Height, Systolic blood pressure, Diastolic blood pressure, and Treatment for hypertension are observable features. The name of ObservableFeature is unique within one Investigation. It is recommended that each ObservableFeature is named according to a well-defined ontology term which can be specified via ontologyReference. Note that in some instances an observableFeature can also be an observationTarget, for example in the case of correlation matrices. The ObservableFeature class maps to XGAP:Trait, METABASE:Question, FuGE:DimensionElement, and PaGE:ObservableFeature. Multi-value features can be grouped by Protocol. For instance: high blood pressure can be inferred from observations for features systolic and diastolic blood pressure. There may be many alternative protocols to measure a feature. See Protocol section.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: measurement.txt

Contents:
Generic obserable feature to flexibly define a measurement .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
unit_name
xref     (Optional) Reference to the well-defined measurement unit used to observe this feature (if feature is that concrete). E.g. mmHg. This xref uses {unit_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
datatype enum   string (Optional) Reference to the technical data type. E.g. 'int'.
temporal bool   false Whether this feature is time dependent and can have different values when measured on different times (e.g. weight, temporal=true) or generally only measured once (e.g. birth date, temporal=false).
categories_name
mref     Translation of codes into categories if applicable. This mref uses {categories_name} to find related elements in file category.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targettypeallowedforrelation_className
xref     Subclass of ObservationTarget (Individual, Panel or Location) that can be linked to (through the 'relation' field in ObservedValue) when using this Measurement (example: a Measurement 'Species' can only result in ObservedValues that have relations to Panels). This xref uses {targettypeallowedforrelation_classname} to find related elements in file molgenisEntity.txt based on unique column {classname}.
panellabelallowedforrelation string     Label that must have been applied to the Panel that can be linked to (through the 'relation' field in ObservedValue) when using this Measurement (example: a Measurement 'Species' can only result in ObservedValues that have relations to Panels labeled as 'Species').
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: category.txt

Contents:
Special kind of ObservationElement to define categorical answer codes such as are often used in Questionaires. A list of categories can be attached to an Measurement using Measurement.categories. For example the Measurement 'sex' has {code_string = 1, label=male} and {code_string = 2, label=female}. Categories can be linked to well-defined ontology terms via the ontologyReference. Category extends ObservationElement such that it can be referenced by ObservedValue.value. The Category class maps to METABASE::Category .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text YES   Description of the code. Use of ontology terms references to establish unambigious descriptions is recommended.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
code_string string YES   The code used to represent this category. For example: { '1' codes for 'male', '2'-'female'}.
ismissing bool   false whether this code should be treated as missing value.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: individual.txt

Contents:
The Individuals class defines human cases that are used as observation target. The Individual class maps to XGAP:Individual and PaGE:Individual. Note that minimal information like 'sex' can be defined as ObservedValue, and that that basic relationships like 'father' and 'mother' can also be defined via ObservedRelationship, using the 'relation' field. Groups of individuals can be defined via Panel.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
mother_name
xref     Refers to the mother of the individual.. This xref uses {mother_name} to find related elements in file individual.txt based on unique column {name}.
father_name
xref     Refers to the father of the individual.. This xref uses {father_name} to find related elements in file individual.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: location.txt

Contents:
This class defines physical locations such as buildings, departments, rooms, freezers and cages. Use ObservedValues to link locations to eachother, to build a location hierarchy.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: panel.txt

Contents:
The Panel class defines groups of individuals based on cohort design, case/controls, families, etc. For instance: LifeLines cohort, 'middle aged man', 'recombinant mouse inbred Line dba x b6' or 'Smith family'. A Panel can act as a single ObservationTarget. For example: average height (ObservedValue) in the LifeLines cohort (Panel) is 174cm. The Panel class maps to XGAP:Strain and PaGE:Panel classes. In METABASE this is assumed there is one panel per study.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
individuals_name
mref     The list of individuals in this panel. This mref uses {individuals_name} to find related elements in file individual.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
species_name
xref     The species this panel is an instance of/part of/extracted from.. This xref uses {species_name} to find related elements in file species.txt based on unique column {name}.
paneltype_name
xref     Indicate the type of Panel (example: Natural=wild type, Parental=parents of a cross, F1=First generation of cross, RCC=Recombinant congenic, CSS=chromosome substitution). This xref uses {paneltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
founderpanels_name
mref     The panel(s) that were used to create this panel.. This mref uses {founderpanels_name} to find related elements in file panel.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observedvalue.txt

Contents:
Generic storage of values, relationships and optional ontology mapping of the value/relation. Values can be atomatic observations, e.g., length (feature) of individual 1 (target) = 179cm (value). Values can also be relationship values, e.g., extract (feature) of sample 1 (target) = derived sample (relation).
Discussion: how to model sample pooling in this model?
More Discussion: do we want to have type specific subclasses? No, because you can solve this by casting during querying? .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
ontologyreference_name
xref     (Optional) Reference to the ontology definition or 'code' for this value (recommended for non-numeric values such as codes). This xref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
value string     The value observed.
relation_name
xref     Reference to other end of the relationship, if any. For example to a 'brother' or from 'sample' to 'derivedSample'.. This xref uses {relation_name} to find related elements in file observationElement.txt based on unique column {name}.
time datetime     (Optional) Time when the value was observed. For example in time series or if feature is time-dependent like 'age'.
endtime datetime     (Optional) Time when the value's validity ended.

org.molgenis.protocol file types

Molgenis compute framework that extends the molgenis protocol framework adding the computational details

File: protocol.txt

Contents:
The Protocol class defines parameterizable descriptions of methods; each protocol has a unique name within an Study. Each ProtocolApplication can define the ObservableFeatures it can observe. Also the protocol parameters can be modeled using ObservableFeatures (Users are expected to 'tag' the observeable feature by setting ObserveableFeature type as 'ProtocolParameter'. Examples of protocols are: SOP for blood pressure measurement used by UK biobank, or 'R/qtl' as protocol for statistical analysis. Protocol is a high level object that represents the details of protocols used during the investigation. The uses of Protocols to process BioMaterials and Data are referenced by ProtocolApplication (in the SDRF part of the format). Protocol has an association to OntologyTerm to represent the type of protocol. Protocols are associated with Hardware, Software and Parameters used in the Protocol. An example from ArrayExpress is E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt.
The FUGE equivalent to Protocol is FuGE::Protocol.
The Protocol class maps to FuGE/XGAP/MageTab Protocol, but in contrast to FuGE it is not required to extend protocol before use. The Protocol class also maps to METABASE:Form (note that components are solved during METABASE:Visit which can be nested). Has no equivalent in PaGE.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description richtext     Description, or reference to a description, of the protocol.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
protocoltype_name
xref     annotation of the protocol to a well-defined ontological class.. This xref uses {protocoltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
features_name
mref     The features that can be observed using this protocol. For example 'length' or 'rs123534' or 'probe123'. Also protocol parameters are considered observable features as they are important to the interpretation of the observed values.. This mref uses {features_name} to find related elements in file observableFeature.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targetfilter string     Expression that filters the InvestigationElements that can be targetted using this protocol. This helps the user to only select from targets that matter when setting observedvalues. For example: type='individual' AND species = 'human'.
contact_name
xref     TODO Check if there can be multiple contacts.. This xref uses {contact_name} to find related elements in file person.txt based on unique column {name}.
subprotocols_name
mref     Subprotocols of this protocol. This mref uses {subprotocols_name} to find related elements in file protocol.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.

File: protocolapplication.txt

Contents:
A ProtocolApplication class defines the actual action of observation by refering to a protocol and optional ParameterValues. The name field can be used to label applications with a human understandeable tag. For example: the action of blood pressure measurement on 1000 individuals, using a particular protocol, resulting in 1000 associated observed values. If desired, protocols can be shared between Studys; in those cases one should simply refer to a protocol in another Study.
ProtocolApplications are used in MAGE-TAB format to reference to protocols used, with optionally use of certain protocol parameter values. For example, a Source may be transformed into a Labeled Extract by the subsequent application of a Extraction and Labeling protocol. ProtocolApplication is associated with and Edge that links input/output, e.g. Source to Labeled Extract. The order of the application of protocols can be set in order to be able to reconstruct the left-to-right order of protocol references in MAGE-TAB format. The FuGE equivalent to ProtocolApplication is FuGE:ProtocolApplication, however input/output is modeled using Edge.
The ProtocolApplication class maps to FuGE/XGAP ProtocolApplication, but in FuGE ProtocolApplications can take Material or Data (or both) as input and produce Material or Data (or both) as output. Similar to PaGE.ObservationMethod. Maps to METABASE:Visit (also note that METABASE:PlannedVisit allows for planning of protocol applications; this is outside scope for this model?).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
time datetime   today time when the protocol was applied.
protocol_name
xref     Reference to the protocol that is being used.. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
performer_name
mref     Performer. This mref uses {performer_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.

File: protocoldocument.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
protocol_name
xref YES   protocol. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
document file YES   document.
Constraint: values in column name should unique.

File: workflow.txt

Contents:
A workflow is a plan to execute a series of subprotocols in a particular order. Each workflow elements is another protocol as refered to via WorkflowElement. Because Workflow extends Protocol, workflows can be nested just as any other protocol.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description richtext     Description, or reference to a description, of the protocol.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
protocoltype_name
xref     annotation of the protocol to a well-defined ontological class.. This xref uses {protocoltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
features_name
mref     The features that can be observed using this protocol. For example 'length' or 'rs123534' or 'probe123'. Also protocol parameters are considered observable features as they are important to the interpretation of the observed values.. This mref uses {features_name} to find related elements in file observableFeature.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targetfilter string     Expression that filters the InvestigationElements that can be targetted using this protocol. This helps the user to only select from targets that matter when setting observedvalues. For example: type='individual' AND species = 'human'.
contact_name
xref     TODO Check if there can be multiple contacts.. This xref uses {contact_name} to find related elements in file person.txt based on unique column {name}.
subprotocols_name
mref     Subprotocols of this protocol. This mref uses {subprotocols_name} to find related elements in file protocol.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: workflowelement.txt

Contents:
Elements of a workflow are references to protocols. The whole workflow is a directed graph with each element pointing to the previousSteps that the current workflow element depends on.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
workflow_name
xref YES   Workflow this element is part of. This xref uses {workflow_name} to find related elements in file workflow.txt based on unique column {name}.
protocol_name
xref YES   Protocol to be used at this workflow step. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
previoussteps_name
mref     Previous steps that need to be done before this protocol can be executed.. This mref uses {previoussteps_name} to find related elements in file workflowElement.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Constraint: values in column name should unique.

File: workflowelementparameter.txt

Contents:
Element parameters are the way to link workflow elements together. It allows override of the parameters from the previous step.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
workflowelement_name
xref YES   To attach a parameter to a WorkflowElement. This xref uses {workflowelement_name} to find related elements in file workflowElement.txt based on unique column {name}.
parameter_name
xref YES   Parameter definition.. This xref uses {parameter_name} to find related elements in file observableFeature.txt based on unique column {name}.
value string YES   Value of this parameter. Can be a template of form ${other} refering to previous values in context.
Contraint: values in the combined columns (workflowelement, parameter) should be unique.

org.molgenis.xgap file types

XGAP

File: chromosome.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
ordernr int YES   orderNr.
isautosomal bool YES   Is 'yes' when number of chromosomes is equal in male and female individuals, i.e., if not a sex chromosome.
bplength int     Lenght of the chromsome in base pairs.
species_name
xref     Reference to the species this chromosome belongs to.. This xref uses {species_name} to find related elements in file species.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: nmrbin.txt

Contents:
Shift of the NMR frequency due to the chemical environment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: clone.txt

Contents:
BAC clone fragment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: derivedtrait.txt

Contents:
Any meta trait, eg. false discovery rates, P-values, thresholds.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: environmentalfactor.txt

Contents:
Experimental conditions, such as temperature differences, batch effects etc.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: gene.txt

Contents:
Trait annotations specific for genes.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     Main symbol this gene is known by (not necessarily unique, in constrast to 'name').
orientation enum     Orientation of the gene on the genome (F=forward, R=reverse).
control bool     Indicating whether this is a 'housekeeping' gene that can be used as control.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: transcript.txt

Contents:
Trait annotations specific for transcripts.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
gene_name
xref     The gene that produces this protein. This xref uses {gene_name} to find related elements in file gene.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: protein.txt

Contents:
Trait annotations specific for proteins.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
gene_name
xref     The gene that produces this protein. This xref uses {gene_name} to find related elements in file gene.txt based on unique column {name}.
transcript_name
xref     The transcript variant that produces this protein. This xref uses {transcript_name} to find related elements in file transcript.txt based on unique column {name}.
aminosequence text     The aminoacid sequence.
mass decimal     The mass of this metabolite.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: metabolite.txt

Contents:
Trait annotations specific for metabolites.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
formula string     The chemical formula of a metabolite.
mass decimal     The mass of this metabolite.
structure text     The chemical structure of a metabolite (in SMILES representation).
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: marker.txt

Contents:
Trait annotations specific for markers.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
reportsfor_name
mref     The marker (or a subclass like 'SNP') this marker (or a subclass like 'SNP') reports for.. This mref uses {reportsfor_name} to find related elements in file marker.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: snp.txt

Contents:
A SNP is a special kind of Marker, but can also be seen as a phenotype to map against in some cases. A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide in the genome (or other shared sequence) differs between members of a biological species or paired chromosomes in an individual.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
reportsfor_name
mref     The marker (or a subclass like 'SNP') this marker (or a subclass like 'SNP') reports for.. This mref uses {reportsfor_name} to find related elements in file marker.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
status string     The status of this SNP, eg 'confirmed'.
polymorphism_name
mref     The polymorphism that belongs to this SNP.. This mref uses {polymorphism_name} to find related elements in file polymorphism.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: polymorphism.txt

Contents:
The difference of a single base discovered between two sequenced individuals.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
base enum YES   The affected DNA base. Note that you can select the reference base here.
value string     The strain/genotype for which this polymorphism was discovered. E.g. 'N2' or 'CB4856'.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: probe.txt

Contents:
A piece of sequence that reports for the expression of a gene, typically spotted onto a microarray.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
mismatch bool   false Indicating whether the probe is a match.
probeset_name
xref     Optional: probeset this probe belongs to (e.g., in Affymetrix assays).. This xref uses {probeset_name} to find related elements in file probeSet.txt based on unique column {name}.
reportsfor_name
xref     The gene this probe reports for.. This xref uses {reportsfor_name} to find related elements in file gene.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: spot.txt

Contents:
This is the spot on a microarray.
Note: We don't distinquish between probes (the sequence) and spots (the sequence as spotted on the array).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
mismatch bool   false Indicating whether the probe is a match.
probeset_name
xref     Optional: probeset this probe belongs to (e.g., in Affymetrix assays).. This xref uses {probeset_name} to find related elements in file probeSet.txt based on unique column {name}.
reportsfor_name
xref     The gene this probe reports for.. This xref uses {reportsfor_name} to find related elements in file gene.txt based on unique column {name}.
x int YES   Row.
y int YES   Column.
gridx int     Meta Row.
gridy int     Meta Column.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.
Contraint: values in the combined columns (x, y, gridx, gridy) should be unique.

File: probeset.txt

Contents:
A set of Probes. E.g. an Affymetrix probeset has multiple probes. It implements locus because sometimes you want to give the complete set of probes a range, for example: indicating that this set of probes spans basepair 0 through 10.000.000 on chromosome 3. The same information could arguably also be queried from the probes themselves, but if you have 40k probes, retrieving the same information from only ProbeSet (if annotated so) would be much faster.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: masspeak.txt

Contents:
A peak that has been selected within a mass spectrometry experiment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
mz decimal     Mass over charge ratio of this peak.
retentiontime decimal     The retention-time of this peak in minutes.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: investigationfile.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
description text     description field.
investigation_name
xref YES   Reference to the Study.. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
Constraint: values in column name should unique.

File: tissue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: samplelabel.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: sample.txt

Contents:
.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
individual_name
xref     The individual from which this sample was taken.. This xref uses {individual_name} to find related elements in file individual.txt based on unique column {name}.
tissue_name
xref     The tissue from which this sample was taken.. This xref uses {tissue_name} to find related elements in file tissue.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: pairedsample.txt

Contents:
A pair of samples labeled for a two-color microarray experiment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
subject1_name
xref YES   The first subject. This xref uses {subject1_name} to find related elements in file individual.txt based on unique column {name}.
label1_name
xref     Which channel or Fluorescent labeling is associated with the first subject. This xref uses {label1_name} to find related elements in file sampleLabel.txt based on unique column {name}.
subject2_name
xref YES   The second sample. This xref uses {subject2_name} to find related elements in file individual.txt based on unique column {name}.
label2_name
xref     Which channel or Fluorescent labeling is associated with the second subject. This xref uses {label2_name} to find related elements in file sampleLabel.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

org.molgenis.cluster file types

Cluster calculation tables.

File: job.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
outputdataname string YES   Name of the matrix that will be written.
timestamp string YES   Datatime when the job was started.
analysis_name
xref YES   Analysis. This xref uses {analysis_name} to find related elements in file analysis.txt based on unique column {name}.
computeresource enum   local ComputeResource.
Constraint: values in column outputdataname should unique.

File: subjob.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Reference to the job this subjob belongs to.. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
statuscode int YES   Status code of this subjob.
statustext string YES   Status text of this subjob.
statusprogress int     Percentage done.
nr int YES   Number of this subjob within the job.

File: analysis.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     Optional description of this type of analysis.
parameterset_name
xref YES   ParameterSet. This xref uses {parameterset_name} to find related elements in file parameterSet.txt based on unique column {name}.
dataset_name
xref YES   DataSet. This xref uses {dataset_name} to find related elements in file dataSet.txt based on unique column {name}.
targetfunctionname string YES   The function used to start a specific type of analysis on the cluster.
Constraint: values in column name should unique.

File: parameterset.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
Constraint: values in column name should unique.

File: parametername.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
parameterset_name
xref YES   ParameterSet. This xref uses {parameterset_name} to find related elements in file parameterSet.txt based on unique column {name}.
description text     Optional description of this parameter.
Contraint: values in the combined columns (name, parameterset) should be unique.

File: parametervalue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
parametername_name
xref YES   ParameterName. This xref uses {parametername_name} to find related elements in file parameterName.txt based on unique column {name}.
value string YES   Possible value of this parameter.
Contraint: values in the combined columns (name, parametername) should be unique.

File: dataset.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
Constraint: values in column name should unique.

File: dataname.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
dataset_name
xref YES   DataSet. This xref uses {dataset_name} to find related elements in file dataSet.txt based on unique column {name}.
Contraint: values in the combined columns (name, dataset) should be unique.

File: datavalue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
dataname_name
xref YES   DataName. This xref uses {dataname_name} to find related elements in file dataName.txt based on unique column {name}.
value_name
xref YES   Possible reference of this Data.. This xref uses {value_name} to find related elements in file data.txt based on unique column {name}.
Contraint: values in the combined columns (name, dataname) should be unique.

File: selectedparameter.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Job. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
parametername string YES   Copied name of this parameter.
parametervalue string YES   Copied value of this parameter.

File: selecteddata.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Job. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
dataname string YES   Copied name of this Data.
datavalue string YES   Copied referenced name of this Data.

File: rscript.txt

Contents:
Proof of concept to show users can add scripts to database, to be replaced later with more generic version from compute model.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
description text     description field.
investigation_name
xref YES   Reference to the Study.. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
Constraint: values in column name should unique.

xqtl file types

File: molgenisrole.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
Constraint: values in column name should unique.

File: molgenisgroup.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
Constraint: values in column name should unique.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
group__name
xref YES   group_. This xref uses {group__name} to find related elements in file molgenisGroup.txt based on unique column {name}.
role__name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
Contraint: values in the combined columns (group_, role_) should be unique.

File: person.txt

Contents:
Person represents one or more people involved with an Investigation. This may include authors on a paper, lab personnel or PIs. Person has last name, firstname, mid initial, address, contact and email. A Person role is included to represent how a Person is involved with an investigation. For submission to repository purposes an allowed value is 'submitter' and the term is present in the MGED Ontology, an alternative use could represent job title. An Example from ArrayExpress is E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt. .
The FUGE equivalent to Person is FuGE::Person.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_name
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_name} to find related elements in file personRole.txt based on unique column {name}.
Constraint: values in column name should unique.
Contraint: values in the combined columns (firstname, midinitials, lastname) should be unique.

File: personrole.txt

Contents:
Seperate type of ontologyTerm to administrate roles.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: institute.txt

Contents:
A contact is either a person or an organization. Copied from FuGE::Contact.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
name string YES   name.
Constraint: values in column name should unique.

File: molgenisuser.txt

Contents:
Anyone who can login .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   name.
address text     The address of the Contact.
phone string     The telephone number of the Contact including the suitable area codes.
email string     The email address of the Contact.
fax string     The fax number of the Contact.
tollfreephone string     A toll free phone number for the Contact, including suitable area codes.
city string     Added from the old definition of MolgenisUser. City of this contact.
country string     Added from the old definition of MolgenisUser. Country of this contact.
firstname string     First Name.
midinitials string     Mid Initials.
lastname string     Last Name.
title string     An academic title, e.g. Prof.dr, PhD.
affiliation_name
xref     Affiliation. This xref uses {affiliation_name} to find related elements in file institute.txt based on unique column {name}.
department string     Added from the old definition of MolgenisUser. Department of this contact.
roles_name
xref     Indicate role of the contact, e.g. lab worker or PI. Changed from mref to xref in oct 2011.. This xref uses {roles_name} to find related elements in file personRole.txt based on unique column {name}.
password_ string   secret big fixme: password type.
activationcode string     Used as alternative authentication mechanism to verify user email and/or if user has lost password.
active bool   false Boolean to indicate if this account can be used to login.
superuser bool   false superuser.
Constraint: values in column name should unique.
Contraint: values in the combined columns (firstname, midinitials, lastname) should be unique.

File: molgenispermission.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
role__name
xref YES   role_. This xref uses {role__name} to find related elements in file molgenisRole.txt based on unique column {name}.
entity_className
xref YES   entity. This xref uses {entity_classname} to find related elements in file molgenisEntity.txt based on unique column {classname}.
permission enum YES   permission.
Contraint: values in the combined columns (role_, entity, permission) should be unique.

File: ontologyterm.txt

Contents:
OntologyTerm defines a single entry (term) from an ontology or a controlled vocabulary (defined by Ontology). The name is the ontology term which is unique within an ontology source, such as [examples here]. Other data entities can reference to this OntologyTerm to harmonize naming of concepts. Each term should have a local, unique label within the Investigation. If no suitable ontology term exists then one can define new terms locally (in which case there is no formal accession for the term limiting its use for cross-Investigation queries). In those cases the local name should be repeated in both term and termAccession. Maps to FuGE::OntologyIndividual; in MAGE-TAB there is no separate entity to model terms.
Optionally a local controlled vocabulary or ontology can be defined, for example to represent 'Codelists' often used in questionaires. Note: this is not a InvestigationElement because of the additional xref_label and unique constraint.This class defines a single entry from an ontology or a controlled vocabulary.
If it is a simple controlled vocabulary, there may be no formal accession for the term. In these cases the local name should be repeated in both term and termAccession. If the term has a value, the OntologyTerm will have a single DataProperty whose value was the value for the property. For instance, for an OntologyIndividual based on the MO ontology the attributes might be: The term would be what is usually called the local name in the Ontology, for instance 'Age'; The termAccession could be 'http://mged.sourceforge.net/ontologies/MGEDOntology.owl#Age' or a an arbitrary accession if one exists; The identifier is a unique identifier for individuals in the scope of the FuGE instance; The inherited name attribute should not be used; The ontologyURI of OntologySource could be 'http://mged.sourceforge.net/ontologies/MGEDOntology.owl". The OntologyTerm subclasses are instances of Ontology classes and properties, not the actual terms themselves. An OntologyIndividual, if based on an existing Ontology, can be considered a statement that can be validated against the referenced ontology. The subclasses and their associations are based on the Ontology Definition Model, ad/2005-04-13, submitted to the OMG as a response to RFP ad/2003-03-40, Copyright 2005 DSTC Pty Ltd. Copyright 2005 IBM Copyright 2005 Sandpiper Software, Inc under the standard OMG license terms.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: ontology.txt

Contents:
Ontology defines a reference to a an ontology or controlled vocabulary from which well-defined and stable (ontology) terms can be obtained. Each Ontology should have a unique name, for instance: Gene Ontology, Mammalian Phenotype, Human Phenotype Ontology, Unified Medical Language System, Medical Subject Headings, etc. Also a abbreviation is required, for instance: GO, MP, HPO, UMLS, MeSH, etc. Use of existing ontologies/vocabularies is recommended to harmonize phenotypic feature and value descriptions. But one can also create a 'local' Ontology. The Ontology class maps to FuGE::Ontology, MAGE-TAB::TermSourceREF.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontologyaccession string     A identifier that uniquely identifies the ontology (typically an acronym). E.g. GO, MeSH, HPO.
ontologyuri hyperlink     (Optional) A URI that references the location of the ontology.
Constraint: values in column name should unique.

File: molgenisfile.txt

Contents:
Helper entity to deal with files. Has a decorator to regulate storage and coupling to an Entity. Do not make abstract because of subtyping. This means the names of the subclasses will be used to distinguish MolgenisFiles and place them in the correct folders.
MS: make it use the <field type="file" property under the hood.
MS: where do the mimetypes go? I mean, I don't see the added value now.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
Constraint: values in column name should unique.

File: runtimeproperty.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
value string YES   Value.
Constraint: values in column name should unique.

File: publication.txt

Contents:
Publication is part of the Investigation package and is used to represent information about one or more publications related to an Investigation. The publication need not only be primary publication for an Investigation but may also represent other related information- though this use is less common. Publications have attributes of publications Authors and also DOI and Pubmed identifiers (when these are available). These are represented as OntologyTerms as in the MAGE-TAB model all 'xrefs' (cross references) for ontologies and accession numbers are handled generically. An example of a publication is available in an IDF file from ArrayExpress is experiment E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt .
The FuGE equivalent to Publication is FuGE::Bibliographic Reference.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
pubmedid_name
xref     Pubmed ID. This xref uses {pubmedid_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
doi_name
xref     Publication DOI. This xref uses {doi_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
authorlist text     The names of the authors of the publication.
title string YES   The title of the Publication.
status_name
xref     The status of the Publication. This xref uses {status_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
year string     The year of the Publication.
journal string     The title of the Journal.
Constraint: values in column name should unique.

File: usecase.txt

Contents:
All the use cases send to the server are stored in this entity .

Structure:
column name type required? auto/default description
usecaseid int   n+1 UseCaseId.
usecasename string YES   UseCaseName.
searchtype string YES   SearchType.
Constraint: values in column usecasename should unique.

File: molgenisentity.txt

Contents:
Referenceable catalog of entity names, menus, forms and plugins.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   Name of the entity.
type_ string YES   Type of the entity.
classname string YES   Full name of the entity.
Constraint: values in column classname should unique.
Contraint: values in the combined columns (name, type_) should be unique.

File: datafile.txt

Contents:
ObservedFile is to store observations that result in a file. Mapping to other models: MAGE-TAB 1.1 has the column ArrayDataFile and DerivedArrayDataFile. In order to make the MAGE-TAB 1.1. model more generic we have generalized these to DataFile and provided named associations to the respective types via Scan and Assay. TODO: make this link to MolgenisFile? Or distinguish between links and data? .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
uri string YES   reference to the location of the file.
format_name
xref YES   format of the file. Discussion: is this not already solved in MolgenisFile. This xref uses {format_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: data.txt

Contents:
Data is a data structure to store a homogeneous matrix of observedvalues as one unit, that is, all data elements in the set have the same type of feature, target and value. For example: an expression qtlProfile (observation.feature) for a Panel of mouse (observation.target) that consists of a matrix of Probe X marker (featureType and targetType respectively). In the user interface we expect that this observation can be shown as a bigger set of observations but click-able so the user can drill down to the underlying matrix.
Data is also an observationTarget: this allows Data to be referred to in an ObservedValue.relation. TODO: describe how this can be used to define inputs/outputs for a protocolApplication. This would allow us to use it to link 'pheno' to 'cluster' package so that the whole provenace can be administrated as part of the observation models.
This class maps to XGAP.DataMatrix and MAGE-TAB.Data.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
time datetime   today time when the protocol was applied.
protocol_name
xref     Reference to the protocol that is being used.. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
performer_name
mref     Performer. This mref uses {performer_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
featuretype enum YES   Defines the type of the columns of this data set. Each column refers to a Feature or Subject.
targettype enum YES   Defines the type of the rows of this matrix. Each row refers to a Feature or Subject.
valuetype enum YES   Type of the values of this matrix, either text strings or decimal numbers.
storage enum   Binary Tells you how the data elements are stored or should be stored. For example, 'Binary'.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: binarydatamatrix.txt

Contents:
Binary file backend for a datamatrix. This extension is used to deal with the actual source file. Coupled to a matrix with source type 'BinaryFile'. This entity is not shown in the interface. Discussion: I am not so happy with the need of alternative subclasses. Instead you just need a driver.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this binary file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

File: csvdatamatrix.txt

Contents:
CSV file backend for a datamatrix. Convenient to deal with the actual source file. Coupled to a matrix with source type 'CSVFile'. This entity is not shown in the interface.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this CSV file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

File: decimaldataelement.txt

Contents:
A DataElement for storing decimal data.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
data_name
xref YES   Reference to the data set this entity belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
featureindex int YES   Row position in the matrix.
targetindex int YES   Col position in the matrix.
value decimal     The value, e.g., correlation.
Contraint: values in the combined columns (featureindex, targetindex, data) should be unique.

File: textdataelement.txt

Contents:
Store text data .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
data_name
xref YES   Reference to the data set this entity belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
featureindex int YES   Row position in the matrix.
targetindex int YES   Col position in the matrix.
value string     The value, e.g., genotype strings like AA, BA, BB.
Contraint: values in the combined columns (featureindex, targetindex, data) should be unique.

File: originalfile.txt

Contents:
An unmodified original file that belongs to this datamatrix.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
data_name
xref YES   Reference to the datamatrix this file belongs to.. This xref uses {data_name} to find related elements in file data.txt based on unique column {name}.
Constraint: values in column name should unique.

File: investigation.txt

Contents:
Investigation defines self-contained units of study. For example: Framingham study. Optionally a description and an accession to a data source can be provided. Each Investigation has a unique name and a group of subjects of observation (ObservableTarget), traits of observation (ObservableFeature), results (in ObservedValues), and optionally actions (Protocols, ProtoclApplications). 'Invetigation' maps to standard XGAP/FuGE Investigation, MAGE-TAB Experiment and METABASE:Study.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
startdate datetime   today The start point of the study.
enddate datetime     The end point of the study.
contacts_name
mref     Contact persons for this study. This mref uses {contacts_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
accession hyperlink     (Optional) URI or accession number to indicate source of Study. E.g. arrayexpress:M-EXP-2345.
Constraint: values in column name should unique.

File: species.txt

Contents:
Ontology terms for species. E.g. Arabidopsis thaliana. DISCUSSION: should we avoid subclasses of OntologyTerm and instead make a 'tag' filter on terms so we can make pulldowns context dependent (e.g. to only show particular subqueries of ontologies).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: alternateid.txt

Contents:
An external identifier for an annotation. For example: name='R13H8.1', ontology='ensembl' or name='WBgene00000912', ontology='wormbase'.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: observationelement.txt

Contents:
Elements that are the targets or features we are looking at of our research.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observationtarget.txt

Contents:
An ObservationTarget class defines the subjects of observation. For instance: individual 1 from Investigation x. The ObservationTarget class maps to XGAP:Subject, METABASE:Patient and maps to Page:Abstract_Observation_Target. The name of observationTargets is unique.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observablefeature.txt

Contents:
ObservableFeature defines anything that can be observed in a phenotypic Investigation. For instance: Height, Systolic blood pressure, Diastolic blood pressure, and Treatment for hypertension are observable features. The name of ObservableFeature is unique within one Investigation. It is recommended that each ObservableFeature is named according to a well-defined ontology term which can be specified via ontologyReference. Note that in some instances an observableFeature can also be an observationTarget, for example in the case of correlation matrices. The ObservableFeature class maps to XGAP:Trait, METABASE:Question, FuGE:DimensionElement, and PaGE:ObservableFeature. Multi-value features can be grouped by Protocol. For instance: high blood pressure can be inferred from observations for features systolic and diastolic blood pressure. There may be many alternative protocols to measure a feature. See Protocol section.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: measurement.txt

Contents:
Generic obserable feature to flexibly define a measurement .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     (Optional) Rudimentary meta data about the observable feature. Use of ontology terms references to establish unambigious descriptions is recommended.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
unit_name
xref     (Optional) Reference to the well-defined measurement unit used to observe this feature (if feature is that concrete). E.g. mmHg. This xref uses {unit_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
datatype enum   string (Optional) Reference to the technical data type. E.g. 'int'.
temporal bool   false Whether this feature is time dependent and can have different values when measured on different times (e.g. weight, temporal=true) or generally only measured once (e.g. birth date, temporal=false).
categories_name
mref     Translation of codes into categories if applicable. This mref uses {categories_name} to find related elements in file category.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targettypeallowedforrelation_className
xref     Subclass of ObservationTarget (Individual, Panel or Location) that can be linked to (through the 'relation' field in ObservedValue) when using this Measurement (example: a Measurement 'Species' can only result in ObservedValues that have relations to Panels). This xref uses {targettypeallowedforrelation_classname} to find related elements in file molgenisEntity.txt based on unique column {classname}.
panellabelallowedforrelation string     Label that must have been applied to the Panel that can be linked to (through the 'relation' field in ObservedValue) when using this Measurement (example: a Measurement 'Species' can only result in ObservedValues that have relations to Panels labeled as 'Species').
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: category.txt

Contents:
Special kind of ObservationElement to define categorical answer codes such as are often used in Questionaires. A list of categories can be attached to an Measurement using Measurement.categories. For example the Measurement 'sex' has {code_string = 1, label=male} and {code_string = 2, label=female}. Categories can be linked to well-defined ontology terms via the ontologyReference. Category extends ObservationElement such that it can be referenced by ObservedValue.value. The Category class maps to METABASE::Category .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text YES   Description of the code. Use of ontology terms references to establish unambigious descriptions is recommended.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
code_string string YES   The code used to represent this category. For example: { '1' codes for 'male', '2'-'female'}.
ismissing bool   false whether this code should be treated as missing value.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: individual.txt

Contents:
The Individuals class defines human cases that are used as observation target. The Individual class maps to XGAP:Individual and PaGE:Individual. Note that minimal information like 'sex' can be defined as ObservedValue, and that that basic relationships like 'father' and 'mother' can also be defined via ObservedRelationship, using the 'relation' field. Groups of individuals can be defined via Panel.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
mother_name
xref     Refers to the mother of the individual.. This xref uses {mother_name} to find related elements in file individual.txt based on unique column {name}.
father_name
xref     Refers to the father of the individual.. This xref uses {father_name} to find related elements in file individual.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: location.txt

Contents:
This class defines physical locations such as buildings, departments, rooms, freezers and cages. Use ObservedValues to link locations to eachother, to build a location hierarchy.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: panel.txt

Contents:
The Panel class defines groups of individuals based on cohort design, case/controls, families, etc. For instance: LifeLines cohort, 'middle aged man', 'recombinant mouse inbred Line dba x b6' or 'Smith family'. A Panel can act as a single ObservationTarget. For example: average height (ObservedValue) in the LifeLines cohort (Panel) is 174cm. The Panel class maps to XGAP:Strain and PaGE:Panel classes. In METABASE this is assumed there is one panel per study.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
individuals_name
mref     The list of individuals in this panel. This mref uses {individuals_name} to find related elements in file individual.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
species_name
xref     The species this panel is an instance of/part of/extracted from.. This xref uses {species_name} to find related elements in file species.txt based on unique column {name}.
paneltype_name
xref     Indicate the type of Panel (example: Natural=wild type, Parental=parents of a cross, F1=First generation of cross, RCC=Recombinant congenic, CSS=chromosome substitution). This xref uses {paneltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
founderpanels_name
mref     The panel(s) that were used to create this panel.. This mref uses {founderpanels_name} to find related elements in file panel.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: observedvalue.txt

Contents:
Generic storage of values, relationships and optional ontology mapping of the value/relation. Values can be atomatic observations, e.g., length (feature) of individual 1 (target) = 179cm (value). Values can also be relationship values, e.g., extract (feature) of sample 1 (target) = derived sample (relation).
Discussion: how to model sample pooling in this model?
More Discussion: do we want to have type specific subclasses? No, because you can solve this by casting during querying? .

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
investigation_name
xref     Investigation. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
protocolapplication_name
xref     Reference to the protocol application that was used to produce this observation. For example a particular patient visit or the application of a microarray or the calculation of a QTL model. This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
feature_name
xref YES   References the ObservableFeature that this observation was made on. For example 'probe123'. Can be ommited for 1D data (i.e., a data list).. This xref uses {feature_name} to find related elements in file observationElement.txt based on unique column {name}.
target_name
xref YES   References the ObservationTarget that this feature was made on. For example 'individual1'. In a correlation matrix this could be also 'probe123'.. This xref uses {target_name} to find related elements in file observationElement.txt based on unique column {name}.
ontologyreference_name
xref     (Optional) Reference to the ontology definition or 'code' for this value (recommended for non-numeric values such as codes). This xref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
value string     The value observed.
relation_name
xref     Reference to other end of the relationship, if any. For example to a 'brother' or from 'sample' to 'derivedSample'.. This xref uses {relation_name} to find related elements in file observationElement.txt based on unique column {name}.
time datetime     (Optional) Time when the value was observed. For example in time series or if feature is time-dependent like 'age'.
endtime datetime     (Optional) Time when the value's validity ended.

File: protocol.txt

Contents:
The Protocol class defines parameterizable descriptions of methods; each protocol has a unique name within an Study. Each ProtocolApplication can define the ObservableFeatures it can observe. Also the protocol parameters can be modeled using ObservableFeatures (Users are expected to 'tag' the observeable feature by setting ObserveableFeature type as 'ProtocolParameter'. Examples of protocols are: SOP for blood pressure measurement used by UK biobank, or 'R/qtl' as protocol for statistical analysis. Protocol is a high level object that represents the details of protocols used during the investigation. The uses of Protocols to process BioMaterials and Data are referenced by ProtocolApplication (in the SDRF part of the format). Protocol has an association to OntologyTerm to represent the type of protocol. Protocols are associated with Hardware, Software and Parameters used in the Protocol. An example from ArrayExpress is E-MTAB-506 ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/TABM/E-TABM-506/E-TABM-506.idf.txt.
The FUGE equivalent to Protocol is FuGE::Protocol.
The Protocol class maps to FuGE/XGAP/MageTab Protocol, but in contrast to FuGE it is not required to extend protocol before use. The Protocol class also maps to METABASE:Form (note that components are solved during METABASE:Visit which can be nested). Has no equivalent in PaGE.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description richtext     Description, or reference to a description, of the protocol.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
protocoltype_name
xref     annotation of the protocol to a well-defined ontological class.. This xref uses {protocoltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
features_name
mref     The features that can be observed using this protocol. For example 'length' or 'rs123534' or 'probe123'. Also protocol parameters are considered observable features as they are important to the interpretation of the observed values.. This mref uses {features_name} to find related elements in file observableFeature.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targetfilter string     Expression that filters the InvestigationElements that can be targetted using this protocol. This helps the user to only select from targets that matter when setting observedvalues. For example: type='individual' AND species = 'human'.
contact_name
xref     TODO Check if there can be multiple contacts.. This xref uses {contact_name} to find related elements in file person.txt based on unique column {name}.
subprotocols_name
mref     Subprotocols of this protocol. This mref uses {subprotocols_name} to find related elements in file protocol.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.

File: protocolapplication.txt

Contents:
A ProtocolApplication class defines the actual action of observation by refering to a protocol and optional ParameterValues. The name field can be used to label applications with a human understandeable tag. For example: the action of blood pressure measurement on 1000 individuals, using a particular protocol, resulting in 1000 associated observed values. If desired, protocols can be shared between Studys; in those cases one should simply refer to a protocol in another Study.
ProtocolApplications are used in MAGE-TAB format to reference to protocols used, with optionally use of certain protocol parameter values. For example, a Source may be transformed into a Labeled Extract by the subsequent application of a Extraction and Labeling protocol. ProtocolApplication is associated with and Edge that links input/output, e.g. Source to Labeled Extract. The order of the application of protocols can be set in order to be able to reconstruct the left-to-right order of protocol references in MAGE-TAB format. The FuGE equivalent to ProtocolApplication is FuGE:ProtocolApplication, however input/output is modeled using Edge.
The ProtocolApplication class maps to FuGE/XGAP ProtocolApplication, but in FuGE ProtocolApplications can take Material or Data (or both) as input and produce Material or Data (or both) as output. Similar to PaGE.ObservationMethod. Maps to METABASE:Visit (also note that METABASE:PlannedVisit allows for planning of protocol applications; this is outside scope for this model?).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
time datetime   today time when the protocol was applied.
protocol_name
xref     Reference to the protocol that is being used.. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
performer_name
mref     Performer. This mref uses {performer_name} to find related elements in file person.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.

File: protocoldocument.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
protocol_name
xref YES   protocol. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
document file YES   document.
Constraint: values in column name should unique.

File: workflow.txt

Contents:
A workflow is a plan to execute a series of subprotocols in a particular order. Each workflow elements is another protocol as refered to via WorkflowElement. Because Workflow extends Protocol, workflows can be nested just as any other protocol.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description richtext     Description, or reference to a description, of the protocol.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
protocoltype_name
xref     annotation of the protocol to a well-defined ontological class.. This xref uses {protocoltype_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
features_name
mref     The features that can be observed using this protocol. For example 'length' or 'rs123534' or 'probe123'. Also protocol parameters are considered observable features as they are important to the interpretation of the observed values.. This mref uses {features_name} to find related elements in file observableFeature.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
targetfilter string     Expression that filters the InvestigationElements that can be targetted using this protocol. This helps the user to only select from targets that matter when setting observedvalues. For example: type='individual' AND species = 'human'.
contact_name
xref     TODO Check if there can be multiple contacts.. This xref uses {contact_name} to find related elements in file person.txt based on unique column {name}.
subprotocols_name
mref     Subprotocols of this protocol. This mref uses {subprotocols_name} to find related elements in file protocol.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: workflowelement.txt

Contents:
Elements of a workflow are references to protocols. The whole workflow is a directed graph with each element pointing to the previousSteps that the current workflow element depends on.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
workflow_name
xref YES   Workflow this element is part of. This xref uses {workflow_name} to find related elements in file workflow.txt based on unique column {name}.
protocol_name
xref YES   Protocol to be used at this workflow step. This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
previoussteps_name
mref     Previous steps that need to be done before this protocol can be executed.. This mref uses {previoussteps_name} to find related elements in file workflowElement.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Constraint: values in column name should unique.

File: workflowelementparameter.txt

Contents:
Element parameters are the way to link workflow elements together. It allows override of the parameters from the previous step.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
workflowelement_name
xref YES   To attach a parameter to a WorkflowElement. This xref uses {workflowelement_name} to find related elements in file workflowElement.txt based on unique column {name}.
parameter_name
xref YES   Parameter definition.. This xref uses {parameter_name} to find related elements in file observableFeature.txt based on unique column {name}.
value string YES   Value of this parameter. Can be a template of form ${other} refering to previous values in context.
Contraint: values in the combined columns (workflowelement, parameter) should be unique.

File: chromosome.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
ordernr int YES   orderNr.
isautosomal bool YES   Is 'yes' when number of chromosomes is equal in male and female individuals, i.e., if not a sex chromosome.
bplength int     Lenght of the chromsome in base pairs.
species_name
xref     Reference to the species this chromosome belongs to.. This xref uses {species_name} to find related elements in file species.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: nmrbin.txt

Contents:
Shift of the NMR frequency due to the chemical environment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: clone.txt

Contents:
BAC clone fragment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: derivedtrait.txt

Contents:
Any meta trait, eg. false discovery rates, P-values, thresholds.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: environmentalfactor.txt

Contents:
Experimental conditions, such as temperature differences, batch effects etc.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: gene.txt

Contents:
Trait annotations specific for genes.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     Main symbol this gene is known by (not necessarily unique, in constrast to 'name').
orientation enum     Orientation of the gene on the genome (F=forward, R=reverse).
control bool     Indicating whether this is a 'housekeeping' gene that can be used as control.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: transcript.txt

Contents:
Trait annotations specific for transcripts.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
gene_name
xref     The gene that produces this protein. This xref uses {gene_name} to find related elements in file gene.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: protein.txt

Contents:
Trait annotations specific for proteins.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
gene_name
xref     The gene that produces this protein. This xref uses {gene_name} to find related elements in file gene.txt based on unique column {name}.
transcript_name
xref     The transcript variant that produces this protein. This xref uses {transcript_name} to find related elements in file transcript.txt based on unique column {name}.
aminosequence text     The aminoacid sequence.
mass decimal     The mass of this metabolite.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: metabolite.txt

Contents:
Trait annotations specific for metabolites.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
formula string     The chemical formula of a metabolite.
mass decimal     The mass of this metabolite.
structure text     The chemical structure of a metabolite (in SMILES representation).
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: marker.txt

Contents:
Trait annotations specific for markers.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
reportsfor_name
mref     The marker (or a subclass like 'SNP') this marker (or a subclass like 'SNP') reports for.. This mref uses {reportsfor_name} to find related elements in file marker.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: snp.txt

Contents:
A SNP is a special kind of Marker, but can also be seen as a phenotype to map against in some cases. A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide in the genome (or other shared sequence) differs between members of a biological species or paired chromosomes in an individual.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
reportsfor_name
mref     The marker (or a subclass like 'SNP') this marker (or a subclass like 'SNP') reports for.. This mref uses {reportsfor_name} to find related elements in file marker.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
status string     The status of this SNP, eg 'confirmed'.
polymorphism_name
mref     The polymorphism that belongs to this SNP.. This mref uses {polymorphism_name} to find related elements in file polymorphism.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: polymorphism.txt

Contents:
The difference of a single base discovered between two sequenced individuals.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
base enum YES   The affected DNA base. Note that you can select the reference base here.
value string     The strain/genotype for which this polymorphism was discovered. E.g. 'N2' or 'CB4856'.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: probe.txt

Contents:
A piece of sequence that reports for the expression of a gene, typically spotted onto a microarray.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
mismatch bool   false Indicating whether the probe is a match.
probeset_name
xref     Optional: probeset this probe belongs to (e.g., in Affymetrix assays).. This xref uses {probeset_name} to find related elements in file probeSet.txt based on unique column {name}.
reportsfor_name
xref     The gene this probe reports for.. This xref uses {reportsfor_name} to find related elements in file gene.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: spot.txt

Contents:
This is the spot on a microarray.
Note: We don't distinquish between probes (the sequence) and spots (the sequence as spotted on the array).

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
mismatch bool   false Indicating whether the probe is a match.
probeset_name
xref     Optional: probeset this probe belongs to (e.g., in Affymetrix assays).. This xref uses {probeset_name} to find related elements in file probeSet.txt based on unique column {name}.
reportsfor_name
xref     The gene this probe reports for.. This xref uses {reportsfor_name} to find related elements in file gene.txt based on unique column {name}.
x int YES   Row.
y int YES   Column.
gridx int     Meta Row.
gridy int     Meta Column.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.
Contraint: values in the combined columns (x, y, gridx, gridy) should be unique.

File: probeset.txt

Contents:
A set of Probes. E.g. an Affymetrix probeset has multiple probes. It implements locus because sometimes you want to give the complete set of probes a range, for example: indicating that this set of probes spans basepair 0 through 10.000.000 on chromosome 3. The same information could arguably also be queried from the probes themselves, but if you have 40k probes, retrieving the same information from only ProbeSet (if annotated so) would be much faster.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
chromosome_name
xref     Reference to the chromosome this position belongs to.. This xref uses {chromosome_name} to find related elements in file chromosome.txt based on unique column {name}.
cm decimal     genetic map position in centi morgan (cM).
bpstart long     numeric basepair postion (5') on the chromosome.
bpend long     numeric basepair postion (3') on the chromosome.
seq text     The FASTA text representation of the sequence.
symbol string     todo.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: masspeak.txt

Contents:
A peak that has been selected within a mass spectrometry experiment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
mz decimal     Mass over charge ratio of this peak.
retentiontime decimal     The retention-time of this peak in minutes.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: investigationfile.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
description text     description field.
investigation_name
xref YES   Reference to the Study.. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
Constraint: values in column name should unique.

File: tissue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: samplelabel.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
ontology_name
xref     (Optional) The source ontology or controlled vocabulary list that ontology terms have been obtained from.. This xref uses {ontology_name} to find related elements in file ontology.txt based on unique column {name}.
termaccession string     (Optional) The accession number assigned to the ontology term in its source ontology. If empty it is assumed to be a locally defined term.
definition string     (Optional) The definition of the term.
termpath string     EXTENSION. The Ontology Lookup Service path that contains this term.
Contraint: values in the combined columns (ontology, termaccession) should be unique.
Contraint: values in the combined columns (ontology, name) should be unique.

File: sample.txt

Contents:
.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
individual_name
xref     The individual from which this sample was taken.. This xref uses {individual_name} to find related elements in file individual.txt based on unique column {name}.
tissue_name
xref     The tissue from which this sample was taken.. This xref uses {tissue_name} to find related elements in file tissue.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: pairedsample.txt

Contents:
A pair of samples labeled for a two-color microarray experiment.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     description field.
investigation_name
xref     Reference to the Study that this data element is part of. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
ontologyreference_name
mref     (Optional) Reference to the formal ontology definition for this element, e.g. 'Animal' or 'GWAS protocol'. This mref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
alternateid_name
mref     Alternative identifiers or symbols that this element is known by.. This mref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}. . More than one reference can be added separated by '|', for example: ref1|ref2|ref3.
label string     User friendly textual representation of this ObservationElement. For example: 'male', 'mouse 3 in cage 7' or 'TRA-2 like protein'. Label allows for human-readable name that is potentially not unique.
subject1_name
xref YES   The first subject. This xref uses {subject1_name} to find related elements in file individual.txt based on unique column {name}.
label1_name
xref     Which channel or Fluorescent labeling is associated with the first subject. This xref uses {label1_name} to find related elements in file sampleLabel.txt based on unique column {name}.
subject2_name
xref YES   The second sample. This xref uses {subject2_name} to find related elements in file individual.txt based on unique column {name}.
label2_name
xref     Which channel or Fluorescent labeling is associated with the second subject. This xref uses {label2_name} to find related elements in file sampleLabel.txt based on unique column {name}.
Contraint: values in the combined columns (name, investigation) should be unique.
Constraint: values in column name should unique.

File: job.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
outputdataname string YES   Name of the matrix that will be written.
timestamp string YES   Datatime when the job was started.
analysis_name
xref YES   Analysis. This xref uses {analysis_name} to find related elements in file analysis.txt based on unique column {name}.
computeresource enum   local ComputeResource.
Constraint: values in column outputdataname should unique.

File: subjob.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Reference to the job this subjob belongs to.. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
statuscode int YES   Status code of this subjob.
statustext string YES   Status text of this subjob.
statusprogress int     Percentage done.
nr int YES   Number of this subjob within the job.

File: analysis.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
description text     Optional description of this type of analysis.
parameterset_name
xref YES   ParameterSet. This xref uses {parameterset_name} to find related elements in file parameterSet.txt based on unique column {name}.
dataset_name
xref YES   DataSet. This xref uses {dataset_name} to find related elements in file dataSet.txt based on unique column {name}.
targetfunctionname string YES   The function used to start a specific type of analysis on the cluster.
Constraint: values in column name should unique.

File: parameterset.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
Constraint: values in column name should unique.

File: parametername.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
parameterset_name
xref YES   ParameterSet. This xref uses {parameterset_name} to find related elements in file parameterSet.txt based on unique column {name}.
description text     Optional description of this parameter.
Contraint: values in the combined columns (name, parameterset) should be unique.

File: parametervalue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
parametername_name
xref YES   ParameterName. This xref uses {parametername_name} to find related elements in file parameterName.txt based on unique column {name}.
value string YES   Possible value of this parameter.
Contraint: values in the combined columns (name, parametername) should be unique.

File: dataset.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
Constraint: values in column name should unique.

File: dataname.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
dataset_name
xref YES   DataSet. This xref uses {dataset_name} to find related elements in file dataSet.txt based on unique column {name}.
Contraint: values in the combined columns (name, dataset) should be unique.

File: datavalue.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
dataname_name
xref YES   DataName. This xref uses {dataname_name} to find related elements in file dataName.txt based on unique column {name}.
value_name
xref YES   Possible reference of this Data.. This xref uses {value_name} to find related elements in file data.txt based on unique column {name}.
Contraint: values in the combined columns (name, dataname) should be unique.

File: selectedparameter.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Job. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
parametername string YES   Copied name of this parameter.
parametervalue string YES   Copied value of this parameter.

File: selecteddata.txt

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
job_OutputDataName
xref YES   Job. This xref uses {job_outputdataname} to find related elements in file job.txt based on unique column {outputdataname}.
dataname string YES   Copied name of this Data.
datavalue string YES   Copied referenced name of this Data.

File: rscript.txt

Contents:
Proof of concept to show users can add scripts to database, to be replaced later with more generic version from compute model.

Structure:
column name type required? auto/default description
id int   n+1 automatically generated id.
name string YES   human-readable name.
extension string YES   The file extension. This will be mapped to MIME type at runtime. For example, a type 'png' will be served out as 'image/png'.
description text     description field.
investigation_name
xref YES   Reference to the Study.. This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
Constraint: values in column name should unique.

Appendix: documentation of the mref tables

xqtl file types

File: investigation_contacts.txt

Contents:
Link table for many-to-many relationship 'Investigation.contacts'.

Structure:
column name type required? auto/default description
contacts_name
xref YES   This xref uses {contacts_name} to find related elements in file person.txt based on unique column {name}.
investigation_name
xref YES   This xref uses {investigation_name} to find related elements in file investigation.txt based on unique column {name}.
Contraint: values in the combined columns (contacts, investigation) should be unique.

File: observationelement_ontolo12449.txt

Contents:
Link table for many-to-many relationship 'ObservationElement.ontologyReference'.

Structure:
column name type required? auto/default description
ontologyreference_name
xref YES   This xref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
investigationelement_name
xref YES   This xref uses {investigationelement_name} to find related elements in file observationElement.txt based on unique column {name}.
Contraint: values in the combined columns (ontologyreference, investigationelement) should be unique.

File: observationelement_alternateid.txt

Contents:
Link table for many-to-many relationship 'ObservationElement.AlternateId'.

Structure:
column name type required? auto/default description
alternateid_name
xref YES   This xref uses {alternateid_name} to find related elements in file alternateId.txt based on unique column {name}.
observationelement_name
xref YES   This xref uses {observationelement_name} to find related elements in file observationElement.txt based on unique column {name}.
Contraint: values in the combined columns (alternateid, observationelement) should be unique.

File: measurement_categories.txt

Contents:
Link table for many-to-many relationship 'Measurement.categories'.

Structure:
column name type required? auto/default description
categories_name
xref YES   This xref uses {categories_name} to find related elements in file category.txt based on unique column {name}.
measurement_name
xref YES   This xref uses {measurement_name} to find related elements in file measurement.txt based on unique column {name}.
Contraint: values in the combined columns (categories, measurement) should be unique.

File: panel_individuals.txt

Contents:
Link table for many-to-many relationship 'Panel.Individuals'.

Structure:
column name type required? auto/default description
individuals_name
xref YES   This xref uses {individuals_name} to find related elements in file individual.txt based on unique column {name}.
panel_name
xref YES   This xref uses {panel_name} to find related elements in file panel.txt based on unique column {name}.
Contraint: values in the combined columns (individuals, panel) should be unique.

File: panel_founderpanels.txt

Contents:
Link table for many-to-many relationship 'Panel.FounderPanels'.

Structure:
column name type required? auto/default description
founderpanels_name
xref YES   This xref uses {founderpanels_name} to find related elements in file panel.txt based on unique column {name}.
panel_name
xref YES   This xref uses {panel_name} to find related elements in file panel.txt based on unique column {name}.
Contraint: values in the combined columns (founderpanels, panel) should be unique.

File: protocol_ontologyreference.txt

Contents:
Link table for many-to-many relationship 'Protocol.ontologyReference'.

Structure:
column name type required? auto/default description
ontologyreference_name
xref YES   This xref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
investigationelement_name
xref YES   This xref uses {investigationelement_name} to find related elements in file protocol.txt based on unique column {name}.
Contraint: values in the combined columns (ontologyreference, investigationelement) should be unique.

File: protocol_features.txt

Contents:
Link table for many-to-many relationship 'Protocol.Features'.

Structure:
column name type required? auto/default description
features_name
xref YES   This xref uses {features_name} to find related elements in file observableFeature.txt based on unique column {name}.
protocol_name
xref YES   This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
Contraint: values in the combined columns (features, protocol) should be unique.

File: protocol_subprotocols.txt

Contents:
Link table for many-to-many relationship 'Protocol.subprotocols'.

Structure:
column name type required? auto/default description
subprotocols_name
xref YES   This xref uses {subprotocols_name} to find related elements in file protocol.txt based on unique column {name}.
protocol_name
xref YES   This xref uses {protocol_name} to find related elements in file protocol.txt based on unique column {name}.
Contraint: values in the combined columns (subprotocols, protocol) should be unique.

File: protocolapplication_ontol11768.txt

Contents:
Link table for many-to-many relationship 'ProtocolApplication.ontologyReference'.

Structure:
column name type required? auto/default description
ontologyreference_name
xref YES   This xref uses {ontologyreference_name} to find related elements in file ontologyTerm.txt based on unique column {name}.
investigationelement_name
xref YES   This xref uses {investigationelement_name} to find related elements in file protocolApplication.txt based on unique column {name}.
Contraint: values in the combined columns (ontologyreference, investigationelement) should be unique.

File: protocolapplication_performer.txt

Contents:
Link table for many-to-many relationship 'ProtocolApplication.Performer'.

Structure:
column name type required? auto/default description
performer_name
xref YES   This xref uses {performer_name} to find related elements in file person.txt based on unique column {name}.
protocolapplication_name
xref YES   This xref uses {protocolapplication_name} to find related elements in file protocolApplication.txt based on unique column {name}.
Contraint: values in the combined columns (performer, protocolapplication) should be unique.

File: workflowelement_previoussteps.txt

Contents:
Link table for many-to-many relationship 'WorkflowElement.PreviousSteps'.

Structure:
column name type required? auto/default description
previoussteps_name
xref YES   This xref uses {previoussteps_name} to find related elements in file workflowElement.txt based on unique column {name}.
workflowelement_name
xref YES   This xref uses {workflowelement_name} to find related elements in file workflowElement.txt based on unique column {name}.
Contraint: values in the combined columns (previoussteps, workflowelement) should be unique.

File: marker_reportsfor.txt

Contents:
Link table for many-to-many relationship 'Marker.ReportsFor'.

Structure:
column name type required? auto/default description
reportsfor_name
xref YES   This xref uses {reportsfor_name} to find related elements in file marker.txt based on unique column {name}.
marker_name
xref YES   This xref uses {marker_name} to find related elements in file marker.txt based on unique column {name}.
Contraint: values in the combined columns (reportsfor, marker) should be unique.

File: snp_polymorphism.txt

Contents:
Link table for many-to-many relationship 'SNP.Polymorphism'.

Structure:
column name type required? auto/default description
polymorphism_name
xref YES   This xref uses {polymorphism_name} to find related elements in file polymorphism.txt based on unique column {name}.
snp_name
xref YES   This xref uses {snp_name} to find related elements in file sNP.txt based on unique column {name}.
Contraint: values in the combined columns (polymorphism, snp) should be unique.