OmicSoft Land Metadata Definitions

From Array Suite Wiki

Jump to: navigation, search


Definitions for Common Land Metadata Columns


The OmicSoft Curation Team processes hundreds of disease-related projects every quarter, which includes carefully categorizing every sample's metadata, to ensure that you can find the data you are interested in.

Sample metadata are derived from metadata submitted to GEO, as well as from the source publication.

Every metadata column has a precise scope, but the logic can sometimes be unclear to the new user. This page will provide definitions and examples for the most commonly-used metadata columns.


  • Disease Category - Grouping of multiple related specific disease states (auto-generated from DiseaseState, Controlled Vocabulary)
  • Disease State - The specific disease for the defined subject (Controlled Vocabulary)
    • normal control - subject was healthy with no known disease (with or without treatment); wild-type animals or reporter animals, untreated or with short-term treatment, or long-term treatment of compound that does not induce disease; cells isolated from healthy subjects
    • disease control - tissue with normal pathology from a subject with unknown disease, or from deceased subject; biopsy from normal tissue that is not easily accessible (e.g. requiring surgery); non-wild-type animals with no known disease phenotype (e.g. transgene or targeted mutation); cell lines derived from the above
      • "easily" accessible biopsy tissues include skin, blood, hair, saliva, throat swabs, urine/fecal samples
    • general disease - human tissue from subject with a disease, but the specific disease is not specified; animal with a targeted mutation or treated with a set protocol to recapitulate a general disease pathology
    • genetic disease - subjects with a known genetic mutation but no specific disease
  • Sample Pathology - Within a tissue (such as skin from a patient with Psoriasis disease state), whether or not the sample was exhibiting the disease pathology (Controlled Vocabulary)

Controlled vocabulary terms from Human Disease Ontology, Monarch Disease Ontology, MeSH and Orphanet.


  • Tissue Category - General category that groups multiple tissues, automatically generated from Tissue curation (Controlled Vocabulary)
  • Tissue - The most specific tissue term describing where the sample was isolated (Controlled Vocabulary)
  • Cell Type - Identifies the cell type of the sample (Controlled Vocabulary)
  • Sample Source - The source of the biological material of the Sample, unmodified from GEO submission.

Controlled vocabulary terms from Uberon and BRENDA Tissue.


If a specific cell type was isolated and identified by the authors, will be indicated here.

Controlled vocabulary terms from Cell Ontology, ImmGen, and source publications.


If a commercially-available cell line was used, the cell line will be defined here.

Controlled vocabulary terms from ATCC, Cell Line Ontology, and Cellosaurus.


  • Treatment: For in vitro studies, describes the treatment on a sample (Controlled Vocabulary)
  • Subject Treatment: For in vivo studies, describes the treatment (Controlled Vocabulary).
    • If the same subject was sampled before and after treatment, Subject Treatment will be the same, but Treatment Status will indicate which sample is post-treatment
  • Treatment Status: Indicates an individual sample's treatment, if the sample came from a Subject (i.e. patient) that was sampled pre- and post-treatment (Controlled Vocabulary)

Controlled vocabulary terms from PubChem, NCIT, DrugBank, ChemSpider, and treatment source company's web site.


"Disease vs. Normal","Disease1 vs. Disease2","CellType1 vs. CellType2","Treatment vs. Control","Treatment1 vs. Treatment2","Responder vs. Non-Responder","Tissue1 vs. Tissue2","Healthy vs. Control","Other Comparisons"