Gene Ontology

Provides access to Gene Ontology and its gene annotations.

Class References

class Ontology[source]

Ontology is the class representing a gene ontology.

Parameters
  • filename (str) – A filename of an .obo formated file.

  • progress_callback – Optional float -> None function.

Example

>>> # Load the current ontology (downloading it if necessary)
>>> ontology = Ontology()
>>> term_ids = list(ontology)
>>> term = ontology[term_ids[0]]
__contains__(termid)[source]

Return True if a term with termid is present in the ontology.

__getitem__(termid)[source]

Return a Term object with termid.

Parameters

term (str) – An id of a ‘Term’ in the ontology.

Return type

Term

__iter__()[source]

Iterate over all term ids in ontology.

__len__()[source]

Return number of terms in ontology.

defined_slims_subsets()[source]

Return a list of defined subsets in the ontology.

Return type

list of str

extract_sub_graph(terms)[source]

Return all sub terms of terms.

Parameters

terms (list) – A list of term IDs.

extract_super_graph(terms)[source]

Return all super terms of terms up to the most general one.

Parameters

terms (list) – A list of term IDs.

named_slims_subset(subset)[source]

Return all term IDs in a named subset.

Parameters

subset (str) – A string naming a subset in the ontology.

Return type

list of str

set_slims_subset(subset)[source]

Set the slims_subset term subset to subset.

Parameters

subset (set) – A subset of GO term IDs.

subset may also be a string, in which case the call is equivalent to ont.set_slims_subsets(ont.named_slims_subset(subset))

slims_for_term(term)[source]

Return a list of slim term IDs for term.

This is a list of most specific slim terms to which term belongs.

Parameters

term (str) – Term ID.

class Term[source]
id

The term id.

namespace

The namespace of the term.

def_

The term definition (Note the use of trailing underscore to avoid conflict with a python keyword).

is_a

List of term ids this term is a subterm of (parent terms).

related

List of (rel_type, term_id) tuples with rel_type specifying the relationship type with term_id.

class Annotations[source]

Annotations object holds the annotations.

Parameters
  • organism (str) – an organism specifier (e.g. '9606'). Annotations for that organism will be loaded.

  • ontology (Ontology) – Ontology object for annotations

gene_annotations

A dictionary mapping a gene (gene_id) to a set of all annotations of that gene.

term_anotations

A dictionary mapping a GO term id to a set of annotations that are directly annotated to that term

annotations

A list of all AnnotationRecords instances.

add_annotation(a)[source]

Add a single AnotationRecord instance to this object.

get_genes_with_known_annotation(genes)[source]

Return only genes with known annotation

Parameters

genes – List of genes

get_annotations_by_go_id(go_id)[source]

Return a set of all annotations (instances of AnnotationRecord) for GO term id and all it’s subterms.

:param str go_id: GO term id

get_genes_by_go_term(go_id, evidence_codes=None)[source]

Return a list of genes annotated by specified evidence_codes to GO term ‘id’ and all it’s subterms.”

Parameters
  • go_id (str) – GO term id

  • evidence_codes (list-of-strings) – List of evidence codes to consider when matching annotations to terms.

get_enriched_terms(genes, reference=None, evidence_codes=None, slims_only=False, aspect=None, prob=<orangecontrib.bioinformatics.utils.statistics.Binomial object>, use_fdr=True, progress_callback=None)[source]

Return a dictionary of enriched terms, with tuples of (list_of_genes, p_value, reference_count) for items and term ids as keys. P-Values are FDR adjusted if use_fdr is True (default).

Parameters
  • genes – List of genes

  • reference – List of genes (if None all genes included in the annotations will be used).

  • evidence_codes – List of evidence codes to consider.

  • slims_only – If True return only slim terms.

  • aspect – Which aspects to use. Use all by default; one of Process (biological process), Function (molecular function) or Component (cellular component)

  • prob

  • use_fdr

  • progress_callback

get_annotated_terms(genes, direct_annotation_only=False, evidence_codes=None, progress_callback=None)[source]

Return all terms that are annotated by genes with evidence_codes.

add(line)[source]

Add one annotation

append(line)[source]

Add one annotation

extend(lines)[source]

Add multiple annotations

class AnnotationRecord[source]

An annotation record mapping a gene to a term.

See ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/README for description if individual fields under <gene2go> section.

classmethod from_string(string)[source]

Create an instance from a line in a annotations file format from serverfiles.

Usage

Load the ontology and print out some terms:

from orangecontrib.bioinformatics import go
ontology = go.Ontology()
term = ontology["GO:0097194"] # execution phase of apoptosis

# print a term
print(term)

# access fields by name
print(term.id, term.name)
# note the use of underscore due to a conflict with a python def keyword
print(term.def_)

Searching the annotation (part of code/go/gene_annotations.py)

from orangecontrib.bioinformatics import go

ontology = go.Ontology()

# Load annotations for yeast.
annotations = go.Annotations("4932", ontology=ontology)

# keys are symbol names, values are Entrez IDs
genes = {'RRB1': '855161', 'OST4': '851366', 'VID27': '855509'}
res = annotations.get_enriched_terms(genes.values())


print(annotations.gene_annotations['855161'])
for a in annotations.gene_annotations['855161']:
    print(ontology[a.go_id].name + " with evidence code " + a.evidence)


# Get all genes annotated to the same terms as RRB1
ids = set([a.go_id for a in annotations.gene_annotations['855161']])
for term_id in ids:
    ants = annotations.get_annotations_by_go_id(term_id)
    genes = set([a.gene_id for a in ants])
    print(", ".join(genes) + " annotated to " + term_id + " " + ontology[term_id].name)

Term enrichment (part of code/go/enrichment.py)

from orangecontrib.bioinformatics import go

ontology = go.Ontology()
annotations = go.Annotations("4932", ontology=ontology)

# keys are symbol names, values are Entrez IDs
genes_ids = {'Yta7p': '853186', 'RPN2': '854735', 'RPT2': '851557'}
res = annotations.get_enriched_terms(genes_ids.values())

print(res)
print("Enriched terms:")
for go_id, (genes, p_value, ref) in res.items():
    if p_value < 0.05:
        print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))

# And again for slims
annotations.ontology.set_slims_subset('goslim_yeast')

res = annotations.get_enriched_terms(genes_ids.values(), slims_only=True)
print("\n\nEnriched slim terms:")
for go_id, (genes, p_value, _) in res.items():
    if p_value < 0.2:
        print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))