KEGG - Kyoto Encyclopedia of Genes and Genomes
KEGG - Kyoto Encyclopedia of Genes and Genomes
kegg
is a python module for accessing KEGG (Kyoto Encyclopedia
of Genes and Genomes) using its web services.
>>> # Create a KEGG Genes database interface
>>> genome = KEGGGenome()
>>> # List all available entry ids
>>> keys = list(genome.keys())
>>> print(keys[0])
T01001
>>> # Retrieve the entry for the key.
>>> entry = genome[keys[0]]
>>> print(entry.entry_key)
T01001
>>> print(entry.definition)
Homo sapiens (human)
>>> print(entry)
ENTRY T01001 Complete Genome
NAME hsa, HUMAN, 9606
DEFINITION Homo sapiens (human)
...
The Organism
class can be a convenient starting point
for organism specific databases.
>>> organism = Organism("Homo sapiens") # searches for the organism by name
>>> print(organism.org_code) # prints the KEGG organism code
hsa
>>> genes = organism.genes # get the genes database for the organism
>>> gene_ids = list(genes.keys()) # KEGG gene identifiers
>>> entry = genes["hsa:672"]
>>> print(entry.definition)
(RefSeq) BRCA1, DNA repair associated
>>> # print the entry in DBGET database format.
>>> print(entry)
ENTRY 672 CDS T01001
NAME BRCA1, BRCAI, BRCC1, BROVCA1, FANCS, IRIS, PNCA4, PPP1R53, PSCP, RNF53
DEFINITION ...
- class Organism(org)[source]
A convenience class for retrieving information regarding an organism in the KEGG Genes database.
- Parameters
org (str) – KEGG organism code (e.g. “hsa”, “sce”). Can also be a descriptive name (e.g. ‘yeast’, “homo sapiens”) in which case the organism code will be searched for by using KEGG find api.
See also
organism_name_search
Search KEGG for an organism code
- property org
KEGG organism code.
- gene_aliases()[source]
Return a list of sets of equal genes (synonyms) in KEGG for this organism.
Note
This only includes ‘ncbi-geneid’ and ‘ncbi-proteinid’ records from the KEGG Genes DBLINKS entries.
- organism_name_search(name)[source]
Search for a organism by name and return it’s KEGG organism code.
DBEntry (entry
)
The entry.DBEntry
represents a DBGET databas entry.
The individual KEGG Database interfaces below provide their own
specialization for this base class.
KEGG Databases interface (databases
)
- class DBDataBase(**kwargs)[source]
Bases:
object
Base class for a DBGET database interface.
- ENTRY_TYPE
ENTRY_TYPE constructor (a
DBEntry
subclass). This should be redefined in subclasses.alias of
DBEntry
- DB = None
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- itervalues()[source]
Return an iterator over all
DBDataBase.ENTRY_TYPE
instances.
- keys()[source]
Return an iterator over all database keys. These are unique KEGG identifiers that can be used to query the database.
- values()[source]
Return an iterator over all
DBDataBase.ENTRY_TYPE
instances.
- items()[source]
Return an iterator over all (key,
DBDataBase.ENTRY_TYPE
) tuples.
- get(key, default=None)[source]
Return an
DBDataBase.ENTRY_TYPE
instance for the key. RaisesKeyError
if not found.
- class GenomeEntry(text)[source]
Bases:
DBEntry
Entry for a KEGG Genome database.
- property organism_code
A three or four letter KEGG organism code (e.g. ‘hsa’, ‘sce’, …)
- property taxid
Organism NCBI taxonomy id.
- property annotation
ANNOTATION
- property chromosome
CHROMOSOME
- property comment
COMMENT
- property data_source
DATA_SOURCE
- property definition
DEFINITION
- property disease
DISEASE
- property entry
ENTRY
- property keywords
KEYWORDS
- property name
NAME
- property org_code
ORG_CODE
- property original_db
ORIGINAL_DB
- property plasmid
PLASMID
- property reference
REFERENCE
- property statistics
STATISTICS
- property taxonomy
TAXONOMY
- class Genome[source]
Bases:
DBDataBase
An interface to the A KEGG GENOME database.
- DB = 'genome'
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- ENTRY_TYPE
alias of
GenomeEntry
- class GeneEntry(text=None)[source]
Bases:
DBEntry
- property aaseq
AASEQ
- property brite
BRITE
- property class_
CLASS
- property dblinks
DBLINKS
- property definition
DEFINITION
- property disease
DISEASE
- property drug_target
DRUG_TARGET
- property entry
ENTRY
- property module
MODULE
- property motif
MOTIF
- property name
NAME
- property ntseq
NTSEQ
- property organism
ORGANISM
- property orthology
ORTHOLOGY
- property pathway
PATHWAY
- property position
POSITION
- property structure
STRUCTURE
- class Genes(org_code)[source]
Bases:
DBDataBase
Interface to the KEGG Genes database.
- Parameters
org_code (str) – KEGG organism code (e.g. ‘hsa’).
- DB = None
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- class CompoundEntry(text=None)[source]
Bases:
DBEntry
- property atom
ATOM
- property bond
BOND
- property brite
BRITE
- property comment
COMMENT
- property dblinks
DBLINKS
- property entry
ENTRY
- property enzyme
ENZYME
- property exact_mass
EXACT_MASS
- property formula
FORMULA
- property mol_weight
MOL_WEIGHT
- property name
NAME
- property pathway
PATHWAY
- property reaction
REACTION
- property reference
REFERENCE
- property remark
REMARK
- class Compound[source]
Bases:
DBDataBase
- DB = 'cpd'
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- ENTRY_TYPE
alias of
CompoundEntry
- class ReactionEntry(text=None)[source]
Bases:
DBEntry
- property definition
DEFINITION
- property entry
ENTRY
- property enzyme
ENZYME
- property equation
EQUATION
- property name
NAME
- class Reaction[source]
Bases:
DBDataBase
- DB = 'rn'
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- ENTRY_TYPE
alias of
ReactionEntry
- class EnzymeEntry(text=None)[source]
Bases:
DBEntry
- property all_reac
ALL_REAC
- property class_
CLASS
- property comment
COMMENT
- property dblinks
DBLINKS
- property entry
ENTRY
- property genes
GENES
- property name
NAME
- property orthology
ORTHOLOGY
- property pathway
PATHWAY
- property product
PRODUCT
- property reaction
REACTION
- property reference
REFERENCE
- property substrate
SUBSTRATE
- property sysname
SYSNAME
- class Enzyme[source]
Bases:
DBDataBase
- DB = 'ec'
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- ENTRY_TYPE
alias of
EnzymeEntry
- class PathwayEntry(text=None)[source]
Bases:
DBEntry
- property class_
CLASS
- property compound
COMPOUND
- property dblinks
DBLINKS
- property description
DESCRIPTION
- property disease
DISEASE
- property drug
DRUG
- property entry
ENTRY
- property enzyme
ENZYME
- property ko_pathway
KO_PATHWAY
- property module
MODULE
- property name
NAME
- property organism
ORGANISM
- property pathway_map
PATHWAY_MAP
- property reference
REFERENCE
- property rel_pathway
REL_PATHWAY
- class Pathway(prefix='map')[source]
Bases:
DBDataBase
KEGG Pathway database
- Parameters
prefix (str) – KEGG Organism code (‘hsa’, …) or ‘map’, ‘ko’, ‘ec’ or ‘rn’
- DB = 'path'
A database name/abbreviation (e.g. ‘pathway’). Needs to be set in a subclass or object instance’s constructor before calling the base. __init__
- ENTRY_TYPE
alias of
PathwayEntry
KEGG Pathway (pathway
)
- class Pathway(pathway_id, local_cache=None, connection=None)[source]
Bases:
object
Class representing a KEGG Pathway (parsed from a “kgml” file)
- Parameters
pathway_id (str) – A KEGG pathway id (e.g. ‘path:hsa05130’)
- property name
hsa05130”)
- Type
Pathway name/id (e.g. “path
- property org
Pathway organism code (e.g. ‘hsa’)
- property number
Pathway number as a string (e.g. ‘05130’)
- property title
Pathway title string.
- property image
URL of the pathway image.
- property link
URL to a pathway on the KEGG web site.
Utilities
- class DBGETEntryParser[source]
A DBGET entry parser (inspired by
xml.dom.pulldom
).Example
>>> stream = StringIO( ... "ENTRY foo\n" ... "NAME foo's name\n" ... " BAR A subsection of 'NAME'\n" ... ) >>> parser = DBGETEntryParser() >>> for event, title, contents_part in parser.parse(stream): ... print(parser.EVENTS[event], title, repr(contents_part)) ... ENTRY_START None None SECTION_START ENTRY 'foo\n' SECTION_END ENTRY None SECTION_START NAME "foo's name\n" SUBSECTION_START BAR "A subsection of 'NAME'\n" SUBSECTION_END BAR None SECTION_END NAME None ENTRY_END None None
- ENTRY_END = 1
Entry end event
- ENTRY_START = 0
Entry start events
- SECTION_END = 3
Section end event
- SECTION_START = 2
Section start event
- SUBSECTION_END = 5
Subsection end event
- SUBSECTION_START = 4
Subsection start event
- TEXT = 6
Text element event