Computing double-pushout graph transformation rules and atom-to-atom maps from KEGG RCLASS data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Atom-to-atom maps play an important role in many applications. However, they are often difficult to obtain. The KEGG reaction database does not provide atom-to-atom maps for its reactions and instead offers a description of local changes for pairs of reactant and product molecules in terms of so-called RCLASSes. Developed for classification purposes, RCLASS data are difficult to use for purposes such as the construction of atom-to-atom maps or reaction rules. DPO graph transformation rules, on the other hand, work as a convenient and efficient representation, particularly for these applications. The RCLASS data can be understood as collections of local graph patterns in the reactants and products of a reaction, together with partial correspondences of atoms. The problem of converting RCLASS data into DPO rules, therefore, is a special case of the graph reconstruction problem, which consists of inferring a graph from a collection of subgraphs.
Results
We developed , a tool that computes explicit DPO rules from KEGG reactions and RCLASS data. The algorithm proceeds stepwise, starting with a translation of individual RDM codes, specifically developed by the KEGG database, into equivalent RDM pattern graphs . Multiple RDM pattern graphs for the same RCLASS are then combined based on their embeddings into the reactant and product molecules, observing certain consistency conditions. In the final step, these combined pairwise patterns are merged into a pair of subgraphs of reactants and products, respectively. If RCLASSes connecting all pairs of reactant and product molecules are available, the complete reaction center(s) is/are contained in the union of these subgraphs. The atom-to-atom map inherited from the RDM codes then defines a DPO transformation rule. Application of these rules to the reactants then yields complete atom-to-atom maps (AAMs). Starting from 3195 RCLASSes, generates a total of 1232 DPO rules and 1594 AAMs.
Conclusions
The software makes it possible to extract local atom-to-atom maps from the RCLASSes of the KEGG database, covering a large set of enzyme-catalyzed reactions. The results are made available in the form of DPO rules for use in atom-level models of metabolic networks, filling a crucial gap in the available data.