bionetgen.atomizer.atomizer package
Submodules
bionetgen.atomizer.atomizer.analyzeRDF module
bionetgen.atomizer.atomizer.analyzeSBML module
Created on Thu Mar 22 13:11:38 2012
@author: proto
- class bionetgen.atomizer.atomizer.analyzeSBML.SBMLAnalyzer(modelParser, configurationFile, namingConventions, speciesEquivalences=None, conservationOfMass=True)[source]
Bases:
object
- analyzeSpeciesModification(baseElement, modifiedElement, partialAnalysis)[source]
a method for trying to read modifications within complexes This is only possible once we know their internal structure (this method is called after the creation and resolving of the dependency graph)
- analyzeSpeciesModification2(baseElement, modifiedElement, partialAnalysis)[source]
A method to read modifications within complexes.
- approximateMatching2(reactantString, productString, strippedMolecules, differenceParameter)[source]
The meat of the naming convention matching between reactant and product is done here tl;dr naming conventions are hard
- checkCompliance(ruleCompliance, tupleCompliance, ruleBook)[source]
This method is mainly useful when a single reaction can be possibly classified in different ways, but in the context of its tuple partners it can only be classified as one
- classifyReactions(reactions, molecules, externalDependencyGraph={})[source]
classifies a group of reaction according to the information in the json config file
FIXME:classifiyReactions function is currently the biggest bottleneck in atomizer, taking up to 80% of the time without counting pathwaycommons querying.
- classifyReactionsWithAnnotations(reactions, molecules, annotations, labelDictionary)[source]
this model will go through the list of reactions and assign a ‘modification’ tag to those reactions where some kind of modification goes on aided through annotation information
- findClosestModification(particles, species, annotationDict, originalDependencyGraph)[source]
maps a set of particles to the complete set of species using lexical analysis. This step is done independent of the reaction network.
- fuzzyArtificialReaction(baseElements, modifiedElement, molecules)[source]
in case we don’t know how a species is composed but we know its base elements, try to get it by concatenating its basic reactants
- getReactionClassification(reactionDefinition, rules, equivalenceTranslator, indirectEquivalenceTranslator, translationKeys=[])[source]
reactionDefinition is a list of conditions that must be met for a reaction to be classified a certain way rules is the list of reactions equivalenceTranslator is a dictinary containing all complexes that have been determined to be the same through naming conventions This method will go through the list of rules and the list of rule definitions and tell us which rules it can classify according to the rule definitions list provided
- getReactionProperties()[source]
if we are using a naming convention definition in the json file this method will return the component and state names that this reaction uses
- greedyModificationMatching(speciesString, referenceSpecies)[source]
recursive function trying to map a given species string to a string permutation of the strings in reference species >>> sa = SBMLAnalyzer(None,’./config/reactionDefinitions.json’,’./config/namingConventions.json’) >>> sorted(sa.greedyModificationMatching(‘EGF_EGFR’,[‘EGF’,’EGFR’])) [‘EGF’, ‘EGFR’] >>> sorted(sa.greedyModificationMatching(‘EGF_EGFR_2_P_Grb2’,[‘EGF’,’EGFR’,’EGF_EGFR_2_P’,’Grb2’])) [‘EGF_EGFR_2_P’, ‘Grb2’] >>> sorted(sa.greedyModificationMatching(‘A_B_C_D’,[‘A’,’B’,’C’,’C_D’,’A_B_C’,’A_B’])) [‘A_B’, ‘C_D’]
- growString(reactant, product, rp, pp, idx, strippedMolecules, continuityFlag)[source]
currently this is the slowest method in the system because of all those calls to difflib
- identifyReactions2(rule, reactionDefinition)[source]
This method goes through the list of common reactions listed in ruleDictionary and tries to find how are they related according to the information in reactionDefinition
- loadConfigFiles(fileName)[source]
the reactionDefinition file must contain the definitions of the basic reaction types we wnat to parse and what are the requirements of a given reaction type to be considered as such
- processAdHocNamingConventions(reactant, product, localSpeciesDict, compartmentChangeFlag, moleculeSet)[source]
1-1 string comparison. This method will attempt to detect if there’s a modifiation relatinship between string <reactant> and <product>
>>> sa = SBMLAnalyzer(None,'./config/reactionDefinitions.json','./config/namingConventions.json') >>> sa.processAdHocNamingConventions('EGF_EGFR_2','EGF_EGFR_2_P', {}, False, ['EGF','EGFR', 'EGF_EGFR_2']) [[[['EGF_EGFR_2'], ['EGF_EGFR_2_P']], '_p', ('+ _', '+ p')]] >>> sa.processAdHocNamingConventions('A', 'A_P', {}, False,['A','A_P']) #changes neeed to be at least 3 characters long [[[['A'], ['A_P']], None, None]] >>> sa.processAdHocNamingConventions('Ras_GDP', 'Ras_GTP', {}, False,['Ras_GDP','Ras_GTP', 'Ras']) [[[['Ras'], ['Ras_GDP']], '_gdp', ('+ _', '+ g', '+ d', '+ p')], [[['Ras'], ['Ras_GTP']], '_gtp', ('+ _', '+ g', '+ t', '+ p')]] >>> sa.processAdHocNamingConventions('cRas_GDP', 'cRas_GTP', {}, False,['cRas_GDP','cRas_GTP']) [[[['cRas'], ['cRas_GDP']], '_gdp', ('+ _', '+ g', '+ d', '+ p')], [[['cRas'], ['cRas_GTP']], '_gtp', ('+ _', '+ g', '+ t', '+ p')]]
- processFuzzyReaction(reaction, translationKeys, conventionDict, indirectEquivalenceTranslator)[source]
- removeExactMatches(reactantList, productList)[source]
goes through the list of lists reactantList and productList and removes the intersection
bionetgen.atomizer.atomizer.atomizationAux module
- exception bionetgen.atomizer.atomizer.atomizationAux.CycleError(memory)[source]
Bases:
Exception
Exception raised for errors in the input.
- Attributes:
expr – input expression in which the error occurred msg – explanation of the error
- bionetgen.atomizer.atomizer.atomizationAux.addAssumptions(assumptionType, assumption, assumptions)[source]
- bionetgen.atomizer.atomizer.atomizationAux.addToDependencyGraph(dependencyGraph, label, value)[source]
- bionetgen.atomizer.atomizer.atomizationAux.getAnnotations(annotation)[source]
parses a libsbml.XMLAttributes annotation object into a list of annotations
- bionetgen.atomizer.atomizer.atomizationAux.getURIFromSBML(moleculeName, parser, filterString=None)[source]
filters a list of URI’s so that we get only filterString ID’s
- bionetgen.atomizer.atomizer.atomizationAux.parseReactions(reaction)[source]
given a reaction string definition it separates the elements into reactants and products >>> parseReactions(‘A() + B() -> C() k1()’) [[‘A’, ‘B’], [‘C’]] >>> parseReactions(‘A()@EC + B()@PM -> C()@PM k1()’) [[‘A’, ‘B’], [‘C’]] >>> parseReactions(‘0 -> A() k1()’) [‘0’, [‘A’]]
bionetgen.atomizer.atomizer.atomizerUtils module
bionetgen.atomizer.atomizer.detectOntology module
Created on Sat Oct 19 15:19:35 2013
@author: proto
- bionetgen.atomizer.atomizer.detectOntology.analyzeNamingConventions(speciesName, ontologyFile, ontologyDictionary={}, similarityThreshold=4)[source]
- bionetgen.atomizer.atomizer.detectOntology.defineEditDistanceMatrix(speciesName, similarityThreshold=4, parallel=False)[source]
obtains a distance matrix and a pairs of elements that are close in distance, along with the proposed differences
- bionetgen.atomizer.atomizer.detectOntology.defineEditDistanceMatrix3(speciesName, similarityThreshold=4, parallel=False)[source]
- bionetgen.atomizer.atomizer.detectOntology.getDifferences(scoreMatrix, speciesName, threshold)[source]
given a list of strings and a scoreMatrix, return the list of difference between those strings with a levenshtein difference of less than threshold returns:
namePairs: list of tuples containing strings with distance <2 differenceList: list of differences between the tuples in namePairs
bionetgen.atomizer.atomizer.moleculeCreation module
Created on Tue Apr 2 21:06:43 2013
@author: proto
- bionetgen.atomizer.atomizer.moleculeCreation.addBondToComponent(species, moleculeName, componentName, bond, priority=1)[source]
- bionetgen.atomizer.atomizer.moleculeCreation.addComponentToMolecule(species, moleculeName, componentName)[source]
- bionetgen.atomizer.atomizer.moleculeCreation.addStateToComponent(species, moleculeName, componentName, state)[source]
- bionetgen.atomizer.atomizer.moleculeCreation.atomize(dependencyGraph, weights, translator, reactionProperties, equivalenceDictionary, bioGridFlag, sbmlAnalyzer, database, parser)[source]
The atomizer’s main methods. Receives a dependency graph
- bionetgen.atomizer.atomizer.moleculeCreation.createBindingRBM(element, translator, dependencyGraph, bioGridFlag, pathwaycommonsFlag, parser, database)[source]
- bionetgen.atomizer.atomizer.moleculeCreation.createCatalysisRBM(dependencyGraph, element, translator, reactionProperties, equivalenceDictionary, sbmlAnalyzer, database)[source]
if it’s a catalysis reaction create a new component/state
- bionetgen.atomizer.atomizer.moleculeCreation.getBondNumber(molecule1, molecule2)[source]
keeps a model-level registry of of all the molecule pairs and returns a unique index
- bionetgen.atomizer.atomizer.moleculeCreation.getComplexationComponents2(moleculeName, species, bioGridFlag, pathwaycommonsFlag=False, parser=None, bondSeeding=[], bondExclusion=[], database=None)[source]
method used during the atomization process. It determines how molecules in a species bind together
- bionetgen.atomizer.atomizer.moleculeCreation.getTrueTag(dependencyGraph, molecule)[source]
given any modified or basic element it returns its basic name
- bionetgen.atomizer.atomizer.moleculeCreation.identifyReaction(equivalenceDictionary, originalElement, modifiedElement)[source]
- bionetgen.atomizer.atomizer.moleculeCreation.isInComplexWith(moleculeSet, parser=None)[source]
given a list of binding candidates, it gets the uniprot ID from annotation and queries the pathway commons class to see if there’s known binding information for those two
- bionetgen.atomizer.atomizer.moleculeCreation.sanityCheck(database)[source]
checks for critical atomization errors like isomorphism
- bionetgen.atomizer.atomizer.moleculeCreation.solveComplexBinding(totalComplex, pathwaycommonsFlag, parser, compositionEntry)[source]
given two binding complexes it will attempt to find the ways in which they bind using different criteria
- bionetgen.atomizer.atomizer.moleculeCreation.transformMolecules(parser, database, configurationFile, namingConventions, speciesEquivalences=None, bioGridFlag=False, memoizedResolver=True)[source]
main method. Receives a parser configuration, a configurationFile and a list of user defined species equivalences and returns a dictionary containing an atomized version of the model Args:
parser: data structure containing the reactions and species we will use database: data structure containining the result of the outgoing translation configurationFile: speciesEquivalences: predefined species
bionetgen.atomizer.atomizer.resolveSCT module
- class bionetgen.atomizer.atomizer.resolveSCT.SCTSolver(database, memoizedResolver=False)[source]
Bases:
object
- bindingReactionsAnalysis(dependencyGraph, reaction, classification)[source]
adds addBond based reactions based dependencies to the dependency graph
>>> dg = dg2 = {} >>> dummy = SCTSolver(None) >>> dummy.bindingReactionsAnalysis(dg, [['A', 'B'], ['C']], 'Binding') >>> dg == {'A': [], 'C': [['A', 'B']], 'B': []} True >>> dummy.bindingReactionsAnalysis(dg2, [['C'], ['A', 'B']], 'Binding') >>> dg2 == {'A': [], 'C': [['A', 'B']], 'B': []} True
- consolidateDependencyGraph(dependencyGraph, equivalenceTranslator, equivalenceDictionary, sbmlAnalyzer, loginformation=True)[source]
The second part of the Atomizer algorithm, once the lexical and stoichiometry information has been extracted it is time to state all elements of the system in unequivocal terms of their molecule types
- createSpeciesCompositionGraph(parser, configurationFile, namingConventions, speciesEquivalences=None, bioGridFlag=False)[source]
Main method for the SCT creation.
It first does stoichiometry analysis, then lexical…
- fillSCTwithAnnotationInformation(orphanedSpecies, annotationDict, logResults=True, tentativeFlag=False)[source]
- measureGraph(element, path)[source]
Calculates the weight of individual paths as the sum of the weights of the individual candidates plus the number of candidates. The weight of an individual candidate is equal to the sum of strings contained in that candidate different from the original reactant >>> dummy = SCTSolver(None) >>> dummy.measureGraph(‘Trash’,[‘0’]) 1 >>> dummy.measureGraph(‘EGF’,[[‘EGF’]]) 2 >>> dummy.measureGraph(‘EGFR_P’,[[‘EGFR’]]) 3 >>> dummy.measureGraph(‘EGF_EGFR’, [[‘EGF’, ‘EGFR’]]) 4 >>> dummy.measureGraph(‘A_B_C’,[[‘A’, ‘B_C’], [‘A_B’, ‘C’]]) 7
- measureGraph2(element, path)[source]
Identical to previous function but iterative instead of recursive
- resolveDependencyGraph(dependencyGraph, reactant, withModifications=False)[source]
Given a full species composition table and a reactant it will return an unrolled list of the molecule types (elements with no dependencies that define this reactant). The classification to the original candidates is lost since elements are fully unrolled. For getting dependencies keeping candidate consistency use consolidateDependencyGraph instead
- Args:
withModifications (bool): returns a list of the 1:1 transformation relationships found in the path to this graph
>>> dummy = SCTSolver(None) >>> dependencyGraph = {'EGF_EGFR_2':[['EGF_EGFR','EGF_EGFR']],'EGF_EGFR':[['EGF','EGFR']],'EGFR':[],'EGF':[], 'EGFR_P':[['EGFR']],'EGF_EGFR_2_P':[['EGF_EGFR_2']]} >>> dependencyGraph2 = {'A':[],'B':[],'C':[],'A_B':[['A','B']],'B_C':[['B','C']],'A_B_C':[['A_B','C'],['B_C','A']]} >>> dummy.resolveDependencyGraph(dependencyGraph, 'EGFR') [['EGFR']] >>> dummy.resolveDependencyGraph(dependencyGraph, 'EGF_EGFR') [['EGF'], ['EGFR']] >>> sorted(dummy.resolveDependencyGraph(dependencyGraph, 'EGF_EGFR_2_P')) [['EGF'], ['EGF'], ['EGFR'], ['EGFR']]
>>> sorted(dummy.resolveDependencyGraph(dependencyGraph, 'EGF_EGFR_2_P', withModifications=True)) [('EGF_EGFR_2', 'EGF_EGFR_2_P')] >>> sorted(dummy.resolveDependencyGraph(dependencyGraph2,'A_B_C')) [['A'], ['A'], ['B'], ['B'], ['C'], ['C']]
- resolveDependencyGraphHelper(gkey, reactant, memory, withModifications=False)[source]
Helper function for resolveDependencyGraph that adds a memory field to resolveDependencyGraphHelper to avoid cyclical definitions problems >>> dummy = SCTSolver(None) >>> dependencyGraph = {‘EGF_EGFR_2’:[[‘EGF_EGFR’,’EGF_EGFR’]],’EGF_EGFR’:[[‘EGF’,’EGFR’]],’EGFR’:[],’EGF’:[], ‘EGFR_P’:[[‘EGFR’]],’EGF_EGFR_2_P’:[[‘EGF_EGFR_2’]]} >>> dependencyGraph2 = {‘A’:[],’B’:[],’C’:[],’A_B’:[[‘A’,’B’]],’B_C’:[[‘B’,’C’]],’A_B_C’:[[‘A_B’,’C’],[‘B_C’,’A’]]} >>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph, ‘EGF_EGFR_2_P’,[])) [[‘EGF’], [‘EGF’], [‘EGFR’], [‘EGFR’]]
>>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph, 'EGF_EGFR_2_P', [], withModifications=True)) [('EGF_EGFR_2', 'EGF_EGFR_2_P')]
>>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph2, 'A_B_C', [])) [['A'], ['A'], ['B'], ['B'], ['C'], ['C']]
>>> dependencyGraph3 = {'C1': [['C2']],'C2':[['C3']],'C3':[['C1']]} >>> resolveDependencyGraphHelper(dummy.dependencyGraph3, 'C3', [], withModifications=True) Traceback (innermost last): File "<stdin>", line 1, in ? CycleError
- unMemoizedResolveDependencyGraphHelper(dependencyGraph, reactant, memory, withModifications=False)[source]
Helper function for resolveDependencyGraph that adds a memory field to resolveDependencyGraphHelper to avoid cyclical definitions problems >>> dummy = SCTSolver(None) >>> dependencyGraph = {‘EGF_EGFR_2’:[[‘EGF_EGFR’,’EGF_EGFR’]],’EGF_EGFR’:[[‘EGF’,’EGFR’]],’EGFR’:[],’EGF’:[], ‘EGFR_P’:[[‘EGFR’]],’EGF_EGFR_2_P’:[[‘EGF_EGFR_2’]]} >>> dependencyGraph2 = {‘A’:[],’B’:[],’C’:[],’A_B’:[[‘A’,’B’]],’B_C’:[[‘B’,’C’]],’A_B_C’:[[‘A_B’,’C’],[‘B_C’,’A’]]} >>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph, ‘EGF_EGFR_2_P’,[])) [[‘EGF’], [‘EGF’], [‘EGFR’], [‘EGFR’]]
>>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph, 'EGF_EGFR_2_P', [], withModifications=True)) [('EGF_EGFR_2', 'EGF_EGFR_2_P')]
>>> sorted(dummy.resolveDependencyGraphHelper(dependencyGraph2, 'A_B_C', [])) [['A'], ['A'], ['B'], ['B'], ['C'], ['C']]
>>> dependencyGraph3 = {'C1': [['C2']],'C2':[['C3']],'C3':[['C1']]} >>> resolveDependencyGraphHelper(dummy.dependencyGraph3, 'C3', [], withModifications=True) Traceback (innermost last): File "<stdin>", line 1, in ? CycleError
- weightDependencyGraph(dependencyGraph)[source]
Given a dependency Graph it will return a list indicating the weights of its elements a path is calculated >>> dummy = SCTSolver(None) >>> dummy.weightDependencyGraph({‘EGF_EGFR_2’:[[‘EGF_EGFR’,’EGF_EGFR’]],’EGF_EGFR’:[[‘EGF’,’EGFR’]],’EGFR’:[],’EGF’:[], ‘EGFR_P’:[[‘EGFR’]],’EGF_EGFR_2_P’:[[‘EGF_EGFR_2’]]}) [[‘EGF’, 2], [‘EGFR’, 2], [‘EGFR_P’, 4], [‘EGF_EGFR’, 5], [‘EGF_EGFR_2’, 9], [‘EGF_EGFR_2_P’, 10]] >>> dependencyGraph2 = {‘A’:[],’B’:[],’C’:[],’A_B’:[[‘A’,’B’]],’B_C’:[[‘B’,’C’]],’A_B_C’:[[‘A_B’,’C’],[‘B_C’,’A’]]} >>> dummy.weightDependencyGraph(dependencyGraph2) [[‘A’, 2], [‘C’, 2], [‘B’, 2], [‘B_C’, 5], [‘A_B’, 5], [‘A_B_C’, 13]]