Monosaccharide¶
Represents individual saccharide residues and their associated functions. These are the basic unit of structural representation, possesing graph node-like properties.
Monosaccharide Objects¶
- class glypy.structure.monosaccharide.Monosaccharide(anomer=None, configuration=None, stem=None, superclass=None, ring_start=-1, ring_end=-1, modifications=None, links=None, substituent_links=None, composition=None, reduced=None, id=None, fast=False)[source]¶
Represents a single monosaccharide molecule, and its relationships with other molcules through
Link
objects.Link
objects stored inlinks
for connections to otherMonosaccharide
instances, building aGlycan
structure as a graph ofMonosaccharide
objects.Link
objects connecting theMonosaccharide
instance toSubstituent
objects are stored insubstituent_links
.Both
links
andsubstituent_links
are instances ofOrderedMultiMap
objects where the key is the index of the carbon atom in the carbohydrate backbone that hosts the bond. An index ofx
or-1
represents an unknown location.Warning
While
Monosaccharide
objects expose theirmodifications
,links
, andsubstituent_links
attributes as mutable, you should treat them as read-only. The methods for altering their contents,add_substituent()
,add_monosaccharide()
,add_modification()
,drop_substituent()
,drop_monosaccharide()
, anddrop_modification()
are all responsible for handling these mutations for you.Link
methods likeLink.apply()
andLink.break_link()
are used internally.- Variables
anomer (
Anomer
) – An entry ofAnomer
that corresponds to the linkage type of the carbohydrate backbone. Is an entry of a class based onEnum
superclass (
SuperClass
) – An entry ofSuperClass
that corresponds to the number of carbons in the carbohydrate backbone of the monosaccharide. Controls the base composition of the instance and the number of positions open to be linked to or modified. Is an entry of a class based onEnum
configuration (
Configuration
or {‘d’, ‘l’, ‘x’, ‘missing’, None}) – An entry ofConfiguration
which corresponds to the optical stereomer state of the instance. Is an entry of a class based onEnum
. May possess more than one value.stem (
Stem
) – Corresponds to the bond conformation of the carbohydrate backbone. Is an entry of a class based onEnum
. May possess more than one value.ring_start (
int
) – The index of the carbon of the carbohydrate backbone that starts a ring. A value of-1
,'x'
, orNone
corresponds to an unknown start. A value of0
refers to a linear chain.ring_end (
int
) – The index of the carbon of the carbohydrate backbone that ends a ring. A value of-1
,'x'
, orNone
corresponds to an unknown ends. A value of0
refers to a linear chain.stereocode (
Stereocode
) – The stereochemistry of all carbons of the monosaccharide’s backbone ring/chain.reducing_end (
ReducedEnd
) – The reducing end terminal group of the monosaccharide if the monosaccharide is uncyclizedmodifications (
OrderedMultiMap
) – The mapping of sites toModification
entries. Directly modifies the instance’scomposition
links (
OrderedMultiMap
) – The mapping of sites toLink
entries that refer to otherMonosaccharide
instancessubstituent_links (
OrderedMultiMap
) – The mapping of sites toLink
entries that refer toSubstituent
instances.composition (
Composition
) – An instance ofComposition
corresponding to the elemental composition ofself
and its immediate modifications. If not provided, this will be inferred from field values.reduced (
ReducedEnd
) – An instance of ReducedEnd, or the valueTrue
, represents a reduced sugar. May be inferred frommodifications
if “aldi” is present
Monosaccharide Methods
Connection Enumeration¶
- Monosaccharide.parents(links=False)[source]¶
Returns an iterator over the
Monosaccharide
instances which are considered the ancestors ofself
.
- links: bool
Whether to return the Link objects, or their parents. Defaults to False
- Returns
list
ofposition (int) – Location of the bond to the parent
Monosaccharide
parent (Monosaccharide) –
Monosaccharide
atposition
- Monosaccharide.children(links=False)[source]¶
Returns an iterator over the
Monosaccharide
instancess which are considered the descendants ofself
>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> ch = n_linked_core.root.children() >>> ch[0] (4, RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n) >>>
- Parameters
links (bool) – Whether to return the Link objects, or their children. Defaults to False
- Returns
list
ofposition (int) – Location of the bond to the child
Monosaccharide
child (Monosaccharide) –
Monosaccharide
atposition
- Monosaccharide.substituents()[source]¶
Returns an iterator over all substituents attached to
self
by aLink
object stored insubstituent_links
- Returns
list
ofposition (int) – Location of the bond to the substituent
substituent (Substituent) –
Substituent
atposition
Adding and Removing Connections and Modifications¶
- Monosaccharide.add_monosaccharide(monosaccharide, position=-1, max_occupancy=0, child_position=-1, parent_loss=None, child_loss=None)[source]¶
Adds a
Monosaccharide
and associatedLink
tolinks
at the site given byposition
.>>> from glypy import monosaccharides >>> hexnac = monosaccharides.HexNAc >>> hex = monosaccharides.Hex >>> hexnac.add_monosaccharide(hex, 1) RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac.links[1][0].child RES 1b:x-xx-HEX-1:5
- Parameters
monosaccharide (Monosaccharide) – The monosaccharide to add.
position (int or 'x') – The location to add the
Monosaccharide
link tolinks
. Defaults to -1child_position (int) – The location to add the link to in
monosaccharide
’slinks
. Defaults to -1.max_occupancy (int, optional) – The maximum number of items acceptable at
position
. Defaults to1
parent_loss (Composition or str) – The elemental composition removed from
self
child_loss (Composition or str) – The elemental composition removed from
monosaccharide
- Raises
IndexError –
position
exceeds the bounds set bysuperclass
.ValueError –
position
is occupied by more thanmax_occupancy
elements- Returns
self
, for chain calls- Return type
- Monosaccharide.add_substituent(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]¶
Adds a
Substituent
and associatedLink
tosubstituent_links
at the site given byposition
. This new substituent is included when calculating mass with substituents included.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hex.add_substituent("n-acetyl", 2, parent_loss="OH") RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac == hex True
- Parameters
substituent (str or Substituent) – The substituent to add. If passed a
str
it will be translated into an instance ofSubstituent
.position (int or 'x') – The location to add the
Substituent
link tosubstituent_links
. Defaults to -1child_position (int) – The location to add the link to in
substituent
links
. Defaults to -1. Substituent indices are currently not checked.max_occupancy (int, optional) – The maximum number of items acceptable at
position
. Defaults to1
parent_loss (Composition or str) – The elemental composition removed from
self
child_loss (Composition or str) – The elemental composition removed from
substituent
- Raises
IndexError –
position
exceeds the bounds set bysuperclass
.ValueError –
position
is occupied by more thanmax_occupancy
elements- Returns
self
, for chain calls- Return type
- Monosaccharide.add_modification(modification, position, max_occupancy=0)[source]¶
Adds a modification instance to
modifications
at the site given byposition
. This directly modifiescomposition
, consequently changingmass()
- Parameters
modification (str or Modification) – The modification to add. If passed a
str
, it will be translated into an instance ofModification
position (int or 'x') – The location to add the
Modification
to.max_occupancy (int, optional) – The maximum number of items acceptable at
position
. defaults to1
- Raises
IndexError –
position
exceeds the bounds set bysuperclass
.ValueError –
position
is occupied by more thanmax_occupancy
elements- Returns
self
, for chain calls- Return type
- Monosaccharide.drop_monosaccharide(position, refund=True)[source]¶
Remove the glycosidic bond at
position
, detatching a connectedMonosaccharide
If there is more than one glycosidic bond at
position
, an error will be raised.>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> n_linked_core.root.drop_monosaccharide(4) RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> n_linked_core.mass() 221.08993720321
- Parameters
position (int) – The position to drop the modification from
refund (bool) – Passed to
break_link()
- Raises
- Returns
self
, for chain calls- Return type
- Monosaccharide.drop_substituent(position, substituent=None, refund=True)[source]¶
Remove the
substituent
atposition
.If
substituent
isNone
, then the first substituent found atposition
is removed.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hexnac.drop_substituent(2) RES 1b:x-xx-HEX-1:5 >>> hexnac == hex True
- Parameters
position (int) – The position to drop the modification from
substituent (Substituent) – The
Substituent
to remove. IfNone
, the first substituent found atposition
will be removedrefund (bool) – Passed to
break_link()
- Raises
IndexError: – If
position
is not a valid carbohydrate backbone positionValueError: – If
substituent
is not found atposition
- Returns
self
, for chain calls- Return type
- Monosaccharide.drop_modification(position, modification)[source]¶
Remove the
modification
atposition
- Parameters
position (int) – The position to drop the modification from
modification (Modification) – The Modification to remove.
- Raises
IndexError: – If
position
is not a valid carbohydrate backbone positionValueError: – If
modification
is not found atposition
- Returns
self
, for chain calls- Return type
Position Occupancy¶
- Monosaccharide.is_occupied(position)[source]¶
Checks to see if a particular backbone position is occupied by a
Modification
,Substituent
, orLink
to anotherMonosaccharide
.
- Monosaccharide.open_attachment_sites(max_occupancy=0)[source]¶
When attaching
Monosaccharide
instances to other objects, bonds are formed between the carbohydrate backbone and the other object. If a site is already bound, the occupying object fills that space on the backbone and prevents other objects from binding there.Currently only cares about the availability of the hydroxyl group. As there is not a hydroxyl attached to the ring-ending carbon, that should not be considered an open site.
If any existing attached units have unknown positions, we can’t provide any known positions, in which case the list of open positions will be a
list
of-1
s of the length of open sites.
Equality Comparison¶
Monosaccharide objects support equality comparison operators,
==
and!=
. They also support hashing, using thehash()
value ofMonosaccharide.id
.
- Monosaccharide.exact_ordering_equality(other, substituents=True, visited=None)[source]¶
Performs equality testing between two monosaccharides where the exact position (and ordering by sort) of links must to match between the input
Monosaccharide
objects
- Return type
- Monosaccharide.topological_equality(other, substituents=True, visited=None)[source]¶
Performs equality testing between two monosaccharides where the exact ordering of child links does not have to match between the input |Monosaccharide|s, so long as an exact match of the subtrees is found
- Return type
- Monosaccharide.__eq__(other)[source]¶
Test for equality between
Monosaccharide
instances. First try scalar equality of fields, and then compare descendants.
Serialization¶
- Monosaccharide.serialize(name='glycoct')[source]¶
Convert this object into text using the requested textual encoding
- classmethod Monosaccharide.register_serializer(name, method)[source]¶
Add
method
asname
to the set of serializers to pick from inserialize()
- Parameters
name (str) – The name of the serializer
method (Callable) – A callable object that when called with a
Monosaccharide
returns astr
Mass Spectrometry Utilities¶
- Monosaccharide.total_composition()[source]¶
Computes the sum of the composition of
self
and each of its linkedSubstituent
s
- Return type
Composition
- Monosaccharide.mass(average=False, charge=0, mass_data=None, substituents=True)[source]¶
Calculates the total mass of
self
.
- Parameters
average (bool, optional, defaults to False) – Whether or not to use the average isotopic composition when calculating masses. When
average == False
, masses are calculated using monoisotopic mass.charge (int, optional, defaults to 0) – If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is
charge
mass_data (dict, optional) – If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to
None
.substituents (bool, optional, defaults to True) – Whether or not to include substituents’ masses.
- Return type
Miscellaneous¶
- Monosaccharide.clone(prop_id=False, fast=True, monosaccharide_type=None)[source]¶
Copies just this
Monosaccharide
and its |Substituent|s, creating a separate instance with the same data. All mutable data structures are duplicated and distinct from the original.Does not copy any
links
as this would cause recursive duplication of the entireGlycan
graph.
- Parameters
prop_id (
bool
) – Whether to copyid
fromself
to the new instancefast (
bool
) – Whether to use the fast-path initialization process inMonosaccharide.__init__()
monosaccharide_type (
type
) – A subclass ofMonosaccharide
to use- Return type
Explicit Uncyclized Reducing Ends and Labels¶
- class glypy.structure.monosaccharide.ReducedEnd(composition=None, substituents=None, valence=1, id=None)[source]¶
Represents the composition shift and conformation change created by reducing a
Monosaccharide
.
- Variables
composition (
Composition
) – The elemental composition of the reducing end reduction modification.links (
OrderedMultiMap
) – The attached substituentsvalence (
int
) – Number of substituents this node can hostid (
int
) – Unique identifier:ivar There is also a class attribute,
name
for comparison withaldi
:
- add_substituent(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]¶
Adds a
Substituent
and associatedLink
tosubstituent_links
at the site given byposition
. This new substituent is included when calculating mass with substituents included
- Parameters
substituent (str or Substituent) – The substituent to add. If passed a
str
, it will be translated into an instance ofSubstituent
position (int or 'x') – The location to add the
Substituent
link tosubstituent_links
. Defaults to -1child_position (int) – The location to add the link to in
substituent
’slinks
. Defaults to -1. Substituent indices are currently not checked.max_occupancy (int, optional) – The maximum number of items acceptable at
position
. Defaults to1
parent_loss (Composition or str) – The elemental composition removed from
self
child_loss (Composition or str) – The elemental composition removed from
substituent
- Raises
IndexError –
position
exceeds the bounds set bysuperclass
.ValueError –
position
is occupied by more thanmax_occupancy
elements
- children()[source]¶
Returns an iterator over the nodes which are considered the descendants of
self
.
- clone(prop_id=True)[source]¶
Make a deep copy of
self
.
- Parameters
prop_id (bool) – Whether to copy over
id
.- Return type
- drop_substituent(position, substituent=None, refund=True)[source]¶
Remove the
substituent
atposition
.If
substituent
isNone
, then the first substituent found atposition
is removed.
- Parameters
position (int) – The position to drop the modification from
substituent (Substituent) – The
Substituent
to remove. IfNone
, the first substituent found atposition
will be removedrefund (bool) – Passed to
break_link()
- Raises
IndexError: – If
position
exceedsvalence
ValueError: – If
substituent
is not found atposition
- Returns
self
for chaining calls- Return type
- is_occupied(position)[source]¶
Checks to see if a particular backbone position is occupied by a or
Substituent
.
- Parameters
position (int) – The position to check for occupancy. Passing -1 checks for undetermined attachments.
- Returns
The number of occupants at
position
, orfloat('inf')
ifposition
exceedsvalence
- Return type
numeric
- mass(average=False, charge=0, mass_data=None)[source]¶
Calculates the total mass of
self
.
- Parameters
average (bool, optional, defaults to False) – Whether or not to use the average isotopic composition when calculating masses. When
average == False
, masses are calculated using monoisotopic mass.charge (int, optional, defaults to 0) – If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is
charge
mass_data (dict, optional) – If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to
None
.- Return type