GlycoCT¶
A parser for GlycoCT{condensed} format.
GlycoCT{condensed} is a multi-line format for representing glycan structures and compositions published in [1]. The format is intended to be human-readable, easily compressed, and includes a canonicalization algorithm to ensure that there is only a single representation for a glycan structure.
GlycoCT{condensed} can represent glycan structures with ambiguous
or repeating sub-units. The specification includes additional section directives with
support for stochastic sub-units as well as disjoint subgraphs, though these have not
been implemented in glypy
.
References
- [1] Herget, S., Ranzinger, R., Maass, K., & Lieth, C.-W. V. D. (2008).
GlycoCT-a unifying sequence format for carbohydrates. Carbohydrate Research, 343(12), 2162–2171. https://doi.org/10.1016/j.carres.2008.03.011
High Level Functions¶
- glypy.io.glycoct.dump(structure, buffer=None)[source]¶
Serialize the
Glycan
into GlycoCT{condensed}, usingbuffer
to store the result. Ifbuffer
isNone
, then the function will operate on a newly createdStringIO
object.- Parameters
structure (
Glycan
) – The structure to serializebuffer (file-like or None) – The stream to write the serialized structure to. If
None
, uses an instance ofStringIO
- Return type
file-like or str if
buffer
isNone
- glypy.io.glycoct.load(stream, structure_class=<class 'glypy.structure.glycan.Glycan'>, allow_repeats=True, allow_multiple=True)[source]¶
Read all structures from the provided text stream.
- glypy.io.glycoct.dumps(structure)[source]¶
Serialize the
Glycan
into GlycoCT{condensed}, returning the text as a string.
Examples¶
>>> from glypy.io import glycoct
>>> glycoct.loads("""RES
1b:x-dglc-HEX-1:5
2s:n-acetyl
3b:b-dglc-HEX-1:5
4s:n-acetyl
5b:b-dman-HEX-1:5
6b:a-dman-HEX-1:5
7b:b-dglc-HEX-1:5
8s:n-acetyl
9b:a-lgal-HEX-1:5|6:d
10b:b-dgal-HEX-1:5
11b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
12s:n-glycolyl
13b:b-dglc-HEX-1:5
14s:n-acetyl
15b:b-dgal-HEX-1:5
16s:n-acetyl
17b:b-dglc-HEX-1:5
18s:n-acetyl
19b:a-dman-HEX-1:5
20b:b-dglc-HEX-1:5
21s:n-acetyl
22b:a-lgal-HEX-1:5|6:d
23b:b-dgal-HEX-1:5
24b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
25s:n-glycolyl
26b:b-dglc-HEX-1:5
27s:n-acetyl
28b:a-lgal-HEX-1:5|6:d
29b:b-dgal-HEX-1:5
30b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
31s:n-acetyl
32b:a-lgal-HEX-1:5|6:d
LIN
1:1d(2+1)2n
2:1o(4+1)3d
3:3d(2+1)4n
4:3o(4+1)5d
5:5o(3+1)6d
6:6o(2+1)7d
7:7d(2+1)8n
8:7o(3+1)9d
9:7o(4+1)10d
10:10o(3+2)11d
11:11d(5+1)12n
12:6o(4+1)13d
13:13d(2+1)14n
14:13o(4+1)15d
15:15d(2+1)16n
16:5o(4+1)17d
17:17d(2+1)18n
18:5o(6+1)19d
19:19o(2+1)20d
20:20d(2+1)21n
21:20o(3+1)22d
22:20o(4+1)23d
23:23o(3+2)24d
24:24d(5+1)25n
25:19o(6+1)26d
26:26d(2+1)27n
27:26o(3+1)28d
28:26o(4+1)29d
29:29o(3+2)30d
30:30d(5+1)31n
31:1o(6+1)32d
""")
>>>
(Source code, svg, png, hires.png, pdf)
Object-Oriented Interface¶
- class glypy.io.glycoct.GlycoCTReader(stream, structure_class=<class 'glypy.structure.glycan.Glycan'>, allow_repeats=True, completes=True)[source]¶
Parse GlycoCT{condensed} text data into
Glycan
objects.The parser implements the
Iterator
interface, yielding successive glycans from a text stream separated by empty lines.The parser can understand fully specified and partially ambiguous structures. When
allow_repeats
isTrue
and aREP
section is encountered, it will be expanded to its minimum multiplicity, or 1 if the minimum is unknown.UND
sections will be connected to the main graph byAmbiguousLink
instead ofLink
objects.- Variables
allow_repeats (
bool
) – Whether or not to permitREP
sections. Defaults toTrue
completes (
bool
) – Whether or not to translate the built graph into aGlycan
object. Defaults toTrue
handle (file-like) – The text file being read from
in_repeat (
bool
) – Indicates the parser is currently parsing aREP
section’s sub-graphin_undetermined (bool) – Indicates the parser is currently parsing a
UND
section’s sub-graphpostponed (list) – Holds all the deferred operations for the top-most graph as
callable
objectsroot (
Monosaccharide
) – The root node of the produced graphstate (str) – The current state of the parser’s state machine
repeats (dict) – Maps RES section index to
RepeatedGlycoCTSubgraph
undetermineds (dict) – Maps UND section index to
UndeterminedGlycoCTSubgraph
- glypy.io.glycoct.GlycoCTWriter¶
alias of
UNDOrderRespectingGlycoCTWriter
Implementation Details¶
- class glypy.io.glycoct.RepeatedGlycoCTSubgraph(graph_index, repeat_index, internal_linkage=None, external_linkage=None, multitude=None, graph=None, parent=None)[source]¶
Implements the machinery for representing a repeated subgraph in GlycoCT.
- Variables
graph_index (int) –
repeast_index (int) – The ``i``th repeating subgraph in the graph.
internal_linkage (object) – The linkage connecting two repetitions of the subgraph
external_linkage (object) – The linkage connecting from the final repetition and the outside nodes.
multitude (
RepeatedMultitude
) – Holds the lower and upper range of multiplicities this subgraph may be repeated to.repetitions (
OrderedDict
) – The repetitions of this subgraph, materialized duringpostprocess()
postponed (
deque
) – A queue of post-processing callbacks.