Nomenclature and Serialization FormatsΒΆ
There are many ways of representing monosaccharides, substituents and glycan
structures in text. glypy.io
includes modules for reading and writing
several of these formats.
Three of these formats, IUPAC, WURCS, and
LinearCode put an entire structure in a single line, while
GlycoCT uses multi-line blocks to denote different parts of
a structure. The glycoct
, iupac
,
linear_code
, and wurcs
modules all provide
loads
and dumps
functions, similar to other Python serialization interfaces
for converting objects to and from strings.
Note
GlycoCT and WURCS support more complex representations, including both structures and compositions, than IUPAC, but all three can represent essentially any monosaccharide. LinearCode can only represent a limited number monosaccharides, not including generic cases with unknown ring stereochemistry.
Warning
IUPAC and LinearCode have to do complex
heuristic reasoning to decode modified versions of monosaccharides with special
names (e.g. Neu5Ac
, Fuc
) that imply modifications or substituents, limiting
their performance. GlycoCT and WURCS do not
require nearly as much introspection, making them considerably faster.