Nomenclature and Serialization FormatsΒΆ

There are many ways of representing monosaccharides, substituents and glycan structures in text. glypy.io includes modules for reading and writing several of these formats.

Three of these formats, IUPAC, WURCS, and LinearCode put an entire structure in a single line, while GlycoCT uses multi-line blocks to denote different parts of a structure. The glycoct, iupac, linear_code, and wurcs modules all provide loads and dumps functions, similar to other Python serialization interfaces for converting objects to and from strings.

Note

GlycoCT and WURCS support more complex representations, including both structures and compositions, than IUPAC, but all three can represent essentially any monosaccharide. LinearCode can only represent a limited number monosaccharides, not including generic cases with unknown ring stereochemistry.

Warning

IUPAC and LinearCode have to do complex heuristic reasoning to decode modified versions of monosaccharides with special names (e.g. Neu5Ac, Fuc) that imply modifications or substituents, limiting their performance. GlycoCT and WURCS do not require nearly as much introspection, making them considerably faster.