Serialization in SGScript
In SGScript, serialization is conversion to a binary stream of a specific format. Serialization in the language API is done by calling the "serialize" function and deserialization is done with "unserialize". The C API has similar functions: sgs_Serialize(Ext) / sgs_SerializeObject and sgs_Unserialize, respectively.
check the "Possible gotchas" part of this page if it is intended to trust the end user with access to serialized data
The format (modes 1 and 2)
- a list of operations
- each operation consists of a 'type' byte and additional data
- a "push" operation has the byte 'P', the data consists of the type to push (a byte consisting of one base flag from SGS_VT_*) and the binary data of the type
- a "call" operation has the byte 'C', the following data differs between formats:
- MODE 1: number of arguments (4 bytes), function name length (1 byte) and the null-terminated function name itself (<length>+1 bytes)
- MODE 2: number of arguments (4 bytes), function name length (1 byte), argument indices (4 bytes for each, count as previously specified) and the null-terminated function name itself (<length>+1 bytes)
The format (mode 3)
- SGScript Object Notation (SGSON) - a minimal version of the script syntax for defining data
- Supported features:
- keywords:
null
,false
,true
- integers and real values
- strings (
"..."
or'...'
) - containers:
- array:
[...]
- dict:
{ a = 1, b = "x" }
- map:
map{ [false] = 5, [{}] = "test" }
- array:
- function calls:
<identifier>(<arg1>, <arg2>, ... <argN>)
- keywords:
Serialization
- "serialize" is called
- "sgs_Serialize(Ext)" is called internally
- type of variable is determined and the data is written
- C functions cannot be serialized and whenever encountered, will abort the action
- objects will have SERIALIZE operation called
- if operation is not defined, everything will stop
- if object is an array, all variables in it will be serialized using sgs_Serialize(Ext) and sgs_SerializeObject will generate a call to 'array'
- if object is a 'dict' or a 'map', all keys and values in it will be serialized using sgs_Serialize and sgs_SerializeObject will generate a call to 'dict'/'map'
Deserialization
- "unserialize" is called
- "sgs_Unserialize(Ext)" is called internally
- "push", "call", "symbol" operations are executed, thus regenerating the data
- "push" operation pushes a variable on the stack
- "call" operation calls the specified function
- "symbol" operation resolves the given string to a variable from the symbol table
Possible gotchas
- unserialization is not by default safe in the sense that any function can be executed with a carefully crafted byte buffer; this can be prevented by temporarily replacing the global environment (_G)
- if environment overrides are used on deserialization, remember to add "array", "dict" and "map" to the list if you use them, they are not always supported automatically
- mode 1/3 serialization will not preserve any variable reuse relationships so after deserialization the structure could use more memory than before; if more efficient behavior is desired, it is suggested to use mode 2 (default) instead