new stream adapter, and lots of refactorings

This commit is contained in:
fraillt
2017-10-06 13:50:04 +03:00
parent 5ede853954
commit f3c9a33849
46 changed files with 967 additions and 701 deletions

View File

@@ -2,51 +2,65 @@ To get the most out of **Bitsery**, start with the [tutorial](tutorial/README.md
Once you're familiar with the library consider the following reference material.
Library design:
* `valueNb instead of value`
* `fundamental types`
* `valueNb instead of value`
* `flexible syntax`
* `serializer/deserializer functions overloads`
* `extending library functionality`
* `errors handling`
* `forward/backward compatibility via Growable extension`
Core Serializer/Deserializer functions (alphabetical order):
* `align`
* `boolByte`
* `boolBit`
* `boolValue`
* `container`
* `extend`
* `getContext`
* `ext`
* `context`
* `object`
* `text`
* `value`
Serializer/Deserializer extensions via `extend` method (alphabetical order):
* `ContainerMap`
Serializer/Deserializer extensions via `ext` method (alphabetical order):
* `Entropy`
* `Growable`
* `Optional`
* `StdMap`
* `StdOptional`
* `StdQueue`
* `StdSet`
* `StdStack`
* `ValueRange`
BasicBufferWriter/Reader functions:
AdapterWriter/Reader functions:
* `writeBits/readBits`
* `writeBytes/readBytes`
* `writeBuffer/readBuffer`
* `align`
* `beginSession/endSession`
* `flush (writer only)`
* `writtenBytesCount (writer only)`
* `setError (reader only)`
* `getError (reader only)`
* `isCompletedSuccessfully (reader only)`
Input adapters (buffer and stream) functions:
* `read`
* `error`
* `setError`
* `isCompletedSuccessfully`
Output adapters (buffer and stream) functions:
* `write`
* `flush`
* `writtenBytesCount`
Tips and tricks:
* if you're getting static assert "please define 'serialize' function", most likely it is because your SERIALIZE function is not defined in same namespace as object.
* if you're getting static assert "please define 'serialize' function", most likely it is because your **serialize** function is not defined in same namespace as object.
Limitations:
* max **text** or **container** size can be 2^(n-2) (where n = sizeof(std::size_t) * 8) for 32-bit systems it is 1073741823 (0x3FFFFFF).
* when using **Growable** extension, serialized buffer size in bytes, cannot be greater than 2^(n-2) (where n = sizeof(std::size_t) * 8).
Other:
* [Contributing](../CONTRIBUTING.md)
* [Change log](../CHANGELOG.md)

View File

@@ -11,7 +11,7 @@ Most well-known serialization libraries sacrifice memory and speed efficiency by
## A word about JSON
Often times people use C++ because they want speed and memory efficiency, and JSON is not on the list of efficient serialization format.
Although JSON is very readable and very convenient when used together with dynamically typed languages (such as JavaScript).
Although JSON is very readable and very convenient when used together with dynamically typed languages such as JavaScript.
When serializing data from statically typed languages, however, JSON not only has the obvious drawback of runtime inefficiency, but also forces you to write more code to access data (counterintuitively) due to its dynamic-typing serialization system.
Adding optional support for JSON doesn't come for free either.
@@ -34,18 +34,19 @@ Now let's review features in more detail.
* **Cross-platform compatible.** if same code compiles on Android, PS3 console, and your PC either x64 or x86 architecture, you are 100% sure it works.
To achieve this, bitsery specifically defines size of underlying data, hence syntax is *value\<2\>* (alias function *value2b*) instead or *value*, or *container2b* for element type of 16bits, eg int16_t.
Bitsery also applies endianess transformation if nessesarry.
**If** however, you don't like this verbose syntax, you can just write *serialize* functions for fundamental types, and forget about *value\<N\>*, *container\<N\>*, etc.
But do it on your own risk, or write static asserts.
* **Flexible syntax.** if you don't like like writing code with explicitly specifying underlying type size, like *container2b* or *value8b* you can use flexible syntax.
Just include <bitsery/flexible.h> and can write like in [cereal](http://uscilab.github.io/cereal/).
But do it on your own risk, and static assert using *assertFundamentalTypeSizes* function if you're planing to use it accross multiple platforms.
* **Optimized for speed and space.** library itself doesn't do any allocations (except if you use backward/forward compatibility) so data writing/reading is fast as memcpy to/from your buffer.
It also doesn't serialize any type information, all information needed is writen in your code!
* **No code generation required: no IDL or metadata** since it doesn't support any other formats except binary, it doesn't need any metadata.
* **Runtime error checking on deserialization** library designed to be save with untrusted network data, that's why all overloads that work on containers has *maxSize* value, unless container is static size like *std::array*, this way bitsery ensures that no malicious data will not crash you.
* **Runtime error checking on deserialization** library designed to be save with untrusted network data, that's why all overloads that work on containers has *maxSize* value, unless container is static size like *std::array*, this way bitsery ensures that no malicious data crash you.
* **Supports forward/backward compatibility for your types** library has optional forward/backward compatibility for types implemented in *BasicBufferReader/BasicBufferWriter* by allowing to have inner data sessions in inside buffer.
This is the only functionality that requires dynamic memory allocation.
*Glowable* extension use these sessions to add compatibility support for your types, in most basic form.
You can implement your own extensions if you want to be able to add default values.
* **2-in-1 declarative control flow, same code for serialization and deserialization.** only one function to define, for serialization and deserialization in same manner as *cereal* does.
It might be handy to have separate *load*, *save* functions, but Bitsery explicitly doesn't support it, to avoid any serialization deserialization path differences, because it is very hard to catch an errors if you make a bug in one of these functions.
It might be handy to have separate *load* and *save* functions, but Bitsery explicitly doesn't support it, to avoid any serialization deserialization divergence, because it is very hard to catch an errors if you make a bug in one of these functions.
The only way around this through extensions, write your custom flow once, and reuse where you need them.
* **Allows fine-grained serialization control** this is a feature that no other libraries provides.
Bitsery allows to use bit-level operations and has two extensions that use them:
@@ -53,16 +54,16 @@ Bitsery allows to use bit-level operations and has two extensions that use them:
* *Entropy*,- full term is *entropy encoding*, which means that when you have most common value, or multiple values, it will write just few bits instead of full object.
Eg.: imagine that you have a struct Person{ int32_t Id; string Profession; }.
You know that mostly there are young persons, so the most common value will be equal to: "Student", "Child", "NoProfession", in this case you'll pay 2bits for each record, but write no data if string matches.
You might know that mostly there are young persons, so the most common value will be equal to: "Student", "Child", "NoProfession", in this case you'll pay 2bits for each record, but write no data if string matches.
Using these bit-level operations and extensions you can compose your own extensions for vectors, matrices or any other types.
Further more, all other operations will not align data automatically for you, so data will be compressed as much as possible.
One more advanced and dangerous feature, is ability to have serialization context, so you can control your serialization flow at runtime, but make sure that these contexts are in sync between serializer and deserializer.
One possible use case for serialization context is to pass min/max ranges for *ValueRange* when your information changes at runtime.
* One more advanced and dangerous feature, is ability to have serialization context, so you can control your serialization flow at runtime, but make sure that these contexts are in sync between serializer and deserializer.
One possible use case for serialization context is to pass min/max ranges for *ValueRange* when your information changes at runtime.
* **Easily extendable** library is designed to be easily extendable for any type and flow.
You want to support your custom container, its fine there is *ContainerTraits* for this, only few methods required to implement.
To use same container for buffer writing/reading add specialization to *BufferContainerTraits*.
To use same container for buffer writing/reading add specialization to *BufferAdapterTraits*.
You want to customize serialization flow - use extensions, only two methods to define, and *ExtensionTraits* to further customize usage.
* **Configurable endianess support.** default is *Little Endian*, but if your primary target is PowerPC architecture, eg. PlayStation3, just change your configuration to be *Big Endian*.
* **No macros.** Not so much to say, if you are like me, then it's a feature :)

View File

@@ -1,5 +1,5 @@
*document in progress*
* NO_ERROR,
* NoError,
* BUFFER_OVERFLOW,
* INVALID_BUFFER_DATA
* write what happens when data is corrupted

View File

@@ -10,7 +10,30 @@ bitsery can be directly included in your project or installed anywhere you can a
Grab the latest version, and include directory `bitsery_base_dir/include/` to your project.
There's nothing to build or make - **bitsery** is header only.
## Add serialization methods for your types
## Include required headers and define some helper types
```cpp
#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/traits/string.h>
using namespace bitsery;
using Buffer = std::vector<uint8_t>;
using OutputAdapter = OutputBufferAdapter<Buffer>;
using InputAdapter = InputBufferAdapter<Buffer>;
```
**bitsery** is very lightweight, so we need to explicitly include what we need.
* `<bitsery/bitsery.h>` is a core header, that includes our Serializer and Deserializer
* `<bitsery/adapter/buffer.h>` in order to write/read data we need specific adapter, depending on what underlying buffer will be. In this example we'll be using std::vector as our buffer, so we include buffer adapter.
* <bitsery/traits/...> traits tells library how efficiently serialize particular container.
create alias types for *InputAdapter* and *OutputAdapter* using our vector as buffer.
## Add serialization method for your type
**bitsery** needs to know which data members to serialize in your classes.
Let it know by implementing a serialize method for your type:
@@ -30,11 +53,11 @@ void serialize(S& s, MyStruct& o) {
};
```
**bitsery** also can serialize private class members, just move *serialize* function inside structure, and make it *friend* (*fiend void serialize(.....)*).
**bitsery** also allows to define serialize function in side your class, and can also serialize private class members, just make *friend bitsery::Access;*
**bitsery** has verbose syntax, because it is cross-platform compatible by default and has full control over how to serialize data (read more about it in [motivation](../design/README.md))
**bitsery** supports two ways how to describe your serialization flow: *verbose syntax* (as in example) or *flexible syntax*, similar to *cereal* library, just include `<bitsery/flexible.h>` to use it.
This example contains core functionality that you'll use all the time, so lets get through it:
This example we choosed probably unfamiliar verbose syntax, so lets explain core functionality that you'll use all the time:
* **s.value4b(o.i);** serialize fundamental types (ints, floats, enums) value**4b** means, that data type is 4 bytes. If you use same code on different machines, if it compiles it means it is compatible.
* **s.text1b(o.str);** serialize text (null-terminated) of char type, if you use *wchar* then you would write *text2b*.
* **s.container4b(o.fs, 100);** serializes any container of fundamental types of size 4bytes, **100** is max size of container.
@@ -45,57 +68,35 @@ External serialization functions should be placed either in the same namespace a
## Serialization and deserialization
### Create serializer
Create a serializer and send the data you want to serialize to it.
Create buffer and use helper functions for serialization and deserialization.
```cpp
std::vector<uint8_t> buffer;
BufferWriter bw{buffer};
Serializer ser{bw};
Buffer buffer;
auto writtenSize = quickSerialization<OutputAdapter>(buffer, data);
auto state = quickDeserialization<InputAdapter>({buffer.begin(), writtenSize}, res);
```
Serialization process consists of three independant parts.
* **std::vector<uint8_t> buffer;** core object, that will store the data for serialization and deserialization.
* **BufferWriter bw{buffer};** writer knows how to write bytes to buffer, and how to resize buffer, or how to use fixed-size buffer. It also applies endianess transformations if nesessary.
* **Serializer ser{bw};** serializer is a high level wrapper that knows how to convert object to stream of bytes, and write then to buffer.
Serializer doesn't store any state, it only has reference to buffer, so it is safe to create many of those if nesessary.
BufferWriter also doesn't own buffer, but it stores state about writing position and container size.
One important note that when using bit-level operations, dont forget to flush buffer writer **bw.flush()** otherwise, some data might not be written to buffer.
### Serialize object
```cpp
MyStruct data{8941, "hello", {15.0f, -8.5f, 0.045f}};
ser.object(data); // serializes data
```
**ser.object(data)** is a final core function along with **value, text, container**.
This function is actually equivalent to calling *serialize(ser, data)* directly, but it displays friendly static assert message if it cannot find *serialize* function for your type.
### Deserialize object
```cpp
BufferReader br{bw.getWrittenRange()};
Deserializer des{br};
MyStruct res{};
des.object(res); //deserializes data
```
Deserialization process is equivalent to serialization, except that *BufferReader* reader has getError() method that returns deserialization state.
These helper functions use default configuration *bitsery::DefaultConfig*
* **quickSerialization** create serializer using output adapter, serializes data and returns written size.
* **quickDeserialization** create deserializer using input adapter, deserializes to object, and returns deserialization state.
deserialization state has two properties, error code and bool that indicates if buffer was fully read and there is no errors.
## Full example code
```cpp
#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/traits/string.h>
using namespace bitsery;
using Buffer = std::vector<uint8_t>;
using OutputAdapter = OutputBufferAdapter<Buffer>;
using InputAdapter = InputBufferAdapter<Buffer>;
struct MyStruct {
uint32_t i;
char str[6];
@@ -110,18 +111,16 @@ void serialize(S& s, MyStruct& o) {
};
int main() {
std::vector<uint8_t> buffer;
BufferWriter bw{buffer};
Serializer ser{bw};
MyStruct data{8941, "hello", {15.0f, -8.5f, 0.045f}};
ser.object(data); // serializes data
BufferReader br{bw.getWrittenRange()};
Deserializer des{br};
MyStruct res{};
des.object(res); //deserializes data
Buffer buffer;
auto writtenSize = quickSerialization<OutputAdapter>(buffer, data);
auto state = quickDeserialization<InputAdapter>({buffer.begin(), writtenSize}, res);
assert(state.first == ReaderError::NoError && state.second);
assert(data.fs == res.fs && data.i == res.i && std::strcmp(data.str, res.str) == 0);
}
```