1.4.1 Wiretypes
The wire-stream consists of six primitive payload types, two of which have been deprecated. A primitive in the wire-stream is a multi-byte string that provides three pieces of information: a wire-type, a user-specified tag (field number), and the raw payload. Except for the tag and its wire-type, protobuf payloads are not instantaneously recognizable because the wire-stream contains no payload type information. The interpreter uses the tag to associate the raw payload with a local host type specified by the template. Hence, the message can only be properly decoded using the template that was used to encode it. Note also that the primitive is interpreted according to the needs of a local host. Local word-size and endianness are dealt with at this level.
The following table shows the association between the types in the .proto file and the primitives used in the wire-stream. For how these correspond to other programming languages, such as C++, Java, etc. see Protocol Buffers Scalar Value Types, which also has advice on how to choose between the various integer types. (Python3 types are also given here, because Python is used in some of the interoperability tests.)
Prolog Wirestream .proto file C++ Python3 Notes double fixed64 double double float unsigned64 fixed64 fixed64 uint64 int integer64 fixed64 sfixed64 int64 float fixed32 float float float unsigned32 fixed32 fixed32 uint32 int integer32 fixed32 sfixed32 int32 integer varint sint32 int32 int 1, 2, 9 integer varint sint64 int64 int 1, 2, 9 signed32 varint int32 int32 int 2, 3, 10 signed64 varint int64 int64 int 2, 3, 10 unsigned varint uint32 uint32 int 2, 3 unsigned varint uint64 uint64 int 2, 3 boolean varint bool bool bool 2, 8 enum varint (enum) (enum) (enum) atom length delimited string str (unicode) codes length delimited bytes bytes utf8_codes length delimited string str (unicode) string length delimited string string str (unicode) embedded length delimited message (class) 5 repeated length delimited repeated (list) 6 repeated_embedded length delimited repeated (list) 11 packed length delimited packed repeated (list)
Notes:
- Encoded using a compression technique known as zig-zagging, which is more efficient for negative values, but which is slightly less efficient if you know the values will always be non-negative.
- Encoded as a modulo 128 string. Its length is proportional to its magnitude. The intrinsic word length is decoupled between parties. If zig-zagging is not used (see note 1), negative numbers become maximum length.
- SWI-Prolog has unbounded integers, so an unsigned integer isn't a special case (it is range-checked and an exception thrown if its representation would require more than 32 or 64 bits).
- Encoded as UTF8 in the wire-stream.
- Specified as
embedded(Tag,protobuf([...]))
. - Specified as
repeated(Tag,Type([...,...]))
, where Type is =unsigned,integer
,string
, etc. repeated ... [packed=true]
in proto2. Can not contain "length delimited" types.- Prolog
boolean(Tag,false)
maps to 0 andboolean(Tag,true)
maps to 1. - Uses "zig-zag" encoding, which is more space-efficient for negative numbers.
- The documentation says that this doesn't use "zig-zag" encoding, so
it's less space-efficient for negative numbers. In particular, both C++
and Python encode negative numbers as 10 bytes, and this implementation
does the same for wire-stream compatibility (note that SWI-Prolog
typically uses 64-bit integers anyway). Therefore, signed64 is used for
both .proto types
int32
andint64
. - Specified as
repeated_embedded(Tag,protobuf([...]),Fields)