cover

Considering Optional and Required Fields in ProtoBuf

sorcererxw

A World Without optional

For a long period of time, ProtoBuf3 did not support the optional keyword in ProtoBuf2. As I mentioned in gRPC and "Scalable Programming", I believe that making all fields optional is beneficial for future extensibility, avoiding the addition of a new field that could cause type errors in the caller's code.

For instance, in TypeScript, if a field is not decorated as optional, then all fields must be fully defined when initializing this object. If countless callers are using this Request type, it would obviously be difficult to add fields to it.

Generally speaking, at this point, we have two methods to make a field nullable:

  • Use the oneof method.
    message Foo {
        int32 bar = 1;
        oneof optional_baz {
            int32 baz = 2;
        }
    }
  • Use google.protobuf.wrappers as a wrapper class for basic types, introducing a null state on the basic values, just like the relationship between Integer and int in Java.
    message Foo {
        int32 bar = 1;
        google.protobuf.Int32Value baz = 2;
    }

Although the above method is not very elegant, it can work!!!

Reintroducing optional

However, the recent reintroduction of the optional keyword in ProtoBuf v3.15.0 has truly been mind-blowing for me.

Release Protocol Buffers v3.15.0 · protocolbuffers/protobuf
Protocol Compiler Optional fields for proto3 are enabled by default, and no longer require the --experimental_allow_proto3_optional flag. C++ MessageDifferencer: fixed bug when using custom igno...
https://github.com/protocolbuffers/protobuf/releases/tag/v3.15.0
syntax = "proto3";

message Request {
	string arg1 = 1;
	optional string arg2 = 2;
}
So, are fields without the optional keyword optional or required?

With this question in mind, some experiments were conducted based on TypeScript and Go's gRPC, observing the different behaviors of optional and non-optional fields, and the following performances were summarized:

  • Ordinary Fields:
    • In Go, basic types are used for representation, which inherently have zero values by default.
    • TypeScript also represents it as a basic type, although there is no default zero value in Node, both the gRPC client and server will initialize all undefined fields to zero values when they receive a proto message.
  • Optional fields:
    • In Go, the pointer type field *T is used, which no longer has a zero value.
    • In TypeScript, it is also represented as an optional type field?: T , which does not need to be initialized.

This makes it very clear that, without the 'optional', not all fields are truly optional in the real sense. For the caller, it can selectively initialize some fields, but for the callee, it needs to rely on the features of the framework or the language itself to fill in the undefined fields, so as not to generate null pointer errors when using them.

If a field is optional and explicitly has an empty state, this state needs to be perceived by both the caller and the callee, so it needs to be marked as optional.