cover

The Ingenious Use and Pitfalls of Go Type Embedding in Deserialization

sorcererxw•

In Go, struct embedding is often considered as syntactic sugar to replace the 'extends' in object-oriented programming. Besides simply implementing some object-oriented modeling tasks, there are also some tricks to implement field/method overriding in JSON deserialization through this approach.

Effective Go - The Go Programming Language
https://go.dev/doc/effective_go#embedding

Field Type Replacement

You can use type embedding to override the field types in the embedded types, a typical use case:

type Inner struct {
	CreateTime time.Time `json:"time"`
}

type Alias Inner

type Outer struct {
	CreateTime int64 `json:"time"`

	*Alias
}

func (i *Inner) UnmarshalJSON(b []byte) error {
	var o Outer
	o.Inner = (*Alias)(i)
	
	json.Unmarshal(b,&o)

	i.CreateTime = time.Unix(o.CreateTime,0)

	return nil
}

As we all know, in Go, time.Time can only deserialize timestamps in RFC3339 format. So, what if we need to deserialize Unix timestamps? We can customize the UnmarshalJSON method of the Inner structure once, and use the Outer structure to override the CreateTime field in it. In this way, during the deserialization process, when the time field is parsed, it will attempt to write the value into the Outer.CreateTime field, bypassing Inner.CreateTime.

However, there is a point to note, the Alias type needs to replace the Inner type in Outer, otherwise during the serialization process, the json library will detect that Outer has implemented the UnmarshalJSON method (actually it's Inner.UnmarshalJSON), and it will directly call Inner.Unmarshal, without attempting to fill data into Outer.

UnmarshalJSON Hell

As mentioned earlier, if a type embeds another type that implements UnmarshalJSON, json will by default directly call the UnmarshalJSON of the embedded type.

If you want to maintain a layer-by-layer deserialization logic from the outside to the inside during the deserialization process, you need to implement UnmarshalJSON on the outer structure as well. The specific implementation can be referred to below:

type Outer struct {
	Inner
	K any `json:"k"`
}

func (o *Outer) UnmarshalJSON(b []byte) error {
  // unmarshal into inner
	if err := json.Unmarshal(b, &o.Inner); err != nil {
		return err
	}
  // unmarshal self
	type Alias *Outer
	if err := json.Unmarshal(b, (Alias)(o)); err != nil {
		return err
	}
	return nil
}

Going further, if there are multiple levels of nested types, each level is required to implement UnmarshalJSON. It can be said that once the innermost type implements UnmarshalJSON, it affects the whole system. If not careful, it could lead to very serious BUGs.

Therefore, in many cases, it is actually not recommended to use embedded types. Defining field names explicitly as much as possible can help avoid inadvertently falling into traps in the future.