May 30, 2020

Recursive Objects

Earlier we learnt about how we can implement a binary tree in Swift. Remember we started with a Class and not a struct.

Let's try to understand the problem with using a struct. Try compiling the code below.

struct Node {
    var next: Node

It will not compile and give you an error.

value type 'Node' cannot have a stored property that recursively contains it

The reason is, the compiler need to know the memory size of struct node so that it can allocate memory for variables / constants of this type in Stack. Now due to the recursive nature of the struct it means it is an infinite size and compiler cannot allow it.

Now, what do you think about the code below

struct Node {
    var next: [Node]

Well, it compiles fine, which feels a bit weird to some, if you are confused by this go ahead and read the explanation below.

If you think about it Array in Swift is also a value type meaning compiler need to know the size of an Array in advance. But also the array need to hold variable number of elements which it cannot know in advance and hence it need to allocate memory for those elements in heap and hold a reference (or pointer) to that memory.

The references are fixed size and so Array has a fixed size (size of a reference to actual array elements + other fixed size elements for book keeping and other implementation details.)

Now it is an Array's implementation detail that gives you value type semantic despite it containing a reference type. It does it lazily meaning copying only when required.

struct Node {
    var value: Int
    var kids: [Node]

var n1 = Node(value: 1, kids: [ Node(value: 2, kids: []) ])
var n2 = n1

When we copy n1 to n2, we expect a new memory for kids array of n2 and all n1's array elements copied to it. This is because being a value type we expect a value type semantic from Array. Now to optimize unnecessary copy operations Array can actually delay this copy operation and hold until one of n1 or n2 try to mutate their kids array (e.g. remove or add a new element).

So far essentially it mean both reference inside n1 and n2 are actually pointing to the same memory location where the elements are stored. 3, kids: []))

Now at this point, Array implementation will actually allocate a new memory for holding kids of n2, add the new Node in that memory and change the reference inside n2 to point to this memory.

All these details are Array's internal implementation details but it is good to know what is going on inside to explain some concepts we get confused with while using an Array.

Tagged with: