Often developers get confused with Slices with Arrays and maybe because of its syntax.
Slices are more powerful than traditional arrays but great power comes with great responsibility. Before jumping into the main topic today, let's spend some time understanding the significant difference between Arrays and Slices.
What is Slice and Array ?
Arrays:
Arrays are typed collections with a fixed size.
var arr [N]T // N - Size and T is a primitive Type
Eg: arr := [3]int32{1,2,3,4}
Arrays are fixed-length hence size cannot be changed.
Arrays are value types not reference types in Go. This means that when they are assigned to a new variable, a copy of the original array is assigned to the new variable.
arr1 := [3]int32{1,2,3}
arr2 := arr1
arr2[2] = 22
fmt.Println(arr2[2]) //Prints 22
fmt.Println(arr1[2]) //Prints 3
Slices:
Slices don't own any data instead it's just a reference to another array.
Slices hold a pointer to an existing array, so modifying any value in the slice may change the existing array (there is a catch here, will explain a bit later)
arr1 := [3]int32{1,2,3}
slice := arr1[:]
slice[1] = 11
fmt.Println(arr1[1]) //Prints 11
fmt.Println(slice[1]) //Prints 11
Slices are dynamic, meaning new elements can be easily added using the append function.
You can think of a Slice as a struct (SliceHeader)
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
Here Data is a pointer to the first element(index 0) of the existing array because Slice doesn't hold any data.
All Fine!!! what's the problem ?
1. Garbage collection
As you know, Slices hold the reference to the existing array. As long as the Slice is in use, the existing array cannot be garbage collected. Let's suppose we have an array containing 10000 elements or objects and a slice wants a small part of it for processing. let's suppose 11 of them.
arr := [10000]int32{1,2,3,.....}
slice := arr[:11] //holds the reference to the existing array
The important thing to note here is that this existing array won't be garbage collected because the Slice has a reference to the existing array.
Looks like a problem, so what's the solution ?
One way to solve this is to create a new copy of the slice
Output:
arr address: 0xc000114000
[1 2] &{Data:824634851328 Len:2 Cap:5}
[1 2] &{Data:824634867792 Len:2 Cap:2}
if you closely look at the output
Line No 1 - if you decode 0xc000114000 to decimal then you will get 824634851328 which is the data pointer address of tempSlice (refer LineNo 2), which shows that tempSlice is referring to arr
LineNo 3 - The Data pointer address of the slice is 824634867792 which is different and it no longer depends on arr or tempSlice. so in the next GC cycle, arr and tempSlice will get garbage collected.
2. Capacity planning:
Slices provide the flexibility to add new elements dynamically unlike Arrays.
Note: getSliceHeader is a helper method that helps us to view the sliceHeader
Output:
before appending '0': &{Data:0 Len:0 Cap:0}
After appending '0': &{Data:824633811136 Len:1 Cap:1}
before appending '1': &{Data:824633811136 Len:1 Cap:1}
After appending '1': &{Data:824633811248 Len:2 Cap:2}
before appending '2': &{Data:824633811248 Len:2 Cap:2}
After appending '2': &{Data:824633819424 Len:3 Cap:4}
before appending '3': &{Data:824633819424 Len:3 Cap:4}
After appending '3': &{Data:824633819424 Len:4 Cap:4}
before appending '4': &{Data:824633819424 Len:4 Cap:4}
After appending '4': &{Data:824633843968 Len:5 Cap:8}
Final Result is: [0 1 2 3 4]
You may get these questions when you see the output:
Why Data (pointer) is changing when the capacity is exhausted?
Why capacity is getting doubled when Len reaches Cap?
That's a good question, let's talk about the reasons for that behavior:
nil slice starts off with empty capacity (check LineNo 1)
The capacity of the slice doubles while attempting to append a new item when its capacity and length are equal (check LineNo 4, LineNo 6)
When the capacity is doubled, we can also observe that the pointer to the backing array (i.e. the Data field value of reflect.SliceHeader struct) changes.
Wait, What's the problem with this ?
It's clear that you get a new backing array every time Len reaches the Cap if you don't have the capacity planned ahead, leading to an increase in time Complexity. Let's see some code to understand this better
Output:
before appending '0': &{Data:824633851984 Len:0 Cap:10}
After appending '0': &{Data:824633851984 Len:1 Cap:10}
before appending '1': &{Data:824633851984 Len:1 Cap:10}
After appending '1': &{Data:824633851984 Len:2 Cap:10}
before appending '2': &{Data:824633851984 Len:2 Cap:10}
After appending '2': &{Data:824633851984 Len:3 Cap:10}
before appending '3': &{Data:824633851984 Len:3 Cap:10}
After appending '3': &{Data:824633851984 Len:4 Cap:10}
before appending '4': &{Data:824633851984 Len:4 Cap:10}
After appending '4': &{Data:824633851984 Len:5 Cap:10}
[0 1 2 3 4]
Data (Pointer) is always pointing to 824633851984 because we initialize the backing array with a capacity of 10 and this makes sure that all the append operations have run in O(1) time.
3. Becareful with append:
You might be thinking slices are dynamic and append provides a way to add new elements to the end. Yes, that's correct !!!
But if you are not careful while appending then you may end up getting some bugs that are very hard to track down. Let's take an example
You guess the output would be:
[0 1 2 3]
[0 1 2 4]
But the output is:
[0 1 2 4]
[0 1 2 4]
But How ?
let me explain, I explained in previous sections that the slice Header will have Len and Cap, and capacity will get doubled when Len reaches capacity remember?
Let's take the help of the getSliceHeader function and instrument the code
Output:
x is initialized = [] &{Data:0 Len:0 Cap:0}
After appending 0, x = [0] &{Data:824634433544 Len:1 Cap:1}
After appending 1, x = [0 1] &{Data:824634433616 Len:2 Cap:2}
After appending 2, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4}
After appending 3, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4}
After appending 3, y = [0 1 2 3] &{Data:824634523712 Len:4 Cap:4}
After appending 4, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4}
After appending 4, z = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4}
After appending 4, y = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4}
I will clear the confusion around append using the above output.
x is initialized = [] &{Data:0 Len:0 Cap:0}, Empty slice has been created
.........
After appending 2, x = [0 1 2] &{Data:824634523712 Len:3 Cap:4}, After appending 2 the slice capacity got doubled because size reached its Len and Capacity is 4. That means we have one extra space to accommodate to reach the Cap
After appending 3, y = [0 1 2 3] &{Data:824634523712 Len:4 Cap:4}, After appending 3 to x the Len and Cap has been reached to 4 for y slice header. Still, the x slice header is still having Len 3 and Cap 4 which means there is one more space left to reach the Cap
After appending 4, z = [0 1 2 4] &{Data:824634523712 Len:4 Cap:4}, After appending 4 to x the Len and Cap has been reached to 4 for z slice header. Still, the x slice header is still having Len 3 and Cap 4 which means there is one more space left to reach the Cap
Hope it's clear now, x Len is still pointing to 3 and that's the reason for the override
How to fix these ?
One way to fix this is by using a copy
Output:
[0] &{Data:824634433544 Len:1 Cap:1}
[0 1] &{Data:824634433616 Len:2 Cap:2}
[0 1 2] &{Data:824634523712 Len:3 Cap:4}
[0 1 2 4] &{Data:824634425440 Len:4 Cap:6}
[0 1 2 3] &{Data:824634425488 Len:4 Cap:6}
FYI: Copy function won't copy anything if the slice is empty/nil, so you should initialize a slice with some length, check Line No 17,18 for more info.
4. Append on a Sliced Slice:
sometimes appending on a sliced slice can modify the original slice
Output:
a: [1 2 3 4 5], sliceHeader: &{Data:824634425392 Len:5 Cap:5}
b: [3 4], sliceHeader: &{Data:824634425408 Len:2 Cap:3}
a: [1 2 3 4 20], sliceHeader: &{Data:824634425392 Len:5 Cap:5}
b: [3 4 20], sliceHeader: &{Data:824634425408 Len:3 Cap:3}
you might be thinking why a has been changed ?
To understand more about the issue, let's understand how sliced slice [2:4] works internally
When you slice a slice using slice expression like [low: high] sliced slice gets a new capacity.
i.e; Cap = Len - low; (Len = Original Slice length)
b gets a capacity, i.e; 3 (5-2) from the original slice, and this space is shared between the original slice and the sliced slice.
I think now you got the answer, why changing the b slice has changed the a slice. One way to fix this is
Output:
a: [1 2 3 4 5], sliceHeader: &{Data:824635383808 Len:5 Cap:5}
b: [3 4 5], sliceHeader: &{Data:824635383824 Len:3 Cap:3}
a: [1 2 3 4 5], sliceHeader: &{Data:824635383808 Len:5 Cap:5}
b: [3 4 5 20], sliceHeader: &{Data:824635383904 Len:4 Cap:6}
In the above example, we can see that the capacity and length of slice b was 3, and calling append on slice b triggered the grow logic which meant that the values had to be copied to a new array with double the capacity i.e; 6, and that's the reason Original Slice a was not impacted because of this change
Another way to fix the above issue would be to use a copy instead of an append ;)
Conclusion:
Slices in go are very powerful and they are memory-efficient, but unlike arrays, they are not straight forward and devs need to be extra careful while using slices or else you end up wasting a lot of time to track down bugs ;)
References:
Very nice Vasu. thanks for sharing knowledge
Good one Vasu..I have one doubt, how copying existing slice to new slice, would make it eligible for GC? arr :=[10000]int32{1,2,3,.....} tempSlice := arr[:11] slice :=make([]int32,len(tempSlice))copy(slice,arr[:1])fmt.Println(slice) Here tempSlice is still referencing arr right?
Nice one Vasu💯