Go is known for its concise and simple syntax, but even Go has many pitfalls that you may encounter in your work. In this article, I’ll break down common mistakes with examples and tell you how to avoid them.
Arrays and Slides
Let’s start with the basic concepts:
An array is a sequence of elements of a certain type and fixed length. It is an immutable data structure and its capacity is always equal to its length.
A slice is a kind of superstructure on top of an array with the possibility to change its length.
Common mistakes when working with slides
To understand how slides work, you need to understand their structure. In the code below, you can see the fields about length and capacity and the pointer to the array on the basis of which the slice is built.
type slice struct
array unsafe.Pointer
len int
cap int
There are two things to remember about slice length and capacity:
- When we create a new slice, its length is equal to its capacity, unless otherwise specified.
- How exactly the capacity of a slice grows since go 1.20.
Since in Go all arguments are passed to functions by value, when passing a slice, the value of its structure is passed as an argument. In other words, only the reference to the array on the basis of which the slice is built is copied. The array itself with all the data it contains is not copied. If you don’t know this, you may get unexpected results.
Let’s consider some examples further.
Reallocating memory for a new array
func changeSLiceValues(s []int)
s[0] = 1
func main ()
slice := []int0
fmt.Println(slice) // [0]
changeSliceValues(slice)
fmt.Println(slice) // [1]
Here, a slice consisting of one zero is declared in main. After that this slice is passed to the function with the name changeSliceValues
. There some actions happen to it, in the current example a one is written to the zero index. If we display the slice in main before and after the function call, we will see [0] and [1] respectively.
func changeSliceValues(s []int)
s[0] = 1
s = append(s, 2)
s[0] = 3
func main ()
slice := []unt0
fmt.Println(slice) // [0]
chamgeSliceValues(slice)
fmt.Println(slice) // [1]
Now let’s change the example a bit – in changeSliceValues
, after writing to the null index using append, we will add 2 to the end of the slice and then write 3 to the null index. And despite the changes we applied, the prints in main will still print [0] and [1]. The size of the slice has not changed, even though we added append. The second append didn’t apply either.
In fact, everything will become clear if we remember the previously mentioned facts about slice and the structure of the slice. At the very beginning, we created a slice with a length that is equal to its capacity, i.e. one.
When changeSliceValues
is called, the value of the slice structure is passed as an argument. It points to the same underlying array as the slice in main. For this reason, the first entry at index zero is applied on the original array that was created when the slice was initialized in main.
However, further on, when we do an append, since the slice has a length equal to capacity, memory is reallocated to the new array. All values from the old array are evacuated there, and further work with this slice has no effect on the original slice in main.
The next write to the zero index is performed again on the new array. It has no effect on the original array that was created when the slice was declared in main.
Copying slides
You may encounter the same problem if you try to copy a slice.
func main ()
slice := []int 0, 0, 0
newSlice := slice[0:2]
newSlice = append(newSlice, 1)
fmt.PrintLn(slice) // [0,0,1]
In the example, the data from the original slice, including a pointer to the array with the data, is copied into the variable newSlice
using slice. When append is executed, the data is overwritten in the original array because newSlice
points to the original array from the first slice.
Go has a special built-in copy function that will allow you to safely copy any slice.
func main()
slice := []int0,0,0
newSlice := make([]int, 2)
copy(newSlice, slice)
newSlice = append(newSlice, 1)
fmt.Println(slice) // [0, 0, 0]
fmt.Println(newSlice) // [0, 0, 1]
In the image above, you can see that by using copy, we have transferred the elements from the original slice to the new one. Now we can safely do append without fear of overwriting the original data.
Working with strings and runes
Let’s say we want to parse some news portal and cache the first hundred characters of each news item to show a preview to the user.
func receiveArticle() string
...
func consumeNewsArticles()
for
article := receiveArticle()
storeArticlePreview(getArticlePreview([]rune(article)))
func getArticlePreview(article []rune) []rune
return article[:100]
In an infinite loop we get new news, after taking the first hundred runes and giving them to some storeArticlePreview
function that takes care of storing the runes in the cache.
However, the problem is that when the service goes live, it will eat a lot more RAM than planned because it has a leak. The operation of fetching the first hundred runes from a news item creates a slice that is one hundred elements long, but its capacity remains the same as the original slice. As a result, the whole array with the news text remains in memory, even if in the end we have access only to the first hundred elements.
By the way, why in this example we scaled the string to the runes array before taking the first 100 characters from it? Let me show you by example how runes differ from bytes.
func main()
hello := "Hello World"
helloRunes := []rune(hello)
fmt.Println(helloRunes[:5]) // [72 101 108 108 111]
fmt.Println(string(helloRunes[:5])) // Hello
fmt.Println(hello[:5]) // Hello
We take the standard string Hello World and make a separate variable with runes. By design we want to print the first five characters, i.e. the word Hello.
If you look at the output, there will be nothing unusual in it. First the runes themselves will be printed, then when we convert it to a string, the word Hello is printed. When we take the first five elements from the string, Hello is printed again.
There seems to be no difference between runes and bytes. But here’s what happens if you say hello in Chinese.
func main()
hello := "你好世界"
helloRunes := []rune(hello)
fmt.Println(helloRunes[:2]) // [20320 22909]
fmt.Println(string(helloRunes[:2])) // 你好
fmt.Println(hello[:2]) // �
By design, the first 2 characters should be output, but the usual string slicing doesn’t work – the incorrect character is output instead of characters.
Remember that strings in Go consist of UTF-8 characters, each of which can be represented by more than one byte. If we take a slice on a string, we will be working with bytes, not its characters. So when we try to take the first two characters from a string, we take the first two bytes.
Most string operations work with their bytes, but there are exceptions…. Slicing a string gives bytes, the len
method will also show the length in bytes. The for range loop takes the byte index where the character starts as an index, but the value will not be the byte, but the rune that starts at that index.
func main()
hello := "你好世界"
fmt.Println(hello[:2]) // bytes
fmt.Println(len(hello)) // bytes
for i, c := range hello
fmt.Println(i, c) // bytes index, rune
Often you can just cast a line to the rune slice and work with it. But we should not forget about the overhead we can get in this case. There will be two variables for each line: one of them stores the original line, and the second one stores the array of runes. If there are many strings and they are long, this can also make a difference. Fortunately, Go has CPU optimizations that avoid it in certain situations.
Channels
Channels are a synchronization primitive that gives one goroutine the ability to send data to another and allows secure access to shared data.
Two questions arise when working with channels: who should close them and whether it is necessary to do so at all. It is important to know what can happen when working with a channel in its different states.
This table describes what we will get when performing different operations. For example, with a channel in a closed state, reading from it will not cause any problems – we will just get the default value and false, which signals that reading from the channel failed. However, writing to a closed channel and closing an already closed channel will cause panic. Conclusion: the goroutine that writes to the channel must close it.
Now let’s try to answer the question: “why close a channel?”. To do this, let’s turn to the documentation: “a sender may close a channel to indicate that values will no longer be sent”. If the sender closes the channel, it means that someone other than the sender may need it – for example, a reader of the channel. Let’s look at an example of when this might be useful:
func writeToChan(ch chan<- int)
ch <- 1
ch <- 2
ch <- 3
close(ch)
func main()
ch := make(chan int)
go writeToChan(ch)
for value := range ch
fmt.Println(value)
// some logic
Here we have a speaking function writeToChan
, which writes to a channel and the main loop on this channel – in it we calculate values. If you don’t close the channel, the loop will never end and a deadlock will occur. If you close it – for range is successfully completed and the program logic goes on.
You should only close the channel in situations where the reader should not react in any way. It’s okay if the channel is not closed. The garbage collector will be able to get rid of it even in that state.
time.After.
Since we have discussed channels, let’s talk about structures that channels use. One of them is time.After
. This function returns a channel that will close after a given time delay. It is usually used to create timers or set timeouts for executing certain logic in programs.
func consumer(ch <-chan Event)
for
select
case event := <-ch:
handle(event)
case <-time.After(time.Minute * 15) :
fmt.Println("warning: no messages received")
Here is an example where we process some events that come from some queue. If we haven’t encountered a single event in the channel in 15 minutes, we’ll throw a warning that something has happened.
If we run the code, it will work. But if suddenly there is some dashboard where we measure memory consumption and there are a large number of events, we will easily find out that there is a memory leak. With an average flow of a million messages per 15 minutes, the leak would be on the order of 200 MB. One Go channel weighs about 200 bytes. It turns out that for every event a new channel is leaked.
Why does this happen, if the garbage collector can delete a channel that is not closed, but the channel created by time.after goes out of scope after each new event?
Let’s refer to the documentation, “the timer will not be collected by the garbage collectors until it has worked out”. That is, the time.after that we set for 15 minutes is left hanging dead weight for that time, even if it is no longer in scope.
Goroutines
A goroutine is a lightweight thread of execution in userspace while operating system threads live in kernel space. Goroutines are managed by Go tools, while threads are managed by operating system tools.
Horoutines are designed to be more efficient than traditional operating system threads. But they also have a few traps that you can fall into when working with them.
In the example below, we create a slice of numbers from 1 to 5 and goroutines in a loop, and in each one we add a different number from the slice to the sum variable. You might think that the output would be the number 15, which is the sum of the numbers from 1 to 5 – but it is not.
func main()
digits := []int641, 2, 3, 4, 5
var sum int64 = 0
var wg sync.WaitGroup
for _, value := range digits
go func()
wg.Add(1)
defer wg.Done()
atomic.AddIt64(&sum, value)
()
wg.Wait()
fmt.Println(sum)
The problem here is closures – functions that grab variables from outside scope. In the example, the anonymous function that is created in the loop is a closure because it captures the value variable from the external scope.
What makes them special is how the captured variable is used. Horoutines don’t capture variable values at the time they are created – they capture a reference to the variable. Therefore, when goroutines start executing, the loop has often already passed, and the value variable has the last value from the slice. We iterate on it, although there is no guarantee that the loop will end before one of the goroutines starts running. This causes the sum variable to contain a value other than 15.
This is such a common problem that Go maintainers decided to change the semantics of for loop variables to prevent their unintended use in closures and goroutines at every iteration. A corresponding experiment appeared in version 1.21, and as of version 1.22, this problem is no longer completely reproducible. But since version 1.22 is fresh and not everyone has had time to update yet, take note of this feature of closures.
The sync and atomic packages
In the examples above we used sync WaitGroup
to wait for goroutines to execute and, by the way, we did it wrong. Who didn’t notice?) It’s worth paying attention to where we do wg.Add
and think about what it does.
Let’s look into it. Let’s take a look at the setup
type WaitGroup struct
noCopy no Copy
state atomic.Unit64
sema unit32
In the WaitGroup
structure we see a semaphore and a certain noCopy
. First let’s talk about semaphore, more precisely about the fact that WaitGroup
is a simple wrapper over semaphore with three methods:
-
.Add(delta int)
increments the value of the semaphore by the passed value. -
.Done()
decrements the semaphore value by one. -
.Wait()
blocks execution until the semaphore value is equal to 0.
So, the problem – in the goroutines we are running is that there is no guarantee that they will run before .Wait
is called. That means .Wait
may terminate before .Add
is executed. Since there is no guarantee of the order in which the goroutines will run, we run the risk of incorrectly assuming that all goroutines have finished, even though some have not even started yet.
go func()
wg.Add(1)
defer wg.Done()
sum += v
()
Now let’s go back to the WaitGroup
structure and take a closer look at the noCopy
field, with the same type noCopy
– what is it? You can guess by the name that it is something that cannot be copied, this field is found in most of the sync package structures. Let’s see what happens if you copy a structure with this field. For this example we will use a mutex, it also has a noCopy
field with the type noCopy
.
type Counter struct
m sync.Mutex
counters maps[string]int
func (c Counter) increment(key string)
c.m.Lock()
defer c.m.Unlock()
c.counters[key]++
func (c Counter) IncrementMultiple(key string, n int)
for i := 0; i < n; i++
c.increment(key)
func main()
c := Countercounters:
map[string]int"key1": 0, "key2" : 0
go c.IncrementMultiple("key1", 100000)
go c.IncrementMultiple("key1", 100000)
time.Sleep(300 * time.Millisecond)
fmt.Println(c.counters)
In this program we have a structure Counter, which stores a master and a mutex, which should protect the master from parallel writing. The mutex contains noCopy
as well as waitGroup
.
Two methods are defined on the Counter structure: one increases the value of a certain key by one, the other – immediately by the passed value. There is a maine where we initialize the counter structure and run two goroutines to increment the value of the same key, do a slip to wait for the goroutines to execute, and print the values that end up in our counter map. But here’s the print, unfortunately we’ll never see it, because we’ll fall down with panic.
fatal error: concurrent map writes
<goroutines stack>
Process finished with the exit code 2
The problem with the code is that whenever increment is called, our Counter
c is copied into it, because increment is defined on Counter type, not *Counter
. In other words, it is a value receiver, not a pointer receiver. Therefore, increment cannot change the original Counter type variable we created in main. That’s why each time increment is called, the counter with all its contents, including the mutex, is copied.
Now let’s remember that a mutex is just a wrapper over a semaphore, and when we copy it, we copy the semaphore as well. At the same time, the copy and the original can live their separate lives, and there is nothing to prevent them from competing for operations on the same memory block. That’s why copying a mutex is wrong.
So, due to the very same noCopy it is possible to mark any structure as impossible to copy (many structures from the sync package are so marked). Then you can use the go vet command to detect places where the marked structure is copied and find a potential problem in your application code.
Atomic
Now let’s move on to another common synchronization primitive – atomics. They provide safe access to shared memory for operations of reading, writing and modifying variables. In addition, atomics operations are generally faster than mutex operations due to the use of a special set of processor instructions. However, with this advantage comes a disadvantage that is occasionally forgotten: operations with atomics are atomic individually, but not atomic all together.
In this program, the goroutine is started, which constantly increases the value of the variable num by one in an infinite loop. At the same time, main contains an infinite loop that checks if the number is even, and if this condition is met, it displays it on the screen. However, we can see that the number 287 is printed at startup, and it is oddly odd. This happens because after num passes the parity check its value is not protected from changes and the goroutine incrementing num manages to change its value before the number is displayed.
defer
defer
allows you to postpone execution of a block of code until the end of the function in which it was called. It is typically used to ensure that resources are freed, such as closing a file or unlocking a mutex, regardless of how the function terminates – due to a normal return, panic, or error.
type ProfileType string
const (
SimpleProfile ProfileType = "simple"
InvestmentProfile ProfileType = "investment"
BusinessProfile ProfileType = "business"
)
type Profile struct
Type ProfileType
func (p *Profile) GetBalance() (balance int)
switch p.Type
case BusinessProfile:
return p.getBusinessProfileBalance()
case InvestmentProfile:
return p.getBusinessProfileBalance()
case SimpleProfile:
return p.getBusinessProfileBalance()
default:
panic("inknown profile type")
Here we see the profile structure and several possible types for it, as well as the GetBalance()
method, which, depending on the profile type, selects one or another balance calculation method. Let’s assume that now we want to add a log with the total balance obtained during the calculation:
type ProfileType string
const (
SimpleProfile ProfileType = "simple"
InvestmentProfile ProfileType = "investment"
BusinessProfile ProfileType = "business"
)
type Profile struct
Type ProfileType
func (p *Profile) GetBalance() (balance int)
defer fmt.Println("profile balance:", balance)
switch p.Type
case BusinessProfile:
return p.getBusinessProfileBalance()
case InvestmentProfile:
return p.getBusinessProfileBalance()
case SimpleProfile:
return p.getBusinessProfileBalance()
default:
panic("unknown profile type")
And as a result of the added log we will always see the entry “profile balance: 0”. Why is it so?
Let’s take a closer look at what is written about defer in the language documentation: “The arguments to the deferred function (which include the receiver if the function is a method) are evaluated when the defer executes, not when the call executes”. The values of the arguments to the function in the defer (including the receiver of the method) are evaluated when the defer is executed, not when the function is executed. In our example, at the moment of defer execution we have 0 in the balance variable by default – this is the value with which our print is executed. In order to achieve the result we wanted to get, i.e. to have the total amount of the calculation appear in the print, we can use the concept we have already met – closures.
defer func()
fmt.Println("profile balance:", balance)
()
An anonymous function does not have any arguments, the balance variable is located in the body of this function. Accordingly, a reference to this variable will be stored, and the actual value will be obtained when the anonymous function is executed using the stored reference.
Interfaces
Interfaces in Go provide code flexibility, allowing you to write universal functions that can work with different data types implementing the same interface. However, not everything is smooth with them too. Let’s take a look at this code example:
type Requester interface (
MakeRequest() int
)
type ConcreteRequester struct
someField int
func (r *ConcreteRequester) MakeRequest() int
return r.someField
func makeRequester(someVal int) Requester
var requester *ConcreteRequester
if someVal > 0
requester = &ConcreteRequestersomeField: someVal
return request
func main()
requester := makeRequester(0)
fmt.Println("got requester: ", requester)
if requester == nil
fmt.Println("requester is nil")
else
fmt.Println("requester is not nil")
We have a requester interface that makes some query and gives int in the response – let’s say, the status code of our query. There is a concrete concrete requester type that implements the requester interface. There is also a function for the requester interface that allows it to be initialized depending on the value passed in.
If it is greater than zero, we return the instance concret requester. If it is less than or equal to zero, we simply return an uninitialized variable. There is logic in main that will first print the requester and then do another print depending on whether it is nil or not.
If we run the program, we see a funny output:
got requester: <nil>
requester is not nil
We got a nil
requester, but it is not nil
.
To figure it out, we need to take a closer look at the interfaces – more precisely, at how they are organized under the hood.
type eface struct
_type *_type
data unsafe.Pointer
type iface struct
tab *itab
data unsafe.Pointer
type itab struct
_type *_type
...
Under the hood, there are two structures for interfaces: eface
for empty and iface
, if a set of methods to which the type should correspond is defined. Now we are interested in the fields common to them – namely, the data type that the interface implements and a reference to the area in memory where its value is located. For two interface-type variables to be equal, both of these fields must be equal.
Now let’s look at
(fmt.Printf("requester=(%T,%v)n", requester, requester)).
what exactly lies in these fields for our variable requester:
requester=(*main.ConcreteRequester,<nil>)
Aha! That’s where the legs grow from. Although the actual value of a variable is nil
, the type is not, which causes the comparison requester == nil
to be false.
This behavior is related to the so-called value boxing and deserves a separate article, which is already kindly written here.
Peculiarities of vendoring
Let’s imagine that you have created a library on Go, where some network requests should take place. Inside this library you have implemented a certain client that can make requests, receive some data in the response and give them in the form of structures described in the models folder.
Now let’s try to use this library in some service. We added it to go.mod, wrote go mod tidy, go mod vendor in the console, decided to look in vendor, and there, unexpectedly, lies only a part of files and folders of your library.
For those who have not studied how vendoring works, this will seem like something strange. Well, let’s go to the language documentation for answers:
“The go mod vendor command constructs a directory named vendor in the main module’s root directory containing copies of all packages needed to build and test packages in the main module.”
And everything falls into place again: only those packages needed for successful build and testing of the application are found in the vendor. That is, if we initialize a client from a library somewhere in the service where we have connected this library, we will get the packages required for this.
This situation itself may seem to be just an unexpected feature of the language. In fact, it’s a subtle hint that it’s not a good idea to put the implementation of the logic of going to an external service inside the library. After all, this way we increase the cohesion of the logic, and also reduce the ability of consumer services to customize the library’s interaction with external services.
In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
https://techplanet.today/storage/posts/2024/12/11/9bylTMb4Un7G21OwqhwzAgGE1g5RoNc67Zm1btl3.webp
2024-12-13 07:54:24