DEV Community

Evan Lin
Evan Lin

Posted on • Originally published at evanlin.com on

Go Channel and Pipeline: A Conference Review

Preface

Hello everyone, I am Evan Lin, a Technical Evangelist at LINE Taiwan. I am very happy to return to the Golang community and participate in the "GolangTaipeiGathering#423" meetup event.

-

Community GolangTW Meetup: https://www.meetup.com/golang-taipei-meetup/

-

This event webpage: Event URL

This time, the main purpose of participating in the community meetup is to explain a problem that is often encountered in data processing, to explain a frequently confusing issue in Go(lang), which is the difference between Buffered/UnBuffered Channels, how to apply them, and how to complete a Data Pipeline through Channels.

Introducing the go channel and pipeline buffered/unbuffered channel / Senior Development Technology Promotion Engineer Evan Lin

Slides

When I was working on the Machine Learning Platform before, I had the opportunity to develop a data processing pipeline. When developing related functions that can handle data processing, the most important thing is to hope that the data processing speed can be as fast as possible. Assuming a photo needs to be judged for a face, the normal approach is to read the image (assuming it takes 100 ms) and then perform face recognition on the image (300 ms), and finally store the data in a database or an intermediate medium (file) (10 ms). This might take 100 + 300 + 10 = 410 ms for an image file, and when dealing with a large amount of data, this time consumption is incalculable. Therefore, at this time, it is necessary to use goroutines and channels to process these data in large quantities. Next, I will explain the differences and usage of Buffered/UnBuffered Channels from the beginning.

Buffered/UnBuffered Channel

When explaining goroutines and channels, the most common questions discussed and understood are "What is an UnBuffered Channel, and what is a Buffered Channel? What are the differences and where are they applied?". Before explaining Buffered and UnBuffered, let's understand the basic characteristics of Channels:

  • Channels are goroutine safe
  • Channels carry messages between goroutines
  • FIFO semantics
  • Channels cause blocking and unblocking

Next, let the author use a few examples to give the reader a clear understanding:

First, let's take an UnBuffered Channel:

This example can be executed directly in Playground.

This is a simple example of using an UnBuffered Channel. The method is to call a goroutine in main() to assign a value (send) to a channel ch, and finally receive (receive) the value from ch.

There are a few small things to note here:

  • Since it is an UnBuffered Channel, both sending (send) and receiving (Receive) will cause a block, especially when the UnBuffered Channel is already sending a value, sending a second value will cause the goroutine to block and must wait for the value to be received (Receive) before continuing to process.
  • If in main() you send a value (send) to an unbuffered channel but do not receive (Receive), it will cause the deadlock detector to detect and report an error fatal error: all goroutines are asleep - deadlock!, but it will not generate an error in the goroutine because it will only generate a goroutine leak. (Note: It will not if you call a function call of libsystem, you can refer to Go issue 33004)
  • If you send (send) a value to an already closed channel, there will be a problem, but receiving a value will only return an empty value.

Next, let's give an example for Buffered Channe.l:

The related example can be executed directly in Playground.

The second example is a simple Buffered Channel example. Comparing it with the previous example, you can understand that Buffered Channel can send and receive more values than UnBuffered Channel without waiting (or blocking). Because of this, it is generally recommended to put data that needs to be processed synchronously in a Buffered Channel, which allows several goroutines to operate on a Buffered Channel.

Golang Pipeline pattern

Next, let's explain a method that is often used in data processing, which is to process pipelines through channels. Here, I will take an example from the Golang Blog: Go Concurrency Patterns: Pipelines and cancellation:

The related example can be executed directly in Playground

This is a simple example to pass data one by one to each function through UnBuffered Channels. This example mainly inputs a series of numbers, gen() sends them into the channel, and sq() receives them from the channel, squares the value, and then sends it to another channel, and finally uses this out channel to print the received values.

You can pay attention to the following things in this example:

  • The common pattern of each function in the pipeline is to create an output channel (out) and receive the input channel (in) through a goroutine.
  • Since data is passed through UnBuffered Channels, only one will be processed each time. Through the above example, 2 will be processed first and then 3.

There are the following advantages to processing data through the Pipeline pattern:

  • Each processing function can focus on its functional part.
  • Because it is pipeline processing, each processing step can be superimposed or the order can be adjusted.

The most important thing is that you can use goroutines to make the data processing pipeline become multiple threading, and the details will be explained next.

Advanced Pipeline patterns

There are the following types in the advanced pattern of Pipeline:

  • Fan-In:
    • Merge multiple channels into a single channel. Mainly used to unify and collect data after Fan-Out.
  • Fan-Out:
    • Conversely, it is to distribute the data in a single Channel to multiple Channels, mainly used for the preparation of parallel processing.

The main application scenario is to transmit data to multiple Channels through Fan-Out, so that you can connect the relevant Pipeline (or process the same Pipeline in parallel) through individual Channels, and finally use Fan-In to organize the data processed in parallel into a Channel, as the basis for subsequent data storage or related processing.

As an example, the relevant example code for Fan-In is as follows:

View the complete runnable code on Playground.

In the processing of merge(), sync.WaitGroup will be used to handle the need to wait in the goroutine to correctly close out, and multiple go output(c) will be used to process the integration of data through goroutines.

Explicit cancellation

In addition, in addition to Fan-In and Fan-Out, because the data processing of the Pipeline often requires a relatively long time, it is also possible to need to pause or cancel the entire process in the middle of processing. At this time, you will need to use the Explicit cancellation method, and make some modifications to the merge() from the previous section:

The complete executable code can be seen in Playground.

Through this code, you can understand that the parameter done chan struct{} passed in is mainly used to control the logic through UnBuffered channels, and among them:

for n := range c {
            select {
            case out <- n:
            case <-done:
            }
        }

Enter fullscreen mode Exit fullscreen mode

is the main logic judgment for whether to stop, select is very flexible in Go, in addition to making judgments about the situation, it can also use a variety of Patterns, and this example is the most commonly used in goroutines. There are two situations that can be judged, and select will automatically go to the judgment that does not cause a block. That is to say, if there is a value that can be received in the done channel, then it will not wait for the out channel to receive the value.

Useful pipeline package - Go-kit

https://github.com/go-kit/kit is a very useful package tool, which provides many useful small tools, including circuitbreaker, ratelimit and other useful tools. But in fact, there is a useful pipeline package called endpoint.

Here is a piece of example code, which demonstrates how to use the endpoint package of go-kit to implement a simple pipeline function in Playground. Readers may also find: When can you start using 3rd party packages in Playground?

The #golang playground now supports third-party imports, pulling them in via https://t.co/IrUZXimjCk ... https://t.co/5Ng5JXpggq 🎉

Multi-file support & few other things up next.

Report bugs at https://t.co/kZELNa2yzY or here on the tweeters.

— Brad Fitzpatrick (@bradfitz) May 13, 2019

According to this tweet, it turns out that after May 2019, the functions of Playgound have been updated to support thrid-party import. This makes it easier to discuss problems through Playground.

Go-kit endpoint Pipeline + Buffered Channel

Finally, let's implement a simple image reading Pipeline through the pipeline package of Go-kit's endpoint and the concept of Buffered Channel.

This example code determines how many goroutines are needed to handle related functions, which is ImageReader(in, out), by inputting numOfImageReaders. And the default size of the Buffered Channel is 1000 to allow all data to be processed immediately without waiting for the pipeline backend to finish processing.

Summary:

This article explains the Pipeline pattern in detail by introducing Buffered/UnBuffered Channels, and explains how to apply Channels to complete the Pipeline through a practical case (photo data reading and processing).

More related content:

-

Go Concurrency Patterns: Pipelines and cancellation https://blog.golang.org/pipelines

-

Advanced Go Concurrency Patterns https://blog.golang.org/advanced-go-concurrency-patterns

Go: issue 33004 - Deadlock detector will disable when use Mac OSX (channel with net.http, no deadlock, why)

Refactoring in Goland for beginner / Julian Chu

This article mainly uses JetBrains' product Goland to quickly help developers with refactoring-related work. The related videos are listed below:

Summarize the related questions:

  • How to switch tabs in GoLand:
    • GoLand (alt + left/right, or ctrl+tab)
    • VScode (ctrl + page up/down)
    • vim/VScodeVim/ideavim (g + t/T)
  • Can GoLand automatically load vimrc or must it be sourced:

Related posts can be found at:

Event Summary

Whenever the inspiration for slides runs dry, I will go back to the Go Taipei Community to warm up. Giving a technical sharing session makes me feel particularly cheerful. Thank you to the partners who discussed enthusiastically that day, and everyone is welcome to come and learn together.

As a co-organizer of Golang.TW, I haven't been back to the community for a purely technical sharing for a long time. Through this opportunity, in addition to sharing, I can also come back to the community to see the partners I haven't seen for a long time.

Join the "LINE Developer Official Community" official account immediately, and you can receive the first-hand Meetup activities, or push notifications of the latest news related to the developer program. ▼

"LINE Developer Official Community" official account ID: @line_tw_dev

About the "LINE Developer Community Program"

LINE launched the "LINE Developer Community Program" in Taiwan at the beginning of this year, and will invest long-term manpower and resources to hold developer community gatherings, recruitment days, developer conferences, etc., both internally and externally, online and offline, in Taiwan, and is expected to hold more than 30 events throughout the year. Readers are welcome to continue to check back for the latest updates. For details, please see 2019 LINE Developer Community Program Activity Schedule (continuously updated)https://engineering.linecorp.com/zh-hant/blog/line-taiwan-developer-relations-2019-plan/)

Top comments (0)