1

Introduction

I wrote an article before introducing the ants the goroutine pool 060ca03edce0be. When I checked the relevant information on the Internet, I found another realization tunny . Taking advantage of the closeness of time, it just happened to study. It is also good to compare these two libraries. Then let's get started.

Quick start

The code in this article uses Go Modules.

Create a directory and initialize:

$ mkdir tunny && cd tunny
$ go mod init github.com/darjun/go-daily-lib/tunny

Use go get to get the tunny library from GitHub:

$ go get -u github.com/Jeffail/tunny

For easy and ants to make a comparison, we will ants example of re-use tunny achieve it again: is an example of that segment sum of:

const (
  DataSize    = 10000
  DataPerTask = 100
)

func main() {
  numCPUs := runtime.NumCPU()
  p := tunny.NewFunc(numCPUs, func(payload interface{}) interface{} {
    var sum int
    for _, n := range payload.([]int) {
      sum += n
    }
    return sum
  })
  defer p.Close()
  // ...
}

It is also very simple to use, first create a Pool , here tunny.NewFunc() is used.

The first parameter is the pool size, that is, how many workers (that is, goroutines) are working at the same time. Here, it is set to the number of logical CPUs. For CPU-intensive tasks, setting this value too large is meaningless, but may cause goroutine switching Frequent and degraded performance.

The second parameter is passed in a parameter of func(interface{})interface{} as the task processing function. Subsequent incoming data will call this function for processing.

The pool needs to be closed after use, here defer p.Close() is used to close it before the program exits.

Then, generate test data, still 10,000 random numbers, divided into 100 groups:

nums := make([]int, DataSize)
for i := range nums {
  nums[i] = rand.Intn(1000)
}

Process each set of data:

var wg sync.WaitGroup
wg.Add(DataSize / DataPerTask)
partialSums := make([]int, DataSize/DataPerTask)
for i := 0; i < DataSize/DataPerTask; i++ {
  go func(i int) {
    partialSums[i] = p.Process(nums[i*DataPerTask : (i+1)*DataPerTask]).(int)
    wg.Done()
  }(i)
}

wg.Wait()

Call the p.Process() method, pass in the task data, and the idle goroutine will be selected in the pool to process this data. Since we set the processing function above, goroutine will directly call the function and pass in this slice as a parameter.

tunny difference between 060ca03edce384 and ants tunny is synchronous, that is, p.Process() method, the current goroutine will be suspended and will not be awakened until the task processing is completed. Because it is synchronous, the p.Process() method can directly return the processing result. This is also the reason why the above program starts multiple goroutines when distributing tasks. If not every task starts a goroutine, the p.Process() method will always wait for the task to complete, then the subsequent tasks will not be executed until the previous tasks are all executed. In this way, the advantages of concurrency cannot be played.

Pay attention to a small detail here. I for loop variable as a parameter to the goroutine function. If you don't do this, all goroutines share the outer layer i , and when the goroutine starts to run, the for have ended. At this time, i = DataSize/DataPerTask , index nums[i*DataPerTask : (i+1)*DataPerTask] will trigger panic out of range.

Final statistics and verification results:

var sum int
for _, s := range partialSums {
  sum += s
}

var expect int
for _, num := range nums {
  expect += num
}
fmt.Printf("finish all tasks, result is %d expect:%d\n", sum, expect)

run:

$ go run main.go
finish all tasks, result is 5010172 expect:5010172

time out

By default, p.Process() will block until the task is completed, even if there is no idle worker currently. We can also use the Process() method ProcessTimed() . Pass in a timeout interval. If there is no idle worker after this time, or the task has not been completed, it will terminate and return an error.

There are 2 cases of timeout:

  • Can't wait for idle workers: All workers have been processing busy states, and the tasks being processed are time-consuming and cannot be completed in a short time;
  • The task itself is time-consuming.

Below we write a function to calculate Fibonacci, using recursion, an inefficient implementation method:

func fib(n int) int {
  if n <= 1 {
    return 1
  }

  return fib(n-1) + fib(n-2)
}

Let's first look at the time-consuming task and create the Pool object. In order to observe more clearly, the statement time.Sleep()

p := tunny.NewFunc(numCPUs, func(payload interface{}) interface{} {
  n := payload.(int)
  result := fib(n)
  time.Sleep(5 * time.Second)
  return result
})
defer p.Close()

Generate the number of tasks equal to the pool capacity, call the p.ProcessTimed() method, and set the timeout to 1s:

var wg sync.WaitGroup
wg.Add(numCPUs)
for i := 0; i < numCPUs; i++ {
  go func(i int) {
    n := rand.Intn(30)
    result, err := p.ProcessTimed(n, time.Second)
    nowStr := time.Now().Format("2006-01-02 15:04:05")
    if err != nil {
      fmt.Printf("[%s]task(%d) failed:%v\n", nowStr, i, err)
    } else {
      fmt.Printf("[%s]fib(%d) = %d\n", nowStr, n, result)
    }
    wg.Done()
  }(i)
}

wg.Wait()

Because the processing function sleeps for 5s, the task will time out during execution. run:

$ go run main.go 
[2021-06-10 16:36:26]task(7) failed:job request timed out
[2021-06-10 16:36:26]task(4) failed:job request timed out
[2021-06-10 16:36:26]task(1) failed:job request timed out
[2021-06-10 16:36:26]task(6) failed:job request timed out
[2021-06-10 16:36:26]task(5) failed:job request timed out
[2021-06-10 16:36:26]task(0) failed:job request timed out
[2021-06-10 16:36:26]task(3) failed:job request timed out
[2021-06-10 16:36:26]task(2) failed:job request timed out

Both time out in the same second.

We doubled the number of tasks, and then changed the sleep in the processing function to 990ms to ensure that the previous batch of tasks can be completed smoothly, and subsequent tasks either cannot wait for idle workers or return due to excessive execution time. run:

$ go run main.go
[2021-06-10 16:42:46]fib(11) = 144
[2021-06-10 16:42:46]fib(25) = 121393
[2021-06-10 16:42:46]fib(27) = 317811
[2021-06-10 16:42:46]fib(1) = 1
[2021-06-10 16:42:46]fib(18) = 4181
[2021-06-10 16:42:46]fib(29) = 832040
[2021-06-10 16:42:46]fib(17) = 2584
[2021-06-10 16:42:46]fib(20) = 10946
[2021-06-10 16:42:46]task(5) failed:job request timed out
[2021-06-10 16:42:46]task(14) failed:job request timed out
[2021-06-10 16:42:46]task(8) failed:job request timed out
[2021-06-10 16:42:46]task(7) failed:job request timed out
[2021-06-10 16:42:46]task(13) failed:job request timed out
[2021-06-10 16:42:46]task(12) failed:job request timed out
[2021-06-10 16:42:46]task(11) failed:job request timed out
[2021-06-10 16:42:46]task(6) failed:job request timed out

context

Context is a tool for coordinating goroutines. tunny supports the method context.Context ProcessCtx() . After the current context status changes to Done , the task will also stop executing. Done state due to timeout, cancellation, etc. Take the above example:

go func(i int) {
  n := rand.Intn(30)
  ctx, cancel := context.WithCancel(context.Background())
  if i%2 == 0 {
    go func() {
      time.Sleep(500 * time.Millisecond)
      cancel()
    }()
  }

  result, err := p.ProcessCtx(ctx, n)
  if err != nil {
     fmt.Printf("task(%d) failed:%v\n", i, err)
  } else {
     fmt.Printf("fib(%d) = %d\n", n, result)
  }
  wg.Done()
}(i)

The other code is the same, we call the p.ProcessCtx() method to perform the task. The parameter is a cancelable Context . For even-numbered tasks, we start a goroutine cancel() drop this Context after 500ms. The code running results are as follows:

$ go run main.go
task(4) failed:context canceled
task(6) failed:context canceled
task(0) failed:context canceled
task(2) failed:context canceled
fib(27) = 317811
fib(25) = 121393
fib(1) = 1
fib(18) = 4181

We see that even-numbered tasks have been cancelled.

Source code

tunny is even less. Excluding the test code and comments, it is less than 500 lines. Then let's take a look together. Pool structure of 060ca03edce6ad is as follows:

// src/github.com/Jeffail/tunny.go
type Pool struct {
  queuedJobs int64

  ctor    func() Worker
  workers []*workerWrapper
  reqChan chan workRequest

  workerMut sync.Mutex
}

There is a ctor Pool structure, which is a function object that returns a value that Worker

type Worker interface {
  Process(interface{}) interface{}
  BlockUntilReady()
  Interrupt()
  Terminate()
}

Different methods of this interface are called at different stages of task execution. The most important is undoubtedly the Process(interface{}) interface{} method. This is the function that performs the task. tunny provides the New() method to create the Pool object. This method requires us to construct the ctor function object ourselves, which is inconvenient to use. tunny provides two other default implementations, closureWorker and callbackWorker :

type closureWorker struct {
  processor func(interface{}) interface{}
}

func (w *closureWorker) Process(payload interface{}) interface{} {
  return w.processor(payload)
}

func (w *closureWorker) BlockUntilReady() {}
func (w *closureWorker) Interrupt()       {}
func (w *closureWorker) Terminate()       {}

type callbackWorker struct{}

func (w *callbackWorker) Process(payload interface{}) interface{} {
  f, ok := payload.(func())
  if !ok {
    return ErrJobNotFunc
  }
  f()
  return nil
}

func (w *callbackWorker) BlockUntilReady() {}
func (w *callbackWorker) Interrupt()       {}
func (w *callbackWorker) Terminate()       {}

tunny.NewFunc() method uses closureWorker :

func NewFunc(n int, f func(interface{}) interface{}) *Pool {
  return New(n, func() Worker {
    return &closureWorker{
      processor: f,
    }
  })
}

The created closureWorker directly uses the parameter f as the task processing function.

tunny.NewCallback() method uses callbackWorker :

func NewCallback(n int) *Pool {
  return New(n, func() Worker {
    return &callbackWorker{}
  })
}

callbackWorker is no processing function in the 060ca03edce7e8 structure, so you can only send it a function object with no parameters and no return value as a task. Its Process() method is to execute this function.

Pool object is created SetSize() method is called to set the number of workers. In this method, the corresponding number of goroutines will be started:

func (p *Pool) SetSize(n int) {
  p.workerMut.Lock()
  defer p.workerMut.Unlock()

  lWorkers := len(p.workers)
  if lWorkers == n {
    return
  }

  for i := lWorkers; i < n; i++ {
    p.workers = append(p.workers, newWorkerWrapper(p.reqChan, p.ctor()))
  }

  // 停止过多的 worker
  for i := n; i < lWorkers; i++ {
    p.workers[i].stop()
  }

  // 等待 worker 停止
  for i := n; i < lWorkers; i++ {
    p.workers[i].join()
    // -----------------
  }
  p.workers = p.workers[:n]
}

SetSize() is actually called during expansion and contraction. For expansion, it will create a corresponding number of workers. For shrinking, it will stop redundant workers. Unlike ants , the expansion and contraction of tunny

In the code, the place I marked ----------------- For shrinking, because the underlying array has not changed, workers slice is reduced, the following elements in the array are actually inaccessible, but the array still holds its references, which is a kind of memory leak. So it's better to add p.workers[i] = nil ?

The worker created here is actually a workerWrapper structure that wraps a layer:

// src/github.com/Jeffail/worker.go
type workerWrapper struct {
  worker        Worker
  interruptChan chan struct{}
  reqChan chan<- workRequest
  closeChan chan struct{}
  closedChan chan struct{}
}

func newWorkerWrapper(
  reqChan chan<- workRequest,
  worker Worker,
) *workerWrapper {
  w := workerWrapper{
    worker:        worker,
    interruptChan: make(chan struct{}),
    reqChan:       reqChan,
    closeChan:     make(chan struct{}),
    closedChan:    make(chan struct{}),
  }

  go w.run()
  return &w
}

workerWrapper structure is created, it will immediately call the run() method to start a goroutine:

func (w *workerWrapper) run() {
  jobChan, retChan := make(chan interface{}), make(chan interface{})
  defer func() {
    w.worker.Terminate()
    close(retChan)
    close(w.closedChan)
  }()

  for {
    w.worker.BlockUntilReady()
    select {
    case w.reqChan <- workRequest{
      jobChan:       jobChan,
      retChan:       retChan,
      interruptFunc: w.interrupt,
    }:
      select {
      case payload := <-jobChan:
        result := w.worker.Process(payload)
        select {
        case retChan <- result:
        case <-w.interruptChan:
          w.interruptChan = make(chan struct{})
        }
      case _, _ = <-w.interruptChan:
        w.interruptChan = make(chan struct{})
      }
    case <-w.closeChan:
      return
    }
  }
}

Each worker goroutine is trying to send a workRequest w.reqChan channel. After the transmission is successful, it obtains the task data jobChan Worker.Process() method to execute the task, and finally sends the result to the retChan channel. There are actually several interactions here. It needs to be combined with the Process() method to make it clearer:

func (p *Pool) Process(payload interface{}) interface{} {
  request, open := <-p.reqChan
  request.jobChan <- payload
  payload, open = <-request.retChan
  return payload
}

Delete the irrelevant code, and finally it is the above. When we call the Process() method of the reqChan , then send the task data to the jobChan channel, and finally receive the result retChan Combined with the above run process, in fact, when a task is executed normally, there are 3 interactions between Pool and workerWrapper

Observing the Pool to workerWrapper , we can see that the Pool in the workerWrapper structure is reqChan the same channel reqChan in the 060ca03edceb07 structure. I.e. workerWrapper After starting, it will be blocked to reqChan the data transmission channel, until the call Pool of Process*() method, from the passage reqChan extracted data. Process() method gets workRequest and sends task data to its jobChan After the workerWrapper.run() method successfully sends data to reqChan , it waits jobChan channel, and then receives the data sent Process() Start to execute the w.worker.Process() method, and then send the result data retChan Process() method successfully sends the data to jobChan , it starts to wait to receive data retChan After the reception is successful, the Process() method returns, and workerWrapper.run() continues to block on the w.reqChan <- , waiting for the next task to be processed. Note that jobChan and retChan are both channels created in the workerWrapper.run()

So how is the timeout achieved? See the realization of ProcessTimed()

func (p *Pool) ProcessTimed(
  payload interface{},
  timeout time.Duration,
) (interface{}, error) {
  tout := time.NewTimer(timeout)
  var request workRequest
  select {
  case request, open = <-p.reqChan:
  case <-tout.C:
    return nil, ErrJobTimedOut
  }

  select {
  case request.jobChan <- payload:
  case <-tout.C:
    request.interruptFunc()
    return nil, ErrJobTimedOut
  }

  select {
  case payload, open = <-request.retChan:
  case <-tout.C:
    request.interruptFunc()
    return nil, ErrJobTimedOut
  }

  tout.Stop()
  return payload, nil
}

Similarly, delete irrelevant code. First, create a timer , the timeout period is specified by the incoming parameter. There are three select statements at the end:

  • Waiting to p.reqChan , that is, waiting for a worker to be free;
  • Waiting to send data to jobChan , that is, waiting for the worker to fetch task data jobChan
  • Waiting to retChan , that is, waiting for the worker to send the result to retChan .

In the first case, if the timeout expires, the workers are all busy, and the task timeout is returned directly. In the latter two cases, the task has actually been executed, but it has not been completed within the specified time. In both cases, the execution of the task needs to be terminated. We see that the workerRequest.interruptFunc() method is called above, which is the workerWrapper.interrupt() method:

func (w *workerWrapper) interrupt() {
  close(w.interruptChan)
  w.worker.Interrupt()
}

This method is simply to close the interrupteChan channel, and then call the Interrupt() worker object. This method is empty in the default implementation.

interruptChan channel is closed, the goroutine waiting to jobChan and waiting to retChan will be cancelled:

select {
case payload := <-jobChan:
  result := w.worker.Process(payload)
  select {
  case retChan <- result:
  case <-w.interruptChan:
    w.interruptChan = make(chan struct{})
  }
case _, _ = <-w.interruptChan:
  w.interruptChan = make(chan struct{})
}

ProcessCtx() implementation of 060ca03edced63 is similar.

The last call workerWrapper.stop() closes closeChan channel, which can lead to workerWrapper.run() methods for loop jump, and then execute defer function of close(retChan) and close(closedChan) :

defer func() {
  w.worker.Terminate()
  close(retChan)
  close(w.closedChan)
}()

The need to close the retChan channel here is to prevent the Process*() method from waiting for retChan data.

closedChan channel is closed, the workerWrapper.join() method returns.

func (w *workerWrapper) join() {
  <-w.closedChan
}

Worker of calling several methods of 060ca03edcedf5:

  • Process() : When performing tasks;
  • Interrupt() : When the task will be cancelled by the context due to timeout;
  • BlockUntilReady() : Every time before performing a new task, some resources may need to be prepared;
  • Terminate() : workerWrapper.run() , after stopping the worker.

These timings can be clearly seen in the code.

Based on the source code, I drew a flowchart:

The interrupted process is omitted in the figure.

tunny vs ants

tunny design idea of ants is quite different from 060ca03edceef0:

tunny only supports synchronous task execution. Although the task is executed in another goroutine, the goroutine that submitted the task must wait for the result to return or time out. Can't do other things. It is precisely because of this that tunny is a little more complicated, and in order to support timeout and cancellation, multiple channels are designed to communicate with the goroutine performing the task. The process of a task execution involves multiple communications, and performance is lost. On the other hand, the synchronous programming method is more in line with human intuition.

ants completely asynchronous task execution process, and tunny performance is slightly higher than that of 060ca03edcef1f. But also because of its asynchronous nature, there is no mechanism for task timeout and cancellation. And if you need to collect the results, you must write additional code yourself.

to sum up

This article introduces the implementation of another goroutine pool tunny . It handles tasks in a synchronous manner, writing code is more intuitive, and has stronger control over the execution flow of tasks, such as timeout, cancellation, etc. Of course, the implementation is also more complicated. tunny code does not run 500 lines, it is highly recommended to read it.

If you find a fun and useful Go language library, welcome to submit an issue on the Go Daily Library GitHub😄

reference

  1. tunny GitHub:https://github.com/Jeffail/tunny
  2. ants GitHub:github.com/panjf2000/ants
  3. Go daily one library GitHub: https://github.com/darjun/go-daily-lib

I

My blog: https://darjun.github.io

Welcome to follow my WeChat public account [GoUpUp], learn together and make progress together~


darjun
2.9k 声望358 粉丝