mem: implement `ReadAll()` for more efficient `io.Reader` consumption #7653

ash2k · 2024-09-20T05:29:35Z

I moved my project to gRPC 1.66.2 and saw a good reduction in RAM consumption. Now the hot spot is decompress(), where io.Copy() allocates a temporary buffer, reads from the reader into it, copies the read data into another buffer it got from the pool. This is an unnecessary allocation, an unnecessary copy, and underutilized buffers from the pool.

This PR adds mem.ReadAll() (like io.ReadAll()) to efficiently consume a reader into buffers from the pool.

I found #7631 while working on this code (I have similar code in my project, but decided to contribute it upstream and replace this io.Copy with it).

RELEASE NOTES:

mem: implement a ReadAll() method for more efficient io.Reader consumption

codecov · 2024-09-20T05:32:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.86%. Comparing base (a3a8657) to head (eef8fb4).
Report is 4 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #7653      +/-   ##
==========================================
- Coverage   81.87%   81.86%   -0.02%     
==========================================
  Files         373      373              
  Lines       37822    37880      +58     
==========================================
+ Hits        30967    31009      +42     
- Misses       5563     5579      +16     
  Partials     1292     1292

Files with missing lines	Coverage Δ
mem/buffer_slice.go	`96.26% <100.00%> (+1.37%)`	⬆️
rpc_util.go	`79.48% <100.00%> (-0.06%)`	⬇️

... and 30 files with indirect coverage changes

mem/buffer_slice.go

purnesh42H · 2024-10-22T09:19:45Z

@ash2k are you working on this actively?

ash2k · 2024-10-22T11:04:21Z

@purnesh42H No, I'm not. I think I've answered the questions and it's ready to be merged. Is something off still? Let me know what needs to be changed.

ash2k · 2024-10-28T02:10:34Z

@PapaCharlie PTAL

mem/buffer_slice.go

mem/buffer_slice_test.go

easwars · 2024-10-29T22:19:28Z

There are still a whole lot of comments which are not wrapped at 80-cols. Could you please take care of that.

Also, please don't mark comments as resolved. It is the responsibility of the person making the comment to mark it as resolved when they think that the comment has been sufficiently addressed.

ash2k · 2024-10-29T22:59:13Z

@easwars

There are still a whole lot of comments which are not wrapped at 80-cols. Could you please take care of that.

Wrapped. Let me know if I missed something.

Also, please don't mark comments as resolved. It is the responsibility of the person making the comment to mark it as resolved when they think that the comment has been sufficiently addressed.

Ok, fair point. I used that as a way to track what I have addressed, but I see why that's not the best idea.

ash2k · 2024-10-29T23:07:06Z

Related question: I see quite a few calls with a nil pool - mem.NewBuffer(&someDataSlice, nil). Why not swap this with a type cast like this mem.SliceBuffer(someDataSlice)? No need to call a function.

easwars

Thanks for taking care of the comments.

easwars · 2024-10-30T17:17:46Z

Related question: I see quite a few calls with a nil pool - mem.NewBuffer(&someDataSlice, nil). Why not swap this with a type cast like this mem.SliceBuffer(someDataSlice)? No need to call a function.

IIRC, the SliceBuffer type was added a little later on during the review process on the PR where all the buffering functionality was added, and there is a good chance some callsites were not fixed. I don't see a reason why we should be opposed to doing that change and would happy review a PR with that change. Thanks.

dfawley · 2024-10-30T21:14:06Z

mem/buffer_slice.go

+	wt, ok := r.(io.WriterTo)
+	if ok {


Please combine into a compound if to limit the scope of wt, too.

dfawley · 2024-10-30T21:18:23Z

mem/buffer_slice.go

+// A failed call returns a non-nil error and could return partially read
+// buffers. It is the responsibility of the caller to free this buffer.


This is surprising behavior. The one saving grace is that freeing buffers is optional -- GC will take them away if you forget. If not for that, I would say this is definitely not OK.

I highly doubt gRPC would ever want the partially data, and I'm curious why you want it, too.

This behavior matches io.ReadAll(). Reasons to do it this way:

A drop-in replacement for io.ReadAll() (behavior-wise).

Sometimes you may need the read data regardless if there was an error or not.

Example of the last point: a proxy forwarding a response from an upstream server. It must send everything it got from the upstream and then return an error (return == maybe RST the connection or something else, protocol-specific). Isn't e.g. gRPC streaming response client the same? It reads and provides the client with messages even if it already got an error after the message. This is also similar to how HTTP/1.0 or 1.1 Connection: close works. Sometimes you don't want Transfer-Encoding: chunked and prefer the underlying connection to be closed on EOF instead of chunking.

Are you using this completely outside of gRPC? This package isn't really intended as a generic thing, it's intended to meet our needs for our use cases.

@dfawley My project uses gRPC "a lot". Most of the uses of this function are not related to gRPC directly at the moment but this may change.

In my opinion more flexible behavior is a better choice since it's not an internal package/function. gRPC might need it later but it'd be impossible to change - you'll have to introduce a new function.

Having said the above, I have no new things to say =) I'm happy to change it to free the buffers on error. My ultimate goal here is to get rid of the unnecessary allocations inside of gRPC. I can absolutely live with a copy of this function in my codebase and use the modified version for my own purposes.

Please let me know how you want to proceed.

OK, this is fine, but I have some minor edits I'll make to the docstring to make it stand out a little more.

mem/buffer_slice.go

easwars · 2024-11-08T20:19:20Z

@ash2k
Could we have some benchmarks like what we have on #7786.

ash2k · 2024-11-09T01:02:26Z

@easwars

This is current master vs this branch (just rebased on master to compare apples vs apples) with the benchmark from #7786.

goos: darwin
goarch: arm64
pkg: google.golang.org/grpc
                                              │   ./old.txt   │               ./new.txt               │
                                              │    sec/op     │    sec/op     vs base                 │
RPCCompressor/comp=gzip,payloadSize=1024-10      170.9µ ± ∞ ¹   148.7µ ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=10240-10     205.0µ ± ∞ ¹   179.2µ ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=512000-10    1.515m ± ∞ ¹   1.538m ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=1024-10     102.57µ ± ∞ ¹   76.36µ ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=10240-10    111.73µ ± ∞ ¹   84.42µ ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=512000-10    431.3µ ± ∞ ¹   413.9µ ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                          253.0µ         218.7µ        -13.56%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                              │   ./old.txt    │               ./new.txt                │
                                              │      B/op      │     B/op       vs base                 │
RPCCompressor/comp=gzip,payloadSize=1024-10     146.96Ki ± ∞ ¹   25.98Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=10240-10    185.43Ki ± ∞ ¹   43.04Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=512000-10   1103.0Ki ± ∞ ¹   994.4Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=1024-10      78.25Ki ± ∞ ¹   14.00Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=10240-10     89.34Ki ± ∞ ¹   23.77Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=512000-10   1137.5Ki ± ∞ ¹   986.3Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                          249.1Ki         84.54Ki        -66.07%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                              │  ./old.txt  │              ./new.txt              │
                                              │  allocs/op  │  allocs/op   vs base                │
RPCCompressor/comp=gzip,payloadSize=1024-10     252.0 ± ∞ ¹   244.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=10240-10    253.0 ± ∞ ¹   244.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
RPCCompressor/comp=gzip,payloadSize=512000-10   294.0 ± ∞ ¹   288.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=1024-10     231.0 ± ∞ ¹   223.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=10240-10    231.0 ± ∞ ¹   223.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
RPCCompressor/comp=noop,payloadSize=512000-10   294.0 ± ∞ ¹   279.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                         257.9         248.9        -3.47%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

dfawley · 2024-11-12T00:03:16Z

Thanks for the PR! That's a nice performance gain.

ash2k · 2024-11-12T00:20:16Z

@dfawley Thanks for the review and merging.

I'll wait for the release and post new RAM usage once I get this deployed.

ash2k mentioned this pull request Sep 25, 2024

Implement new Codec that uses mem.BufferSlice instead of []byte #7356

Merged

PapaCharlie reviewed Sep 25, 2024

View reviewed changes

mem/buffer_slice.go Outdated Show resolved Hide resolved

mem/buffer_slice.go Outdated Show resolved Hide resolved

purnesh42H assigned ash2k Sep 29, 2024

PapaCharlie reviewed Oct 4, 2024

View reviewed changes

mem/buffer_slice.go Show resolved Hide resolved

PapaCharlie reviewed Oct 4, 2024

View reviewed changes

mem/buffer_slice.go Show resolved Hide resolved

purnesh42H added this to the 1.69 Release milestone Oct 16, 2024

ash2k requested a review from PapaCharlie October 18, 2024 08:19

aranjans added the Type: Performance Performance improvements (CPU, network, memory, etc) label Oct 22, 2024

easwars reviewed Oct 28, 2024

View reviewed changes

ash2k force-pushed the mem-read-all branch from d601913 to 8987d55 Compare October 29, 2024 00:41

ash2k requested a review from easwars October 29, 2024 00:44

easwars mentioned this pull request Oct 29, 2024

Optimize recvAndDecompress when receiving compressed payload #7786

Closed

easwars added the Status: Requires Reporter Clarification label Oct 29, 2024

ash2k force-pushed the mem-read-all branch from 169613c to b46c562 Compare October 29, 2024 22:57

easwars removed the Status: Requires Reporter Clarification label Oct 30, 2024

easwars self-assigned this Oct 30, 2024

easwars approved these changes Oct 30, 2024

View reviewed changes

easwars changed the title ~~mem: ReadAll for more efficient io.Reader consumption~~ mem: implement ReadAll() for more efficient io.Reader consumption Oct 30, 2024

dfawley reviewed Oct 30, 2024

View reviewed changes

aranjans assigned dfawley and unassigned ash2k and easwars Nov 6, 2024

ash2k added 5 commits November 9, 2024 11:55

mem: ReadAll for more efficient io.Reader consumption

e5d4820

Address code review comments

51eeb30

Remove addressed issue reference

afc7063

Wrap comments

d7b3ac8

Address review comments

999f3d9

ash2k force-pushed the mem-read-all branch from 459cb83 to 999f3d9 Compare November 9, 2024 00:55

Update docstring comment

eef8fb4

dfawley approved these changes Nov 11, 2024

View reviewed changes

dfawley merged commit 60c70a4 into grpc:master Nov 12, 2024
15 checks passed

ash2k deleted the mem-read-all branch November 12, 2024 00:18

ash2k mentioned this pull request Nov 12, 2024

cleanup: use SliceBuffer directly where no pool is available #7827

Merged

easwars mentioned this pull request Nov 18, 2024

cleanup: remove a TODO that has been take care of #7855

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mem: implement `ReadAll()` for more efficient `io.Reader` consumption #7653

mem: implement `ReadAll()` for more efficient `io.Reader` consumption #7653

ash2k commented Sep 20, 2024 •

edited by easwars

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading

purnesh42H commented Oct 22, 2024

ash2k commented Oct 22, 2024

ash2k commented Oct 28, 2024

easwars commented Oct 29, 2024

ash2k commented Oct 29, 2024

ash2k commented Oct 29, 2024

easwars left a comment

easwars commented Oct 30, 2024

dfawley Oct 30, 2024

ash2k Oct 30, 2024

dfawley Oct 30, 2024

ash2k Oct 30, 2024

dfawley Nov 11, 2024

ash2k Nov 11, 2024

dfawley Nov 11, 2024

easwars commented Nov 8, 2024

ash2k commented Nov 9, 2024

dfawley commented Nov 12, 2024

ash2k commented Nov 12, 2024

		// A failed call returns a non-nil error and could return partially read
		// buffers. It is the responsibility of the caller to free this buffer.

mem: implement ReadAll() for more efficient io.Reader consumption #7653

mem: implement ReadAll() for more efficient io.Reader consumption #7653

Conversation

ash2k commented Sep 20, 2024 • edited by easwars Loading

codecov bot commented Sep 20, 2024 • edited Loading

Codecov Report

purnesh42H commented Oct 22, 2024

ash2k commented Oct 22, 2024

ash2k commented Oct 28, 2024

easwars commented Oct 29, 2024

ash2k commented Oct 29, 2024

ash2k commented Oct 29, 2024

easwars left a comment

Choose a reason for hiding this comment

easwars commented Oct 30, 2024

dfawley Oct 30, 2024

Choose a reason for hiding this comment

ash2k Oct 30, 2024

Choose a reason for hiding this comment

dfawley Oct 30, 2024

Choose a reason for hiding this comment

ash2k Oct 30, 2024

Choose a reason for hiding this comment

dfawley Nov 11, 2024

Choose a reason for hiding this comment

ash2k Nov 11, 2024

Choose a reason for hiding this comment

dfawley Nov 11, 2024

Choose a reason for hiding this comment

easwars commented Nov 8, 2024

ash2k commented Nov 9, 2024

dfawley commented Nov 12, 2024

ash2k commented Nov 12, 2024

mem: implement `ReadAll()` for more efficient `io.Reader` consumption #7653

mem: implement `ReadAll()` for more efficient `io.Reader` consumption #7653

ash2k commented Sep 20, 2024 •

edited by easwars

Loading

codecov bot commented Sep 20, 2024 •

edited

Loading