[Misc][V1] Enhance performance of KVCacheManager._get_cached_block #13878
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enhance KVCacheManager cached block dict retrieve by:
key in dict
check anddict getitem
operationdict.get
fn (cache the fn itself, rather than accessdict.get
every time) will get better performancefor _ in dict.values()
for better performance thannext(iter(dict.values()))
becausefor loop
use nativeGET_ITER
andFOR_ITER
instruction with better performanceA slight time cost grow in case of "unknown block", but we may expect one or more "cached block" hit when enable_prefix_cache=True, and it would act like
cache_hit, cache_hit, cache_hit, ..., cache_hit, cache_miss
. So I think it's worth to choose this impl?Benchmark
repeat 10000 times
Reference
10+ years performance problem of
dict.get
https://www.reddit.com/r/Python/comments/1atp5s/comment/c90onqq/
Iteration instruction compare