Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(context optimization): Optimize LLM Context Management and File Handling #578

Merged
merged 4 commits into from
Dec 13, 2024

Conversation

thecodacus
Copy link
Collaborator

Optimize LLM Context Management and File Handling

Overview

This PR significantly improves how we manage LLM context and file handling in chat interactions. The changes optimize memory usage, extend chat context length, and provide a single source of truth for file content, resulting in more reliable and efficient AI operations.

Key Changes

1. Context Optimization

  • Implemented a new file context system that maintains a single source of truth for code files
  • Added file content filtering using gitignore-style patterns to exclude irrelevant files
  • Simplified bolt actions in chat history by truncating file content
  • Added line numbers to code context for better reference tracking

2. Chat History Management

  • Optimized message processing to remove redundant file contexts from chat history
  • Implemented content simplification for assistant messages containing file actions
  • Added support for streaming file contexts efficiently
  • Modified the chat client to handle the new file context system

3. Workbench Improvements

  • Enhanced action execution queue management
  • Added duplicate action execution prevention
  • Improved file action handling in the webcontainer
  • Modified artifact management for better state tracking

Benefits

  • Extended Chat Context: Enables longer conversations by reducing redundant content
  • Improved Accuracy: Single source of truth for file content reduces hallucinations
  • Better Memory Usage: Optimized context management reduces memory overhead
  • Enhanced Reliability: Consistent file context handling improves AI response accuracy

Technical Details

  • Added createFilesContext function to generate structured file contexts
  • Implemented simplifyBoltActions for optimizing chat history
  • Modified stream handling to incorporate file context efficiently
  • Added proper file filtering using common ignore patterns

Testing

  • Verified context optimization with large codebases
  • Tested chat history management with complex interactions
  • Validated file content handling and reference accuracy
  • Confirmed proper handling of ignore patterns

Migration Notes

No breaking changes. Existing chat interactions will automatically benefit from the optimizations.

Future Improvements

  • Implement differential context updates
  • Optimize context generation for even larger repositories

Preview

Context.Optimization.demo.mp4

@coleam00
Copy link
Collaborator

coleam00 commented Dec 7, 2024

@thecodacus I am going to review this over the next couple of days, this is absolutely fantastic and I want to give it some proper testing!

Copy link
Collaborator

@coleam00 coleam00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @thecodacus!! I took a look at everything and tested it on my end.

I think this is ready to merge except I just have a couple of questions:

  1. Do you intend on removing all the debug messages in the terminal? The ones that print out processedMessages. Super helpful for me to see what is happening behind the scenes btw! But it would be good to remove these before merging I'm thinking.

  2. Which LLMs have you tested this with? I tested with Qwen 2.5 Coder 32b and seemed to get suboptimal results compared to what I usually get, but then again you never know with local LLMs sometimes, it could be a fluke.

@thecodacus
Copy link
Collaborator Author

tested with some local models 7B and llama3.2B not much difference there,also tested with gpt-4o and claude sonnet almost similar output I got

@thecodacus
Copy link
Collaborator Author

thecodacus commented Dec 12, 2024

@wonderwhy-er , Since this is a small architectural change, I like to confirm with you , if this has any conflict with any changes that you planned for future?

if not then I will merge

@aliasfoxkde
Copy link
Collaborator

Fantastic work @thecodacus!! I took a look at everything and tested it on my end.

I think this is ready to merge except I just have a couple of questions:

  1. Do you intend on removing all the debug messages in the terminal? The ones that print out processedMessages. Super helpful for me to see what is happening behind the scenes btw! But it would be good to remove these before merging I'm thinking.
  2. Which LLMs have you tested this with? I tested with Qwen 2.5 Coder 32b and seemed to get suboptimal results compared to what I usually get, but then again you never know with local LLMs sometimes, it could be a fluke.

I like this a lot but yes, it is very verbose console logging. The way I would imagine handling this is disable console logging for production builds. So, the build command would turn it off (or something like that). For Development, and tracking down issues, this is great!

@thecodacus thecodacus merged commit 8c4397a into stackblitz-labs:main Dec 13, 2024
1 check passed
JJ-Dynamite pushed a commit to val-x/valenClient that referenced this pull request Jan 29, 2025
…zation

feat(context optimization): Optimize LLM Context Management and File Handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants