Plans for RAG #5

oatmealm · 2024-10-01T13:35:26Z

oatmealm
Oct 1, 2024

I’ll install and try asap but wanted to ask, are “instructions” text that’s sent with the prompt as is, assuming it won’t be truncated to fit the context? I’m guessing with Gemini models and others with huge context window it’s not a problem but with ollama, which I believe defaults to 2k this could be an issue (I think gptel might have it hardcoded for 4k or 8k if I remember correctly).

And in this context, what are your plans for rag? I’ve been trying elysa for a while but lately it simply hangs vectorizing the prompt, or is very slow. But the idea sounds cool.

daedsidog · 2024-10-02T11:49:23Z

daedsidog
Oct 2, 2024
Maintainer

I’ll install and try asap but wanted to ask, are “instructions” text that’s sent with the prompt as is, assuming it won’t be truncated to fit the context? I’m guessing with Gemini models and others with huge context window it’s not a problem but with ollama, which I believe defaults to 2k this could be an issue (I think gptel might have it hardcoded for 4k or 8k if I remember correctly).

You can see exactly how the the prompt is crafted by previewing it with evedel-preview-directive-prompt, that will open a new buffer with the prompt there. If by "as is" you mean that the reference text and directive text as being summarized/truncated in some way, then no, they are untouched, and are baked into an existing prompt. So far, no measures have been taken to consider model context limits. If you exceed the model context, you'll get a gptel error.

And in this context, what are your plans for rag? I’ve been trying elysa for a while but lately it simply hangs vectorizing the prompt, or is very slow. But the idea sounds cool.

I haven't decided yet. I would like to leverage the fact that you can tag your instructions and add commentary, in hopes that it will create a much more accurate vectorization. Right now, the approach I have in my head is to vectorize only references and memoize the vectorization, rebuilding the index only when the references were actually changed. We can chunk the references and make them fit the model context constraints.

I know RAG is something that will definitely be added sometime, but at the moment there are other more crucial features I would like to have that will help me work on the package, such as linked instructions. RAG will essentially try to free you from having to supply query tags, which is very lovely.

Converted to discussion.

0 replies

oatmealm · 2024-10-03T09:33:18Z

oatmealm
Oct 3, 2024
Author

Thanks for the explantion.
Installd and using. Works well.
Question: what's the strategy for removing generated content? Simply revert the buffer?
Hopefully there will be a standard library for emacs (like llm.el) for vectorizing content which the other package can use, the way gptel and llm are being leveraged right now. Automatically vectorzing contents on save for example (with exceptions and black lists) would be nice I think.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plans for RAG #5

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Plans for RAG #5

oatmealm Oct 1, 2024

Replies: 2 comments

daedsidog Oct 2, 2024 Maintainer

oatmealm Oct 3, 2024 Author

oatmealm
Oct 1, 2024

daedsidog
Oct 2, 2024
Maintainer

oatmealm
Oct 3, 2024
Author