Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to only load the encoding needed? #23

Closed
blackdiz opened this issue May 3, 2023 · 5 comments
Closed

Is there a way to only load the encoding needed? #23

blackdiz opened this issue May 3, 2023 · 5 comments

Comments

@blackdiz
Copy link
Contributor

blackdiz commented May 3, 2023

Hey there, thanks for your hard work. We're interested in using this library on mobile, but we noticed that the initialization process takes some time. We dug into the code and saw that DefaultEncodingRegistry.initializeDefaultEncodings() loads all the encodings. We only require the r50k_base.tiktoken encoding, so is there a way to load just that one and speed up the initialization?

@tox-p
Copy link
Contributor

tox-p commented May 3, 2023

Currently not, but I would be open to adding such a functionality

Would adding a new LazyEncodingRegistry that does not initialize any encodings on construction but does so lazily at first getEncoding call for that encoding fit your needs?

@blackdiz
Copy link
Contributor Author

blackdiz commented May 3, 2023

Sure, that would be very appreciated!

@blackdiz blackdiz closed this as completed May 3, 2023
@blackdiz blackdiz reopened this May 3, 2023
@tox-p
Copy link
Contributor

tox-p commented May 3, 2023

I am currently a little bit busy :) Could you open a PR with the change? If I am not mistaken it should be pretty straightforward, just extracting the common functionality of DefaultEncodingRegistry into an AbstractEncodingRegistry, renaming the DefaultEncodingRegistry to EagerEncodingRegistry, creating the LazyEncodingRegistry alongside it and exposing it via a new newLazyEncodingFactory in the Encodings class

@blackdiz
Copy link
Contributor Author

blackdiz commented May 4, 2023

OK, I don't have the experience to contribute to open-source projects, but I'll give it a try.

@tox-p
Copy link
Contributor

tox-p commented May 16, 2023

Thanks for the implementation 😊 This feature is released as part of 0.5.0 and should soon be available on maven central

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants