Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hang when parsing unclosed comment #37

Closed
wants to merge 2 commits into from

Conversation

cebtenzzre
Copy link
Contributor

Since c73b53f, minja hangs when parsing the template {#, e.g.:

#include <minja/minja.hpp>
int main() {
    try {
        minja::Parser::parse("{#", {});
    } catch (...) {
        return 1;
    }
}

This program should exit with status code 1, but instead it hangs.

I tried a different fix for the lack of std::regex::multiline on MSVC, and it no longer hangs. Also, all of the tests pass—it seems like the version with multiline was also buggy. multiline makes anchors match at the end of a line, which I think is the opposite of what was intended.

The ability to parse incomplete templates without crashing or hanging is important to GPT4All, which parses the template as it is being typed in the Settings page. This provides instant feedback on whether it is valid.

I'd like to add a regression test for this change. Let me know where it belongs. Or, you can add one if you'd like.

@cebtenzzre
Copy link
Contributor Author

The CI failure is not related to this PR:

Error processing model meta-llama/Llama-3.1-8B-Instruct: 401, message='Unauthorized', url='https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/raw/main/tokenizer_config.json'

@ochafik
Copy link
Collaborator

ochafik commented Feb 1, 2025

I tried a different fix for the lack of std::regex::multiline on MSVC, and it no longer hangs. Also, all of the tests pass—it seems like the version with multiline was also buggy. multiline makes anchors match at the end of a line, which I think is the opposite of what was intended.

Thank you so much for diving into this! (my Windows debugging setup has been quite suboptimal - VNC on a EC2 instance ☠️ - #40 is going to need love soon)

I'd like to add a regression test for this change. Let me know where it belongs. Or, you can add one if you'd like.

Something in tests/test-syntax.cpp would be ace

@ochafik
Copy link
Collaborator

ochafik commented Feb 1, 2025

The CI failure is not related to this PR:

Ah, somehow it's not giving you access to the HF_TOKEN so no gated models for you :-S (did you try and run scripts/run_tests.sh locally?

There also seems to be failures in test-capabilities, more worrying. Testing out your patch :-)

@ochafik
Copy link
Collaborator

ochafik commented Feb 3, 2025

The following doesn't seem to work (assuming i patched it right):

{#-
...
-#}

@cebtenzzre
Copy link
Contributor Author

This is causing a parsing error on Windows that is not seen on Linux or macOS:

Unexpected character at row 1, column 13:
{% if foo %}
            ^
{% endif %}

Also, the tests are failing on Windows.

It seems like std::regex in the Microsoft STL may handle newlines differently. I think the reason std::regex::multiline is not available is that it is stuck ON by default on Windows, with no way to disable it.

@ochafik
Copy link
Collaborator

ochafik commented Feb 3, 2025

It seems like std::regex in the Microsoft STL may handle newlines differently. I think the reason std::regex::multiline is not available is that it is stuck ON by default on Windows, with no way to disable it.

Yes there's a lot going on on Windows (newlines being \r\n and MS's interpretations of stdc++ lib compliance being even more creative than Apple's). Hopefully #45 gets us going.

@ochafik ochafik closed this Feb 4, 2025
@ochafik
Copy link
Collaborator

ochafik commented Feb 4, 2025

@cebtenzzre thanks again for preparing this

@cebtenzzre
Copy link
Contributor Author

@ochafik I found a fix that works even with the nonconforming Microsoft STL, and added a test. #45 did not fix the issue. Could this PR be reopened?

https://github.com/nomic-ai/minja/tree/fix-opencomment-hang

@ochafik
Copy link
Collaborator

ochafik commented Feb 5, 2025

Ohh, sorry about this / thanks for keeping at it!

@ochafik
Copy link
Collaborator

ochafik commented Feb 5, 2025

Doesn’t seem to let me reopen it, necromancy has limits it seems. Could you please open a new one?

@cebtenzzre
Copy link
Contributor Author

Doesn’t seem to let me reopen it, necromancy has limits it seems. Could you please open a new one?

Done: #49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants