-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using enc_utf16() in prepare() and valid() functions is not working right. #2252
Comments
Here's some things the current, messy, implementation tries to achieve:
The list will grow as I recall more 😵 |
Please note that any core changes related to this issue has to go to a topic branch. That includes myself. This is really really tricky shit, I know that from the past. There are sooo many parameters and ways of starting john (let alone resuming). |
I figured as much. This 'seems' like not that big of a deal, but I know that encoding issues is MESSY. |
I agree on the branch. There was no way I was going to make any change, only to have 9 months go by before obscure stuff shows up totally busted. |
From #2243
I think if we make sure that all current formats are properly protected from this issue (formats listed in OP of this thread) then at least we buy time to look at this. Encoding is very powerful, but damn, it sux digging through complexities. Kinda like having something obscure in dynamic not work. Some little looking issue can end up being 100 lines changed in 8 places. |
IRL this is much less of an issue than it may seem. I believe the problems are only showing up when using input files that are not encoded in UTF-8? At least that was the case in #2243. All the affected formats are best used with UTF-8 input files. In the unlikely case a user picked some legacy encoding instead, he really should specify it on the command line and not rely on any heuristics (the heuristics are mainly for LM which is somewhat of a special case for several reasons). So possibly this is a "solution":
|
Please test cd4cf0d. It now bails with error like this:
Similar fixes made to mscash and oracle formats. |
An even simpler fix made to as400-ssha1 in 33e4c9f. |
Are you sure you want to error() in this valid? I would think a 1 time message, and then return 0 for the bad hashes would be more proper for valid. If the user calls john with no -format= then each format's valid gets blindly called. Would suck to have an input file that john bailed out on, especially since the format does NOT know if it is being used, OR if it is just being queried for validity. |
You are right. Thanks! |
Ok, this 'buys' us a bandage, until we can figure out a real solution, OR close this with a document listing that encoding usage is not valid until after init() has been called. |
if we're not seeing "this" format. See #2252.
Reference original issue #2243
Known formats using utf encoding in prepare/valid functions (note gpu will/should have same issue)
These appear to be the formats currently standing, which want to use the enc_to_utf16*() type functions (I did not check things like strlen16() or other unicode usages).
The text was updated successfully, but these errors were encountered: