- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some performance optimisations for Dict.merge and Dict.toList #5458
base: main
Are you sure you want to change the base?
Conversation
…ction from FSharpPlus so we can do type-checking in one loop. (The union function in FSharpPlus relies on Map.add internally so this is equivalent, but we have to be a bit careful which map we are inserting from and which one we are inserting into.) 2. Change Dict.toList to convert the dict to a list in one loop.
| _, vm, _, [ DDict(_vtTODO1, intoMap); DDict(_vtTODO2, fromMap) ] -> | ||
let f (accType, accMap) k v = | ||
TypeChecker.DvalCreator.dictAddEntry | ||
vm.threadID | ||
accType | ||
accMap | ||
(k, v) | ||
TypeChecker.ReplaceValue | ||
|
||
let initial = (_vtTODO1, intoMap) | ||
Map.fold f initial fromMap |> DDict |> Ply |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great -- some quick thoughts, regarding typechecking and those _vtTODO
s:
- I had only left those as TODOs because I didn't have a function like
dictAddEntry
available, which takes in the ValueType of the dict. - so we can rename those to
vt1
andvt2
(orvtLeft
andvtRight
-- whatever) and actually use them, as you've started to - I think the first part of the impl should likely be a call to
ValueType.merge
-- if it fails, then we raise some sort of RTE early, noting we can't merge the dicts - otherwise, the vt of
initial
(the first part in that tuple) should just be the merged vt -- and the rest of the code can stay as is.
so, I think
- your
f
fn is great - rename those vts
- edit the last 2 lines to something like:
match (vt1, vt2) with
| Ok vt ->
let initial = (vt, intoMap)
Map.fold ....
| Error ?? -> (some RTE failure -- take a peek at RuntimeTypes.fs and TypeChecker.fs to see which RTE to raise)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense. I remember a runtime error being raised both in TypeChecker.DvalCreator.dict
and dictAddEntry
which I think is the appropriate one you have in mind.
So I drafted the following code:
match VT.merge vt1 vt2 with
| Ok merged ->
let initial = (merged, intoMap)
Map.fold f initial fromMap |> DDict |> Ply
| Error() ->
let (k, v) =
if Map.isEmpty fromMap && Map.isEmpty intoMap then
// is this an impossible case?
else if Map.isEmpty fromMap then
Map.minKeyValue intoMap
else
Map.minKeyValue fromMap
RTE.Dicts.Error.TriedToAddMismatchedData(k, vt1, vt2, v)
|> RTE.Error.Dict
|> raiseRTE vm.threadID
The one thing that's bugging me about this draft is that this particular exception is constructed using a key and a value, but I think it's possible that the user deleted all key/value pairs from both maps, so we might not have the data we need to raise the exception in the Dict.merge
function.
What would you suggest here?
I think it's possible to make the (key, value) pair an option type and do some refactoring, but I'm unsure what the best solution here is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it sort of feels the situation (of 'merging' dicts) isn't fitting into any of the RTE cases/"shapes" that exist. Can you make a new one? Something like RTE.Dicts.Error.TriedToMergeMismatchedDicts(vt1, vt2)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it sort of feels the situation (of 'merging' dicts) isn't fitting into any of the RTE cases/"shapes" that exist. Can you make a new one? Something like
RTE.Dicts.Error.TriedToMergeMismatchedDicts(vt1, vt2)
?
This suggestion makes sense.
In the latest two commits, I added anew case as you suggested and replaced the call to dictAddEntry
in the fold function with the simple Map.add
since we perform type-checking at the start of the builtin. I think this commit will probably be fine now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There don't seem to be any tests around this builtin. (looking at backend/testfiles/execution/stdlib/dict.dark
). Could you please add at least one?
Edit: Ok, no failure-case tests, in which this RTE would actually be raised..
In doing so, you'll see you have to "add"+handle this RTE case in a few (2) places -- in order to do this, I'd recommend looking for all instances of "TriedToAddMismatchedData" throughout the codebase, and following the pattern you see there. Namely, you'll have to adjust 2 files related to the in-Dark representation of RTEs, and the pretty-printing of such.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thorough instructions; I tried to follow them as best I could, and the pattern matching is exhaustive now (at least on the F# side; no compile errors there, and I tried to be thorough on the .dark
side too), but I think there's an error on my part somewhere.
Here are the changes I attempted in the latest commit:
- Try making error pattern matching in
RuntimeTypesToDarkTypes.fs
exhaustive in functions:RuntimeError.Dicts.toDT" and
RuntimeError.Dicts.fromDT`. - Make pattern matching exhaustive in
languageTools/runtimeErrors.dark
- Add a pretty printer for the new error type in
prettyPrinter/runtimeError.dark
- Add a couple of tests for
Dict.merge
The problem I'm having is that it seems the case for pretty-printing this new error type isn't getting triggered, because running the new tests displays an entirely different message for the test in Expecto.
I tried figuring out why, and in my local copy, I replaced the return values for toDT
and fromDT
from RuntimeTypesToDarkTypes.fs
with exceptions to see if they get triggered, but it doesn't look like it to me.
(When I replace the body of another error case with an exception, I see that exception printed by Expecto, but I'm not seeing an exception printed for this case here.)
I'd appreciate any guidance when you're free to have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I see what's happening:
The case of "trying to merge mismatched dicts" is prevented by the function signatures (of both the Builtin and the Package fns) requiring Dict<'a> on the left and Dict<'a> on the right.
Basically, the RTE is currently unreachable, caught by earlier stuff.
If the fns looked like let merge (l: Dict<'a>) (r: Dict<'b>)
, then the RTE could be hit, as the fn-calling typechecking wouldn't catch this thing, but I'd say the signatures are actually right.
... in which case: I think we can actually delete the new RTE case, as it's seemingly unreachable.
And in its place, call upon Exception.raiseInternal
, (see usage elsewhere) which will turn into the UncaughtException
case of RTE.Error
. Please, when calling raiseInternal
, include metadata for vt1 and vt2. (though, I suspect that code will never be hit)
You can remove the new tests - we try to keep fn-arg-checking tests somewhere in testfiles/execution/language
.
Sorry for the back-and-forth here - it's been a bit since I was in this space, so didn't think of the fn sigs preventing this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been going through Andre Appel's book "Modern Compiler Implementation in ML" to hopefully build an interpreter for Standard ML (MLton takes some time compiling programs), and looking at places in the Darklang codebase which are new to me is a good lesson for that project that wouldn't have happened without the back-and-forth, so it's all good. 😅
I think the latest commit is fine in addressing your concerns, but let me know if there's anything else for me to improve on. (I removed the new error type, removed the new tests, and called raiseInternal
with the types as metadata.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work -- just some tiny feedback then I think this is good to merge.
… type checking to the Builin function (instead of relying dictAddEntry) and, if initial type checking passes, call 'Map.add' instead of 'dictAddEntry' in fold as 'dictAddEntry' would perform unnecessary type checking on every element due to being repeatedly called by 'Map.fold'
…smatchedDicts'
…ch expressions which were previously exhaustive before adding the new type, and add two tests correctly predicting failure of merging dicts of different types (but error string of these tests is incorrect; need to fix)
…ception is caught elsewhere, and add one test for failing case of 'Dict.merge' (when types mismatch)
…rguments as there are already general tests for checking that types of arguments are correct (somewhere in testfiles/execution/language)
There are two changes here:
** Avoid Map -> List -> Map conversion in
Dict.merge
**We use
dictAddEntry
again, which works. The FSharpPlusMap.union
function works by callingMap.add
for every key, so it's fine if we do the same (viadictAddEntry
).I think the only thing to note is that the initial type provided is
_vtTODO1
, which I don't think I can do anything about as I don't really know anything about type checking.I also changed the name of the maps to
fromMap
(the map we want to insert from/the one whose keys will be present in the final map) andintoMap
(the map we want to insert into which will have its values overwritten). I think that's just a little clearer.** Convert dict to list in one loop instead of two in
Dict.toList
**The initial implementation of
Dict.toList
involved:DList
containingDTuple
s (which involves traversing the whole list).I think we can do a little better in terms of performance, so we only perform one traversal to get the same result.
The technique is using the
Map.foldBack
function (which applies the provided fold function, starting from the highest key in the map, descending to the lowest key).During each key-value pair fold, we can convert they pair to a
DTuple
and add the tuple to the start of the accumulator which is a list. We always get a list which is in order that way.This REPL code demonstrates the trick:
Most of my programming before has been in solo projects and I don't know what things I'[ve picked up that are common knowledge, so I thought it was safe to be detailed about the thought process. 😅