-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
moving towards functional programming in sisl #665
Conversation
Once #393 is merged, the code should be moved to |
da2f353
to
f4e194e
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #665 +/- ##
==========================================
- Coverage 87.62% 86.77% -0.85%
==========================================
Files 363 359 -4
Lines 48870 45739 -3131
==========================================
- Hits 42820 39688 -3132
- Misses 6050 6051 +1 ☔ View full report in Codecov by Sentry. |
If I understand correctly, the goal is to have the methods of Love it! This will make it much easier to use these functions as nodes :) And also to play with the composite geometries. |
Yes, that is the plan. Eventually, some of them should also be completely removed from the When we get to it, it would be valuable with some more feedback on the approach, see my "worries" section above. |
@pfebrer do you have anything to say about the way to implement things above? Should we use dispatch methods, or? |
I think a dispatcher is nice to centralize everything, but it shouldn't be the way that users would be expected to use The type hinting is also not trivial to make it work, but it could be done. However, it would be very nice to use the dispatcher internally. I would propose that the dispatcher exists, but when a class is registered to it, it also creates the corresponding method on the class. Some advantages that I can think of:
The creation of the methods could be done at runtime or on the source code if doing it at runtime causes a significant overhead. |
@pfebrer could you have a look now, I think this method seems appropriate. It automatically creates the dispatch method, if needed, and exposes it (perhaps the way of exposing it should be done differently). Hence one can now do: sisl.tile(geometry, 2, 2)
# which is equivalent to
geometry.tile(2, 2) Right now I have only done this for the |
3b00ab8
to
fa65b5e
Compare
@@ -144,13 +144,13 @@ | |||
from numpy import argsort, dot, pi, sum | |||
|
|||
import sisl._array as _a | |||
from sisl._core.grid import Grid |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.grid
@@ -144,13 +144,13 @@ | |||
from numpy import argsort, dot, pi, sum | |||
|
|||
import sisl._array as _a | |||
from sisl._core.grid import Grid | |||
from sisl._core.lattice import Lattice |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.lattice
@@ -353,11 +352,12 @@ | |||
|
|||
# Since all these things depend on other elements | |||
# we will simply import them here. | |||
from sisl.physics.electron import wavefunction |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl.physics.electron
Looks awesome! I just had a quick glance at it, later I will have a deeper look, but I have some doubts:
|
I think so, I don't think users should get used to doing
Agreed, and
Agreed! The re-arrangements of things can be done later, that will be minor changes that won't affect end users. |
documentation looks fine, except that the |
Ok 👍
Yes, my only worry is that once you rearrange things everything basically ends up inside
I can probably fix this, I have a Master in modifying function signatures hahah |
Perhaps the idea would be to:
|
|
||
__all__ = ["Geometry", "sgeom"] | ||
from .atom import Atom, Atoms |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.atom
|
||
__all__ = ["Geometry", "sgeom"] | ||
from .atom import Atom, Atoms | ||
from .lattice import Lattice, LatticeChild |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.lattice
__all__ = ["Geometry", "sgeom"] | ||
from .atom import Atom, Atoms | ||
from .lattice import Lattice, LatticeChild | ||
from .orbital import Orbital |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.orbital
from .utils.mathematics import fnorm | ||
from sisl.utils.mathematics import fnorm | ||
|
||
from .geometry import Geometry |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.geometry
from sisl.utils.mathematics import fnorm | ||
|
||
from .geometry import Geometry | ||
from .lattice import BoundaryCondition, Lattice, LatticeChild |
Check notice
Code scanning / CodeQL
Cyclic import Note
sisl._core.lattice
Do you think there can be a way to treat associated data in a general way? E.g. imagine in In some sense But still for example for the def tile(geometry, ...):
xyz = geometry.xyz.tile(geometry.lattice, ...)
lattice = geometry.lattice.tile(...)
atoms = geometry.lattice.tile(...)
return geometry.__class__(xyz, atoms, lattice) So in essence sisl.Geometry.associated_data = ["xyz", "lattice", "atoms"]
# For the translate ufunc just apply the ufunc on everything, the order doesn't matter
# Perhaps this isn't even needed and the ufunc dispatcher could directly look for associated
# data if there is no implementation for Geometry.
register_ufunc_wrapper(sisl.Geometry, "translate")
# Define how to apply the tile ufunc on Geometry
register_ufunc_wrapper(sisl.Geometry, "tile", [
("xyz", ["lattice"]),
... # Meaning the rest, without a required order.
]) It doesn't need to be exactly like that, but you get the idea. Maybe it is too abstract, but I just put my ideas here because you might come up with a version of it that is less crazy :) |
def sub( | ||
coefficient: Coefficient, index: IndexArgument, inplace: bool = False | ||
) -> Optional[Coefficient]: |
Check notice
Code scanning / CodeQL
Explicit returns mixed with implicit (fall through) returns Note
|
||
|
||
@register_sisl_dispatch(module="sisl.physics") | ||
def sub(state: State, index: IndexArgument, inplace: bool = False) -> Optional[State]: |
Check notice
Code scanning / CodeQL
Explicit returns mixed with implicit (fall through) returns Note
def sub( | ||
statec: StateC, index: IndexArgument, inplace: bool = False | ||
) -> Optional[StateC]: |
Check notice
Code scanning / CodeQL
Explicit returns mixed with implicit (fall through) returns Note
ee13f5d
to
b4b04e3
Compare
Let's deal with that later. Right now this is already getting very complicated, it shouldn't affect this PR. |
@pfebrer you might know how to do this, In commit b4b04e3 you can see I have disabled lots of documentation with type-hinting. Basically, sphinx and the way the typehinting was enforced in my changes would not allow it to work. The basic problem is that
I have seached around, and can find numerous bug reports on python, sphinx, sphinx-autodoc-typehints, and basically Python is at fault here due to ForwardRefs being fully resolved. So the other tools would require some hackery to fix it. I don't know how, but I also don't want type-hinting to be a major burden. So the current way (on this branch) seems the simplest to work-around these issues, however, it will have some problems for the build documentation. |
Some day next week I'll try to build the documentation and see what it looks like. By the way, a certain ufunc has common arguments that should work for all objects + some arguments that are specific to the object. I think it would be nice to define the interface of the ufunc for the common arguments, and add some documentation that indicates what the operation is expected to do in a generic way. E.g. here: I would like that the signature is something like:
And the docstring explains what the operation is expected to do in an abstract way. Maybe in the parameters section of the docstring you could have a list of what types I believe that then we could expect that people really starts using |
I would very much like that, however, the problem of inspecting the type-hints is what is causing circular imports, and hence I can't get this to work. diff --git a/src/sisl/_ufuncs.py b/src/sisl/_ufuncs.py
index 3f2fdc622..71292988c 100644
--- a/src/sisl/_ufuncs.py
+++ b/src/sisl/_ufuncs.py
@@ -3,9 +3,10 @@
# file, You can obtain one at https://mozilla.org/MPL/2.0/.
from __future__ import annotations
+from collections import OrderedDict
from functools import singledispatch
from textwrap import dedent
-from typing import Callable, Optional
+from typing import Any, Callable, Dict, Optional, get_type_hints
from sisl.messages import SislError, warn
@@ -52,6 +53,37 @@ def register_sisl_function(name, cls, module=None):
return decorator
+def _args_intersection(args1: Dict, args2: Dict, skip: int = 1, order: bool = True):
+ """The intersection of two dictionaries containing typed arguments
+
+ Parameters
+ ----------
+ args1, args2 : Callable[..., Any]
+ the functions to compare and return the argument list from.
+ This function will only compare typed arguments (may change later)
+ skip: int
+ number of initial arguments to be skipped from the comparison
+ order: bool
+ whether the order of the arguments matter, or if they should just
+ be the overlap of arguments
+ """
+ args = OrderedDict()
+ args1 = _ordereddict_strip_first(args1, skip)
+ args2 = _ordereddict_strip_first(args2, skip)
+ if order:
+ for (n1, t1), (n2, t2) in zip(args1.items(), args2.items()):
+ if n1 == n2 and t1 == t2:
+ args[n1] = t1
+ else:
+ break
+ else:
+ inter = args1.keys() & args2.keys()
+ for n in inter:
+ if args1[n] == args2[n]:
+ args[n] = args1[n]
+ return args
+
+
def _append_doc_dispatch(method: Callable, cls: type):
"""Append to the doc-string of the dispatch method retrieved by `method` that the `cls` class can be used"""
global _registry
@@ -64,9 +96,47 @@ def _append_doc_dispatch(method: Callable, cls: type):
# Append to doc string
doc = (
- f"\n{cls.__name__}.{name}: equivalent to '{name}({cls.__name__.lower()}, ...)'."
+ f"{cls.__name__}.{name}: equivalent to '{name}({cls.__name__.lower()}, ...)'.\n"
)
method_registry.__doc__ += doc
+ doc = method_registry.__doc__
+
+
+def _ordereddict_strip_first(d: Dict, skip: int):
+ out = OrderedDict()
+ for i, item in enumerate(d.items()):
+ if i < skip:
+ continue
+ out[item[0]] = item[1]
+ return out
+
+
+_param = "Parameters\n----------\n"
+_see_also = "See Also"
+
+
+def _args_to_doc(args):
+ return _param + "\n".join(map(lambda x: f"{x[0]}: {x[1]}", args.items()))
+
+
+def _patch_doc_parameters(method_registry):
+ """Amend the documentation by adding the parameters section, possibly overwriting the old one"""
+
+ doc = method_registry.__doc__
+ args = method_registry.__sisl_args__
+
+ # strip parameters section and put in a new one
+ iloc = [0, 0]
+ iloc[0] = doc.find(_param)
+ if iloc[0] >= 0:
+ iloc[1] = doc.find(_see_also)
+ doc = doc[: iloc[0]] + doc[iloc[1] :]
+
+ param_docs = _param + "cls:\n" + _args_to_doc(args) + "\n"
+
+ iloc = doc.find(_see_also)
+ doc = doc[:iloc] + param_docs + "\n" + doc[iloc:]
+ method_registry.__doc__ = doc
def register_sisl_dispatch(
@@ -79,7 +149,7 @@ def register_sisl_dispatch(
global _registry
if method is None and module is None:
return lambda f: register_sisl_dispatch(cls, method=f)
- elif method is None:
+ if method is None:
if isinstance(module, str):
return lambda f: register_sisl_dispatch(cls, module, method=f)
return register_sisl_dispatch(cls, method=module)
@@ -104,6 +174,10 @@ def register_sisl_dispatch(
method_registry.__doc__ = doc
method_registry.__name__ = name
method_registry.__module__ = "sisl"
+ method_registry.__sisl_args__ = _ordereddict_strip_first(
+ get_type_hints(method), 1
+ )
+
_registry[name] = singledispatch(method_registry)
# Retrieve the dispatched method
@@ -121,6 +195,12 @@ def register_sisl_dispatch(
_append_doc_dispatch(method, cls)
register_sisl_function(name, cls)(method)
+ # retrieve the overlapping arguments
+ args = get_type_hints(method)
+ args = _args_intersection(args, method_registry.__sisl_args__)
+ method_registry.__sisl_args__ = args
+ _patch_doc_parameters(method_registry)
+
return method It simply results in:
which is the problem mentioned above. Lastly, with the current state, the doc will look like this (and they are added in a new segment called I think the above looks kind of OK. |
I don't understand, why do you need l inspect the typehints to define a generic signature for I think @tile.register(Geometry)
def tile(...):
... Then Creating dispatchers on the fly is also a good feature but I think the ufuncs provided as "builtins" by |
Agreed, but on the other hand, you would still have to read the documentation for I think for now this is ok, adding this later is just a matter of defining the ufuncs the first time and change the API. So it can easily be fixed. Probably the best thing to do now would be to finalize the methods that needs moving to the functional paradigm. Then we can always take it from there. |
Wouldn't the import problem disappear if you first import the objects (classes) and then the functions on Anyways I think it makes more sense to import them in this order: objects, then functions. Because functions make no sense without objects 😅 |
By the way, does this functional paradigm support inheritance? Should it? E.g. if you do: class MyGeometry(Geometry):
...
geom = MyGeometry(...)
sisl.tile(geom) Does it work? |
Yes, check this out: from functools import singledispatch
class A:
pass
class B(A):
pass
class C(B):
pass
class D(C):
pass
@singledispatch
def test(a):
print('P', type(a))
@test.register(A)
def _(a):
print('A', type(a))
@test.register(B)
def _(a):
print('B', type(a))
@test.register(C)
def _(a):
print('C', type(a))
test(A())
test(B())
test(C())
test(D()) which works as expected. :) This was high on the list, otherwise it wouldn't have been feasible. However, there are still some things that are annoying:
|
Great!
Hmm that leads into thinking that the right approach would be to have a separate class for each spin type matrix 😅 I know, crazy, but that is what I thought hahah |
Hey, while there aren't standard signatures for ufuncs, could the default signature be
|
Good idea, however I played a bit with it and the documentation for the objects methods turns out to be confusing. They will mention and object argument that isn't part of the api. |
do we have more details here, or should we start with getting this in? @pfebrer have you played with the docs here? |
I haven't looked at the docs, I thought everyhting was solved now. What are the things that are left to fix? Apart from it I think it is good to just merge this, since for now it doesn't result in any change on the API. One thing that I believe would be very important to do later is to define a generic description and interface for sisl ufuncs, since otherwise there is nothing "universal" about them :) |
I hope this may be of use to somebody ;) |
If not the interface, at least the description. There is some reason for the functions to share a common name, so that could be explained. |
This commit separates the geometry from the global name-space, and puts it in a sub-module for it self. This enables one to more coherently add more functionalities to the geometry, without disrupting the entire code-base. Signed-off-by: Nick Papior <[email protected]>
Now all sisl core functionality should be placed in the _core module. This should adapt the needs for future additions more easily and aid in the modularization of the code structure. As it is, nothing needs to be changed in terms of usage. One thing that does make it simpler to use is that the aim of sisl is to make the methods that expose a functional approach be a dispatch method. For instance: def tile(geometry: Geometry, ...) def tile(lattice: Lattice, ...) are now defined once, and then attached to the appropriate class of the same name. Simultaneously the method is hosted in a dispatch form so that: sisl.tile(geom, ...) can be done. A call sisl.tile(0, ...) will result in a SislError since the first argument has not been registered. As such, one cannot really see anything. To gain access to the direct method (with appropriate documentation), simply do: geometry_tile = sisl.tile.dispatch(sisl.Geometry) Signed-off-by: Nick Papior <[email protected]>
These accommodations required an upgrade of the typing system. Sadly there is a *bug*/*feature-request* that broke >=3.9 with missing names. It is related to this: https://bugs.python.org/issue41987 Basically the AtomsArgument can't be used in the type-hinting because the ForwardRef lookup tries to resolve it, regardless of whether we are in a TYPE_CHECKING state or not. Hence it *must* be defined before we can use it. This happens, even if __future__ annotations has been imported. This forces us to define them via strings manually, *sigh*. Signed-off-by: Nick Papior <[email protected]>
Added for the physics/state.py code. Also fixed cases where derived classes got a method assigned through the dispatch method. These were wrongly warned before. Now they will be ignored. Signed-off-by: Nick Papior <[email protected]>
Sadly, the register function of singledispatch does not handle the ForwardRefs very elegantly as the objects references *needs* to be available in the global space. This is rather unfortunate as this is exactly what we are trying to avoid using __future__ annotations. I'll have to see later if this causes issues in the documentation build. Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Added overwriting class name for the ufunc registration. Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
b0c91f5
to
b0427f9
Compare
|
||
from .sparse import * | ||
from .sparse_geometry import * | ||
from ._core import * |
Check notice
Code scanning / CodeQL
'import *' may pollute namespace Note
sisl._core
@@ -5,8 +5,8 @@ | |||
|
|||
import sisl._array as _a | |||
from sisl import SislError | |||
from sisl._core.sparse import _rows_and_cols |
Check notice
Code scanning / CodeQL
Unused import Note
|
||
from functools import singledispatch | ||
from textwrap import dedent | ||
from typing import Callable, Optional |
Check notice
Code scanning / CodeQL
Unused import Note
from itertools import product | ||
from math import acos | ||
from numbers import Integral, Real | ||
from pathlib import Path | ||
from typing import TYPE_CHECKING, Iterator, List, Optional, Sequence, Tuple, Union | ||
from typing import Iterator, List, Optional, Sequence, Tuple, Union |
Check notice
Code scanning / CodeQL
Unused import Note
|
||
if scale_atoms: | ||
# Atoms are rescaled to the maximum scale factor | ||
atoms = geometry.atoms.scale(max_scale) |
Check failure
Code scanning / CodeQL
Potentially uninitialized local variable Error
This commit separates the geometry from the
global name-space, and puts it in a sub-module
for it self.
This enables one to more coherently add
more functionalities to the geometry, without
disrupting the entire code-base.
isort .
andblack .
at top-leveldocs/
CHANGELOG.md
This is primarily a placeholder for future developments.
I think that this functional approach, much like numpy's, could be very instructive.
Methods could be added at will to the
Geometry
class, and patched as needed in user-codes.I believe this functional approach would also make the passivation requests more useful as the documentation could be made simpler and more separated.
My main concerns right now are these:
translate
in your own code, and then all internals would also use that oneg_
? Or something else? We don't want to type too much, but we should also consider that many of the methods also exists forLattice
.g_
prefixes, be a dispatch method based on the class of the first argument?I.e.
translate
is the exposed method, and then the real method is called based on the type of the first object.Geometry
classThis is mainly for my own remembrance!