Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some extra features? #3

Open
vans163 opened this issue Jan 22, 2020 · 6 comments
Open

Some extra features? #3

vans163 opened this issue Jan 22, 2020 · 6 comments

Comments

@vans163
Copy link

vans163 commented Jan 22, 2020

What are your thoughts on join/leave subscriptions and allowing metadata with the registration?

@max-au
Copy link
Owner

max-au commented Jan 27, 2020

Yes, I am working on these two, but do not have any specific timeframe yet.

@vans163
Copy link
Author

vans163 commented Jan 27, 2020

Waiting for ForgETS as well :P, would be nice to try it, even if its not ready (if its planned to.be open source).

@max-au
Copy link
Owner

max-au commented Feb 9, 2023

Better late than never.... pg monitoring (join/leave subscriptions) have been implemented ~October 2021 (and now available in OTP 25.1 and above). Of course, spg has it as well.

Metadata is a trickier question, I still don't have a performant enough implementation. But there is a solution for that, with full scope monitoring. Effectively we run a few more processes monitoring groups, and those processes attach necessary metadata.

@vans163
Copy link
Author

vans163 commented Feb 12, 2023

I used a hack for metadata where the scope was the metadata, and the query became a direct ETS lookup. I think it was mentioned on OTP issue tracker as well.

worker = :physical_node_X
:pg.join({PGInferenceWeight, worker.uuid, jobWeight}, self())

    def inference_job_weight(worker_uuid) do
        :ets.select(:pg, [{
            {{PGInferenceWeight, worker_uuid, :"$1"},:"$2",:_}, 
            [], 
            [{{:"$1", :"$2"}}]
        }])
        |> Enum.reduce(0, fn({weight, pids},acc)-> 
            acc + weight*length(pids)
        end)
    end

Example is: We have a physical worker node that does AI inference, and it can run more than 1 inference in parallel. Because different types of inference take different amount of resources, and we know cost upfront, we assign weight to each. We can join PG group + query then sum the weights and if cum_weight >= 1 not queue further inferences on that node.
If anything goes wrong like a client randomly drops, the inference request will autoleave the group and neatly adjust the cum_weights next time its queried. This leads to much cleaner looking (and less buggy/racy) code.
In the super rare case there is a race (as there is no locks or sync primitives) we don't really care the node will simply get a bit overloaded for the next few seconds. So it becomes like best effort load balancing.

About ForgETS, we coded this up https://github.com/xenomorphtech/mnesia_kv. Its missing distributed functionality (so far company not fortunate enough to reach the scale needed) which ideally would be C A P, with the C handled by a group leader (if you want basic C you need to execute TX on the leader node).

@max-au
Copy link
Owner

max-au commented Feb 13, 2023

We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from spg/pg is ETS table type. Right now it's hardcoded as a set, while it would be more performant to use ordered_set for faster selection in a consistent hash ring.

@vans163
Copy link
Author

vans163 commented Feb 17, 2023

We are using the same technique (encoding sharding/partitioning in the Group Name), but one thing that is missing from spg/pg is ETS table type. Right now it's hardcoded as a set, while it would be more performant to use ordered_set for faster selection in a consistent hash ring.

This would be nice to make it ordered_set, or allow user to configure it when they init pg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants