Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solver 1.2.4.5 #49

Open
wants to merge 4 commits into
base: empty
Choose a base branch
from
Open

Solver 1.2.4.5 #49

wants to merge 4 commits into from

Conversation

janper
Copy link
Contributor

@janper janper commented Jun 1, 2021

I'd like to ask you to review the code and suggest high-level refactoring.

@janper janper requested a review from yanchith June 1, 2021 13:57
@janper
Copy link
Contributor Author

janper commented Jun 6, 2021

I am reopening this PR.
This Solver will be part of a huge update. I'm going to publish it only when several conceptual issues are resolved. The Solver is just one of them.

@yanchith
Copy link
Member

Starting review now

Copy link
Member

@yanchith yanchith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More feedback inline. In general I see a lot of the following patterns:

  • Significant overhead with intermediate data structures virtually everywhere. The solver component has at least 3 (but possibly as many as 5, depending on how you count) representation layers, each of which uses its own memory storage. This would hugely benefit from fused iteration and more rigid definition of what each layer means. This also makes the file more difficult to read.

  • Abundant use of helper methods (e.g. All and Any). These usually iterate over everything, but visually seem cheap. If you find yourself in O(N^2) or O(N * M) or O(N * M * K) situations this often, maybe there's a data structure that's better suited to your needs than lists (of lists (of lists)).

  • What I was missing the most was paragraphs of text describing the why of things (except the ones I wrote way back). Any contributor here would welcome a page describing what are kinds of transformation the function does on the representations of slots, modules and rules.

I realize these are not simple problems to solve, and there might be an underlying problem that the design of Monoceros itself is more complicated than it needs to be, or that Monoceros tries to solve too many problems. This might be just a difference in our preference, but I still see it as the greatest shortcoming. Regardless of the high level concern, this file can be cleaned up independently.

pManager.AddParameter(new RuleParameter(),
"Rules",
"R",
"All Monoceros rules",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capital R? .)

Comment on lines 22 to 24
/// <summary>
/// Registers all the input parameters for this component.
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment really says anything useful. You are overriding a standard method and are not doing anything unexpected there. If you omit it, I believe the standard method's docs will be used instead. Applies to more places than this.

}

/// <summary>
/// Wrap input geometry into module cages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment? Or maybe I just don't understand what it wants to say. Once again - you are not doing anything nonstandard here. I'd just remove this doc comment completely.

var rules = new List<Rule>();
var randomSeed = 42;
var maxAttempts = 10;
var maxTime = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a unit suffix (secs, millis, nanos) might help readability here.

var maxObservations = 0;

// Due to many early return branches it is easier to set and the re-set the output pin
DA.SetData(3, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using indices here, it might be helpful if you defined constants for in/out parameter slots for better readability. See https://github.com/subdgtl/WFC/blob/master/wfc_gh/WaveFunctionCollapse.cs#L15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sure? The GH convention is different AFAIK: https://github.com/search?l=C%23&p=50&q=DA.SetData&type=Code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know about GH conventions, but DA.SetData(3, false); tells me little, while DA.SetData(OUT_PARAM_CONTRADICTORY, false); tells me what's happening without needing to scan elsewhere.

Comment on lines 474 to 478
if (slot.AllowsAnyModule) {
AddRuntimeMessage(GH_RuntimeMessageLevel.Error,
"Unwrapping failed for slot at " + slot.AbsoluteCenter + ".");
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the never branch. It may be too much...

Comment on lines 966 to 980
public override string ToString( ) {
var b = new StringBuilder("Slot state { ", 64);

b.Append("[");
b.Append(slot_state[0]);
b.Append("][");
b.Append(slot_state[1]);
b.Append("][");
b.Append(slot_state[2]);
b.Append("][");
b.Append(slot_state[3]);
b.Append("] }");

return b.ToString();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if used. Consider removing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't even know what that is ;) I guess it's your legacy?

Comment on lines +837 to +848
if (observationResult == WfcObserveResult.Deterministic) {
stats.deterministic = true;
stats.contradictory = false;
}
if (observationResult == WfcObserveResult.Nondeterministic) {
stats.deterministic = false;
stats.contradictory = false;
}
if (observationResult == WfcObserveResult.Contradiction) {
stats.deterministic = false;
stats.contradictory = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum?

Copy link
Contributor Author

@janper janper Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, why? these are the output values for Grasshopper pins

Native.wfc_rng_state_init(&wfcRngStateHandle, rngSeedLow, rngSeedHigh);

fixed (AdjacencyRule* adjacencyRulesPtr = &adjacencyRules[0]) {
var result = Native.wfc_world_state_init(&wfcWorldStateHandle,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be initialized just once? It is not free.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please help me here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's look to our other solver component for guidance. https://github.com/subdgtl/WFC/blob/master/wfc_gh/WaveFunctionCollapse.cs#L470

You want to call wfc_world_state_init and wfc_world_state_slots_set just once per run of the solver. These two perform a lot of validation and cleanup.

You can use wfc_world_state_init_from to get the backup copy "for free". To restore from backup cheaply, use wfc_world_state_clone_from. You already use these, but they do not do anything, because you moved wfc_world_state_init and wfc_world_state_slots_set into the loop, which overwrites the results.

As for the RNG state, if your strategy is to change the seed with each attempt, you'll need to init and free it each time through the loop, which will have some overhead, but that's probably worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wfc_rng_state_init requires rngSeedLow and rngSeedHigh, which are coming from the seeded RNG. I'm changing the seed with each attempt. is the content of wfcRngStateHandle reusable for differnt rngSeedLow and rngSeedHigh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, wfc_rng_state_init is the only place where the random values are used, so if I only call it once and then keep copying the first result, how would I set the new random seed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See last paragraph of my comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also have a short call or online session here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's done. I've added some profiling reports and the pre and post processing is already fairly fast (tested on my old XPS):

Monoceros WFC Solver found a solution.

Data preprocessing: 5ms 
WFC Solver: 9ms 
WFC Pre & Post-processing: 1ms 
WFC Attempts: 8ms; Fastest: 8ms; Slowest: 8ms 
Data postprocessing: 0ms 

Rule count: 99
Part count: 7
Slot count (including outer Slots): 1914
Solution random seed: 42
Solve attempts: 1
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 130
The solution is deterministic - each Slot allows placement of exactly one Module Part.
WFC solver failed to find solution within 10 attempts

Data preprocessing: 3ms 
WFC Solver: 78ms 
WFC Pre & Post-processing: 2ms 
WFC Attempts: 76ms; Fastest: 7ms; Slowest: 8ms 
Data postprocessing: 0ms 

Rule count: 99
Part count: 7
Slot count (including outer Slots): 1914
Solution random seed: 372913049
Solve attempts: 10
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 132
Average observation count per attempt: 13
The solution is contradictory - some Slots are not allowed to contain any Module.
Monoceros WFC Solver found a solution.

Data preprocessing: 51ms 
WFC Solver: 2245ms 
WFC Pre & Post-processing: 10ms 
WFC Attempts: 2235ms; Fastest: 2235ms; Slowest: 2235ms 
Data postprocessing: 1ms 

Rule count: 3210
Part count: 92
Slot count (including outer Slots): 1452
Solution random seed: 1235407487
Solve attempts: 1
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 70
The solution is deterministic - each Slot allows placement of exactly one Module Part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even better in the release mode:

Monoceros WFC Solver found a solution.

Data preprocessing: 2ms 
WFC Solver: 9ms 
WFC Pre & Post-processing: 1ms 
WFC Attempts: 8ms; Fastest: 8ms; Slowest: 8ms 
Data postprocessing: 0ms 

Rule count: 99
Part count: 7
Slot count (including outer Slots): 1914
Solution random seed: 42
Solve attempts: 1
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 130
The solution is deterministic - each Slot allows placement of exactly one Module Part.
WFC solver failed to find solution within 10 attempts

Data preprocessing: 1ms 
WFC Solver: 92ms 
WFC Pre & Post-processing: 1ms 
WFC Attempts: 91ms; Fastest: 8ms; Slowest: 10ms 
Data postprocessing: 0ms 

Rule count: 99
Part count: 7
Slot count (including outer Slots): 2001
Solution random seed: 372913049
Solve attempts: 10
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 134
Average observation count per attempt: 13
The solution is contradictory - some Slots are not allowed to contain any Module.
Monoceros WFC Solver found a solution.

Data preprocessing: 25ms 
WFC Solver: 2148ms 
WFC Pre & Post-processing: 9ms 
WFC Attempts: 2139ms; Fastest: 2139ms; Slowest: 2139ms 
Data postprocessing: 1ms 

Rule count: 3210
Part count: 92
Slot count (including outer Slots): 1452
Solution random seed: 1235407487
Solve attempts: 1
Approach to entropy calculation: linear
Solver observations limited to: 4294967295
Observations required to find the solution: 70
The solution is deterministic - each Slot allows placement of exactly one Module Part.

It's actually bearable even on my old computer. The last setup is quite big and difficult actually.


var timeStart = DateTime.UtcNow;

while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rework this entire loop. I stared at it for quite some time and I am not completely sure I found every exit condition. Also, it does a lot of unnecessary work.

Any reason why this isn't structured as a seed-trying-loop and an attempt-loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because then you wouldn't know the seed that generated the valid result, would you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant nested loops, with the outer one trying seeds. Sorry if that wasn't clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I understand that. then you'll know the successful seed but to get a result you may still need more than one attempt. I want to record settings that secure an immediate solution

@janper
Copy link
Contributor Author

janper commented Jun 10, 2021

Thank you. I'll implement most suggestions and explain those I believe are not wrong.

@janper
Copy link
Contributor Author

janper commented Jun 10, 2021

  • Abundant use of helper methods (e.g. All and Any). These usually iterate over everything, but visually seem cheap.

I believe they exit as soon as they know the answer, which would make them as fast as conventional for loops. I can make or google some benchmarks. to be sure.

@janper
Copy link
Contributor Author

janper commented Jun 10, 2021

  • each of which uses its own memory storage

I'm using IEnumerate as much as possible, which (I believe) shouldn't allocate unless necessary.

EDIT: It does. IEnumerate always allocates, which puts the entire LINQ out of question. https://nede.dev/blog/preventing-unnecessary-allocation-in-net-collections
Surprisingly (to me), the best combination is List<> and foreach loop. I'll try to rewrite the Solver to minimize allocations and see if it was worth it.

@janper
Copy link
Contributor Author

janper commented Jun 10, 2021

  • The solver component has at least 3 (but possibly as many as 5, depending on how you count) representation layers

Is this what you mean?

  • Typed Rule -> Explicit Rule -> Rule for Solver
  • Module -> Module Parts
  • Slot allows all -> Slot with listed Modules -> Slot with listed Module Parts -> Slot for Solver
  • Solver component -> Slove function -> WFC library

What could be done about it?

@yanchith
Copy link
Member

yanchith commented Jun 11, 2021

  • Abundant use of helper methods (e.g. All and Any). These usually iterate over everything, but visually seem cheap.

I believe they exit as soon as they know the answer, which would make them as fast as conventional for loops. I can make or google some benchmarks. to be sure.

See second part of my comment. This is not All/Any vs for. This is list vs "a better suited datastructure". I don't know what that data structure is. Sometimes it might not be worth it, but there's a lot of multiplicative and quadratic time complexity.

@yanchith
Copy link
Member

  • each of which uses its own memory storage

I'm using IEnumerate as much as possible, which (I believe) shouldn't allocate unless necessary.

EDIT: It does. IEnumerate always allocates, which puts the entire LINQ out of question. https://nede.dev/blog/preventing-unnecessary-allocation-in-net-collections Surprisingly (to me), the best combination is List<> and foreach loop. I'll try to rewrite the Solver to minimize allocations and see if it was worth it.

If you think on how you'd implement some methods of IEnumerable, you'd find out that there's a difficult tradeoff there to make (space/heap-allocation vs time). The library runtime making this tradeoff for the user would be ok only if performance didn't matter.

@yanchith
Copy link
Member

  • The solver component has at least 3 (but possibly as many as 5, depending on how you count) representation layers

Is this what you mean?

* Typed Rule -> Explicit Rule -> Rule for Solver

* Module -> Module Parts

* Slot allows all -> Slot with listed Modules -> Slot with listed Module Parts -> Slot for Solver

* Solver component -> Slove function -> WFC library

What could be done about it?

Yes, exactly. I'd try reducing the number of arrows in your text, or at least describing in prose why every of these is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants