U_net_graph_dict key, Ising model #11

mengxi7 · 2025-03-16T08:27:40Z

It seems that U_net needs the value of "U_net_graph_dict" in lattice data, but during the generation of data, there is no computation of "U_net_graph_dict".

sanokows · 2025-03-16T08:43:06Z

can you give me some more details of what you have tried?

mengxi7 · 2025-03-16T13:19:47Z

i use python prepare_datasets.py to generate Ising model data
--datasets parameter ["NxNLattice_4x4", "NxNLattice_8x8", "NxNLattice_16x16"]
--problems parameter IsingModel = ["IsingModel"]

after generation of IsingModel data, i use the command

python argparse_ray_main.py --lrs 0.002 --GPUs 5 --MCMC_steps 400 --temps 0.6 --IsingMode NxNLattice_4x4 --EnergyFunction IsingModel --N_anneal 2000 --n_diffusion_steps 300 --batch_size 20 --n_basis_states 10 --noise_potential bernoulli --project_name IsingRun --seed 123 --graph_mode U_net

and there are two errors:
the first problem is in the initialization of Base class in BaseTrainer.py, which is self.beta = 1 / self.T_target, where self.T_target = 0, but i change parameter parser.add_argument('--T_target', default=1e-5, type = float, help='Define target temperature'), i don't know, is it right or wrong?
the second problem:

Traceback (most recent call last):
  File "/DiffUCO/argparse_ray_main.py", line 348, in <module>
       meanfield_run()
  File "/DiffUCO/argparse_ray_main.py", line 136, in meanfield_run
    detect_and_run_for_loops()
  File "/DiffUCO/argparse_ray_main.py", line 250, in detect_and_run_for_loops
    run(flexible_config=flexible_config, overwrite=True)
  File "/DiffUCO/argparse_ray_main.py", line 337, in run
    train = TrainMeanField(config)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/DiffUCO/train.py", line 264, in __init__
    self.__init_optimizer_and_params()
  File "/DiffUCO/train.py", line 362, in __init_optimizer_and_params
    self.__init_params()
  File "/DiffUCO/train.py", line 477, in __init_params
    input_graph_list, energy_graphs = self._prepare_graphs(jraph_graph_dict)
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/DiffUCO/train.py", line 1061, in _prepare_graphs
    input_graph_dict = pmap_batch_U_net_graph_dict_and_pad(batch_dict["U_net_graph_dict"], k = self.pad_k)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/DiffUCO/jraph_utils.py", line 400, in pmap_batch_U_net_graph_dict_and_pad
    keys = U_net_graph_dict_list[0].keys()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'keys'

seems need "U_net_graph_dict", but "U_net_graph_dict = None" set by SolutionDataset_InMemory.getitem() function

sanokows · 2025-03-16T13:52:24Z

Do you have the latest version of the code?
In the lates version there is no '--graph-mode U_net', there is only '--graph_mode Unet'.

The default for --T_target is 0 because this is the target temperature for CO problems. But for the Ising model, a target temperature of 0 is not supported. Usually, one considers the Ising model at a target temperature T >> 0, so you should set --T_target to a higher temperature.

mengxi7 · 2025-03-16T14:12:11Z

cause the parameter judge in train.py is U_net, so i change the parameter：

and about the temperature, the corresponding distribution $p(x) \propto exp(-\frac{1}{T} H(x))$ is the same for both co and ising model, why the temperature schedule is different? is the reason that COs are all set as maximization problem but ising model is minimization problem ?

sanokows · 2025-03-16T14:33:45Z

The U_Net graph mode is a graph unet that I have tried out and it is depreciated as it did not work well. The Unet which is supported is a standard conv net based unet.

The temperature schedule is the same for Ising and for co. but I think that unbiased sampling is not possible at T = 0 as a target temperature because of the computation of 1/T. so it is not supported for the Ising model. In CO we never have terms that compute 1/T so it is not a problem there

mengxi7 · 2025-03-17T00:52:29Z

The problem is solved!
Thank you for your detail respond!

mengxi7 · 2025-03-17T01:51:59Z

I met another problem, when i run:

python argparse_ray_main.py --train_mode PPO --lrs 0.0005 --temps 0.5 --GPUs 0 --minib_diff_steps 7 --n_diffusion_steps 14 --batch_size 140 --n_basis_states 4 --minib_basis_states 4 --relaxed --n_GNN_layers 8 --N_anneal 2000 --IsingMode BA_small --EnergyFunction MaxCut --mode Diffusion --beta_factor 1. --noise_potential bernoulli --multi_gpu --project_name final_runs --mov_average 0.09 --n_rand_nodes 3 --seed 123 --graph_mode normal --inner_loop_steps 1 --diff_schedule exp

but i seems met cuda memory error:

is there any method we can use multiple GPU?

sanokows · 2025-03-17T08:40:38Z

you can use multiple gpus by using '--GPUs 0 1 2 3' etc...

mengxi7 · 2025-03-17T13:26:06Z

sorry for asking problem once again, is there any possible to resume training from a checkpoint?

sanokows · 2025-03-17T16:12:10Z

yes, you can resume training with the usage of the script continue_training.py you just need to specify the GPUs and the wandb ID.

For models trained with PPO there is a small bug that when training is resumed because the PPO uses a moving average of the average and std of the reward and I forgot to implement that those should also be stored and loaded. So for PPO resuming training does not work perfectly. But in most cases this should not be a problem.

mengxi7 · 2025-03-18T06:16:23Z

OK and thank you for detailed and patient explanation!

sanokows closed this as completed Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

U_net_graph_dict key, Ising model #11

U_net_graph_dict key, Ising model #11

mengxi7 commented Mar 16, 2025

sanokows commented Mar 16, 2025

mengxi7 commented Mar 16, 2025 •

edited

Loading

sanokows commented Mar 16, 2025

mengxi7 commented Mar 16, 2025 •

edited

Loading

sanokows commented Mar 16, 2025

mengxi7 commented Mar 17, 2025

mengxi7 commented Mar 17, 2025

sanokows commented Mar 17, 2025

mengxi7 commented Mar 17, 2025

sanokows commented Mar 17, 2025 •

edited

Loading

mengxi7 commented Mar 18, 2025

U_net_graph_dict key, Ising model #11

U_net_graph_dict key, Ising model #11

Comments

mengxi7 commented Mar 16, 2025

sanokows commented Mar 16, 2025

mengxi7 commented Mar 16, 2025 • edited Loading

sanokows commented Mar 16, 2025

mengxi7 commented Mar 16, 2025 • edited Loading

sanokows commented Mar 16, 2025

mengxi7 commented Mar 17, 2025

mengxi7 commented Mar 17, 2025

sanokows commented Mar 17, 2025

mengxi7 commented Mar 17, 2025

sanokows commented Mar 17, 2025 • edited Loading

mengxi7 commented Mar 18, 2025

mengxi7 commented Mar 16, 2025 •

edited

Loading

mengxi7 commented Mar 16, 2025 •

edited

Loading

sanokows commented Mar 17, 2025 •

edited

Loading