-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two restart tests have answer changes due to moving mountain update in cam6_4_078 #1284
Comments
A little debugging info: The only field that is different in this test
is In the WACCM-x test
differences are in |
Based on @fvitt's idea of the tests being less than the radiation time-step are failing, I tried replacing the ERP test with 6, 10, 12, 20 timesteps instead of 9 ERP_D_Ln6.ne30pg3_ne30pg3_mt232.QPC7.derecho_intel.cam-outfrq3s_cosp.20250325_094658_mcchhl - PASS ERP_D_Ln10.ne30pg3_ne30pg3_mt232.QPC7.derecho_intel.cam-outfrq3s_cosp.20250325_094658_mcchhl - FAIL ERP_D_Ln12.ne30pg3_ne30pg3_mt232.QPC7.derecho_intel.cam-outfrq3s_cosp.20250325_094658_mcchhl - PASS ERP_D_Ln20.ne30pg3_ne30pg3_mt232.QPC7.derecho_intel.cam-outfrq3s_cosp.20250325_094658_mcchhl - PASS This issue might be related to |
I will update both tests to use the outfrq3h use case. Hopefully that solves all the problems! |
The short 9-step cam7 physics WACCMX restart test failed tag cam6_4_077, when it was first introduced. As noted by @PeterHjortLauritzen, differences are only in the 3-hour restart test passes:
|
@fvitt - based on what you just reported, do we need to add the If we need to add that variable, could someone advise on how to do this. Belive it or not, I've never added a variable to the restart files and I'm not sure on where these mods would need to happen. |
It would be good to get input from @brian-eaton on this. This seems to be lower priority issue and can probably wait for his return. I first thought is to change how RRTMGP updates this diagnostic to be similar to RRTMG's behavior. Adding a field to the restart file is another option. |
I can't look into details, but just want to mention that CS_RAINCERT is known to not be task independent. I thought I had added it to the fexcl in the user_nl_cam testmods file for cosp tests. Maybe I missed this one. All diagnostics from the cloudsat simulator (CS_*) may have this issue. |
@brian-eaton and @fvitt - thank you both! I've opened a PR that contains both of the fixes y'all have suggested (fexcl-ing CS_RAINCERT, and fixing how we handle swnet when shortwave radiation is not performed) and the failing tests pass. We'll get it in soon. |
What happened?
Restart tests for a couple of regression tests indicate answer changes between a full 9 time step run and a run that is broken in two pieces with a restart to finish the run. The test failures were missed when making cam6_4_078.
Subsequent investigaion indicates the problem occurs when both the change to gv_convect.F90 and the new setting of effgw_beres_dp=0.15 are implemented together. The line in gv_convect.F90 is:
hdepth = max(1000._r8, hdepth*qbo_hdepth_scaling)
Restarts are bit-for-bit when the line is reverted back to:
hdepth = hdepth*qbo_hdepth_scaling
It was noted in issue #1276 :
Change in gw_convect.f90: we introduced a check to make sure that the latent depths (variable called hdepth) do not exceed the range of latent heating depths covered by the lookup table.
A fix needs to be implemented to get restart tests to be bit-for-bit
What are the steps to reproduce the bug?
Using cam6_4_078:
Run: ERP_D_Ln9.ne30pg3_ne30pg3_mt232.QPC7.derecho_intel.cam-outfrq3s_cosp
This test was tested in the detail described above
Assume the following test which also fails restart is due to the same problem, but it has not been investigated:
ERS_Ln9.ne30pg3_ne30pg3_mg17.FHISTC_WXma.derecho_intel.cam-outfrq9s
What CAM tag were you using?
cam6_4_078
What machine were you running CAM on?
CISL machine (e.g. cheyenne)
What compiler were you using?
Intel
Path to a case directory, if applicable
/glade/derecho/scratch/cacraig/test_cac_intel_20250320150236
Will you be addressing this bug yourself?
Yes, but I will need some help
Extra info
@JulioTBacmeister @PeterHjortLauritzen @mbramberger
The text was updated successfully, but these errors were encountered: