Skip to content

< Back to Aurora Known Issues page

Sync tables now

Open Issues

Internal ID Description Vendor ID Reproducer Path PoC Priority? ETA Date Opened Last Updated
122 [IntelPython] Bug in DPCTL to support for order parameter for dpt.asnumpy No response https://github.com/IntelPython/dpctl/issues/2138 Abhi No response 2026-02-23 2026-02-23
121 [IntelPython] Feature request for sub-class support in dpnp arrays No response https://github.com/IntelPython/dpnp/issues/2764 Abhi No response 2026-02-23 2026-02-23
120 [IntelPython] dpnp array .data.ptr on array views ignores USM offset No response https://github.com/IntelPython/dpnp/issues/2781 Abhi 🚨 No response 2026-02-23 2026-02-23
119 [IntelPython] Indexing bug with dpnp nd-array No response https://github.com/IntelPython/dpnp/issues/2783 Abhi 🚨 No response 2026-02-23 2026-02-23
118 Incorrect RUNPATH in libimf.so and libirng.so No response Embeded Ye Luo No response 2026-02-19 2026-02-19
117 Fortran ICE module + input)in) CMPLRLLVM-73523 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/ifx/ice_module_in Thomas / Victor 2026.0 2026-02-18 2026-02-23
116 PCIe counters has a regression on 1.3.X for Datacenter Max GPUs https://github.com/intel/xpumanager/issues/119 https://github.com/intel/xpumanager/issues/119 Servesh No response 2026-02-18 2026-02-18
115 flash attn and fused_moe/test_grouped_gemm tests fail in the VLLM framework No response source/reproducers/frameworks/vllm/ Servesh / Nathan No response 2026-02-16 2026-02-17
114 Offline debugging issues No response N/A Servesh 3/5 next release (~1146.56-8), 2/5 still under analysis 2026-02-16 2026-02-18
113 Engineering version of vtune-backend is extremely slow VASP-33498 /tmp/rcaddy/tmp on Aurora head node 11 Robert Caddy No response 2026-02-11 2026-02-18
111 [Frameworks] alltoallv with zero-sized buffer from pytorch https://github.com/uxlfoundation/oneCCL/issues/190 MLSL-4075 https://github.com/argonne-lcf/nekRS-ML/blob/alcf4/3rd_party/dist-gnn/run_all2all_bench.sh Riccardo Balin In debug 2026-02-05 2026-02-18
110 [Frameworks] degraded Ptycho_Vit performance Vs A100 No response https://github.com/SYNAPS-I/ptycho-vit/tree/aurora_port Varuni Katti Sastry No response 2026-02-03 2026-02-04
109 Global MPI rank issue with STAT HPE ticket CPE-13691 /home/jkwack/Tools/STAT/Multi-node_test on Sunspot JaeHyuk Kwack 🚨 No response 2026-02-02 2026-02-02
108 [LZ] Hanging on event with multiple immediate command lists No response source/reproducers/l0/synch_hang_multi_imm Paulius Velesko Working with 2025.3 SDK + 1146.40 runtime 2026-01-27 2026-01-27
107 Vtune times out even when run with collection paused VASP-33391 /lus/flare/projects/CoreCollapseModel/rcaddy/vtune_issue /lus/flare/projects/Tools/jkwack-tools-reproducer/Robert_Caddy/vtune_issue Robert Caddy 2026.0 2026-01-07 2026-02-11
106 [LZ] Hang on zeEventPoolDestroy when called before a non-related non-same-pool signal GSD-12152 source/reproducers/l0/multi_event_pools_hang Colleen, Paulius No response 2026-01-07 2026-01-07
105 PCIe counters not working on LTS2 2523.31 and xpu-smi 1.2.X https://github.com/intel/xpumanager/issues/114 GSD-12079 in issue Servesh Fixed in newer PE on sunspot. will close once it's default on aurora 2026-01-06 2026-02-18
104 [LZ] Crashing with UseKMDMigration GSD-12102 source/reproducers/dpcpp/supercontext Thomas Under investigation 2025-12-17 2026-01-07
103 [Frameworks][PyTorch][IPEX] PyTorch Complex Matmul support W/O IPEX No response /lus/flare/projects/datasets/softwares/testing/ptychi_tests/complex.py in test set at: source/reproducers/frameworks/pytorch_matmul_ipex Khalid Hossain PyTorch-2.10 (next drop, currently on sunspot) 2025-12-17 2026-02-16
102 [Frameworks][Triton] "No device of requested type available" when ONEAPI_DEVICE_SELECTOR="level_zero:gpu" PYTORCHDCQ-7882 source/reproducers/frameworks/triton_get_device Nathan Nichols WA: ONEAPI_DEVICE_SELECTOR="*:gpu" https://github.com/intel/intel-xpu-backend-for-triton/pull/5745 2025-12-17 2026-01-06
101 Signalling a clSetUserEventStatus does not wake up barriers a barrier depending on it for in-order queues. GSD-12087 source/reproducers/opencl/user_event_in_order Paulius Velesko No response 2025-12-11 2025-12-12
100 Level Zero event used between an in-order immediate command list and out-of-order regular comment list resulting in ZE_RESULT_ERROR_INVALID_ARGUMENT GSD-12085 source/reproducers/l0/inorder_outorder_event/ Paulius Velesko No response 2025-12-11 2025-12-11
99 Advisor tripcounts analysis fails with a PyTorch example. ADV-10735 /flare/Performance/jkwack/Tools/Roofline/SC25_tutorial/ai_ml_profiling/reproducer or /lus/flare/projects/Tools/jkwack-tools-reproducer/JaeHyuk/advisor_pytorch/reproducer or source/reproducers/tools/pytorch_advisor JaeHyuk Kwack 🚨 Under investigation, Advisor and python compatibility issues maybe 2025-12-09 2026-02-18
98 Hanging OpenCL code when one command queue waits on an event from another command queue CMPLRLLVM-72048 / GSD-12075 source/reproducers/opencl/hanging_marker Colleen Under investigation 2025-12-02 2026-01-07
97 SHMEM on Aurora: Unit test wait_until_all-on_queue-2 hanging https://github.com/oneapi-src/ishmem/issues/15 source/applications/ishmem_sis Colleen / Abhi Actively working on it 2025-11-21 2025-12-10
96 Sporadic libze_intel_gpu.so segmentation fault when running QMCPACK GSD-12033 See attached reproducer Ye Luo 🚨 Intel working on reproducing 2025-11-17 2026-02-18
95 Memory leak in Libfabric No response /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/mpi/cxi_memory_lead Rob Lathan Fixed by https://github.com/ofiwg/libfabric/pull/11334, thanks Rob! expect in SHS 13.1.0, on aurora end of March 2025-11-13 2026-02-18
94 zeMemFree slowdown in a loop GSD-11962, NEO-17411 source/reproducers/l0/zememfree_slowdown/ Colleen 🚨 being investigated 2025-11-08 2026-02-18
93 oneCCL exeption with PyTorch DTensor: SYCL recv is not supported for multi-node case MLSL-3951 In the text body source/reproducers/frameworks/pytorch_93 (note only for single node, we must test by hand for 2 nodes) Väinö Hatanpää oneAPI 2025.3, oneCCL 2021.17 2025-11-05 2026-02-04
92 SYCL device info free_memory wrong on 2-stack PVC1550 GPU CMPLRLLVM-71510, GSD-12043 source/reproducers/dpcpp/sycl_free_flat Jakub H Under investigation 2025-10-31 2026-01-07
91 sycl failed malloc_device on GPU takes 20 seconds GSD-10587 source/reproducers/dpcpp/slow_alloc/ Jakub H post-1146.40, fixed internally, but under investigation on what to cherry 2025-10-31 2026-01-07
90 Device Sanitizer + LIBOMPTARGET_DEBUG=1 issues for the GAMESS RI-MP2 mini-app CMPLRLLVM-71455 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/tools/sanitizer_rimp2_test (w60 ones) Brian Fixed internally, 2026.0 (end of april) 2025-10-31 2026-02-18
89 Device Sanitizer breaks with MKL DGEMM call in GAMESS RI-MP2 mini-app MKLD-19334 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/tools/sanitizer_rimp2_test (w30 ones) Brian, JaeHyuk Fixed internally, 2026.0 (end of april) 2025-10-31 2026-02-18
87 QUDA compile fail cmplrllvm-70981 source/reproducers/openmp/quda_crash Xiayong Jin / Brian W 2026.0 (end of april) 2025-10-28 2026-01-07
86 omp_alloc should support pinned memory, or implement proper fallback behavior CMPLRLIBS-35442 /home/kweide/projects/OpenMP_VV/tests/5.1/allocate/test_omp_alloctrait_pinned.c and source/reproducers/openmp/omp_alloctrait_pinned in the test set Klaus Weide fixed internally -- correct error message. likely 2026.1 2025-10-28 2026-02-18
85 zeEventQueryKernelTimestampsExt is broken with IMM command lists GSD-11124 source/reproducers/l0/zeEventQueryKernelTimestampsExt_clock Thomas/John Mellor-Crummey In progress 2025-10-27 2025-11-12
84 Device Sanitizer is not functional with OpenMP C/Fortran codes /lus/flare/projects/Aurora_deployment/jkwack/JK_AT_Tools/sanitizer and source/reproducers/tools/sanitizer JaeHyuk Kwack 🚨 2025.3 2025-10-22 2026-02-18
83 With ifx, openmp_version is missing from omp_lib CMPLRLIBS-35365 /home/kweide/tests/test_openmp_version.f90 and source/reproducers/openmp/omp_version in the test set Klaus Weide 2025.3 2025-10-20 2025-10-24
81 IGC_StackOverflowDetection not working GSD-11763 source/reproducers/openmp/stack_overflow_not_working Brian In progress 2025-10-15 2025-10-29
80 VTune fails with "Assertion failed: tool_gtpin_support:126: (buffer) " VASP-32612, GTPIN-1169 /lus/flare/projects/Aurora_deployment/jkwack/JK_AT_Tools/Apps/GAMESS_RI-MP2_MiniApp source/reproducers/tools/vtune_gtpin_fail in the test set JaeHyuk Kwack 🚨 2025.3 2025-10-10 2025-10-30
79 Advisor fail with "advisor: Warning: The application returned a non-zero exit value." ADV-10687 source/reproducers/tools/advisor_gflop JaeHyuk Kwack Fixed with advisor --version == 616302, which should be in 2025.3 2025-10-08 2025-10-30
77 [SYCL] Function pointers compilation issue CMPLRLLVM-16317 Reproducer below and source/reproducers/dpcpp/func_pointers Abhi, Patrick Steinbrecher 🚨 Under discussion 2025-10-06 2025-10-15
76 Segfaults in MPICH routines in next-eval No response for XGC: /lus/flare/projects/catalyst/world_shared/zippy/xgc Tim Williams 🚨 No response 2025-10-01 2025-10-01
74 ZES_ENABLE_SYSMAN should default to 1 in the oneapi module No response see Details Tim Williams No response 2025-09-29 2025-10-15
73 "error: undefined reference to `old_llvm.umul.with.overflow.i64'" in newer kokkos CMPLRLLVM-70603, GSD-12239 source/reproducers/dpcpp/kokkos_mdspan_umul Daniel Arndt Compiler-side fixed, waiting on agama fix 2025-09-17 2026-02-18
71 RPC launch error tracking 2025-09-15 2025-09-23
70 PALS gpu-bind, composite, envall lead to "launch failed" DCE Case 5392152905 applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/mpi/envall Thomas Applencourt Fixed in USS-1.5 (March '26) 2025-09-10 2025-12-09
68 warpx segfaults/hangs with OpenPMD enabled No response /lus/flare/projects/catalyst/world_shared/zippy/reproducers/issue52/ Tim Williams No response 2025-08-23 2026-01-08
67 warpx Debug build crashes oneAPI compiler CMPLRLLVM-24314 /lus/flare/projects/catalyst/world_shared/zippy/reproducers/issue52/ Tim Williams No response 2025-08-21 2025-10-29
65 Clarification requested about ZE_DEVICE_PROPERTY_FLAG_ONDEMANDPAGING on PVC GSD-11510 source/reproducers/l0/ondemand_paging/ Colleen implemented, post-1146.41+, ~ Jan. (1146.58) 2025-08-20 2026-02-18
64 E3SM fortran compile ICE CMPLRLLVM-69862 source/reproducers/ifx/e3sm_homme_ICE_error Abhi 2025.3.0 2025-08-18 2025-10-09
63 Kokkos kernels fails to build with kokkos built with openmp enabled CMPLRLLVM-69908 source/applications/kokkos-kernels Sean Koyama / Colleen Bertoni gone starting with 4.19 (fixed in 2025.3 branch) 2025-08-18 2025-09-16
62 -ftarget-register-alloc-mode=pvc:large and "-device 12.60.7" for AOT GSD-11490 source/reproducers/general/ftarget-register-alloc-mode_flag Steve Rangel Fixed internally, 1146.58 2025-08-14 2026-02-18
60 ext_oneapi_memcpy2d is significantly slower with implicit scaling than explicit and on PVC vs A100 GSD-11132, GSD-12277 source/reproducers/dpcpp/ext_oneapi_memcpy2d_perf Natalie Beams No response 2025-07-29 2026-02-03
58 kokkos inclusive and exclusive scan giving incorrect answers for 1146.10 CMPLRLLVM-69285, GSD-11736 source/reproducers/dpcpp/kokkos_optimization_scan Daniel Arndt 🚨 1146.40 (two weeks out -- end of Nov) 2025-07-23 2025-12-10
57 GPU segfault in gtensor_bench with 2025.2 MKLD-18276, CMPLRLIBS-35326, CMPLRLLVM-68696 source/applications/gtensor_bench Colleen Bertoni 2025.3 2025-07-22 2025-08-11
56 RSBench-SYCL incorrect answers with 1146.10 GSD-11247 source/applications/RSBench/ John Tramm, Colleen Bertoni 1146.31 2025-07-22 2025-09-17
55 Linking in LZ causes changes in signal handling cmplrlibs-35385, GSD-11413 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/l0/signal_handler/ Thomas Applencourt, Colleen Bertoni Fixed internally, still in vetting 2025-07-22 2025-12-10
54 oneCCL zeMemGetAddressRange error with alltoallv and zero-sized buffers oneCCL GitHub Issue: https://github.com/uxlfoundation/oneCCL/issues/174, MLSL-3764 See instructions on oneCCL GitHub Issue: https://github.com/uxlfoundation/oneCCL/issues/174 and source/reproducers/mpi/oneccl_174 Riccardo Balin 🚨 oneCCL 2021.17, oneAPI 2025.3 2025-07-18 2026-02-04
52 compiler segfaults linking warpx binary GSD-11357, GSD-11855 /lus/flare/projects/catalyst/world_shared/zippy/reproducers/issue52/warpx Tim Williams 🚨 2025.2 + 1146.10 2025-07-07 2026-01-08
47 Non standard MPI knobs suggested for performance ANL-291 N/A Servesh M No response 2025-06-23 2025-06-27
43 CMake can't find MKL::MKL_SYCL with MPI wrapper compilers No response https://github.com/thilinarmtb/onemkl_cmake_mpi_bug Thilina Ratnayaka, Colleen Bertoni improvements will be part of the next oneMKL release, 2025.3. 2025-06-11 2025-06-25
39 Feature request for Aurora runtime to include debugging symbols ANL-286, HPCS-15374, GSD-11427 feature request Ye Luo 1146.40 drop 2025-05-29 2025-12-10
38 One application in GRID consistently hangs GSD-11441 /lus/flare/projects/Aurora_deployment/xyjin/W/test_grid_g5r5_paboyle Xiao-Yong Jin 🚨 Internal investigation, testing a patch, ~1146.58 2025-05-27 2026-02-18
36 (Occasional Interruptible) hangs in applications Possibly related to ANL-215 /lus/flare/projects/Aurora_deployment/xyjin/W/test_example_detar.skel Xiao-Yong Jin 🚨 No response 2025-05-15 2025-07-09
33 Crash when calling too many MPI_Probe https://github.com/pmodels/mpich/issues/7427 https://github.com/pmodels/mpich/issues/7427 David--Cléris Timothée No response 2025-05-15 2025-05-15
32 PETSc segfaults in sparse matrix calls IGDB-6516, GSD-10450 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/mkl/csr_gemv_usm/ Junchao Zhang 🚨 2025.3 for part malloc_shared in MKL 2025-05-15 2025-06-25
31 GAMESS segfaults with -O0 GSD-10393, CMPLRLIBS-35345,GSD-11035 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/openmp/gamess_O0_page_fault Colleen Bertoni 🚨 1146.31 (Targeted for LTS2 (1146.12+), contained with the IGC 2.16 series / WW34 (2-3 weeks)) 2025-05-14 2025-09-17
30 Copy 2D/3D are broken (zeCommandListAppendMemoryCopyRegion) NEO-14954, GSD-11132 https://github.com/rpereira-dev/ze-zoo also source/reproducers/l0/copyRegionPitch Romain PEREIRA and Thomas APPLENCOURT 🚨 No response 2025-05-10 2026-02-18
29 Significant slowdown with LAMMPS in first run, subsequent runs much faster No response /flare/catalyst/proj_shared/knight/projects/ExtremeCarbon/snap-carbon-scaling/1B/ Christopher Knight No response 2025-05-09 2026-01-06
17 hang with MPI pipelining https://github.com/pmodels/mpich/issues/7373 Build and run commands are in the MPICH issue. James Osborn Merged in https://github.com/pmodels/mpich/pull/7622 2025-04-03 2026-02-20
13 XGC hangs at scale CMPLRTST-27836 xgc-es-cpp-gpu app, ES_ITER test case Tim Williams 🚨 No response 2025-04-03 2026-01-07

Closed Issues

Internal ID Description Vendor ID Reproducer Path PoC Priority? Date Opened Closed Date
112 [MPI] MPI_probe crashing with H/W event overflow CAST-39582 in the issue and source/reproducers/mpi/mpi_probe Colleen 2026-02-09 2026-02-13
88 RPATH issue when mixing and matching SDK and spack packages built by another SDK No response No need. reprdducer attached in this ticket Ye Luo 2025-10-30 2026-02-18
82 Symbol missing issue with 1.3 version onwards in SLES and Intel Datacenter Max GPU on Aurora https://github.com/intel/xpumanager/issues/113 https://github.com/intel/xpumanager/issues/113 Servesh 2025-10-16 2026-02-18
78 Applications failing to compile with is too large for Clang to process or generating significantly larger exes with "-g" CMPLRLLVM-70962, (general and related: CMPLRLLVM-53145, CMPLRLLVM-69909, CMPLRLLVM-24314) source/reproducers/dpcpp/jit_too_large_for_Clang Abhi 🚨 2025-10-06 2026-01-06
75 "MPL_gpu_query_is_same_dev(int, int): Assertion `global_dev1 >= 0 && global_dev1 < known_ze_device_count' failed." with mpich.dbg No response https://github.com/pmodels/mpich/issues/7602 Tim, JaeHyuk, Colleen 2025-09-30 2025-10-13
72 MPI_aborts in many applications in next-eval at larger scales No response N/A Brian Holland / Tim Williams 2025-09-16 2025-09-30
66 Compiling with "-g" leads to a much larger binary than without CMPLRLLVM-69909, CMPLRLLVM-24314 (similar JIRAs) lammps + -g Brian Holland 2025-08-20 2026-01-06
61 Failing unit tests on PVCs with 2025.2 oneAPI SDK -- is it expected? https://github.com/uxlfoundation/oneMath/issues/703, CMPLRLLVM-69572, ONSAM-1930, GSD-11482 https://github.com/uxlfoundation/oneMath/issues/703 Colleen Bertoni 2025-07-30 2025-11-12
59 [ISHMEM] Unit test fails with ishmem 1.4.0 https://github.com/oneapi-src/ishmem/issues/10 https://github.com/oneapi-src/ishmem/issues/10 and source/applications/ishmem_sos Abhi 2025-07-25 2025-07-31
53 IFX Compiler reads and stores floating point values from a text file at single-precision No response /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/ifx/fp_precision Victor Anisimov 🚨 2025-07-09 2025-07-10
51 [SYCL] Bug from SYCL peer_access No response /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/dpcpp/sycl_peer_access Abhi 2025-07-02 2025-10-13
50 OpenMP Thread binding No response See bellow Romain PEREIRA 2025-07-02 2025-07-02
49 [E3SM] MPICH bug related to collectives tunning https://github.com/pmodels/mpich/issues/7456 https://github.com/pmodels/mpich/issues/7456 Abhi 🚨 2025-06-27 2025-10-09
48 Zombie Processes GSD-11266 none yet Servesh M 🚨 2025-06-25 2025-10-29
45 DDT issues since Aurora upgrade No response /lus/flare/projects/catalyst/world_shared/zippy/ddt Tim Williams 2025-06-12 2025-11-03
44 QMCPACK segfault in libomp No response Not yet created Ye Luo 🚨 2025-06-12 2025-07-23
42 Linking fails with old build environment No response /lus/flare/projects/PHASTA_aesp_CNDA/jrwrigh/petsc_build_test Kris Rowe 2025-06-06 2025-06-10
41 torch.compile segfaults for >2 tiles MLSL-3728 /flare/Aurora_deployment/vsastry/torch_compile Varuni Sastry 2025-06-06 2025-07-24
40 Need SYSMAN support for all modes in recent releases HPCS-15366, related: GSD-11104 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/l0/leak_zesMemoryGetState Thomas Applencourt 🚨 2025-05-30 2025-06-17
37 xpu-smi reports "N/A" for GPU Utilization RITM0428460, ANL-279, GSD-11252 any run of xpu-smi Kyle Felker / Colleen Bertoni 2025-05-22 2025-10-29
35 Avoid outputs exceeding few KBs to stdout/stderr from MPI ranks RITM0425437 First issue Large MPI writes to stdout Servesh Muralidharan 2025-05-15 2025-07-23
34 Runtime Error: pytorch DDP with CCL_BCAST=<"double_tree, direct, naive, maybe others?"> MLSL-3729 In issue Nathan Nichols 2025-05-15 2025-10-13
28 CMake failures with SYCL No response /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/dpcpp/ Abhishek Bagusetty 2025-05-09 2025-05-09
27 Build failures on PVC with Cutlass GSD-11099, https://github.com/codeplaysoftware/cutlass-sycl/issues/329 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/dpcpp/cutlass-sycl Abhi 🚨 2025-05-07 2025-10-13
26 L0 memcpy bug GSD-11142, NEO-14641 I was doing the same run as QMCPACK SOW runs in the reframe Ye Luo 🚨 2025-05-06 2025-10-13
25 Compile fail in Lattice App Brian reproduced and confirms fixed in 2025.1 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/dpcpp/bug_cgpt_icpx Xiao-Yong Jin 🚨 2025-05-01 2025-10-13
24 Noticeably more "ping failed" than before the 2025.1 SDK + 1099.12 UMD/KMD upgrade JIRA is:  HPCS-15331 N/A Xiao-Yong Jin Colleen Bertoni 2025-05-01 2025-05-16
23 Apps stop running after Apr 29 upgrade due to libstdc++ dependency No response See details Ye Luo 2025-04-30 2025-05-06
22 SYCL In-order queue broken NEO-14641 /lus/flare/projects/Aurora_deployment/applications.hpc.argonne-national-lab.aurora.anl-testing/source/reproducers/dpcpp/in-order Thomas Applencourt 🚨 2025-04-23 2025-10-13
21 Error during write with Quantum ESPRESSO No response see .zip file attached below, also /lus/flare/projects/matml_aesp_CNDA/dir_io_QE_crash Filippo Simini 🚨 2025-04-17 2025-04-18
20 Issue with gpu-bind for mpiexec under ZE_FLAT_DEVICE_HIERARCHY=FLAT mode ANL-283/HPE Support Case 5390607860 See below Abhishek, Nathan, Khalid 2025-04-16 2025-10-01
19 Severe CPU memory growth in MPICH No response /flare/catalyst/world_shared/zippy/reproducers/issue19 Tim Williams 2025-04-04 2025-07-31
18 Ping failures and hangs with production runs using GPT/GRID ANL-251, RITM0404147, RITM0404148, RITM0405730, GSD-11441 /lus/flare/projects/LatticeFlavor/lehner Xiao-Yong Jin 🚨 2025-04-04 2025-12-11
16 Catastrophic memory error in context lmp_aurora_kokkos No response public LAMMPS Chris Knight 2025-04-03 2025-07-23
12 CXI alloc failed on cxi1: request exceeds ACs limits No response None Not Thomas 2025-04-01 2025-12-09
9 Multithreaded data-transfer can cause page-fault N/A Full QMCPACK Ye Luo 2025-04-01 2025-05-08
8 Lots of H2D copies produce CPU I9 error and incorrect value N/A Full QMCPACK Ye Luo 🚨 2025-04-01 2025-05-28
7 MPI_Bcast gets faster when turning off XPMEM pmodels/mpich#7334 see Issue on MPICH GitHub repo Ye Luo 2025-04-01 2025-04-24
6 MPICH memory allocation slows down at scale pmodels/mpich#7333 see MPICH issue Ye Luo 🚨 2025-04-01 2025-04-24
4 Incorrect results in receive buffer in GPU memory MPICH 7312 grid application (lattice QCD) Patrick Steinbrecher, Tim Williams 🚨 2025-03-25 2025-04-24
3 Linker error found by XGC CMPLRLLVM-66496 /home/zippy/smalltests/aurora/xgc42/fails Tim Williams 2025-03-19 2025-03-28

Update tables

Automatically updated nightly. To update now, wait 10-15s after last change to AuroraBugTracking Issues, then run (anywhere on a machine that has authenticated with gh):

gh workflow run "Update Submodules" --repo "argonne-lcf/user-guides" && GH_FORCE_TTY=100% watch -c -n1 gh run list --repo "argonne-lcf/user-guides"
And wait ~2m until no jobs are running.

Or execute aurora-bug-table-sync.sh to automatically run everything step-by-step and know exactly when the changes are live online.