My repo for std::simd and std::experimental::simd related work.
Find a file
Matthew Malcolmson 304d08fea9 libgomp: Ensure memory sync after performing tasks
As described in PR 122356 there is a theoretical bug around not
"publishing" user data written in a task when that task has been
executed by a thread after entry to a barrier.

Key points of the C memory model that are relevant:
1) Memory writes can be seen in a different order in different threads.
2) When one thread (A) reads a value with acquire memory ordering that
   another thread (B) has written with release memory ordering, then all
   data written in thread (B) before the write that set this value will
   be visible to thread (A) after that read.
3) This point requires that the read and write operate on the same
   value.  The guarantee is one-way:  It specifies that thread (A) will
   see the writes that thread (B) has performed before the specified
   write.  It does not specify that thread (B) will see writes that
   thread (A) has performed before reading this value.

Outline of the issue:
1) While there is a memory sync at entry to the barrier, user code can
   be ran after threads have all entered the barrier.
2) There are various points where a memory sync can occur after entry to
   the barrier:
   - One thread getting the `task_lock` mutex that another thread has
     released.
   - Last thread incrementing `bar->generation` with `MEMMODEL_RELEASE`
     and some other thread reading it with `MEMMODEL_ACQUIRE`.
   However there are code paths that can avoid these points.
3) On the code-paths that can avoid these points we could have no memory
   synchronisation between a write to user data that happened in a task
   executed after entry to the barrier, and some other thread running
   the implicit task after the barrier.  Hence that "other thread" may
   read a stale value that should have been overwritten in the explicit
   task.

There are two code-paths that I believe I've identified:
1) The last thread sees `task_count == 0` and increments the generation
   with `MEMMODEL_RELEASE` before continuing on to the next implicit
   task.
   If some other thread had executed a task that wrote user data I
   don't see any way in which an acquire-release ordering *from* the
   thread writing user data *to* the last thread would have been formed.
2) After all threads have entered the barrier.  Some thread (A) is
   waiting in `do_wait`.  Some other thread (B) completes a task writing
   user data.  Thread (B) increments the generation using
   `gomp_team_barrier_done` (non atomically -- hence not allowing the
   formation of any acquire-release ordering with this write).  Thread
   (A) reads that data with `MEMMODEL_ACQUIRE`, but since the write was
   not atomic that does not form an ordering.

This patch makes two changes:
1) The write of `task_count == 0` in `gomp_barrier_handle_tasks` is done
   atomically while the read of `task_count` in
   `gomp_team_barrier_wait_end` is also made atomic.  This addresses the
   first case by forming an acquire-release ordering *from* the thread
   executing tasks *to* the thread that will increment the generation
   and continue.
2) The write of `bar->generation` via `gomp_team_barrier_done` called
   from `gomp_barrier_handle_tasks` is done atomically.  This means that
   it will form an acquire-release synchronisation with the existing
   atomic read of `bar->generation` in the main loop of
   `gomp_team_barrier_wait_end`.

Testing done:
- Bootstrap & regtest on aarch64 and x86_64.
  - With & without _LIBGOMP_CHECKING_.
  - Testsuite with & without OMP_WAIT_POLICY=passive
- Cross compilation & regtest on arm.
- TSAN done on this as part of all my upstream patches.

libgomp/ChangeLog:
	PR libgomp/122356
	* config/gcn/bar.c (gomp_team_barrier_wait_end): Atomically read
	team->task_count.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/gcn/bar.h (gomp_team_barrier_done): Atomically write
	bar->generation.
	* config/linux/bar.c (gomp_team_barrier_wait_end): Atomically
	read team->task_count.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/linux/bar.h (gomp_team_barrier_done): Atomically write
	bar->generation.
	* config/posix/bar.c (gomp_team_barrier_wait_end): Atomically
	read team->task_count.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/posix/bar.h (gomp_team_barrier_done): Atomically write
	bar->generation.
	* config/rtems/bar.h (gomp_team_barrier_done): Atomically write
	bar->generation.
	* task.c (gomp_barrier_handle_tasks): Atomically write
	team->task_count when decrementing to zero.
	* testsuite/libgomp.c/pr122356.c: New test.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
2026-01-20 03:54:51 +00:00
.forgejo Containerfile for base forge actions 2026-01-08 07:31:40 -05:00
.github
c++tools
config
contrib Daily bump. 2026-01-10 00:16:49 +00:00
fixincludes Daily bump. 2026-01-10 00:16:49 +00:00
gcc cobol: Fix up -Wmove-index option description 2026-01-20 01:20:07 +01:00
gnattools Daily bump. 2026-01-10 00:16:49 +00:00
gotools
include
INSTALL
libada
libatomic Daily bump. 2026-01-20 00:16:30 +00:00
libbacktrace
libcc1 Daily bump. 2026-01-16 00:16:30 +00:00
libcody
libcpp Daily bump. 2026-01-16 00:16:30 +00:00
libdecnumber Daily bump. 2026-01-10 00:16:49 +00:00
libffi
libga68 Daily bump. 2026-01-20 00:16:30 +00:00
libgcc Daily bump. 2026-01-14 00:16:30 +00:00
libgcobol Daily bump. 2026-01-18 00:16:31 +00:00
libgfortran Daily bump. 2026-01-14 00:16:30 +00:00
libgm2
libgo
libgomp libgomp: Ensure memory sync after performing tasks 2026-01-20 03:54:51 +00:00
libgrust
libiberty Daily bump. 2026-01-13 00:16:32 +00:00
libitm
libobjc
libphobos
libquadmath
libsanitizer
libssp
libstdc++-v3 Daily bump. 2026-01-20 00:16:30 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.b4-config
.dir-locals.el
.editorconfig
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2026-01-20 00:16:30 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in toplevel: Unbreak Ada build [PR123490] 2026-01-10 11:36:25 +01:00
config.guess
config.rpath
config.sub
configure s390: Deprecate -m31 2026-01-19 09:56:51 +01:00
configure.ac s390: Deprecate -m31 2026-01-19 09:56:51 +01:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS MAINTAINERS: update my email address 2026-01-14 11:49:11 -08:00
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt
symlink-tree bugzilla: remove gcc-bugs@ mailing list address 2026-01-09 11:10:38 +01:00
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.