Skip to content

[OpenMP] Unmapping of 'middle' structures leaks mappings (at runtime) #141046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jtb20 opened this issue May 22, 2025 · 1 comment
Open

[OpenMP] Unmapping of 'middle' structures leaks mappings (at runtime) #141046

jtb20 opened this issue May 22, 2025 · 1 comment
Assignees
Labels

Comments

@jtb20
Copy link
Contributor

jtb20 commented May 22, 2025

Consider this testcase:

#include <cstdlib>
#include <cstdio>

struct S {
  int *p0;
  int *p1;
  int *p2;
};

struct T {
  S *s0;
  S *s1;
  S *s2;
};

int main() {
  T *v = (T *) malloc (sizeof(T));
  v->s0 = (S *) malloc (sizeof(S));
  v->s1 = (S *) malloc (sizeof(S));
  v->s2 = (S *) malloc (sizeof(S));
  v->s1->p0 = (int*) calloc(8, sizeof(int));
  v->s1->p1 = (int*) calloc(8, sizeof(int));
  v->s1->p2 = (int*) calloc(8, sizeof(int));

  fprintf(stderr, "about to call target region:\n");
  fprintf(stderr, "&v=%p, &v->s0=%p, &v->s1=%p, &v->s2=%p, v->s0=%p, v->s1=%p, v->s2=%p\n",
          &v, &v->s0, &v->s1, &v->s2, (void*)v->s0, (void*)v->s1, (void*)v->s2);
  fprintf(stderr, "&v->s1->p0=%p, &v->s1->p1=%p, &v->s1->p2=%p, v->s1->p0=%p, v->s1->p1=%p, v->s1->p2=%p\n",
          &v->s1->p0, &v->s1->p1, &v->s1->p2, v->s1->p0, v->s1->p1, v->s1->p2);
  fprintf(stderr, "sizeof(S)=%d\n", (int)sizeof(S));
  fprintf(stderr, "sizeof(T)=%d\n", (int)sizeof(T));

#pragma omp target map(tofrom: v->s1->p1[:1])
  {
    v->s1->p1[0] += 3;
  }

  fprintf(stderr, "v->s1->p0[0]=%d, v->s1->p1[0]=%d, v->s1->p2[0]=%d\n",
          v->s1->p0[0], v->s1->p1[0], v->s1->p2[0]);

  free(v->s1->p0);
  free(v->s1->p1);
  free(v->s1->p2);
  free(v->s0);
  free(v->s1);
  free(v->s2);
  free(v);

  return 0;
}

Compiled with offloading to AMDGCN, this appears to work, but there is a problem: three blocks are created by the omp target directive, but only two are destroyed. Setting LIBOMPTARGET_DEBUG=5, we can see:

...
omptarget --> Entry  0: Base=0x000057d1c272a350, Begin=0x000057d1c272a358, Size=8, Type=0x20, Name=unknown
(base is &v->s0, begin is &v->s1)
omptarget --> Entry  1: Base=0x000057d1c272a358, Begin=0x000057d1c272a398, Size=8, Type=0x1000000000010, Name=unknown
(base is &v->s1, begin is &v->s1->p1)
omptarget --> Entry  2: Base=0x000057d1c272a398, Begin=0x000057d1c272a400, Size=4, Type=0x13, Name=unknown
(base is &v->s1->p1, begin is v->s1->p1)
...
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a350, HstPtrBegin=0x000057d1c272a358, TgtAllocBegin=0x000071bc75c00000, TgtPtrBegin=0x000071bc75c00008, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is &v->s1)
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a390, HstPtrBegin=0x000057d1c272a398, TgtAllocBegin=0x000071bc75c01000, TgtPtrBegin=0x000071bc75c01000, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is &v->s1->p1)
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a400, HstPtrBegin=0x000057d1c272a400, TgtAllocBegin=0x000071bc75c02000, TgtPtrBegin=0x000071bc75c02000, Size=4, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is v->s1->p1)
...
omptarget --> Removing map entry with HstPtrBegin=0x000057d1c272a400, TgtPtrBegin=0x000071bc75c02000, Size=4, Name=unknown
omptarget --> Deleting tgt data 0x000071bc75c02000 of size 4 by freeing allocation starting at 0x000071bc75c02000
(HstPtrBegin is v->s1->p1)
omptarget --> Removing map entry with HstPtrBegin=0x000057d1c272a358, TgtPtrBegin=0x000071bc75c00008, Size=8, Name=unknown
omptarget --> Deleting tgt data 0x000071bc75c00008 of size 8 by freeing allocation starting at 0x000071bc75c00000
(HstPtrBegin is &v->s1)

(Excerpt, parenthesized lines edited in.)

The map entry for the pointer v->s1->p1 is created, but never destroyed.

Without delving too deep into this in this initial bug report, I'm not sure the current arrangement of type bits/flags is sufficient to describe this case well. I think the lowering code in CGOpenMPRuntime.cpp:generateInfoForComponentList is operating according to the description in the comments, but the runtime can't see that entry 1 is both a MEMBER_OF mapping and a containing structure itself, so as to distinguish this case from other cases that it does handle correctly at present.

The 'entry 1' is not deleted because its refcount does not reach zero in omptarget.cpp:postProcessingTargetDataEnd:

    if (DelEntry && (Entry->getTotalRefCount() != 0 || IsNotLastUser)) {
      // The thread is not in charge of deletion anymore. Give up access
      // to the HDTT map and unset the deletion flag.
      HDTTMap.destroy();
      DelEntry = false;
    }

So this is either a Clang bug, or a runtime bug, but I'm not sure exactly which. I think probably some new flag bit needs adding and the runtime adjusted to process it properly. But then what about backwards-compatibility concerns?

This is related to, but slightly different from, #141042.

@llvmbot
Copy link
Member

llvmbot commented May 22, 2025

@llvm/issue-subscribers-openmp

Author: Julian Brown (jtb20)

Consider this testcase: ``` #include <cstdlib> #include <cstdio>

struct S {
int *p0;
int *p1;
int *p2;
};

struct T {
S *s0;
S *s1;
S *s2;
};

int main() {
T *v = (T *) malloc (sizeof(T));
v->s0 = (S ) malloc (sizeof(S));
v->s1 = (S ) malloc (sizeof(S));
v->s2 = (S ) malloc (sizeof(S));
v->s1->p0 = (int
) calloc(8, sizeof(int));
v->s1->p1 = (int
) calloc(8, sizeof(int));
v->s1->p2 = (int
) calloc(8, sizeof(int));

fprintf(stderr, "about to call target region:\n");
fprintf(stderr, "&v=%p, &v->s0=%p, &v->s1=%p, &v->s2=%p, v->s0=%p, v->s1=%p, v->s2=%p\n",
&v, &v->s0, &v->s1, &v->s2, (void*)v->s0, (void*)v->s1, (void*)v->s2);
fprintf(stderr, "&v->s1->p0=%p, &v->s1->p1=%p, &v->s1->p2=%p, v->s1->p0=%p, v->s1->p1=%p, v->s1->p2=%p\n",
&v->s1->p0, &v->s1->p1, &v->s1->p2, v->s1->p0, v->s1->p1, v->s1->p2);
fprintf(stderr, "sizeof(S)=%d\n", (int)sizeof(S));
fprintf(stderr, "sizeof(T)=%d\n", (int)sizeof(T));

#pragma omp target map(tofrom: v->s1->p1[:1])
{
v->s1->p1[0] += 3;
}

fprintf(stderr, "v->s1->p0[0]=%d, v->s1->p1[0]=%d, v->s1->p2[0]=%d\n",
v->s1->p0[0], v->s1->p1[0], v->s1->p2[0]);

free(v->s1->p0);
free(v->s1->p1);
free(v->s1->p2);
free(v->s0);
free(v->s1);
free(v->s2);
free(v);

return 0;
}

Compiled with offloading to AMDGCN, this appears to work, but there is a problem: three blocks are created by the `omp target` directive, but only two are destroyed. Setting LIBOMPTARGET_DEBUG=5, we can see:

...
omptarget --> Entry 0: Base=0x000057d1c272a350, Begin=0x000057d1c272a358, Size=8, Type=0x20, Name=unknown
(base is &v->s0, begin is &v->s1)
omptarget --> Entry 1: Base=0x000057d1c272a358, Begin=0x000057d1c272a398, Size=8, Type=0x1000000000010, Name=unknown
(base is &v->s1, begin is &v->s1->p1)
omptarget --> Entry 2: Base=0x000057d1c272a398, Begin=0x000057d1c272a400, Size=4, Type=0x13, Name=unknown
(base is &v->s1->p1, begin is v->s1->p1)
...
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a350, HstPtrBegin=0x000057d1c272a358, TgtAllocBegin=0x000071bc75c00000, TgtPtrBegin=0x000071bc75c00008, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is &v->s1)
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a390, HstPtrBegin=0x000057d1c272a398, TgtAllocBegin=0x000071bc75c01000, TgtPtrBegin=0x000071bc75c01000, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is &v->s1->p1)
omptarget --> Creating new map entry with HstPtrBase=0x000057d1c272a400, HstPtrBegin=0x000057d1c272a400, TgtAllocBegin=0x000071bc75c02000, TgtPtrBegin=0x000071bc75c02000, Size=4, DynRefCount=1, HoldRefCount=0, Name=unknown
(HstPtrBegin is v->s1->p1)
...
omptarget --> Removing map entry with HstPtrBegin=0x000057d1c272a400, TgtPtrBegin=0x000071bc75c02000, Size=4, Name=unknown
omptarget --> Deleting tgt data 0x000071bc75c02000 of size 4 by freeing allocation starting at 0x000071bc75c02000
(HstPtrBegin is v->s1->p1)
omptarget --> Removing map entry with HstPtrBegin=0x000057d1c272a358, TgtPtrBegin=0x000071bc75c00008, Size=8, Name=unknown
omptarget --> Deleting tgt data 0x000071bc75c00008 of size 8 by freeing allocation starting at 0x000071bc75c00000
(HstPtrBegin is &v->s1)

(Excerpt, parenthesized lines edited in.)

The map entry for the pointer `v-&gt;s1-&gt;p1` is created, but never destroyed.

Without delving too deep into this in this initial bug report, I'm not sure the current arrangement of type bits/flags is sufficient to describe this case well. I think the lowering code in `CGOpenMPRuntime.cpp:generateInfoForComponentList` is operating according to the description in the comments, but the runtime can't see that entry 1 is both a `MEMBER_OF` mapping and a containing structure itself, so as to distinguish this case from other cases that it does handle correctly at present.

The 'entry 1' is not deleted because its refcount does not reach zero in `omptarget.cpp:postProcessingTargetDataEnd`:
if (DelEntry &amp;&amp; (Entry-&gt;getTotalRefCount() != 0 || IsNotLastUser)) {
  // The thread is not in charge of deletion anymore. Give up access
  // to the HDTT map and unset the deletion flag.
  HDTTMap.destroy();
  DelEntry = false;
}

So this is either a Clang bug, or a runtime bug, but I'm not sure exactly which. I think probably some new flag bit needs adding and the runtime adjusted to process it properly. But then what about backwards-compatibility concerns?

This is related to, but slightly different from, https://github.com/llvm/llvm-project/issues/141042.
</details>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants