Skip to content

Commit 4e7bef8

Browse files
committed
runtime: mark newly-mapped memory as scavenged
On most platforms newly-mapped memory is untouched, meaning the pages backing the region haven't been faulted in yet. However, we mark this memory as unscavenged which means the background scavenger aggressively "returns" this memory to the OS if the heap is small. The only platform where newly-mapped memory is actually unscavenged (and counts toward the application's RSS) is on Windows, since (*mheap).sysAlloc commits the reservation. Instead of making a special case for Windows, I change the requirements a bit for a sysReserve'd region. It must now be both sysMap'd and sysUsed'd, with sysMap being a no-op on Windows. Comments about memory allocation have been updated to include a more up-to-date mental model of which states a region of memory may be in (at a very low level) and how to transition between these states. Now this means we can correctly mark newly-mapped heap memory as scavenged on every platform, reducing the load on the background scavenger early on in the application for small heaps. As a result, heap-growth scavenging is no longer necessary, since any actual RSS growth will be accounted for on the allocation codepath. Finally, this change also cleans up grow a little bit to avoid pretending that it's freeing an in-use span and just does the necessary operations directly. Fixes #32012. Fixes #31966. Updates #26473. Change-Id: Ie06061eb638162e0560cdeb0b8993d94cfb4d290 Reviewed-on: https://go-review.googlesource.com/c/go/+/177097 Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Austin Clements <[email protected]>
1 parent cb5c82b commit 4e7bef8

File tree

3 files changed

+95
-54
lines changed

3 files changed

+95
-54
lines changed

src/runtime/malloc.go

Lines changed: 66 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -332,37 +332,74 @@ var physPageSize uintptr
332332
// value is always safe (though potentially less efficient).
333333
var physHugePageSize uintptr
334334

335-
// OS-defined helpers:
335+
// OS memory management abstraction layer
336336
//
337-
// sysAlloc obtains a large chunk of zeroed memory from the
338-
// operating system, typically on the order of a hundred kilobytes
339-
// or a megabyte.
340-
// NOTE: sysAlloc returns OS-aligned memory, but the heap allocator
341-
// may use larger alignment, so the caller must be careful to realign the
342-
// memory obtained by sysAlloc.
337+
// Regions of the address space managed by the runtime may be in one of four
338+
// states at any given time:
339+
// 1) None - Unreserved and unmapped, the default state of any region.
340+
// 2) Reserved - Owned by the runtime, but accessing it would cause a fault.
341+
// Does not count against the process' memory footprint.
342+
// 3) Prepared - Reserved, intended not to be backed by physical memory (though
343+
// an OS may implement this lazily). Can transition efficiently to
344+
// Ready. Accessing memory in such a region is undefined (may
345+
// fault, may give back unexpected zeroes, etc.).
346+
// 4) Ready - may be accessed safely.
347+
//
348+
// This set of states is more than is strictly necessary to support all the
349+
// currently supported platforms. One could get by with just None, Reserved, and
350+
// Ready. However, the Prepared state gives us flexibility for performance
351+
// purposes. For example, on POSIX-y operating systems, Reserved is usually a
352+
// private anonymous mmap'd region with PROT_NONE set, and to transition
353+
// to Ready would require setting PROT_READ|PROT_WRITE. However the
354+
// underspecification of Prepared lets us use just MADV_FREE to transition from
355+
// Ready to Prepared. Thus with the Prepared state we can set the permission
356+
// bits just once early on, we can efficiently tell the OS that it's free to
357+
// take pages away from us when we don't strictly need them.
358+
//
359+
// For each OS there is a common set of helpers defined that transition
360+
// memory regions between these states. The helpers are as follows:
343361
//
344-
// sysUnused notifies the operating system that the contents
345-
// of the memory region are no longer needed and can be reused
346-
// for other purposes.
347-
// sysUsed notifies the operating system that the contents
348-
// of the memory region are needed again.
362+
// sysAlloc transitions an OS-chosen region of memory from None to Ready.
363+
// More specifically, it obtains a large chunk of zeroed memory from the
364+
// operating system, typically on the order of a hundred kilobytes
365+
// or a megabyte. This memory is always immediately available for use.
349366
//
350-
// sysFree returns it unconditionally; this is only used if
351-
// an out-of-memory error has been detected midway through
352-
// an allocation. It is okay if sysFree is a no-op.
367+
// sysFree transitions a memory region from any state to None. Therefore, it
368+
// returns memory unconditionally. It is used if an out-of-memory error has been
369+
// detected midway through an allocation or to carve out an aligned section of
370+
// the address space. It is okay if sysFree is a no-op only if sysReserve always
371+
// returns a memory region aligned to the heap allocator's alignment
372+
// restrictions.
353373
//
354-
// sysReserve reserves address space without allocating memory.
374+
// sysReserve transitions a memory region from None to Reserved. It reserves
375+
// address space in such a way that it would cause a fatal fault upon access
376+
// (either via permissions or not committing the memory). Such a reservation is
377+
// thus never backed by physical memory.
355378
// If the pointer passed to it is non-nil, the caller wants the
356379
// reservation there, but sysReserve can still choose another
357380
// location if that one is unavailable.
358381
// NOTE: sysReserve returns OS-aligned memory, but the heap allocator
359382
// may use larger alignment, so the caller must be careful to realign the
360-
// memory obtained by sysAlloc.
383+
// memory obtained by sysReserve.
384+
//
385+
// sysMap transitions a memory region from Reserved to Prepared. It ensures the
386+
// memory region can be efficiently transitioned to Ready.
361387
//
362-
// sysMap maps previously reserved address space for use.
388+
// sysUsed transitions a memory region from Prepared to Ready. It notifies the
389+
// operating system that the memory region is needed and ensures that the region
390+
// may be safely accessed. This is typically a no-op on systems that don't have
391+
// an explicit commit step and hard over-commit limits, but is critical on
392+
// Windows, for example.
363393
//
364-
// sysFault marks a (already sysAlloc'd) region to fault
365-
// if accessed. Used only for debugging the runtime.
394+
// sysUnused transitions a memory region from Ready to Prepared. It notifies the
395+
// operating system that the physical pages backing this memory region are no
396+
// longer needed and can be reused for other purposes. The contents of a
397+
// sysUnused memory region are considered forfeit and the region must not be
398+
// accessed again until sysUsed is called.
399+
//
400+
// sysFault transitions a memory region from Ready or Prepared to Reserved. It
401+
// marks a region such that it will always fault if accessed. Used only for
402+
// debugging the runtime.
366403

367404
func mallocinit() {
368405
if class_to_size[_TinySizeClass] != _TinySize {
@@ -539,6 +576,9 @@ func mallocinit() {
539576
// heapArenaBytes. sysAlloc returns nil on failure.
540577
// There is no corresponding free function.
541578
//
579+
// sysAlloc returns a memory region in the Prepared state. This region must
580+
// be transitioned to Ready before use.
581+
//
542582
// h must be locked.
543583
func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
544584
n = round(n, heapArenaBytes)
@@ -580,7 +620,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
580620
// TODO: This would be cleaner if sysReserve could be
581621
// told to only return the requested address. In
582622
// particular, this is already how Windows behaves, so
583-
// it would simply things there.
623+
// it would simplify things there.
584624
if v != nil {
585625
sysFree(v, n, nil)
586626
}
@@ -637,7 +677,7 @@ func (h *mheap) sysAlloc(n uintptr) (v unsafe.Pointer, size uintptr) {
637677
throw("misrounded allocation in sysAlloc")
638678
}
639679

640-
// Back the reservation.
680+
// Transition from Reserved to Prepared.
641681
sysMap(v, size, &memstats.heap_sys)
642682

643683
mapped:
@@ -1288,8 +1328,8 @@ func inPersistentAlloc(p uintptr) bool {
12881328
}
12891329

12901330
// linearAlloc is a simple linear allocator that pre-reserves a region
1291-
// of memory and then maps that region as needed. The caller is
1292-
// responsible for locking.
1331+
// of memory and then maps that region into the Ready state as needed. The
1332+
// caller is responsible for locking.
12931333
type linearAlloc struct {
12941334
next uintptr // next free byte
12951335
mapped uintptr // one byte past end of mapped space
@@ -1308,8 +1348,9 @@ func (l *linearAlloc) alloc(size, align uintptr, sysStat *uint64) unsafe.Pointer
13081348
}
13091349
l.next = p + size
13101350
if pEnd := round(l.next-1, physPageSize); pEnd > l.mapped {
1311-
// We need to map more of the reserved space.
1351+
// Transition from Reserved to Prepared to Ready.
13121352
sysMap(unsafe.Pointer(l.mapped), pEnd-l.mapped, sysStat)
1353+
sysUsed(unsafe.Pointer(l.mapped), pEnd-l.mapped)
13131354
l.mapped = pEnd
13141355
}
13151356
return unsafe.Pointer(p)

src/runtime/mem_windows.go

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -60,24 +60,34 @@ func sysUnused(v unsafe.Pointer, n uintptr) {
6060
}
6161

6262
func sysUsed(v unsafe.Pointer, n uintptr) {
63-
r := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
64-
if r != 0 {
63+
p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
64+
if p == uintptr(v) {
6565
return
6666
}
6767

6868
// Commit failed. See SysUnused.
69-
for n > 0 {
70-
small := n
69+
// Hold on to n here so we can give back a better error message
70+
// for certain cases.
71+
k := n
72+
for k > 0 {
73+
small := k
7174
for small >= 4096 && stdcall4(_VirtualAlloc, uintptr(v), small, _MEM_COMMIT, _PAGE_READWRITE) == 0 {
7275
small /= 2
7376
small &^= 4096 - 1
7477
}
7578
if small < 4096 {
76-
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", getlasterror(), "\n")
77-
throw("runtime: failed to commit pages")
79+
errno := getlasterror()
80+
switch errno {
81+
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
82+
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
83+
throw("out of memory")
84+
default:
85+
print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", errno, "\n")
86+
throw("runtime: failed to commit pages")
87+
}
7888
}
7989
v = add(v, small)
80-
n -= small
90+
k -= small
8191
}
8292
}
8393

@@ -116,15 +126,4 @@ func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer {
116126

117127
func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
118128
mSysStatInc(sysStat, n)
119-
p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE)
120-
if p != uintptr(v) {
121-
errno := getlasterror()
122-
print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n")
123-
switch errno {
124-
case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT:
125-
throw("out of memory")
126-
default:
127-
throw("runtime: cannot map pages in arena address space")
128-
}
129-
}
130129
}

src/runtime/mheap.go

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1246,20 +1246,22 @@ func (h *mheap) grow(npage uintptr) bool {
12461246
return false
12471247
}
12481248

1249-
// Scavenge some pages out of the free treap to make up for
1250-
// the virtual memory space we just allocated, but only if
1251-
// we need to.
1252-
h.scavengeIfNeededLocked(size)
1253-
12541249
// Create a fake "in use" span and free it, so that the
1255-
// right coalescing happens.
1250+
// right accounting and coalescing happens.
12561251
s := (*mspan)(h.spanalloc.alloc())
12571252
s.init(uintptr(v), size/pageSize)
12581253
h.setSpans(s.base(), s.npages, s)
1259-
atomic.Store(&s.sweepgen, h.sweepgen)
1260-
s.state = mSpanInUse
1261-
h.pagesInUse += uint64(s.npages)
1262-
h.freeSpanLocked(s, false, true)
1254+
s.state = mSpanFree
1255+
memstats.heap_idle += uint64(size)
1256+
// (*mheap).sysAlloc returns untouched/uncommitted memory.
1257+
s.scavenged = true
1258+
// s is always aligned to the heap arena size which is always > physPageSize,
1259+
// so its totally safe to just add directly to heap_released. Coalescing,
1260+
// if possible, will also always be correct in terms of accounting, because
1261+
// s.base() must be a physical page boundary.
1262+
memstats.heap_released += uint64(size)
1263+
h.coalesce(s)
1264+
h.free.insert(s)
12631265
return true
12641266
}
12651267

@@ -1314,7 +1316,6 @@ func (h *mheap) freeManual(s *mspan, stat *uint64) {
13141316
unlock(&h.lock)
13151317
}
13161318

1317-
// s must be on the busy list or unlinked.
13181319
func (h *mheap) freeSpanLocked(s *mspan, acctinuse, acctidle bool) {
13191320
switch s.state {
13201321
case mSpanManual:

0 commit comments

Comments
 (0)