A Story Of realloc (And Laziness)

A Story Of realloc (And Laziness)(blog.httrack.com)

130 points by janerik 12 years ago | 63 comments

optimiz3 12 years ago |

Code in the article for realloc is dangerous and wrong:

  void *realloc(void *ptr, size_t size) {
    void *nptr = malloc(size);
    if (nptr == NULL) {
      free(ptr);
      return NULL;
    }
    memcpy(nptr, ptr, size); // KABOOM
    free(ptr);
    return nptr;
  }

Line marked KABOOM copies $DEST_BYTE_COUNT, rather than $SOURCE_BYTE_COUNT.

Say you want to realloc a 1 byte buffer to a 4 byte buffer - you just copied 4 bytes from a 1 byte buffer which means you're reading 3 bytes from 0xDEADBEEF/0xBADF000D/segfault land.

EDIT: Also, this is why the ENTIRE PREMISE of implementing your own reallocator speced to just the realloc prototype doesn't make much sense. You simply don't know the size of the original data with just a C heap pointer as this is not standardized AFAIK.

greenyoda 12 years ago | |

"Also, this is why the ENTIRE PREMISE of implementing your own reallocator speced to just the realloc prototype doesn't make much sense."

If you're reimplementing realloc() it's pretty easy to know the size of the allocated regions - you just need to store the size somewhere when you allocate a block. One common method is to allocate N extra bytes of memory whenever you do malloc() to hold the block header and return a pointer to (block_address + N) to the user. When you then want to realloc() a block, just look in the block header (N bytes before the user's pointer) for the size.

The block header can store other useful stuff, like debugging information. I once implemented a memory manager for debugging that could generate a list of all leaked blocks at the end of the program with the file names and line numbers where they were allocated.

ANTSANTS 12 years ago | | |

That would require either replacing malloc as well, or programming to the hairy details of your system's libc (ie knowing how and where it lays out the buffer metadata). The point is not that either are impossible, but that you can't replace realloc without doing one or the other.

xroche 12 years ago | |

Yes, indeed - but the code was not meant to be an actual implementation, just a (bad) minimalistic example.

mgraczyk 12 years ago | |

Not to mention that this realloc incorrectly frees the input ptr on failure. The standard says

    "If the space cannot be allocated, the object [*ptr] shall remain unchanged."

prutschman 12 years ago | |

In the context of the code presented all mallocs were returned by sbrk. Even if the previous malloc had been "allocated" with size zero, the sbrk called by this function's invocation of malloc will ensure that there are enough bytes of addressable memory to ensure that, although it will be copying data out of earlier allocations, it will not segfault. realloc makes no guarantees of zero-initializing a newly allocated extent.

CJefferson 12 years ago |

I have also found people often unestimate realloc (but have never done the same level of investigation to find out just how clever it is!)

On several occasions I have wanted to use mmap, mremap, and friends more often to do fancy things like copy-on-write memory. However, I always find this whole area depressingly poorly documented, and hard to do (because if you mess up a copy-on-write call, it just turns into a copy it seems, with the same result but less performance).

While it's good realloc is clever, I find it increasingly embarassing how badly C (and even worse, C++ which doesn't even really have realloc (as most C++ types can't be bitwise moved) memory allocation maps to what operating systems efficiently support.

plorkyeran 12 years ago | |

C++ really wants a realloc variant that extends an allocation if it can be extended without a copy, and leaves the allocation unchanged if it can't. The annoying thing is that there's no good reason why this can't exist beyond that the STL allocator interface happens not to have it.

bodyfour 12 years ago | | |

jemalloc's non-standard interface gives you some of what you want, expectially xallocx() http://www.canonware.com/download/jemalloc/jemalloc-latest/d...

There have been C++ templates written that use jemalloc-specific calls; for instance see Folly from facebook. I haven't taken a close look, but I know they do some jemalloc-specific code: https://github.com/facebook/folly/tree/master/folly

The other allocated-related thing that C++ really wants (and could benefit C as well) is "sized deallocation". Most of the time you often know the exact size of the memory you allocated. If you could pass that to free() the allocator could save some work determining it. In the case of C++ the compiler can often do this on your behalf for many "delete" calls (at least in the cases where it knows the exact type). Google did an implementation of this idea and got good results. They submitted a proposal to the standards body but I don't know if there is any recent activity. I hope it does happen though: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n353...

nroets 12 years ago | | |

Are you saying that vector<char> with 'grow()' is substantially slower than the given C macro ? By how much ?

asveikau 12 years ago |

This bothers me so much:

    buffer = realloc(buffer, capa);

Yeah, 'cause when it fails we didn't need the old buffer anyway... Might as well leak it.

ctz 12 years ago |

The realloc implementation in this blog is incorrect: the passed in pointer must not be freed if realloc is called with a non-zero length and returns NULL. This will cause a double free in correct callers.

As someone else pointed out, the example call of realloc is also incorrect.

edit: also, malloc is incorrect for three reasons: 1) sbrk doesn't return NULL on failure, 2) a large size_t length will cause a contraction in the heap segment rather than an allocation, and 3) sbrk doesn't return a pointer aligned in any particular way, whereas malloc must return a pointer suitably aligned for all types.

xroche 12 years ago | |

I fixed the double free. I must admit that the code was typed as I wrote the blog entry, and is horribly wrong :)

kabdib 12 years ago |

I fixed a crippling bug on another platform that was taking down whole servers, because someone was depending on a clever realloc to behave well.

This is implementation coupling at its worst. Don't do it.

xroche 12 years ago | |

Yes and no. The real error here would be to realloc without any geometric progression IMHO - ie. reallocating one more byte each time, which would behave well on Linux (except the libc call cost of course) but not on other implementations (such as some Microsoft's MSVCRT versions). Assuming realloc has no catastrophic performance impact is not something too daring.

picomancer 12 years ago |

This is really neat. Somehow I always assumed realloc() copied stuff instead of using the page table.

But say you have 4K page table size. You malloc() in turn a 2K object, a 256K object, and another 2K object, ending up with 2K, 256K, 2K in memory. Then your 256K is not aligned on a page boundary. If you realloc() the 256K it has to move since it's surrounded by two objects. When you do that, you'll wind up with the two pages on the end being mapped to multiple addresses. Which is actually just fine...Interesting...

nhaehnle 12 years ago | |

The libc memory allocator does not simply hand out memory contiguously. In your example, the 256K block will end up being 4K aligned.

In fact, that's what the article already explains: the large alloc will just end up being passed through to the kernel, which only deals at page granularity.

whoopdedo 12 years ago | | |

What the article revealed to me is that there is no guarantee a contiguous block of allocated virtual memory will be backed by contiguous physical memory. In hindsight, that should be obvious.

But what does this mean for locality? Will I be thrashing the cache if I use realloc frequently? Do I even have the promise that malloc will return unfragmented memory?

crackerz 12 years ago |

And this is why OpenSource is awesome.

__david__ 12 years ago | |

Agreed. Not sure why people are downvoting you. It blows my mind every time I think, "I wonder how <some program> works?" and I'm able to just "apt-get source some_program" and check it out.

Working with Linux and having the source (and the ability to change it) for entire stack all the way down to and including the kernel is liberating. As a programmer it feels like the entire world is open to me.

I guess that's GNU's dream brought to life, really.

mjcohen 12 years ago |

In the original #define, the parameter is lower case "c" and the expansion uses upper case "C".

I don't know about what other operating systems do. Apparently all modern unixes overcommit address space in much the same way as Linux. However, i can't believe that FreeBSD handles this as crassly as Linux does.