Binglong's space

Random notes on computer, phone, life, anything

Posts Tagged ‘block’

C and Objective-C Blocks: Saving and Capturing

Posted by binglongx on October 1, 2024

Introduction

Block is a C extension feature, similar to C++’s lambda as callable object. It is more widely used in Objective-C code, but it is also supported in pure C code (of course in C++ too). You need to use compiler option -fblocks to enable it in C or C++ code for gcc and clang compilers. If you need to use Block_copy() and Block_release(), you need to #include <Block.h>.

This post does not explain general use of Block. You can refer to clang’s Language Specification for Blocks, or Apple’s Working with Blocks for more details about Block.

Instead, this post is to clarify how block capture works, and how to avoid issues using capture, from the perspective of comparison with C++ lambda.

What Is a Block Variable?

You can easily create a Block, by writing something like:

^(parameter list) {
    // code of block body
}

The construct above is called a block literal. It is basically an anonymous function, similar to C++ lambda. You cannot specify a capture list, but in the block body you can refer to objects in the enclosing scope, and that would automatically capture those objects. Block documentation points out that Block captures a const copy of the value of object in enclosing scope, and this is the only way to capture objects. That is, if an variable from enclosing scope is mentioned in the block body, a copy of that object is created in the Block state, and from within the block body it can only be accessed as constant value.

The block literal can be assigned to a block variable.

// clang++ test.cpp -std=c++20 -fblocks -O0

#include <iostream>

struct Big{
    char aa[100];
};

int main(int argc, char** argv) {
    int x = 0;
    Big big;

    auto lambda1 = [=](){
        return x; // captures x by value
    };
    std::cout << "sizeof(lambda1) : " << sizeof(lambda1) << std::endl;

    auto lambda2 = [=](){
        return big.aa[10]; // capture big by value
    };
    std::cout << "sizeof(lambda2) : " << sizeof(lambda2) << std::endl;

    auto block1 = ^(){
        return x; // captures a constant copy of x
    };
    std::cout << "sizeof(block1)  : " << sizeof(block1) << std::endl;

    auto block2 = ^(){
        return big.aa[10]; // capture a constant copy of big (and access only 10-th element)
    };
    std::cout << "sizeof(block2)  : " << sizeof(block2) << std::endl;

    return 0;
}

The code prints:

sizeof(lambda1) : 4
sizeof(lambda2) : 100
sizeof(block1)  : 8
sizeof(block2)  : 8

In C++, lambda is a value object with all state fully enclosed in it, and its size depends on how much it captures. No surprise to see lambda1 and lambda2 show very different sizes (clearly int is 4 bytes in this architecture).

Supposedly, the two blocks would have different state size, because block2 obviously captures a big object. The printout shows that the block variable (block1 and block2) is essentially a pointer type. The actual state of the block, i.e., the block object, is stored somewhere else. To avoid confusion, let’s call the block variable as block pointer, and its pointee object the block object.

Clearly, a block pointer, although appears similar to a C++ lambda, is quite different in terms of how memory is managed.

Where Is the Block Object?

So where is the block object, which may contain big state if it captures stuff?

Let’s find out with code:

// clang++ test.cpp -std=c++20 -fblocks -O0

#include <iostream>

struct Big{
    char aa[100];
};

int global = 42;
int main(int argc, char** argv) {
    int x = 0;
    Big big;
    auto block0 = ^(){
        // no captures
    };
    auto block1 = ^(){
        return x; // captures a constant copy of x
    };
    auto block2 = ^(){
        return big.aa[10]; // capture a constant copy of big (and access only 10-th element)
    };
    int y = 1;

    std::cout << "x              address : " << (void*)(&x) << std::endl;
    std::cout << "block0 pointer address : " << (void*)(&block0) << std::endl;
    std::cout << "block1 pointer address : " << (void*)(&block1) << std::endl;
    std::cout << "block2 pointer address : " << (void*)(&block2) << std::endl;
    std::cout << "y              address : " << (void*)(&y) << std::endl;
    std::cout << "block0 object  address : " << (void*)(block0) << std::endl;
    std::cout << "block1 object  address : " << (void*)(block1) << std::endl;
    std::cout << "block2 object  address : " << (void*)(block2) << std::endl;
    std::cout << "main()         address : " << (void*)(main) << std::endl;
    std::cout << "global         address : " << (void*)(&global) << std::endl;

    return 0;
}

This prints:

x              address : 0x16b6a2e7c
block0 pointer address : 0x16b6a2e70
block1 pointer address : 0x16b6a2e68
block2 pointer address : 0x16b6a2e38
y              address : 0x16b6a2e34
block0 object  address : 0x1047600e0
block1 object  address : 0x16b6a2e40
block2 object  address : 0x16b6a2e90
main()         address : 0x10475edb0
global         address : 0x104764000

For the obvious objects in stack (x, block0, block1, block2, y), addresses of them decrease nicely 0x16b6a2e7c through 0x16b6a2e34 (stack grows down in this architecture). Not a surprise.

The surprising part is the block objects.

  • block1 and block2 are obviously on stack, and seem to hide at whatever convenient location on stack.
  • block0 however is clearly not on stack. It is at a much lower address, closer to the global variable and main function code.

The reason is that, block0 does not capture anything, so its state is minimal, and it does not depend on anything at the site where it is created, therefore can live in some “global” location.

For block1 and block2, however, their states depend on what they capture in the function scope, so they are created on stack. Hopefully they are only used in the same scope, and using stack memory makes sense.

What is Block Object Size?

To understand how block captures objects, we have to check what’s stored in the block.

We can check the Block’s run-time source code in LLVM: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/BlocksRuntime/Block_private.h#L62-L77. This provides definitions of Block_layout and Block_descriptor:

struct Block_descriptor {
    unsigned long int reserved;
    unsigned long int size;
    void (*copy)(void *dst, void *src);
    void (*dispose)(void *);
};

struct Block_layout {
    void *isa;
    int flags;
    int reserved; 
    void (*invoke)(void *, ...);
    struct Block_descriptor *descriptor;
    /* Imported variables. */
};

We can use this to check more details of a block object. A block pointer basically points to a Block_layout object. The /* Imported variables. */ part is not part of base Block_layout sturcture, but improvised by the compiler for an actual block literal, so each block would be different here. The actual size of a concrete Block_layout object is given in descriptor->size, and of course depending on how much stuff it captures.

Let’s write some code to inspect. Block_copy() “semantically” makes a copy of the block object; what this means is to be discussed below.

void inspectBlock(void* block) {
    auto aBlock = (Block_layout *)block;
    // size of block's state, including captured objects
    auto total_size = aBlock->descriptor->size;
    // size of block's state without captures
    auto base_size = sizeof(Block_layout);
    std::cout << " block at " << block << ", descriptor at " << aBlock->descriptor 
        << ", base size " << base_size << ", total size " << total_size << std::endl;
}

template<typename BlockPtr>
void testBlock(BlockPtr block, const char* caption) {
    std::cout << caption << ": \n"; 
    std::cout << "  original            : "; inspectBlock(block);
    auto copy1 = Block_copy(block);
    std::cout << "  copy 1 of original  : "; inspectBlock(copy1);
    auto copy2 = Block_copy(block);
    std::cout << "  copy 2 of original  : "; inspectBlock(copy2);
    auto copy3 = Block_copy(copy2);
    std::cout << "  copy of copy 2      : "; inspectBlock(copy3);
}

int main(int argc, char** argv) {

    auto block0 = ^(){
    };
    testBlock(block0, "block0");

    int x = 0;
    auto block1 = ^(){
        return x;
    };
    testBlock(block1, "block1");

    Big big;
    auto block2 = ^(){
        return big.aa[10];
    };
    testBlock(block2, "block2");

    std::cout << "========\n";
    std::cout << "block2: "; inspectBlock(block2);
    auto block3 = ^(){
        return big.aa[10];
    };
    std::cout << "block3: "; inspectBlock(block3);
    auto block4 = ^(){
        return big.aa[10];
    };
    std::cout << "block4: "; inspectBlock(block4);
    auto block5 = ^(){
        return big.aa[10];
    };
    std::cout << "block5: "; inspectBlock(block5);

    return 0;
}

This prints:

block0: 
  original            :  block at 0x102878108, descriptor at 0x1028780e8, base size 32, total size 32
  copy 1 of original  :  block at 0x102878108, descriptor at 0x1028780e8, base size 32, total size 32
  copy 2 of original  :  block at 0x102878108, descriptor at 0x1028780e8, base size 32, total size 32
  copy of copy 2      :  block at 0x102878108, descriptor at 0x1028780e8, base size 32, total size 32
block1: 
  original            :  block at 0x16d58aca8, descriptor at 0x102878128, base size 32, total size 36
  copy 1 of original  :  block at 0x12a605eb0, descriptor at 0x102878128, base size 32, total size 36
  copy 2 of original  :  block at 0x12a606080, descriptor at 0x102878128, base size 32, total size 36
  copy of copy 2      :  block at 0x12a606080, descriptor at 0x102878128, base size 32, total size 36
block2: 
  original            :  block at 0x16d58ae90, descriptor at 0x102878148, base size 32, total size 132
  copy 1 of original  :  block at 0x12a605d70, descriptor at 0x102878148, base size 32, total size 132
  copy 2 of original  :  block at 0x12a605e00, descriptor at 0x102878148, base size 32, total size 132
  copy of copy 2      :  block at 0x12a605e00, descriptor at 0x102878148, base size 32, total size 132
========
block2:  block at 0x16d58ae90, descriptor at 0x102878148, base size 32, total size 132
block3:  block at 0x16d58ae08, descriptor at 0x102878168, base size 32, total size 132
block4:  block at 0x16d58ad80, descriptor at 0x102878188, base size 32, total size 132
block5:  block at 0x16d58acf8, descriptor at 0x1028781a8, base size 32, total size 132

A few observations:

  • All blocks’ descriptor object is always in low address “global” memory (0x102878000 area), which is responsible to copy and dispose the block object. Understandably this is basically some code / code meta data. Basically, “descriptor” is meta data and “static” class scope data for the block, while “layout” is block object data and “instance” data for each block object.
  • Size of block object instance depends on captures (obviously). Non-capturing block is at least 32 bytes. The more it captures the bigger it is.
  • Non-capturing block (block0 here) has block object in “global” memory (see also previous section).
  • The original block object from block literal sits in stack, but each Block_copy() of the stack block object returns a new unique heap object (0x12a605000 area, lower address than stack, higher than global/code). This can be expensive.
  • Block_copy() of a heap block object returns pointer to the same heap block object, see copy of copy 2. Likely only some reference count is bumped. So “copying” a heap block object is more efficient than copying the stack block object. Therefore, avoid copying the original stack block object repeatedly.
  • The last section of printout for block2 through block5 shows that the block objects really sit in stack, eaching taking 0x88 = 136 bytes. Actual payload is 132 bytes, probably due to alignment their addresses in stack report 136. Maybe block object T does not need to meet sizeof(T)%alignmentof(T)==0 for regular C/C++ object.

How does Block Capture a Reference?

What happens if we capture a C++ reference?

int main(int argc, char** argv) {
    Big big;
    auto block1 = ^(){
        return big.aa[10];      // capture an object
    };
    std::cout << "block1 : full size: " << ((Block_layout *)block1)->descriptor->size << std::endl;

    Big& bigRef = big;
    auto block2 = ^(){
        return bigRef.aa[10];   // capture a reference
    };
    std::cout << "block2 : full size: " << ((Block_layout *)block2)->descriptor->size << std::endl;

    return 0;
}

Surprise! It prints:

block1 : full size: 132
block2 : full size: 40

Basically, only a reference (i.e. basically a pointer, of 8 bytes here) is captured. When the compiler sees a mention of a reference to Big in block literal, it captures only a copy of pointer to Big, not the Big object being referenced.

It would be clearer if we compare with C++ lambda capture:

    auto lambda1 = [=]{
        return big.aa[10];      // capture object by value
    };
    std::cout << "lambda1 : size: " << sizeof(lambda1) << std::endl;

    auto lambda2 = [&]{
        return big.aa[10];      // capture object by reference
    };
    std::cout << "lambda2 : size: " << sizeof(lambda2) << std::endl;

    auto lambda3 = [=]{
        return bigRef.aa[10];   // capture reference by value
    };
    std::cout << "lambda3 : size: " << sizeof(lambda3) << std::endl;

    auto lambda4 = [&]{
        return bigRef.aa[10];   // capture reference by reference
    };
    std::cout << "lambda4 : size: " << sizeof(lambda4) << std::endl;

This prints:

lambda1 : size: 100
lambda2 : size: 8
lambda3 : size: 100
lambda4 : size: 8

It’s clear that lambda1 and lambda2 prints as we already understood C++, and Block capturing object by value is clearly similar to lambda1.

The confusion: is Block capture reference “by value” similar to lambda3 or lambda4? As it is shown, it is actually similar to lambda4, i.e. “capture reference by reference”, not “capture reference by value”, in C++.

Capturing in Block does not allow the flexibility of capturing in C++ lambda where you can specify by value or reference. You always need to keep in mind that Block captures the apparent type by value: object as object, and reference as reference (pointer and reference, therefore shallow).

Normally if you memorize this rule, you would not make mistakes. If you capture anything by reference (or pointer), you know that for the whole life that the block may be called, you must make sure the object in question is alive, to avoid accessing through dangling reference. This is similar to lambda captures in C++.

How does Block_copy() Copy Captured Objects?

Wonder how Block_copy() copies the captured objects, especially C++ class objects? Let’s use a probe class to check.

void inspectBlock(void* block, const char* name) {
    auto aBlock = (Block_layout *)block;
    // size of block's state, including captured objects
    auto total_size = aBlock->descriptor->size;
    // size of block's state without captures
    auto base_size = sizeof(Block_layout);
    std::cout << name << ": block at " << block << ", descriptor at " << aBlock->descriptor 
        << ", total size " << total_size
        << ", captures start at " << (void*)((char*)(&aBlock->descriptor) + sizeof(aBlock->descriptor))
        << std::endl;
}

struct Foo{
    Foo() {
        std::cout << "  " << this << " Foo:Foo()\n";
    }
    Foo(int i) : i(std::make_unique<int>(i)) {
        std::cout << "  " << this << " Foo:Foo(int)\n";
    }
    Foo(const Foo& foo) {
        std::cout << "  " << this << " Foo:Foo(const Foo&)\n";
        if( foo.i )
            i = std::make_unique<int>(*foo.i);
    }
    Foo& operator = (const Foo& foo) {
        if( foo.i )
            i = std::make_unique<int>(*foo.i);
        else
            i.reset();
        std::cout << "  " << this << " Foo:operator = (const Foo&)\n";
        return *this;
    }
    ~Foo() {
        std::cout << "  " << this << " Foo:~Foo()\n";
    }

    std::unique_ptr<int> i;
};


int main(int argc, char** argv) {
    {
        Foo foo(42);

        std::cout << "creating block1 capturing Foo object\n";
        auto block1 = ^(){
            return *foo.i;
        };
        inspectBlock(block1, "block1");
        
        std::cout << "copying block1 to block1a\n";
        auto block1a = Block_copy(block1);
        inspectBlock(block1a, "block1a");

        std::cout << "copying block1 to block1b\n";
        auto block1b = Block_copy(block1);
        inspectBlock(block1b, "block1b");

        std::cout << "copying block1b to block1c\n";
        auto block1c = Block_copy(block1b);
        inspectBlock(block1c, "block1c");
    }
    std::cout << "\n";
    {
        Foo foo(55);
        std::cout << "creating block2 capturing Foo reference\n";
        Foo& fooRef = foo;
        auto block2 = ^(){
            return *fooRef.i;
        };
        inspectBlock(block2, "block2");
    }

    return 0;
}

This prints:

  0x16fbc6f78 Foo:Foo(int)
creating block1 capturing Foo object
  0x16fbc6f58 Foo:Foo(const Foo&)
block1: block at 0x16fbc6f38, descriptor at 0x10023c0c8, total size 40, captures start at 0x16fbc6f58
copying block1 to block1a
  0x12a605ed0 Foo:Foo(const Foo&)
block1a: block at 0x12a605eb0, descriptor at 0x10023c0c8, total size 40, captures start at 0x12a605ed0
copying block1 to block1b
  0x12a6060a0 Foo:Foo(const Foo&)
block1b: block at 0x12a606080, descriptor at 0x10023c0c8, total size 40, captures start at 0x12a6060a0
copying block1b to block1c
block1c: block at 0x12a606080, descriptor at 0x10023c0c8, total size 40, captures start at 0x12a6060a0
  0x16fbc6f58 Foo:~Foo()
  0x16fbc6f78 Foo:~Foo()

  0x16fbc6f18 Foo:Foo(int)
creating block2 capturing Foo reference
block2: block at 0x16fbc6ee0, descriptor at 0x10023c0f8, total size 40, captures start at 0x16fbc6f00
  0x16fbc6f18 Foo:~Foo()

The first section of the print out shows that:

  • Block_copy() of a stack block object calls copy constructor when it has to copy captured object. When necessary, Block_copy() allocates memory, then calls Block_layout.descriptor->copy, which points to compiler synthesized code that would call copy constructors of the captured objects to create valid captured objects.
  • Stack block object destructs its captured objects when scope is left, see 0x16fbc6f58.
  • Heap block object from Block_copy() will not correctly destroy if you don’t have matching Block_release(), see 0x12a605ed0 and 0x12a6060a0. It leaks along with its captures.
  • The captured objects do start right after .descriptor in Block_layout struct.

The second section of the print out shows that capturing a reference does not create a new Foo object, as we discussed before. Obviously making a copy of the block would just merely copy the captured reference, i.e. basically, a pointer (not shown here).

How to Use Block Pointer Later?

As shown in previous sections, a block pointer created from Block literal in a local scope points to a hidden stack object, therefore is only guaranteed valid in the scope it is created.

char (^g_block)();

void setBlockCallback() {
    std::string s("This is a Long long long long string.");
    auto block = ^(){
        return s[10];
    };
    g_block = block; // Line AA
}

int main(int argc, char** argv) {
    setBlockCallback();
    std::cout << g_block();    // Line BB: BOOM
    std::cout << std::endl;
    return 0;
}

The program can crash at Line BB, because at Line AA it naively assigns block to g_block, but once the function leaves the scope, g_block would hold a pointer to the evaporated stack block object.

If remembering whether block object is in stack or heap is difficult, you can view it from the perspective of properly retaining and releasing the block object. You can think of that, when block is created there is an invisible Block_copy() to keep it alive, and at the end of local scope there is an automatic invisible Block_release(block) call. (Although in reality the stack block object just evaporates.) Now saving the block pointer for future use needs retaining, i.e., calling Block_copy().

To correct this problem, you always need to use Block_copy() to save a block for later use in a different scope (here it would be copied from stack to heap, but you can ignore this implementation detail). And use a matching Block_release() when finished using.

char (^g_block)();

void setBlockCallback() {
    std::string s("This is a Long long long long string.");
    auto block = ^(){
        return s[10];
    };
    g_block = Block_copy(block); // Line AA, g_block now points to a copy of block object in heap
}

int main(int argc, char** argv) {
    setBlockCallback();
    std::cout << g_block();      // 'L'
    std::cout << std::endl;
    Block_release(g_block);      // Line BB
    return 0;
}

Line AA above make sure we can safely use g_block later. Line BB makes sure we don’t leak the block object.

If your function takes a Block pointer argument, you can assume at the entry the block has correct reference count, and you can use it safely. But if you want to save that block for use later after the current call is over, you must use Block_copy().

What to Be Careful about Capturing C++ Lambda?

Since a C++ lambda is an object, you can capture and use it in Block. The same is true to capture a block in lambda.

In C++ code we often pass around universal reference to lambda, and capturing them by value is okay for the sake of capturing lambda keeping a copy of captured lambda, as shown in lambda3 above.

However, if your context gets a lamba through a reference (often universal reference), and you capture it in a block, you have to be careful, because “by value” here has a different meaning. See this example:

char (^g_block)();

template<class FO>
auto setBlockCallback(FO&& fo){
    auto block = ^(){
        return fo();    // Line CC capure fo: fo is reference!
    };
    g_block = Block_copy(block);
}

int main(int argc, char** argv) {
    std::string s("This is a Long long long long string.");
    setBlockCallback([=]{
        return s[10];   // capture s by value, OK
    }); // Line AA
    std::cout << "g_block : full size: " << getBlockFullSize(g_block) << std::endl; // 40
    std::cout << g_block() << "\n";    // Line BB: UB
    Block_release(g_block);
    return 0;
}

The lambda captures s fine by value, so fo arrives in setBlockCallback() okay. The problem is that at Line CC the block captures fo, which is a C++ reference, “by value” in Block notion, which is really a reference only to the fo object. Since fo object evaporates after semicolon at Line AA as temporary, at Line BB g_block now holds a bad reference to fo, and executing g_block is Undefined Behavior.

The fix is simple; it is along the line of making Block capture the object, not reference:

char (^g_block)();

template<class FO>
auto setBlockCallback(FO&& fo){
    auto foObj = fo;    // Line DD
    auto block = ^(){
        return foObj();    // Line CC capure a copy of fo
    };
    g_block = Block_copy(block);
}

int main(int argc, char** argv) {
    std::string s("This is a Long long long long string.");
    setBlockCallback([=]{
        return s[10];   // capture s by value, OK
    }); // Line AA
    std::cout << "g_block : full size: " << getBlockFullSize(g_block) << std::endl; // 56
    std::cout << g_block() << "\n";    // Line BB: 'L'
    Block_release(g_block);
    return 0;
}

Line DD gets a copy of fo in C++. Then Line CC captures that object as a copy in block. This fixes the problem.

Does Block work with ARC (Automatic Reference Counting)?

In all the examples above, we use C/C++. ARC is not applicable or supported by compiler for C and C++ code, so we must perform Block_copy and Block_release manually when necessary.

If you are writing Objective-C or Objective-C++ code (.m/.mm files), you can turn on ARC compiler option (it’s actually recommended to do so). With this, the compiler can perform Block_copy and Block_release for you automatically in a lot of occasions. But there are some situations that you still need to manually perform them. Please consult Objective-C documentation, e.g. clang’s Objective-C Automatic Reference Counting (ARC), and Apple’s Transitioning to ARC Release Notes.

For this reason, it’s recommended that you do not put code using blocks in header file (such as in templates) where the header file can be included in C++ and Objective-C/C++ code respectively, because the same template code may result in different behavior depending on .cpp or .mm using it. If you have to do this, then use Block_copy/Block_release defensively.

You can however safely declare a Block pointer in a header file, and it gets referrd to by both C or Objective-C code. Just that when they deal with the block, they follow their own rules in C/C++ and in Objective-C/C++.

Conclusion

Be careful about Block capturing C++ references. It captures a reference, not a copy, of the object, dissimilar from C++ lambda capturing. This is true for references to usual objects and to lambda objects.

Also, remember to Block_copy() if you want to use a block beyond its current scope.

Posted in C++ | Tagged: , , , , , , , , , , , , , , | Leave a Comment »