Overflow content

Tansparent handles may cause issues

If your library is stateful, the state is often managed by a handle used as an argument in the library function calls. Let's say internally in the library we need to track some ID and a position. Those members are expected to be used only by the library itself so the consumers of the library should not care or use whatever is inside the handle.

/* mylib.h */
struct ml_hndl {
	int ml_id;
	int ml_pos;
};

/* mylib.c */
struct handle *
ml_init(void)
{
	/* allocate and init the handle first */
	/* ... */

	return (h);
}

/* mylib.c continues */
enum ml_err
ml_parse(struct handle *h, char *url)
{
	enum ml_err e = ML_OK;

	/* parse the URI */
	/* ... */

	return (e);
}

/* ... */

And the consumer would use it as follows:

#include <mylib.h>

int
main(void)
{
	ml_err ret;
	struct ml_hndl *h;

	h = ml_init(void);
	if ((ret = ml_parse(h, "some URI")) != ML_OK)
		errx(1, "...");

	/* ... */
}

So far, so good. However, what if the consumer code delves into the handle and uses the information in there (because the header file is part of the library and available to the consumer)?

int
main(void)
{
	struct ml_hndl *h;

	/* ... */

	print_debug("[Debug] handle ID: %d\n", h->ml_id);

	/* ... */
}

This will work fine until the library provider changes the handle (believed to be internal to the library):

/* mylib.h */
struct ml_hndl {
	long long ml_id;	/* used to be an int */
	int ml_pos;
};

Unless the consumer code is recompiled after the change, which may not happen as it could be some 3rd party software bought by the customer years ago, the debug print (and any manipulation with the ml_id member really) is now incorrect as it only prints the first 4 bytes of the structure, not 8.

There are ways to mitigate that:

use something non-revealing as a handle, like in index to the internal array of structures that keep the states. That is possible but you now need to manage the internal structure that keeps the state structures. You also cannot change the index type unless you recompile all the consumers as well.
the library provider could use something called incomplete types and use them to provide an opaque handle.

Incomplete types

An incomplete type is a type that describes an object but lacks information needed to determine its size. You cannot declare objects using such types as the lack of the size prevents to allocated memory for them on the stack or heap.

$ cat main.c
struct x a;	/* declaring a variable using a structure of an unknown size */

int
main(void)
{
}

$ cc main.c
main.c:1:10: error: tentative definition has type 'struct x' that is never
      completed
struct x a;
         ^
main.c:1:8: note: forward declaration of 'struct x'
struct x a;
       ^
1 error generated.

The x structure could be completed, for all declarations of that type, by declaring the same structure tag with its defining content later in the same scope. We could only forward declare the structure though (that is, no defining any object of that structure type).

As we do not need to know the size of the x structure, the following code compiles.

$ cat main.c
struct x;	/* this is called forward declaration */

int
main(void)
{
}

$ cc main.c
$ echo $?
0

BTW, the void type is an incomplete type that can be never completed. So, you cannot declare a variable of type void.

$ cat main.c
int
main(void)
{
	void x;
}

$ cc main.c
main.c:4:7: error: variable has incomplete type 'void'
        void x;
             ^
1 error generated.

However, you can always declare and work with a pointer to an incomplete type (all structures are always aligned to the same byte boundary). The following will compile and run fine.

#include <stdio.h>

struct x;

struct x *
myfn(struct x *p)
{
        return (p);
}

int
main(void)
{
        myfn(NULL);
}

This feature of the C language can be used to represent opaque handles, and those can be used to solve the problem with transparent handles .

Opaque structures (handles)

With the help of incomplete types, it is possible to declare a structure in a header file along with API that consumes it without revealing what the structure actually contains. The reason to do that is not to allow a programmer to depend (= use) on the structure members, thus allowing to change the structure innards anytime.

/* Forward declaration in opaque.h */
struct opaque;

void
do_stuff(struct opaque *f);

The file implementing do_stuff() will cast the opaque structure object to an internal structure type.

/* in opaque_impl.h not provided to the consumers */
struct opaque_impl {
    int x;
    int y;
};

/* library implementation code */
void
do_stuff(struct opaque *f)
{
	struct opaque_impl *fi = (struct opaque_impl *)f;

	fi->x = ...
	fi->y = ...
}

Then any .c file that includes opaque.h can work with the structure (by passing it to do_stuff()) however cannot access its members directly. The structure is usable only via a pointer. Thus, the library has to provide a function to allocate a structure and return the pointer to it (and possibly also to initialize it) and functions to get/set the data inside the structure.

This is handy for libraries so they are free to change the layout of structures without breaking the consumers.

The consumer of the library then #includes only opaque.h but it does not have access to opaque_impl.h.

#include <stdio.h>

#include "opaque.h"

int
main(void)
{
	struct opaque *h;

	h = getOpaque();
	doStuff(h);
}

👁️

🔧 Home assignment

Note that home assignments are entirely voluntary but writing code is the only way to learn a programming language.

`getbits`()

Implement a function getbits(x,p,n) that returns the (right adjusted) n-bit field of x that begins at position p. We assume that bit position 0 is at the right end and that n and p are sensible positive values. For example, getbits(x,4,3) returns the three bits in positions 4, 3 and 2, right-adjusted.

The idea is to zero out anything that is to the left of p, then shift those n bits to the very right, and you get the result.

/* 1 0000 0000: should return 1 */
i = getbits(0x100, 8, 1);
(void) printf("0x%X\n", i);
assert(i == 1);

/* 1110 0100: should return 9 (1001 in binary) */
i = getbits(0xe4, 5, 4);
(void) printf("0x%X\n", i);
assert(i == 9);

/* 1010 1010: should return the same number, ie. 0xAA */
i = getbits(0xAA, 7, 8);
(void) printf("0x%X\n", i);
assert(i == 0xAA);

Create and traverse path tree

In memory, build a tree of paths, where each node is a path element. The tree starts with root / (assuming Unix-like system), the non-leaf nodes are directories, the leaves are regular files. The tree can look e.g. like this for paths /foo/a.c, /foo/b.c, /f.txt, /bar/c.c:

	 "/"
       /  |   \
      /   |     \
  "foo/" "f.txt"  "bar/"
  /   \            |
"a.c" "b.c"       "c.c"

Populate the tree with given paths and print the leaves (no particular order), one per line. Then free the allocated memory.

Each node can have constant number of children.

Possible input:

char *paths[] = {
    "/beverages/coffee/espresso.java",
    "/beverages/alcohol/beer.c",
    "/food/healthy/vegetarian/salad.txt",
    "/food/healthy/fruit/blueberries.hs",
    "/food/unhealthy/cake.md"
};

🔑 dirtree.c

Bonus tasks

make the function used for tree traversal generic, i.e. pass in a function pointer
Print also non-leaf nodes via depth first traversal, with each node indented based on its tree depth. E.g:

$ ./a.out
/
foo/
	a.c
	b.c
f.txt
bar/
	c.c

Remove the limitation for the constant number of children.
Print full paths of all files in the tree (hint: backpointers)

13.md