10.md

Recap

  • pointer to pointer is useful when we need to change the address of where the pointer points to (in a function)
  • pointer to pointer is not the same as pointer to array of pointers
    • esp. in a function
  • explicit casting
    • sometimes needed even for integer types (e.g. in printf arguments)
    • handy for pointers to structures
    • when casting pointers beware of alignment issues
  • heap allocation
    • always check return values of functions that allocate memory!
    • risk of memory leaks: use static/dynamic checkers
  • no need to cast when working with void *

Leftover: 👀 2d-static-array-as-1d.c

🔧 warm-up strsepc

implement

      char *strsepc(char **stringp, int c);

which behaves like strsep(3) except that it searches only for the first occurence of a single character.

Try to use strsep() first 👀 strsep.c

🔑 code: 👀 strsepc.c

2D array initialization

Static array initialization is just an extension of 1D array initialization. It is an array of arrays, so each element of the array (of arrays) is an array initializer. You initialize by rows.

int a[MYSIZE][MYSIZE] = {
        {  0,  1,  2,  3 },     // a[0]
        {  4,  5,  6,  7 },     // a[1]
        {  8,  9, 10, 11 },     // a[2]
        { 12, 13, 14, 15 },     // a[3]
};

code: 👀 2d-static-array-initialization.c

C stores the array in a single piece of memory, consequtively, row after row. That means that all but the last dimension must be known. So you could do:

int a[][3] = {
        {  0,  1,  2 },
        {  4,  5,  6 },
        {  8 },         // a[2][1], a[2][2] implicitly initialized to 0
        { 12, 13, 14 },
}

However, you cannot do:

int a[][] = {
        {  0,  1,  2 },
        {  4,  5,  6 },
        {  8,  9, 10 },
        { 12, 13, 14 }  // no comma here, that's OK, but it's
                        // recommended to use it to make it consistent
}
xxx.c:12:7: error: array has incomplete element type 'int []'
        int a[][] = {
             ^

The word incomplete here means the storage size of the array element (which is an array itself) is unknown.

Note that in C99, you may or may not use a comma (,) after the last initializer. In C89, it was prohibited to use the comma after the last one.

Remember that we do not have to set all elements of the row when initializing an array, the C compiler will zero out the rest.

int a[10] = { 1, 2 }; // from a[2] to a[9], elements are 0

Neither you can do the following (same compiler error as above). As the compiler stores the array data in rows, if it does not know how long is the row, it cannot proceed.

int a[4][] = {
        {  0,  1,  2 },
        {  4,  5,  6 },
        {  8,  9, 10 },
        { 12, 13, 14 }
}

You already know that you can initialize a character array using a string instead of { 'a', 'b', ... }:

char s[] = "hello, world";

Same way you can initialize an array of strings. We already know from the operator priority that [] is of higher priority than *, so the following is an array of pointers to char.

char *s[] = {
        "hello",
        "hi",
        "good day",
        "what's up"
};

Note that if you wanted a pointer to an array of chars, it would be char (*s)[]. More on that later.

However, while string is technically an array of characters, it's technically an array whose value is the address of its 1st element, i.e. the 1st character. However, { 'a', 'b', ... } is an initializer for a static array, it is not a pointer. So, the following will not compile:

char *s[] = {
        { 'h', 'e', 'l', 'l', 'o', '\0' },
        { 'h', 'i', '\0' }
};

Same as the following is incorrect:

char *s = { 'a', '\n' };

Also note that the following two lines mean something quite different:

char a[] = "hello, world";
char *p = "hello, world";

In the first case, the string is only used to initialize a static array. You cannot assign anything to a after that but you can change the array's elements.

In the second case, a static string is created and stored in the binary (i.e. the program file), and p contains its address, i.e. the address of character 'h'. One can change the p variable to point to something else but it is undefined what happens when one tries to change the characters of the string (which is internally an array of chars). With gcc, all such strings are read-only:

$ cat main.c
#include <sys/types.h>

int
main(void)
{
        char *p = "hello, world";

        *p = 'H';
        return (0);
}
$ gcc main.c
$ ./a.out
Bus error: 10

Wikipedia:

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name.

Unions

  • similar to structures however have different semantics
    • stores its members in the same memory location
    • allows for different "views" of the same data
  • usually combined with structures
  • handy for efficient data storage or HW programming

consider this declaration:

union foo {
	unsigned short i; // 2 bytes
	struct {
		unsigned char low;
		unsigned char high;
	} bytes;
};
  • the sizeof(union foo) will be that of its largest member
  • modify a value of i and it will be reflected in the low/high values because i and bytes share the same location
		0             high memory addresses
		+-------------+
		|      i      |
		+-------------+
		| low  | high |
                +-------------+
  • union can be part of a structure, 👀 union-in-struct.c and vice versa 👀 struct-in-union.c
  • union can be "anonymous" (e.g. no name inside of a structure), however this is non-standard just like for structures

🔧 Task: use the above declaration to find out if the program is running on big or little endian system (least significant byte is stored on lowest address). The program will print either "big endian" or "little endian" to standard output.

🔑 code: 👀 union-lowhigh.c

Storage classes

  • two classes - automatic and static
  • declaration within a block creates an automatic object. Its storage is valid only within the very same block.
  • the storage class determines the lifetime of the storage associated with the identified object
  • only one storage class specifier may be given in a declaration
  • objects declared outside of any block is always of the static storage class (e.g. global variables).
  • static objects retain their value upon reentry to functions and blocks.
  • you can initialize a static object. The initialization happens just once.

code: 👀 fn-static-object.c

This one also shows how to use goto. More on that later.

code: 👀 block-static-object.c

Static variables and functions

  • if a function is declared static, it is visible only from the same source code file

  • static variables: retain value across function calls. That is, the initialization of a static variable within a function is done only once.

  • yes, the word static is overloaded in C

Internal vs external linkage

Static objects with the keyword static are of internal linkage, meaning they are not seen from other compilation units. Static objects without the keyword static are implicitly external.

Use extern keyword for objects that are defined in a different compilation unit.

Example:

$ cc linkage.c ext.c
Undefined symbols for architecture x86_64:
  "_si", referenced from:
      _main in linkage-917564.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see
invocation)

👀 linkage.c 👀 ext.c

Also note that each object must have exactly one definition. For objects with internal linkage, this rule applies separately to each translation unit, because internally-linked objects are unique to a translation unit.

Function pointers.

A function name is a pointer to the function. You can pass it as an argument.

int
myadd(int a, int b)
{
	return (a + b);
}

int
process_numbers(int a, int b, int (*f)(int, int))
{
	return ((*f)(a, b));
}

Code: 👀 fn-ptr.c

🔧 argv sorting

Use qsort(3) function to sort argv, and print it sorted then. Check the man page, you will need to write a function that can compare two elements of the array you provide to the qsort() function.

First use strcmp() to sort the argument alphabetically, then use atoi() to sort them numerically.

After that, come up with another way of sorting the arguments and write a function for it as well.

🔑 Code: 👀 argv-sort.c (for alphabetical sorting only)

🔧 Home assignment

Note that home assignments are entirely voluntary but writing code is the only way to learn a programming language.

mygetopt()

Implement a command line parser ala getopt(3). The global variable myoptarg will be set if an option has an argument.

Note: call the function mygetopt() not to intermingle with the standard library's getopt(). Implement the "POSIXly correct solution", i.e. stop once non-option argument is encountered (i.e. one not starting with -) or after the -- option is reached.

🔧 bonus task: implement myoptind/myopterr/myoptopt variables

See the following code on how to use getopt(3):

https://github.com/devnull-cz/unix-linux-prog-in-c-src/blob/master/getopt/getopt.c