02.md

recap of the first class

  • make sure you do subscribe to the mailing list (see http://mff.devnull.cz/c-prog-lang on how to do that). If you miss some information posted to the mailing list because you are not subscribed, it is your problem :-)

  • in general, if unsure:

    1. experiment
    2. look into the C standard
    3. use internet search to get information
  • assignment from the previous class: moving star on the console

    • our solution (we strongly recommend you code it first by yourself) is in moving-star.c
    • the home assignments are not mandatory. Also, do not send the solutions to us.

warmup

Print conversion table for https://en.wikipedia.org/wiki/Ell (rope length used by Frodo and Sam in The Lord of the Rings had some 30 ells) to inches and centimeters, i.e. table with 3 columns separated by tabs.

Print the centimeter value as float with 2 digit precision. The cm value will be the last column.

Each 10 lines print a line (sequence of - characters, say 20 times). The line will immediately follow the table header and then will appear each 10 lines. Use while cycle to print the line.

Print 30 numeric rows.

Sample output:

Ell	Inches	Centimeters
--------------------
1	45	114.30
2	90	228.60
3	135	342.90
4	180	457.20
5	225	571.50
6	270	685.80
7	315	800.10
8	360	914.40
9	405	1028.70
10	450	1143.00
--------------------
11	495	1257.30
12	540	1371.60
13	585	1485.90
14	630	1600.20
15	675	1714.50
16	720	1828.80
17	765	1943.10
18	810	2057.40
19	855	2171.70
20	900	2286.00
--------------------
21	945	2400.30
22	990	2514.60
23	1035	2628.90
24	1080	2743.20
25	1125	2857.50
26	1170	2971.80
27	1215	3086.10
28	1260	3200.40
29	1305	3314.70
30	1350	3429.00

Source code management

  • keep all of your code somewhere. Use a distributed source code management (SCM) system, ie. Git or Mercurial.
    • you could keep your repo in your home directory in the Linux lab as those machines are accessible via SSH from anywhere.
    • never use centralized SCMs like Subversion or CVS, unless you really have to, as those are things of the past century.

Comments

  • /* one line comment */

  • multi line comment:

  /*
   * multi line comment.  Follow the C style.
   * multi line comment.  Follow the C style.
   */
  • // one line comment from C99+

  • use comments sparingly

    • not very useful:
    /* increment i */
    ++i;
    • produce meaningful comments, not like this:
    /* probably makes sense, but maybe not */
    if (...)
             do_something()

Preprocessor

  • defines
  • includes
  • conditional compilation
  • use parens for #define to prevent problems with macro expansion
    • #define X (1 + 1)
  • run the compiler with the -E flag to see preprocessor output

To see the result of running preprocessor on your code, use cpp or the -E option of the compiler.

🔧 Task: reimplement fahr-to-cent.c using defines instead of literal numbers

  • solution: fahr-to-cent_defines.c

Expressions

  • any expression has a value

  • logical expression has value 0 or 1, and it's type is always int

1 > 10	... 0
10 > 1	... 1

printf("%d\n", 1 < 10);
--> 1
/* Yes, "equal to" in C is "==" as "=" is used for an assignment */
printf("%d\n", 100 == 101);
--> 0
  • neverending while loop is then:
while (1) {
	...
}

The break statement

  • statement break will cause a jump out of a most inner while loop (well, any loop but we only introduced the while loop so far).
	int finished = 0;
	while (1) {
		if (finished)
			break;
		/* not finished work done here */
		call_a_function();
		k = xxx;
		...
		if (yyy) {
			...
			finished = 1;
		}
		/* more work done here */
		...
	}

Operators

  • equality is == since a single = is an assignment operator
int i = 13;
if (i == 13) {
	// will do something here
}
  • logical AND and OR
if (i == 13 && j < 10) {
	// ...

if (i == 1 || k > 100) {
	// ...
  • you do not need extra ()'s as || and && have lower priority than == and <, >, <=, >=, and !=. We will learn more about operator priority in later lectures.

  • non-equality is !=

if (i != 13) {
	// ...
}

The boolean type

There is a new _Bool type as of C99. Put 0 as false, and a non-zero (stick to 1 though) as a true value.

The keyword name starts with an underscore as such keywords were always reserved in C while bool, true, nor false never were. So, some older code might actually use those names so if C99 just put those in the language, it would have broken the code and that is generally not acceptable in C.

If you are certain that neither bool, true, nor false are used in the code on its own, you can use those macros if you include <stdbool.h>.

See 👀 bool.c

Numbers and types

  • 1, 7, 20000 are always integers of type int if they fit (the range is [-2^31, 2^31 - 1] on 32/64 bit CPUs)

  • Hexadecimal numbers start with 0x or 0X. Eg. 0xFF, 0Xaa, 0x13f, etc.

  • Octal numbers start with 0. Eg. 010 is 8 in decimal. Also remember the Unix file mask (umask), eg. 0644.

  • 'A' is called a character constant and is always of type int. See man ascii. The ASCII standard defines characters with values 0-127.

  • float, double

    • man 3 printf, see %f is of type double. You can use:
	  printf("%f\n", float)
  • floats are automatically converted to doubles if used as arguments in functions with variable number of arguments (known as a "variadic function"), i.e. like printf()

  • char (1 byte), short (usually 2 bytes), long (4 or 8 bytes), long long (usually 8 bytes, and can not be less). It also depends on whether your binary is compiled in 32 or 64 bits.

    • 🔧 see what code emits your compiler by default (i.e. without using either -m32 or -m64 options)
      • use file to display the information about the binary
  • see also 5.2.4.2 Numerical limits in the C spec. For example, int must be at least 4 bytes but the C spec does not prevent it from being 8 bytes in the future.

  • chars and shorts are automatically converted to int if used as arguments in variadic functions.

  • as 'X' is int but within 0-127, it's OK to do the following as it will fit even if char is signed:

	char c = 'A';

Signedness

  • each integer type has a signed and unsigned variant. By default, the numeric types are signed aside from char which depends on the implementation. If you need an unsigned type, use unsigned reserved word.
  signed int;
  unsigned int;
  unsigned long;
  unsigned long long;
  ...
  • for ints, you do not need to use the int keyword, ie. signed i, unsigned u are valid but it is recommended to use signed int i and unsigned int u anyway.

  • You can use long int and long long int or just long and long long, respectively. The latter is mostly used in C.

  • char and short int is converted to int in variadic functions (we will talk more about integer conversions later in semester). Eg. the following is OK as the compiler will first convert variable c to int type, then put it on the stack (Intel x32 passing argument convention) or in a register (Intel x64 convention).

	/* OK */
	char c;
	printf("%d\n", c);

	/* OK */
	short sh;
	printf("%d\n", sh);

Modifiers for printf()

  • l for long, eg. long l; printf("%ld\n", l);

  • ll for long long, eg. long long ll; printf("%lld\n", ll);

  • u is unsigned, x is unsigned hexa, X is unsigned HEXA

	unsigned int u = 13;
	printf("%u\n", u);

	unsigned long long llu = 13;
	printf("%llu\n", llu);

	unsigned int u = 13;
	printf("%x\n", u);
	// --> d
	printf("%X\n", u);
	// --> D
  • the following is a problem though if compiled in 32 bits as you put 4 bytes on the stack but printf will take 8 bytes. Older compilers may not warn you at all!
	/* DEFINITELY NOT OK.  Remember, 13 is of the "int" type. */
	printf("%lld\n", 13);

	$ cc -m32 wrong-modifier.c
	wrong-modifier.c:6:19: warning: format specifies type 'long
	long' but the argument has type 'int' [-Wformat]
		printf("%lld\n", 13);
			~~~~     ^~
			%d
	1 warning generated.
	$ ./a.out
	2026120757116941
  • when compiled in 64 bits, it is still as wrong as before but it will work anyway as 13 is assigned to a 64 bit register (because of x64 ABI).
	$ cc -m64 wrong-modifier.c
	wrong-modifier.c:6:19: warning: format specifies type 'long
	long' but the argument has type 'int' [-Wformat]
		printf("%lld\n", 13);
			~~~~     ^~
			%d
	1 warning generated.
	$ ./a.out
	13

Suffixes

  • you can explicitly specify larger integers with suffices

    • 13L and 13l is a long
    • 13LL and 13ll is a long long (Ll and lL is illegal)
    • 13u and 13U is an unsigned int
    • 13lu and 13LU is an unsigned long
    • 13llu and 13LLU is an unsigned long long
  • so, 0xFULL and 0XFULL is unsigned long long 15 :-)

	printf("%llu\n", 0xFULL);
	// --> 15
	printf("%lld", 13LL);	/* OK */
	// --> 13
	/* in general NOT OK as long may be 4 bytes while long long 8 bytes */
	printf("%ld", 13LL);
	// --> ??
  • Escape sequences \ooo and \xhh (not \Xhh) are character sized bit patterns, either specified as octal or hexadecimal numbers. They can be used in string constants and in character constants.
	printf("\110\x6F\154\x61");
	printf("%c\n", '\x21');
	// -> Hola!

The sizeof operator

  • the sizeof operator computes the byte size of its argument which is an expression or a type
    • its type is size_t which is an unsigned integer according to the standard. However, the implementation (= compiler) can choose whether it's an unsigned int, an unsigned long int, or an unsigned long long int.
    • in printf(), the z modifier modifies u to size_t, so this is the right way to do it:
printf("%zu\n", sizeof (13));
// --> 4
  • the expression within the sizeof operator is never executed (the compiler should warn you about such code). Only the size in bytes needed to store the value if evaluated is returned.
int i = 1;
printf("%zu\n", sizeof (i = i + 1));
// --> 4
printf("%d\n", i);
// --> 1
  • 🔧 try sizeof on various values and types in printf(), compile with -m 32 and -m 64 and see the difference
sizeof (1);
sizeof (char);
sizeof (long);
sizeof (long long);
sizeof ('A');
sizeof (1LL);
// ...
  • we will get there later in semester but if you are bored, try to figure out why the following is gonna print 1 4 4:
char c;
printf("%zu\n", sizeof (c));
// --> 1
printf("%zu\n", sizeof (c + 1));
// --> 4
printf("%zu\n", sizeof (+c));
// --> 4

Integer Literals

  • An integer literal can be a decimal, octal, or hexadecimal constant.

  • so, all of these are equal:

	printf("%c\n", 'A'); // will print an integer as a character
	// --> A
	printf("%c\n", 0101);
	// --> A
	printf("%c\n", 0x41);
	// --> A
	printf("%c\n", 65);
	// --> A
  • if you use a larger number than fits within a byte as an argument for the %c conversion, the higher bits are trimmed. The rule here is that the int argument is converted to unsigned char (not just char!), then printed as a character (= letter). More on integer conversion in upcoming lectures. See also Numbers on why you never pass a char nor short to a variadic function.
    • also note the existence of h and hh modifiers. See the printf() man page for more information.
	printf("%c\n", 65 + 256 + 256 + 256 * 100);
	// --> still prints A
  • assignment is also an expression, meaning it has a value of the result, so the following is legal and all variables a, b, and c will be initialized with 13
	int a, b, c;
	a = b = c = 13;

🔧 Task: print ASCII table

Print ASCII table with hexadecimal values like on in the ascii(7) man page https://man.openbsd.org/ascii except for non-printable characters print NP (non-printable).

To determine whether a character is printable you can use the isprint() function.

Use just while and if (without else).

Solution: 👀 ascii-hex.c

getchar()

  • getchar() function reads one character from the process standard input and returns its value as an integer.
    • when it reaches end of input (for example, by pressing Ctrl-D in the terminal), it returns EOF
    • EOF is a define, usually set as -1. That is why getchar() returns an int instead of a char as it needs an extra value for EOF.
    • getchar() needs #include <stdio.h>
    • you can verify that [EOF is part of <stdio>] (https://pubs.opengroup.org/onlinepubs/9699919799/)

🔧 Task: write code that will read characters from a terminal and prints them out.

It should work like this:

	$ cat /etc/passwd | ./a.out > passwd
	$ diff passwd /etc/passwd
	$ echo $?
	0
  • remember, we said above that an assignment is just an expression, so it has a value. So, you can do this:
	if ((c = getchar()) == EOF)
		return (0);

instead of:

	c = getchar();
	if (c == EOF)
		return (0);

However, do NOT abuse it as you may create a hard to read code. Note the parentheses around the assignment. The "=" operator has lower priority than the "==" operator. If the parens are not used, the following would happen:

if (c = getchar() == EOF) would be evaluated as:

if (c = (getchar() == EOF)), meaning that "c" would be either 0 or 1 based on whether we read a character or the terminal input is closed.

We will learn more about operator priority later in the semester.

Solution: 👀 getchar.c

🔧 Home assignment

Note that home assignments are entirely voluntary but writing code is the only way to learn a programming language.

🔧 Count digit occurrence

If unsure about the behavior, compile our solution and run it.

  • count occurence of each 0-9 digit. Only use what we have learned so far. You end up with longer code than otherwise necessary but that is OK.
$ cat /etc/passwd | ./a.out
0: 27
1: 37
2: 152
3: 38
4: 39
5: 43
6: 34
7: 35
8: 29
9: 31

Solution: 👀 count-numbers.c

🔧 to upper

Convert small characters to upper chars in input. Use the fact that a-z and A-Z are in two consequtive sections of the ASCII table.

Use the else branch:

	if (a) {
		...
	} else {
		...
	{

Expected output:

	$ cat /etc/passwd  | ./a.out
	##
	# USER DATABASE
	#
	# NOTE THAT THIS FILE IS CONSULTED DIRECTLY ONLY WHEN THE SYSTEM IS RUNNING
	# IN SINGLE-USER MODE.  AT OTHER TIMES THIS INFORMATION IS PROVIDED BY
	# OPEN DIRECTORY.
	#
	# SEE THE OPENDIRECTORYD(8) MAN PAGE FOR ADDITIONAL INFORMATION ABOUT
	# OPEN DIRECTORY.
	##
	NOBODY:*:-2:-2:UNPRIVILEGED USER:/VAR/EMPTY:/USR/BIN/FALSE
	...
	...

Solution: 👀 to-upper.c