Implement:
int strcmp(const char *s1, const char *s2);
Returns (according to the strcmp(3) manual page):
an integer greater than, equal to, or less than 0, according as the string s1 is greater than, equal to, or less than the string s2.
You can add the following piece of code into your main
function to test it
(and include assert.h
):
assert(strcmp("", "") == 0);
assert(strcmp("", "foo") < 0);
assert(strcmp("foo", "") > 0);
assert(strcmp("abc", "abd") < 0);
assert(strcmp("foo", "bar") > 0);
assert(strcmp("foo", "foo") == 0);
assert(strcmp("foo", "fooz") < 0);
assert(strcmp("fooz", "foo") > 0);
If any assert triggers, your implementation is buggy.
🔑 strcmp.c
🔧 Compare your solution with the above. Try to reimplement it so that it is as smallest (in terms of Lines Of Code) as possible.
int main(int argc, char *argv[]);
-
The
argv
is declared as an array of pointers.- i.e.
argv[i]
is a pointer tochar
- i.e.
-
The arguments of
main()
can have arbitrary names however please stick to the convention to avoid confusion of those who might be reading your program. -
The
argc
is a number of command line arguments, including the command name itself (inargv[0]
). -
argv[i]
are arguments as strings. They are strings even if you put numbers there on the command line. -
argv[argc]
is a null pointer by definition.
Note: remember (see notes about array passed to function ) that in a function argument, an array is always treated as a pointer so the above effectively becomes:
int main(int argc, char **argv);
I.e. in this context, char *argv[]
and char **argv
are the same thing and
there is not a preferred option.
The declaration merely hints at the memory layout. That is how it was concieved by the fathers of C, unfortunately it often causes confusion.
Also, you already know that you can use an array notation with strings as
well, so you could use argv[i][j]
to print individual characters. Just make
sure that it is not out of range.
-
The memory for
argc
,argv
is allocated beforemain()
is called and the C99 standard leaves unspecified whereargc
/argv
are stored.section 5.1.2.2.1: the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.
-
A quote from the execve(2) man page on Unix systems:
The
argv
is an array of pointers to null-terminated strings and must be terminated by a null pointer.
If unsure, draw a diagram. The memory addresses are just examples:
argv
+-----------+
| 0xFF00 |---------->+--------------+
+-----------+ 0xFF00 | 0xBB0000 |------------>+---+---+---+---+----+
+--------------+ 0xBB0000 | p | r | o | g | \0 |
0xFF08 | 0xFFAA00 |-\ +---+---+---+---+----+
argc +--------------+ \
+----------+ 0xFF10 | 0xCCFF00 |- \----------->+---+---+---+----+
| 3 | +--------------+ \ 0xFFAA00 | f | o | o | \0 |
+----------+ | NULL | \ +---+---+---+----+
+--------------+ \
->+---+---+---+----+
0xCCFF00 | b | a | r | \0 |
+---+---+---+----+
- Print all command line arguments using
argc
- Print all command line arguments using just
argv
- Print all command line arguments not starting with
-
- Print all command line arguments using a recursive function (that accepts pointer to pointer to char).
Note: for all arguments print their address as well.
Note: do not print the terminating null pointer entry.
Some printf()
implementations barf on a null pointer when printing via the
%s
format string.
Code:
Also see:
- Print all command line arguments without using square brackets.
- As above but do not use any variable aside from
argv
.
Write a program with usage ./a.out <a> <b> <string>
to find a distance (number
of characters) between the first occurence of character <a>
and <b>
in a
string <string>
. If either of the character is not found in the string, print
an error.
./a.out a x "ahello xworld"
7
Note: do not use strchr()
or the like.
- Usually used to print when invalid option or arguments are specified
- Can be handled via
errx()
- The usage usually contains program name followed by the argument schema
- See e.g. the
nc(1)
man page - Optional arguments are enclosed in square brackets, mandatory arguments are
enclosed in
<>
or left without brackets
🔧 Write a program that takes 1 or 2 arguments. If run with any other count, print a meaningful usage and exit.
🔑 usage.c
Usage: ./a.out <r> <n> [args]
Ignore argv[0]
, argv[1]
, and argv[2]
. If there are not at least n
extra
arguments or the n
-th argument is not long enough, print a helpful message.
Only use pointer arithmetics, do not use square brackets (ie. argv[i][j]
is not allowed).
./a.out 2 3 hey hi world
l
Note: use atoi()
to convert the first 2 arguments to integers
Assume that the arguments are sufficiently long enough.
Skipping ahead: prefix ++
and dereference operator *
have the same
precedence so they are evaluated based on associativity which is right-to-left.
int
main(int argc, char **argv)
{
printf("%s\n", argv[1]);
printf("%s\n", ++*++argv);
printf("%s\n", argv[0]);
printf("%s\n", ++*++argv);
printf("%s\n", argv[0]);
}
now with extra dereference:
int
main(int argc, char **argv)
{
printf("%s\n", *++*++argv);
}
Note: the last function might not compile with smarter compilers (such as LLVM) that include format string checks. What is expected to happen if the last piece of code does compile and is run with one argument?
Code:
-
Collection of one or more members, possibly of different types, grouped together under a single name.
-
Is one of the two aggregate types (the other aggregate type is the array)
-
Structures permit group of related members to be treated as a unit (precursor to a class in Object Oriented Programming).
-
Structures can contain other structures.
-
Structure is specified as:
struct foo {
... // members
};
e.g.
struct foo {
int a;
char b;
};
-
Any type can be a member of a structure that it not incomplete and not a function. Incomplete means its size is unknown; more on that later.
- That means a structure may not contain itself (before the structure
definition is finished with the terminating
}
, it is an incomplete type as its size is not yet known)- There is a minor exception though, see C99 6.7.2.1, paragraph 2 and 16, and also a flexible array member.
- However: a pointer to its own type is possible (remember, a pointer is just a number referencing a piece of memory, and its size is known)
- Unlike in C++, structure cannot contain functions. It may contain pointers to functions, though.
- That means a structure may not contain itself (before the structure
definition is finished with the terminating
-
Structure does not need a name, 👀 struct-unnamed.c
- However then its use is limited to a variable declaration
- One can even have an "anonymous structure", however that is a C11 extension, 👀 struct-anon.c
-
The
struct
declaration itself cannot contain initializers. However, the structure can be initialized with a list of initializers in the same way as arrays.
So, you cannot do:
struct foo {
int a;
} = { 1 };
- When the structure has been defined, you can declare a variable of its type:
struct foo f;
- Sometimes the
_s
or_st
postfix is used to hint that a name is a structure
Note: the struct
keyword has to be used for its definition and declaration:
foo f;
is not valid.
- Can declare structure and objects of its type at the same time, and you can also initialize it at the same time:
struct foo_s {
...
} foo = { 0, ... };
-
However, this is unusual because structures are normally saved to header files, and including such a header file would mean actual object definition(s) which is rarely desirable.
-
For better code readability and also to be able to search for the members in (large) code base, members are often prefixed with a common string ending with an underscore to denote their structure membership, e.g.:
/*
* 'sin' is a shortcut for 'Sockaddr_IN', the Internet socket address
*/
struct sockaddr_in {
short sin_family;
u_short sin_port;
};
- When looking for variable names in a big source code repository (using
ctags
,cstyle
or tools such as OpenGrok), there would be large amount of generally named variables likeport
,size
, etc in various source code files. However, with the prefix, likesin_port
, very often you find just one, the one you are looking for.
struct X { int a; char b; int c; };
-
The offset of the first member will be always 0.
-
Other members will be padded to preserve self-alignment (i.e. a member is always aligned in memory to multiple of its own size).
- The value of the padding bits is undefined by definition and you must not rely on it.
-
What will be the result of
sizeof (struct X)
above?- Why? (think about efficiency of accessing members that cross a word in memory).
-
What if
char d
is added at the end of the data structure?- Why is that? (think about arrays and memory access again).
-
What if
char *d
is added at the end of the data structure? (i.e. it will have 4 members).- Assume this is being compiled on 64-bit machine.
- Again, for efficiency the access to the pointer should be aligned to its size.
- If in doubt, draw a picture.
+-----------+----+--------+------------+
| a | b | pad | c |
+-----------+----+--------+------------+
- Does the compiler reorder struct members? No, C is designed to trust the programmer.
Note: gcc/Clang has the -fpack-struct
option that will condense the members at
the expense of speed when accessing them. Use only when you know what you are
doing as it may not be safe on all architectures.
There is also attribute (or preprocessor pragma
) than can be used on per
structure basis.
Link: http://www.catb.org/esr/structure-packing/
Members are accessed via 2 operators: .
and ->
-
Infix operators, left-to-right associativity, both are in the group of operators with the highest precedence (priority)
-
->
is used if the variable is a pointer,.
otherwise -
E.g.:
struct foo_s {
int a;
char b;
} foo;
foo.a = 42;
foo.b = 'C';
The .
and ->
operators have higher precedence than *
and &
, so: &foo.b
gets the address of the member b
.
Structure assignment is done byte by byte (shallow copy - does not follow pointers):
struct foo_s one, two;
one = two;
- Handy for members that are pointers.
- On the other hand for large structures (say hundreds of bytes) this can be quite an expensive operation.
Pointers to structures are often used:
struct foo_s *foo;
foo->a = 42;
foo->b = 'C';
🔧 Write the above assignments to the members a
and b
using a
de-reference operator on foo
.
🔧 now if a
was a pointer to integer, how would the code change?
Write a macro (or start with a function with hardcoded values) that will print the offset of the specified member of a given structure.
offsetof(struct X, a)
Hint: exploit the fact that pointer can be assigned an integer (0) + use pointer arithmetics
The macro is useful for debugging (mapping disassembly to C code based on literal offsets) and also when working with flexible array member.
Note: offsetof()
is a standard macro available since ANSI C via stddef.h
.
Can initialize a structure in its definition using the initiator list of values.
You must either follow the ordering of members:
struct foo_s {
int a;
char b;
char *s;
};
struct foo_s foo = { 1, 'C', "hello world" };
or use designated initializers from C99:
struct foo_s foo = {
.b = 'C',
.a = 1,
};
The ordering in the struct declaration does not have to be preserved (but you really should follow it though).
Omitted field members are implicitly initialized the same as objects that have static storage duration (ie. will be initialized to 0).
You can only:
- Copy a structure.
- Assign to it as a unit.
- Taking its address with
&
. - Access its members.
So, structures cannot be:
- Compared (for that one has to implement a comparator function).
- Incremented (obviously).
Define array of structures of this type:
struct animal {
char name[NAME_MAX]; // max filename length should be sufficient
// even for these long Latin names
size_t legs; // can have many legs
};
And initialize it with some samples (can define the array in animals.h
) and
implement a function:
size_t count_minlegs(struct animal *, size_t len, size_t min);
That will return number of animals in the array (of len
items) that have at
least min
legs.
Notice that the function returns size_t
. This way it is ready for future
expansion. If it returned unsigned int
and 32-bits was not found enough later
on, the prototype would have to be changed which would cause problems for the
consumers of this API.
The function will be implemented in a separate file. (Do not forget to create a header file(s).)
In the main()
program (first program argument will specify the min
parameter
for the function) pass an array of structures to the count_minlegs
function and report the
result.
Note: you will need:
-
limits.h
for theNAME_MAX
definition -
stddef.h
forsize_t
(as per C99, §7.17)
Code:
Note: for compilation it is only necessary to compile the *.c
files and then
link them together.
It can be done e.g. like this:
cc struct-animals.c animal_minlegs.c
The compiler will do the compilation of the individual object files and then
call the linker to contruct the binary (named a.out
).
Or as follows which is closer to what would be done using a Makefile:
cc -c struct-animals.c animal_minlegs.c
cc -o animals struct-animals.o animal_minlegs.o
Technically, animals.h
contains code, however, given it is included in a .c
file it is not necessary to compile it individually.
🔧 Use the code from previous task and implement (in separate .c
file)
static size_t getlegs(struct animal *);
That will return number of legs for a given animal.
Note that home assignments are entirely voluntary but writing code is the only way to learn a programming language.
Implement:
struct animal *maxlegs(struct animal *, size_t len);
It will use the getlegs()
function and will return an animal with highest leg
count. Return pointer to the structure (= array element) from the function.
The main()
function (in separate file) will define an array of animals and
will call maxlegs()
. The name of the animal with maximum number of legs will
be printed to standard output.
Note: does the original structure change if the structure returned from the function was modified within the function? How to fix this ?
Code:
🔑 animal_maxlegs.c 🔑 maxlegs.c 🔑 animals.h 🔑 animal.h