Programming Language Concepts Using C and C++/Generic Containers

From Wikibooks, open books for an open world
Jump to navigation Jump to search

In this chapter, we will take a look at an implementation of a generic container, namely a vector. Our module will enable its clients to create and manipulate vectors housing items of different data types. In doing so, we will simulate the C++-style containers, which contain objects of a certain type.[1] That is, a particular container will be either a collection of ints or a collection of doubles, but not a mixture of both. This is in contrast with the pre-1.5 Java-style containers, where a container is a collection of Objects and therefore can have items belonging to incompatible classes.

Another aspect of C programming that we will be discussing in this handout is the use of variadic functions to compensate for the absence of overloading. Enabling the programmer to have an indefinite number of unnamed formal parameters on top of the named ones, variadic functions can be used to provide a poor man’s substitute for overloading. All we have to do is to get the caller to provide an extra argument for discrimination among overloaded function candidates.

Example: Instead of writing

void f(int); void f(int, int); void f(double);

f(3); f(3, 3); f(3.0);

we now write

void f(int func_type, ...);

f(ONE_INT, 3); f(TWO_INTS, 3, 3); f(ONE_DOUBLE, 3.0)

Our presentation will be concluded with a few words on static object module libraries.

Module[edit | edit source]

Interface[edit | edit source]

General.h
#ifndef GENERAL_H

#define GENERAL_H

...
...

Following macro definitions stand for constructor types that one should consider in the process of implementing an abstract data type.

#define COPY 0
#define DEFAULT 1

...

Creating a copy of a Vector object from an already existing one involves copying its components too. With a little insight we can see that there is no universal function to serve this purpose. As a matter of fact, same data type may require different such functions in different contexts. Same argument applies to the case of destroying a Vector object.

So, are we (the programmers) supposed to provide each and every possible copy and destruction functions? This beats the purpose of providing a generic container. We have to get the user to cooperate with us. And the way to do so is through the following pointer-to-function type definitions.

typedef Object (*COPY_FUNC)(Object);
typedef void (*DESTRUCTION_FUNC)(Object);

#endif
Vector.h
#ifndef VECTOR_H
#define VECTOR_H

#include "General.h"

struct _VECTOR;
typedef struct _VECTOR* Vector;

We want to implement a generic vector. That is, a vector that can have anything as its component: a vector of ints, a vector of character strings, a vector of student records, and so on. Given this requirement and the fact that the structure of the component can be known only to the user [not the implementer], we need to ask the user for some assistance about copying and destroying components of the vector. Basically, the user must implement these two functions and somehow pass it to the implementer. In other words, the user and the implementer must cooperate; the user must fill in the missing parts in the framework that is provided by the implementer. A user maintaining a Vector of student records has to provide copy and destruction functions for student record type; a user dealing with employee records has to provide the same functions for employee record type; ...

This style of programming differs from the conventional technique, where the user of the library develops algorithms that invoke library functions, but the overall structure of the application and flow of control differs for each new application. Although almost any C program makes use of the I/O library, the overall structure of a particular program is independent of others; routines provided in the I/O library are simply 'low-level' operations common to all programs and do not have to be tailored to the needs of a particular application. This is the technique used with procedural programming languages such as Pascal and C.[2]

Traditional libraries

In software framework usage, however, it is the overall structure, the high-level view of the algorithm that remains constant and is reused from application to application. The programmer (the user of the framework) specializes the framework, but majority of code remains the same across all applications. Because of this inversion in the relationship between library code and the user-supplied code, a software framework is sometimes referred to as an "upside-down" library.

Software frameworks

The best well-known software frameworks are those associated with graphical user interfaces. The major tasks involved in executing a graphical program- waiting for the user to move or click the mouse, waiting for keystrokes, moving or resizing windows, and so on- do not change much from one application to another. Thus these actions can be usefully combined in a single framework not only to save programming effort in developing new applications, but also to ensure consistency among applications. What distinguishes one program from another are features such as the contents of windows and the action to be performed in response to a mouse click or keystroke. It is these actions that must be programmed anew in each specific application of the framework.

This is the style of choice in object-oriented programming languages. To create a new application, it is necessary only to subclass from the original class [using inheritance] and then implement the behavior associated with these deferred functions. All other functionality, including the overall structure of the application, is obtained “for free” through the use of inheritance. Popular examples are the C++-based MFC (Microsoft Foundation Classes) and Java-based JFC (Java Foundation Classes).

This style of programming (software framework) would be possible if the implementer could 'call back' to some user code. In an object-oriented programming language, this can be done by implementing an interface defined within the framework or subclassing an abstract class and implementing the abstract functions—such as mouse click, keystrokes, and etc.—in this class. In C, we can do this by using function pointers. All we have to do is to define the function signatures, which is what we are doing here, and use these definitions as types of parameters in certain functions. The callback takes place in these functions by the implementer calling the user’s code through the function pointer passed to the function.

typedef struct _Object_funcs {
  COPY_FUNC copy_component;
  DESTRUCTION_FUNC destroy_component;
} Vector_Component_Funcs;

#define NON_EXISTING_VECTOR -1

#define INIT_CAP 2
#define BOTH 3

The following function signature is an example to the declaration of variadic functions in C. Ellipsis used in the specification signifies an unknown number of parameters coming after the named arguments.[3] Number of arguments passed to a variadic function is deduced from the contents of one or more of the named arguments. Take the printf function for instance, whose prototype is provided below.

int printf(const char *format_string, ...);

Placeholders, certain character sequences starting with ‘%’, in format_string help in figuring out the number and type of arguments.

extern Vector Vector_Create(int constructor_type, ...);
extern int Vector_Destroy(Vector* this);
extern void Vector_Assign(Vector* lhs, const Vector rhs);
extern Object Vector_ElementAt(const Vector this, unsigned int pos);
extern Object Vector_InsertAt(const Vector this, Object item, unsigned int pos);
extern Object Vector_RemoveAt(const Vector this, unsigned int pos);
extern Object Vector_UpdateAt(const Vector this, Object new_item, unsigned int pos);
extern unsigned int Vector_Size(const Vector this);

#endif

Implementation[edit | edit source]

Vector.c
#include <stdarg.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "General.h"
#include "utility/Vector.h"

For controlling the lifetime of a particular object shared among different handles (actually pointers in our case), we use an extra field to hold the number of handles that provide access to the same underlying object. Such a field is called a reference count.

The last field of the structure definition can be seen, for pedagogical purposes, as Object container[];. That is, a Vector is basically an array of Objects. Using * instead of [] gives us the flexibility of changing the size of our Vector dynamically. Had we used [] we would have had to specify the size at compile time, which would have been too restrictive for our purposes.

struct _VECTOR {
  COPY_FUNC copy_component;
  DESTRUCTION_FUNC destroy_component; 
  unsigned int ref_count;
  unsigned int cap;
  unsigned int cap_inc;
  unsigned int cur_size;
  Object *container;
};

static BOOL adjust_capacity(const Vector); 
static void shift_down(const Vector, unsigned int);
static void shift_up(const Vector, unsigned int); 

Vector Vector_Create(int type, ...) {
  int i;

The following line declares a variable that will be used in traversing the list of unnamed arguments.

  va_list ap;
  Vector existing_vec;
  Vector_Component_Funcs comp_funcs;

  Vector ret_vec = (Vector) malloc(sizeof(struct _VECTOR));
  if (!ret_vec) {
    fprintf(stderr, "Out of memory...\n");
    return NULL;
  } /* end of (!ret_vec) */

  ret_vec->cur_size = 0; 
  ret_vec->ref_count = 1;
  ret_vec->container = NULL;

The following macro is used to initialize the list of unnamed arguments so that an internal pointer in ap points right after the last named argument.

Initializing the variable argument list


  va_start(ap, type);
  switch (type) {
    case COPY:

Next line returns an argument of type Vector and advances the internal pointer so that it points to the location right after this Vector argument. The run-time stack after execution of this statement is as shown below:[4]

Unnamed parameters (conpy constructor)


      existing_vec = va_arg(ap, Vector);

This is where we retrieve the function pointers passed by the user. Each time we need to copy or destroy a component we will call back to the functions found at the given addresses.

      ret_vec->copy_component = existing_vec->copy_component;
      ret_vec->destroy_component = existing_vec->destroy_component;
      ret_vec->cap = ret_vec->cur_size = existing_vec->cur_size;
      ret_vec->cap_inc = existing_vec->cap_inc;

In addition to malloc, we can use calloc to allocate a region of memory from the heap. This function allocates memory large enough to hold n (passed as the first argument) elements, each of size m (passed as the second element); it returns a pointer to the first element of the array. This can certainly be done by malloc. As an extra, the returned region is bitwise zeroed, which is not done by malloc.

Memory allocated calloc can be returned using free or cfree. But keep in mind that despite its omnipresence cfree is not a standard C function.

Question:
Can we use calloc to allocate memory that holds an int with an initial value of 0? What about a float with an initial value of 0.0? Can you guarantee that answer will always be the same?

float *f = (float *) calloc(1, sizeof(float)); /* What is the value contained in *f? Is it 0.0? */ int *i = (int *) calloc(1, sizeof(int)); /* What is the value contained in *i? Is it 0? */

      ret_vec->container = calloc(existing_vec->cur_size, sizeof(Object));
      for(i = 0; i < existing_vec->cur_size; i++) 
        ret_vec->container[i] = existing_vec->copy_component(existing_vec->container[i]);
      break;
    case DEFAULT:
      comp_funcs = va_arg(ap, Vector_Component_Funcs);
      ret_vec->copy_component = comp_funcs.copy_component;
      ret_vec->destroy_component = comp_funcs.destroy_component;
      ret_vec->cap = ret_vec->cap_inc = 0;
      break;
    case INIT_CAP:
      comp_funcs = va_arg(ap, Vector_Component_Funcs);
      ret_vec->copy_component = comp_funcs.copy_component;
      ret_vec->destroy_component = comp_funcs.destroy_component;


Unnamed parameters (with initial capacity)


      ret_vec->cap = va_arg(ap, unsigned int);
      ret_vec->cap_inc = 0;
      if (ret_vec->cap <= 0) break; 
      ret_vec->container = calloc(ret_vec->cap, sizeof(Object));
      break;
    case BOTH: 
      comp_funcs = va_arg(ap, Vector_Component_Funcs);
      ret_vec->copy_component = comp_funcs.copy_component;
      ret_vec->destroy_component = comp_funcs.destroy_component;
      ret_vec->cap = va_arg(ap, unsigned int);


Unnamed parameters (with both initial capacity and capacity increment)


      ret_vec->cap_inc = va_arg(ap, unsigned int);
      if (ret_vec->cap <= 0) break; 
      ret_vec->container = calloc(ret_vec->cap, sizeof(Object));
      break;
  } /* end of switch(type) */

va_end performs necessary cleanup operations on ap. In other words, it serves the purpose of a destructor. It should be called by the programmer after all arguments have been read with va_arg.

  va_end(ap);

  return(ret_vec);
Question:
Which one of the following memory layouts is more reasonable for implementing variadic functions?
Question:
In C, which one do you think is responsible for removing the arguments from the run-time stack, caller or the callee?
} /* end of Vector Vector_Create(Constructor_type, struct basic_funcs, ...) */

The structure of a Vector object is as given below.

Insert figure

The user accesses a Vector object through a handle (this variable in the figure). The representation is hidden and it is modified only through the functions in the public interface of Vector.

Note that the handle is a pointer to the properties of the Vector, not to its components. This is the typical approach taken to implement collection types. You have a handle on some metadata, which (in addition to defining properties of the collection) has a handle on the container that is used to hold the components.

int Vector_Destroy(Vector* this) {
  int i;
  Vector this_copy = *this;

  if (this_copy == NULL) return(NON_EXISTING_VECTOR);

The Vector_Destroy function having been called doesn’t mean the underlying object will be destroyed. In the case of an object that is shared among different handles, the reference count is decremented and the underlying object is left untouched. If the reference count is one we are about to severe the tie between the object and the last handle on it. In such a situation, we must also destroy the underlying object!

  if(this_copy->ref_count > 1) {
    *this = NULL;
    return(--this_copy->ref_count);
  } /* end of if(this_copy->ref_count > 1) */

Note that destroy_component is a pointer to function. That is, the function is called through a pointer; we can change the function being invoked by assigning different pointer values. This is actually what gives us the ability to deal with different types of data.[8]

  for (i = this_copy->cur_size - 1; i >= 0; i--)
    this_copy->destroy_component(this_copy->container[i]));

  free(this_copy->container); free(this_copy);
  *this = NULL;

  return 0;
} /* end of int Vector_Destroy(Vector*) */

Object Vector_ElementAt(const Vector this, unsigned int pos) {
  Object ret_object;

  if (pos >= this->cur_size) return NULL;

See that we return a copy of the required component, not a pointer to it. The rationale behind this is to avoid giving any access to the underlying data structure.

  ret_object = this->copy_component(this->container[pos]);

  return(ret_object);
} /* end of Object Vector_ElementAt(const Vector, unsigned int) */

Next function assigns a Vector object to another. Since we don’t have operator overloading in C, we have to come up with a function name and use this name to perform assignment. Of course, we could still use '=' to do this. But, doing so would lead to an inconsistent state, where reference count and the number of handles sharing the underlying object do not agree.

Vector v1, v2; v1 = Vector_Create(DEFAULT, ...); Vector_InsertAt(v1, ...); ...; ...; Vector_InsertAt(v1, ...); ... ... Vector_Assign(&v2, v1); ...

The previous code fragment will produce the picture given below.

Insert figure
void Vector_Assign(Vector* lhs, const Vector rhs{
  Vector_Destroy(lhs);
  *lhs = rhs;
  rhs->ref_count++;
} /* end of void Vector_Assign(Vector*, const Vector) */

Object Vector_InsertAt(const Vector this, Object new_item, unsigned int pos) {
  if (this->cur_size < pos) pos = this->cur_size;

Before we make the insertion, we have to make sure there is enough room in our structure. If not, we adjust the container capacity according to some predefined formula. In case we may be out of heap memory, we do not effect any changes in the structure and just return a NULL value from the function. Otherwise, we extend the memory region needed for the container and go ahead with the insertion: if the new component is inserted at the end just copy it at the end of the vector, which is indicated by the cur_size field; if the new component is inserted somewhere other than the end, shift the components that are to come after location of the new item down by one and make the insertion.

  if (this->cap == this->cur_size)
  if (!adjust_capacity(this)) return NULL;
  if (pos != this->cur_size) shift_down(this, pos);
  this->container[pos] = this->copy_component(new_item);
  this->cur_size++;

  return new_item;
} /* end of Object Vector_InsertAt(const Vector, Object, unsigned int) */

Object Vector_RemoveAt(const Vector this, unsigned int pos) {
  Object ret_object;

  if (pos >= this->cur_size) return NULL;
  ret_object = *(this->container + pos);
  if (pos != this->cur_size - 1) shift_up(this, pos);
  this->cur_size--;

  return(ret_object);
} /* end of Object Vector_RemoveAt(const Vector, unsigned int) */

Object Vector_UpdateAt(const Vector this, Object new_item, unsigned int pos) {
  Object old_value;

  if (pos >= this->cur_size) return NULL;

  old_value = *(this->container + pos);
  this->container[pos] = this->copy_component(new_value);

  return old_value;
} /* end of Object Vector_UpdateAt(const Vector, Object, unsigned int)*/

unsigned int Vector_Size(const Vector this) { return(this->cur_size); }

adjust_capacity function extends the capacity of a Vector according to some given formula. In our implementation, we adopt the formula used in most C++ compilers: if the Vector has a nonzero capacity increment value, capacity is incremented by this amount; if the user has not provided any value for this field or it is zero, the new capacity is calculated by doubling the previous one. (If the current capacity value is zero then new capacity is taken to be one.) Failure to allocate enough room for the Vector object results in returning a FALSE value.

static BOOL adjust_capacity(const Vector this) {
  int i;
  const Vector old_this = (Vector)  malloc(sizeof(struct _VECTOR));
  if(!old_this) {
    fprintf(stderr, "Out of memory...\n");
    return FALSE;
  } /* end of if(!old_this) */
  memcpy(old_this, this, sizeof(struct _VECTOR));

  if (this->cap_inc != 0) this->cap += this->cap_inc;
  else if (this->cap == 0) this->cap = 1;
    else this->cap *= 2;
  this->container = calloc(this->cap, sizeof(Object));

  if (!this->container) {
    this->cap = old_this->cap;
    this->container = old_this->container;
    free(old_this);
    fprintf(stderr, "Out of memory...\n");
    return FALSE;
  } /* end of if(!this->container) */

  if (old_this->cap == 0) {
    free(old_this);
    return TRUE;
  } /* end of if(old_this->cap == 0) */
  memcpy(this->container, old_this->container, 
  sizeof(Object) * old_this->cur_size);

  free(old_this->container);
  free(old_this);
  return TRUE;
} /* end of BOOL adjust_capacity(const Vector) */

Shifting down is required whenever a new item is inserted into any location other than the end of the Vector.

static void shift_down(const Vector this, unsigned int pos) {
  unsigned int i;
  for(i = this->cur_size; i > pos; i--)
    memcpy(this->container + i, this->container + (i - 1),sizeof(Object));
} /* end of void shift_down(const Vector, unsigned int) */

Similarly, shifting up is required whenever an item is removed from any location other than the end of the Vector.

static void shift_up(const Vector this, unsigned int pos) {
  unsigned int i;
  for(i = pos; i < this->cur_size - 1; i++)
    memcpy(this->container + i, this->container + (i + 1), sizeof(Object));
} /* end of void shift_up(const Vector, unsigned int) */

Test Programs[edit | edit source]

Vector_Test.c
#include <stdio.h>

#include "General.h"
#include "Wrappers.h"
#include "utility/Vector.h"

enum {CREATE_COPY = 1, CREATE_DEFAULT, DESTROY, INSERT = 6, PEEK, REMOVE, UPDATE, DISPLAY, EXIT = 0};

Vector v[2];

void display_vec(unsigned int which) {
  unsigned int i;

  if (!v[which]) { 
    printf("-----\n");
    return; 
  } /* end of if(!v[which]) */
  printf("+++++\n");
  printf("Contents of vector #%d\n", which);
  for (i = 0; i < Vector_Size(v[which]); i++) 
    printf("#%d: %d\t", i, Integer_Value(Vector_ElementAt(v[which], i)));
  printf("\n");

  return;
} /* end of void display_vec(unsigned int) */

int main(void) {
  int val;
  Integer pval;
  unsigned int action, pos, which_vec, from_which_vec;
  Vector_Component_Funcs int_funcs = {&Integer_CopyInteger, &Integer_DestroyInteger};

  do {
    scanf("%d", &action);
    switch (action) {
      case CREATE_COPY:
        scanf("%d %d", &which_vec, &from_which_vec);
        printf("Trying to create a copy of vector#%d into vector#%d...", from_which_vec, which_vec);
        v[which_vec] = Vector_Create(COPY, v[from_which_vec]);
        if (v[which_vec]) printf("+++++\n");
          else printf("-----\n");
        break;
      case CREATE_DEFAULT:
        scanf("%d", &which_vec);
        printf("Trying to create vector #%d using default constructor...", which_vec);
        v[which_vec] = Vector_Create(DEFAULT, int_funcs );
        if (v[which_vec]) printf("+++++\n");
          else printf("-----\n");
        break;
      case DESTROY:
        scanf("%d", &which_vec);
        printf("Trying to destroy vector#%d...", which_vec);
        if (!Vector_Destroy(&v[which_vec])) printf("+++++\n");
          else printf("-----\n");
        break;
      case INSERT:
        scanf("%d %d %d", &which_vec, &pos, &val);
        printf("Trying to insert %d at position %d of vector#%d...", val, pos, which_vec);

Note we insert Integer objects—which are basically objects of an opaque type that wrap around an int—not ints. This is a consequence of the genericness provided by means of void*, which requires insertion of pointers.

        pval = Integer_Create(val);
        if (Vector_InsertAt(v[which_vec], pval, pos)) printf("+++++\n");
          else printf("-----\n");
        break;


Partial memory layout before Vector_InsertAt is called


Partial memory layout after insertion (in Vector_InsertAt)


      case REMOVE:
        scanf("%d %d", &which_vec, &pos);
        printf("Trying to remove the item at position %d from vector#%d...", pos, which_vec);

Now that the value returned (ret_obj) will be assigned to pval, we will have a handle on the removed object through pval.

Partial memory layout before Vector_RemoveAt is called


Partial memory layout after removal (in Vector_RemoveAt)


        pval = Vector_RemoveAt(v[which_vec], pos);
        if (pval) printf("+++++%d\n", Integer_Value(pval));
          else printf("-----\n");
        break; 
      case PEEK:
        scanf("%d %d", &which_vec, &pos);
        printf("Trying to peek in vector#%d at position %d...", which_vec, pos);


Partial memory layout after peeking (in Vector_ElementAt)


Realize the object is cloned and then returned. This new object, after assignment to pval, can be manipulated independently of the original copy.

        pval = Vector_ElementAt(v[which_vec], pos);
        if (pval) printf("+++++%d\n", Integer_Value(pval));
          else printf("-----\n");
        break;
      case UPDATE:
	scanf("%d %d %d", &which_vec, &pos, &val);
	printf("Trying to update position %d of vector#%d with %d...", pos, which_vec, val);
	pval = Integer_Create(val);
	if (Vector_UpdateAt(v[which_vec], pval, pos))
          printf("+++++\n");
          else printf("-----\n");
	break;
      case DISPLAY:
        scanf("%d", &which_vec);
        printf("Trying to display vector#%d...", which_vec);
        display_vec(which_vec);
        break;
      case EXIT:
        exit(0);
    } /* end of switch(action) */
  } while (1);
} /* end of int main(void) */

Running the Test Program[edit | edit source]

(Using Cygwin)

gcc –c –ID:/include Vector.c↵
gcc –o Vector_Test.exe Vector_Test.c Vector.o –lWrappers –ID:/include –LD:/library↵

The above command will bring into the executable the required object files found in the [static] library named libWrappers.a. Name of the library to use is figured out by simply prefixing "lib" and suffixing ".a" to the file name supplied with the –l switch. Search for this library will make use of the directory passed with the –L switch.

Vector_Test < Vector_Test.input > Vector_Test.output↵

This command will redirect both input and output to disk files. It will still make the program believe that it is receiving its input from the keyboard and sending its output to the screen. This becomes possible by remapping stdin and stdout to different physical files. With no remapping, all processes in a system have the following initial file table:

Per-process open file table
Physical file Other fields
0 Keyboard ...
1 Screen ...
2 Screen ...

After remapping we have:

Per-process open file table after redirection
Physical file Other fields
0 Vector_Test.input ...
1 Vector_Test.output ...
2 Screen ...

For both before and after remapping, we have the following macro definitions in place:

#define stdin 0 #define stdout 1 #define stderr 2

This is all done by the command-line interpreter. The user doesn’t need to make any modifications in the source code; (s)he can still program as if input were entered at the keyboard and output were written to the screen. This is possible because we read from or write to a physical file through a handle (logical file). If we change the physical file the handle is mapped to, although we write to the same logical file (handle) the final destination affected changes.

Vector_Test.input

2 0 6 0 10 5 6 0 10 15 6 0 8 20 6 0 8 25 6 0 2 30 6 0 0 35 10 0 8 0 2 8 0 3 8 0 100 10 0 1 1 0 8 1 1 7 1 0 9 1 2 26 10 1 7 0 2 7 0 5 10 0 3 0 10 0 0

Vector_Test.output

Trying to create vector #0 using default constructor...+++++ Trying to insert 5 at position 10 of vector#0...+++++ Trying to insert 15 at position 10 of vector#0...+++++ Trying to insert 20 at position 8 of vector#0...+++++ Trying to insert 25 at position 8 of vector#0...+++++ Trying to insert 30 at position 2 of vector#0...+++++ Trying to insert 35 at position 0 of vector#0...+++++ Trying to display vector#0...+++++ Contents of vector #0 #0: 35 #1: 5 #2: 15 #3: 30 #4: 20 #5: 25 Trying to remove the item at position 2 from vector#0...+++++15 Trying to remove the item at position 3 from vector#0...+++++20 Trying to remove the item at position 100 from vector#0...----- Trying to display vector#0...+++++ Contents of vector #0 #0: 35 #1: 5 #2: 30 #3: 25 Trying to create a copy of vector#0 into vector#1...+++++ Trying to remove the item at position 1 from vector#1...+++++5 Trying to peek in vector#1 at position 0...+++++35 Trying to update position 2 of vector#1 with 26...+++++ Trying to display vector#1...+++++ Contents of vector #1 #0: 35 #1: 30 #2: 25 Trying to peek in vector#0 at position 2...+++++30 Trying to peek in vector#0 at position 5...----- Trying to display vector#0...+++++ Contents of vector #0 #0: 35 #1: 5 #2: 30 #3: 25 Trying to destroy vector#0...+++++ Trying to display vector#0...-----

Object Module Libraries[edit | edit source]

Whether you use them or not, linking statically with an object module brings in all the functionality within the module. This means code you never use in your program will be there to increase size of the executable. One can avoid this by splitting the contents of the module into smaller chunks. Using these smaller modules will probably reduce the size of the executable but you now face another problem: having to write the names of the object modules separately on the command line.[9]

Libraries offer a combination of both worlds.[10] Linking with a library means all external entities within the individual modules are visible. However, only the modules that are used are brought into the executable. Switching from one module to another in the library does not force you to change the command line. Everything is taken care of by the linker and the library manager.

Example: Suppose libWrappers.a contains Integer.o, Char.o, Short.o, and so on. Having used some functionality from Integer.o, the following command will cause only Integer.o to be brought in.

gcc –o Vector_Test.exe Vector_Test.c Vector.o –lWrappers –ID:/include –LD:/library↵

If we later decide to use some function from, say, Char.o, we don’t need to make any modifications on the command line. The linker-library manager pair will sort out the details for us and bring in only Char.o and Integer.o.

Definition: A library is an archive of object modules, which—in addition to the modules—generally has some sort of metadata to quicken locating the individual files.

Library Header File[edit | edit source]

Wrappers.h includes forward declarations and prototypes required for boxing primitive type data and unboxing wrapped data. Note same thing could have been done by placing relevant information in separate header files. There is no rule saying that one must have a single header file for an entire library. As a matter of fact, you can have a separate header file for each and every object module [or even function] in the library. Choosing to have a separate header file for each object module, however, has a disadvantage: As you add a new function call in your code you have to figure out the header file it comes from and [probably] add a relevant include directive, which is a reincarnation of the problem of having to write multiple object files on the command line.

In some cases, striking a balance between the two extremes might be a better choice. Take, for instance, a math library that offers tens, if not hundreds, of functions. Sifting through a single header file in such a case is admittedly a painstaking process. Having multiple headers corresponding to different aspects of the math library might be a good idea.

Wrappers.h
#ifndef WRAPPERS_H
#define WRAPPERS_H

#include "General.h"

Since our generic vector expects an opaque pointer as its component, using it with primitive data may turn out to be painful: You must first wrap the data and use the pointer to this wrapped data in your transactions. Later on when you want to print the data to some medium, you must do some unboxing—that is, retrieve the contents of the region pointed by the opaque pointer—and then print the data. Similar things can be said about comparing, copying, and destroying the data.

This pattern is so common that a special library (libWrappers.a) is offered for the user’s convenience. In case (s)he may want to use the Vector module and the like with primitive data types, (s)he doesn’t have to reinvent the wheel; all (s)he has to do is to use the functions found in libWrappers.a.

The whole thing can be likened to the wrapper classes in Java and the concept of automatic boxing-unboxing in C#. As a matter of fact, this way of using a container is so frequent that Java, as of 1.5, supports automatic boxing-unboxing.

struct _CHAR;
typedef struct _CHAR* Char;
extern Char Char_Create(unsigned char val);
extern void Char_Destroy(Char* this);

extern int Char_CompareChars(Object, Object);
extern Object Char_CopyChar(Object);
extern void Char_DestroyChar(Object);

extern unsigned char Char_Value(Char wrappedChar);

struct _SCHAR;
typedef struct _SCHAR* SChar;
extern SChar SChar_Create(signed char val);
extern void SChar_Destroy(SChar* this);

extern int SChar_CompareSChars(Object, Object);
extern Object SChar_CopySChar(Object);
extern void SChar_DestroySChar(Object);

extern signed char SChar_Value(SChar wrappedSChar);

struct _INTEGER;
typedef struct _INTEGER* Integer;
extern Integer Integer_Create(signed int val);
extern void Integer_Destroy(Integer* this);

extern int Integer_CompareIntegers(Object, Object);
extern Object Integer_CopyInteger(Object);
extern void Integer_DestroyInteger(Object);

extern signed int Integer_Value(Integer wrappedInt);

struct _UINTEGER;
typedef struct _UINTEGER* UInteger;
extern UInteger UInteger_Create(unsigned int val);
extern void UInteger_Destroy(UInteger* this);

extern int UInteger_CompareUIntegers(Object, Object);
extern Object UInteger_CopyUInteger(Object);
extern void UInteger_DestroyUInteger(Object);

extern unsigned int UInteger_Value(UInteger wrappedUInt);

struct _LONG;
typedef struct _LONG* Long;
extern Long Long_Create(signed long val);
extern void Long_Destroy(Long* this);

extern int Long_CompareLongs(Object, Object);
extern Object Long_CopyLong(Object);
extern void Long_DestroyLong(Object);

extern signed long Long_Value(Long wrappedLong);

struct _ULONG;
typedef struct _ULONG* ULong;
extern ULong ULong_Create(unsigned long val);
extern void ULong_Destroy(ULong*);

extern int ULong_CompareULongs(Object, Object);
extern Object ULong_CopyULong(Object);
extern void ULong_DestroyULong(Object);

extern unsigned long ULong_Value(ULong wrappedULong);

struct _SHORT;
typedef struct _SHORT* Short;
extern Short Short_Create(signed short val);
extern void Short_Destroy(Short*);

extern int Short_CompareShorts(Object, Object);
extern Object Short_CopyShort(Object);
extern void Short_DestroyShort(Object);

extern signed short Short_Value(Short wrappedShort);

struct _USHORT;
typedef struct _USHORT* UShort;
extern UShort UShort_Create(unsigned short val);
extern void UShort_Destroy(UShort*);

extern int UShort_CompareUShorts(Object, Object);
extern Object UShort_CopyUShort(Object);
extern void UShort_DestroyUShort(Object);

extern unsigned short UShort_Value(UShort wrappedUShort);

struct _DOUBLE;
typedef struct _DOUBLE* Double;
extern Double Double_Create(double val);
extern void Double_Destroy(Double*);

extern void Double_DestroyDouble(Object);
extern int Double_CompareDoubles(Object, Object);
extern Object Double_CopyDouble(Object);

extern double Double_Value(Double wrappedDouble);

struct _FLOAT;
typedef struct _FLOAT* Float;
extern Float Float_Create(float val);
extern void Float_Destroy(Float*);

extern void Float_DestroyFloat(Object);
extern int Float_CompareFloats(Object, Object);
extern Object Float_CopyFloat(Object);

extern float Float_Value(Float wrappedFloat);

#endif

A Sample Module[edit | edit source]

What follows is an implementation of the wrapper interface for type int. Wrappers for the other primitive types can be provided similarly and will not be included in this handout.

Integer.c
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

#include "Wrappers.h"

struct _INTEGER { signed int value; };

Integer Integer_Create(signed int val) {
  Integer this = (Integer) malloc(sizeof(struct _INTEGER));
  if (!this) {
    fprintf(stderr, "Out of memory...\n");
    return NULL;
  } /* end of if(!this) */
  this->value = val;

  return this;
} /* Integer Integer_Create(signed int) */

void Integer_Destroy(Integer* this) {
  Integer this_copy = *this;
  if (this_copy == NULL) return;

  free(this_copy);
  *this = NULL;
} /* end of void Integer_Destroy(Integer*) */

void Integer_DestroyInteger(Object this) {
  Integer_Destroy((Integer*) &this);
} /* void Integer_DestroyInteger(Object) */

int Integer_CompareIntegers(Object this, Object rhs) {
  signed int thisValue = ((Integer) this)->value;
  signed int rhsValue = ((Integer) rhs)->value;

  if (thisValue == rhsValue) return 0;
  else if (thisValue < rhsValue) return -1;
    else return 1;  
} /* int Integer_CompareIntegers(Object, Object)*/

Object Integer_CopyInteger(Object this) {
  Integer retVal = Integer_Create(((Integer) this)->value);

  return retVal;
} /* Object Integer_CopyInteger(Object) */

signed int Integer_Value(Integer wrappedInt) {
  return wrappedInt->value;
} /* signed int Integer_Value(Integer) */

Building the Library[edit | edit source]

Using Cygwin

Before building the library we need to create the individual object modules that are to go into it.

gcc –c Char.c –ID:/include
gcc –c Integer.c –ID:/include
...
...

If all source files are in the same directory and no other files exist in that location, you can do it all in one step:

gcc –c *.c –ID:/include

We can now create the library—or archive in Unix lingo—and append the object modules to it by the following command. This will put all files ending with .o into libWrappers.a.

ar qc libWrappers.a *.o # For more on ar, simply type “man ar” on the command line.

If you are not convinced you may want to list the contents of the library.

ar t libWrappers.a
Char.o
Double.o
...
...
UShort.o

Now that libWrappers.a has been built, we can now place it in some fixed location and utilize the functions residing in it as we did in our test program.

Using Windows

Before you issue the following commands make sure command-line tools are made available as external commands. This may be easily done by executing [the commands contained in] the batch file named vcvars32.bat found in the bin subdirectory of the Microsoft Visual C/C++ compiler.

cl /ID:/include /c /TC Char.c
cl /ID:/include /c /TC Integer.c
...
...

The preceding lines create the object files we need for building the library. Although rather short, they underline the fact that we are using a compiler-driver, not a plain compiler. In order to successfully bring in the required headers, using the /I option we pass the preprocessor information about places to look for them. Thanks to the /c we also tell the compiler-driver to stop before linking. Finally, not really needed in our example, using /TC we tell the compiler proper to treat the files passed on the command line as C source code.

lib /OUT:Wrappers.lib Char.obj Integer.obj ... # builds the library
lib /LIST Wrappers.lib # lists the library contents
Microsoft (R) Library Manager Version 7.10.3077
Copyright ( C ) Microsoft Corporation. All rights reserved.
UShort.obj
ULong.obj
...
...
Char.obj

lib, Windows counterpart of ar, is the library-manager command. Passing different options we can get it to build static libraries, list library contents, and so on.

cl /ID:/include /FeVector_Test.exe Vector.obj Wrappers.lib Vector_Test.c # To get help on options, pass /? to the cl
/D_CRT_SECURE_NO_DEPRECATE /link /NODEFAULTLIB:libc.lib /DEFAULTLIB:libcmt.lib /LIBPATH:D:/library[11]

Another convincing proof that cl is actually a compiler-driver! In order to get Vector_Test.exe; preprocess and compile Vector_Test.c and link the resulting object file with Vector.obj and needed object files from Wrappers.lib, which can be found as a result of a search starting with the location given by means of the /LIBPATH option passed to the linker.

Notes[edit | edit source]

  1. With the introduction of inheritance, the same container might end up having items belonging to different- but compatible- types.
  2. Nevertheless, this doesn’t mean that we cannot use other techniques.
  3. Traditional C allows a variadic function with no named arguments while in Standard C there is a minimum of one named argument.
  4. Note that existing_vec is a local variable and therefore should appear in the lower part of the figure, just below ap to be exact. The slot marked with existing_vec actually stands for the unnamed formal parameter that corresponds to the second argument.
  5. This can be used as an argument for why a compiler may choose not to evaluate actual parameters from left-to-right. As a matter of fact, a right-to-left evaluation seems to be a better choice. After all, it is the rightmost one that gets pushed onto the run-time stack first.
  6. While perusing Windows source you may at times see a function name prefixed with specifiers such as __cdecl (C calling convention), __stdcall (standard calling convention) or __pascal. Such a specifier is used to tell the type of arrangement, naming convention and cleanup protocol used in the function call. __cdecl is used with variadic functions, while __stdcall is used for the rest.
  7. As a matter of fact, removing the arguments is usually no more than adjusting the stack pointer by a certain amount.
  8. One should also add the genericness provided through void*.
  9. If you use a visual tool this would mean adding the object files to your project.
  10. Depending on the time of linkage to the executable, a library can be either static or dynamic. The former brings in the object module at link-time while in the latter this is deferred to the load- or run-time. Our presentation in this handout will be limited to static libraries.
  11. This post-2005 MS Visual Studio command line reflects a non-standard aspect of the Microsoft compiler: in the name of making C a safer(!) programming language, compiler advises—by means of warnings—the programmer to use scanf_s, which is a Microsoft-only function, instead of the standard scanf. We ignore this by defining the _CRT_SECURE_NO_DEPRECATE macro. Starting with MS Visual Studio 2005, Microsoft has also removed libc.lib- single-threaded static C runtime- from its distribution. The default C runtime library that is now linked to form the executable is libcmt.lib, which is the multi-threaded version of libc.lib. However, for some reason, the linker may still end up looking for libc.lib! In order to fix this glaring glitch, using the /NODEFAULTLIB option, we remove libc.lib from the list of libraries searched in the process of resolving external references and, using /DEFAULTLIB option, add libcmt.lib in its place. Therefore, in a pre-2005 MS Visual Studio, this command line should be replaced with the following.
    cl /ID:/include /FeVector_Test.exe Vector.obj Wrappers.lib Vector_Test.c /link /LIBPATH:D:/library