15. POD and non-POD classes
Video:
Value and reference semantics
Java and some other object-oriented languages use reference semantics. Reference semantic means that the declared variables represented by a single, fixed-sized pointer or reference, and the real object with all its fields are allocated in the heap by the new operation. When we assign such a variable, we assign the references, therefore the assigned variable will refer to the new object.
C++ however, uses value-semantic. Value semantic means, that the declared variable is stored directly in the computer’s memory without any reference. The following object is stored in the memory as the fields appended after each other (with optionally padding between the fields).
In the following picture, the upper side is the value semantics (C++-like) behavior, the lower side is the reference semantics (Java-like) behavior.
In C++ we usually use the value semantics.
1
2
3
4
5
struct MyStruct
{
int i;
double d;
};
When such object is being copied, then by default the memory area of the source is copied directly into the target. More precisly, the default assignment (and copy constructor) is implemented as a memberwise copy:
1
2
3
MyStruct x, y;
y = x; // means: y.i = x.i; y.d = x.d;
POD and non-POD types
This default behavior works for many cases. However, there are situations when the default (memberwise) copy does not fit for the purposes.
Consider the following DVector class which stores up to maxsize double elements in a buffer dynamically allocated in the heap.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// dvector.h
#ifndef DVECTOR_H
#define DVECTOR_H
// vector storing double values up to 64 elements.
class DVector
{
public:
DVector(); // constructor
const int maxsize = 64;
int size() const; // actual size
double& operator[](int i); // unchecked access
double operator[](int i) const;// unchecked, const member
void push_back(double d); // append to end
void pop_back(); // remove from end;
private:
int _size; // actual number of elements
int _capacity; // buffer size
int* _ptr; // pointer to buffer
};
#endif /* DVECTOR_H */
We can insert new elements to the end of the buffer with the push_back function, while pop_back removes the last element. Inserted elements can be accessed with indexes between 0 and size()-1 by the index operator) operator[].
The constructor creates an empty DVector, allocating the buffer for maxsize elements. (Later we will create a version where the buffer automatically reallocated on push_back when full.)
The class has three fields: capacity to store the actual capacity (in this version it is always maxsize), size to store the elements actually inserted, and ptr pointing to the buffer allocated in heap.
The implementation in dvector.cpp is straightforward.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// dvector.cpp
#include <stdexcept>
#include "dvector.h"
DVector::DVector()
{
_capacity = maxsize;
_size = 0;
_ptr = new double[_capacity];
}
int DVector::size() const
{
return _size;
}
double& DVector::operator[](int i)
{
return _ptr[i];
}
double DVector::operator[](int i) const
{
return _ptr[i];
}
void DVector::push_back(double d)
{
if ( _size == _capacity )
throw std::out_of_range("vector full");
_ptr[_size] = d;
++_size;
}
void DVector::pop_back()
{
if ( 0 == _size )
throw std::out_of_range("vector empty");
--_size;
}
And here is the client code which tests the program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
#include "dvector.h"
void print(const DVec& dv, char *name)
{
std::cout << s << " = [ ";
for (int i = 0; i < dv.size(); ++i )
std::cout << dv[i] << " ";
std::cout << "]" << std::endl;
}
int main()
{
DVector x; // declare and fill x
for (int i = 0; i < 10; ++i )
x.push_back(i);
DVector y; // declare and fill y
for (int i = 0; i < 15; ++i )
y.push_back(i+20);
print(x,"x");
print(y,"y");
std::cout << " x = y;" << std::endl;
x = y;
print(x,"x");
print(y,"y");
std::cout << "x[0] = 999;" << std::endl;
x[0] = 999;
print(x,"x");
print(y,"y");
}
The result is not really what we expected:
1
2
3
4
5
6
7
8
9
$ ./a.out
x = [ 0 1 2 3 4 5 6 7 8 9 ]
y = [ 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ]
x = y;
x = [ 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ]
y = [ 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ]
x[0] = 999;
x = [ 999 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ]
y = [ 999 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ]
Everything looks fine up to line 7. Both x and y objects have created, filled with integer literals automatically converted to doubles. When y is assigned to x in line 4 all elements of y seems to be copied to x. However, when x[0] is modified in line 7, surprisingly y is modified. This is definitely not what we expect.
Where is the error?
Let consider the object layouts:
__________
| _size | y
|_capacity | ___________________________
| _ptr ---------------> | 20 21 22 ... 34 |
|__________| |_________________|_________|
_size _capacity
__________
| _size | x
|_capacity | _____________________
| _ptr ---------------> | 1 2 3 ... 9 |
|__________| |_______________|_____|
_size _capacity
On the x = y assignment, we copy the values of all members of y to
x, including the ptr, which means, that x._ptr will points to
where y._ptr pointed to – the buffer of y object.
This happened on x = y:
__________
| _size | y
|_capacity | ____________________________
| _ptr ---------------> | 20 21 22 ... 34 |
|__________| / |_________________|__________|
/
/
__________ /
| _size | x /
|_capacity | /
| _ptr ---------
|__________|
As we see, the default memberwise copy is not always fits to the requirements. Those types where the memberwise copy is fine we call POD types (from hystorical Plain Old Data expression). Otherwise, we speak about a non-POD type.
Copy constructor and assignment operator
When the default memberwise copy operations are not sufficient, we have to provide our own copy operations: the copy constructor and the assignment operator. The role of these special memberfunctions is to implement the correct copy algorithms.
Copy constructor
The copy constructor has a strict signature: MyType( const MyType &rhs). (In very special cases – like for some smart pointer – we can omit the const qualifier.) The copy constructor is called when a new object is created and initialized from the same type as argument.
Usually the semantics of the copy constructor is to allocate resources for the new object and initialize it copying from the argument object.
Assignment operator
The assignment operator is a normal operator implementing the value semantics
for assignment. The usual (but non-mandatory) syntax is
MyType& operator=( const MyType &rhs). The assignment operator is called
when an existing object is assigned to an other existing object. It is
possible, that the types of these objects are not the same, e.g. a char*
object is assignable to an std::string.
The parameter of the assignment operator is usually a refrence parameter, but contrary to the copy constructor this is not mandatory. The return type of the assignment is a refefence to the same type, and returns the freshly updated object – the one on the left hand side of the assignment.
Usually the semantics of the assignment is to free the old resurces of the object, then allocate the new resources and copy the value from the argument object. The function returns with *this, returning the freshly updated object by reference.
With the correct operations defined the effect of the x = y; copy is the following:
__________
| _size | y
|_capacity | ____________________________
| _ptr ---------------> | 20 21 22 ... 34 |
|__________| |_________________|__________|
| | | |
| | | |
__________ ...copy...
| _size | x | | | |
|_capacity | __V__V__V______V____________
| _ptr ---------------> | 20 21 22 ... 34 |
|__________| |_________________|__________|
Destructor
There is an other issue with our DVector class. When our object is going out of its lifetime the date members of the object: capacity, size, and ptr will be freed. However, the buffer in the heap, that is allocated in the constructor, remains allocated.
In C++ we have no garbage collector (contrary to Java and C#). There are no more reference to this memory anymore, therefore this memory is wasted, it is a memory leak. Memory leak is a critical issue in C++.
The constructor is the place when we initialize our objects. The special memberfunction to execute the reverse operations – deallocate resources, other clean-up activities, etc. – is the destructor. There might be only one destructor for a class, and its signature is ~MyType().
The destructor – if defined – is always executed when an object’s lifetime ends.
Implmenting the copy operations and the destructor
Here we can see the implementation ofthe DVector with proper copy operations and destructor.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// dvector.h
#ifndef DVECTOR_H
#define DVECTOR_H
class DVector
{
public:
DVector(); // constructor
DVector(const DVector& rhs); // copy constructor
DVector& operator=(const DVector& rhs); // assignment
~DVector(); // destructor
int size() const; // actual size
double& operator[](int i); // unchecked access
double operator[](int i) const;// unchecked, const member
void push_back(double d); // append to end
void pop_back(); // remove from end;
private:
int _size; // actual number of elements
int _capacity; // buffer size
double* _ptr; // pointer to buffer
};
#endif /* DVECTOR_H */
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// dvector.cpp
#include <stdexcept>
#include "dvector.h"
DVector::DVector()
{
_capacity = 64;
_size = 0;
_ptr = new double[_capacity];
}
DVector::DVector(const DVector& rhs)
{
_capacity = rhs._capacity;
_size = rhs._size;
_ptr = new double[_capacity];
for (int i = 0; i < _size; ++i)
_ptr[i] = rhs._ptr[i];
}
DVector& DVector::operator=(const DVector& rhs)
{
if ( this != &rhs ) // avoid x = x
{
delete [] _ptr;
_capacity = rhs._capacity;
_size = rhs._size;
_ptr = new double[_capacity];
for (int i = 0; i < _size; ++i)
_ptr[i] = rhs._ptr[i];
}
return *this; // for x = y = z
}
DVector::~DVector()
{
delete [] _ptr;
}
//... rest of dvector.cpp is the same
Let recognize, the self-assignment check in line 21 of dvector.cpp. This is necessary to avoid the issues when someone accidentally try to execute x = x.
Make DVector growing dynamically
Our current implementation has a limited capacity. It is easy to remove this restriction making the capacity of the object dynamically growing on demand.
We check in push_back whether the logical size of the object reaches the physical capacity, and then let the buffer expanded by the newly defined grow method.
1
2
3
4
5
6
7
8
void DVector::push_back(double d)
{
if ( _size == _capacity )
grow();
_ptr[_size] = d;
++_size;
}
The grow method doubles the capacity and allocate a new buffer, copies the elements from the old buffer, then deallocates the old buffer. The size does not change here.
1
2
3
4
5
6
7
8
9
10
11
void DVector::grow()
{
double *_oldptr = _ptr;
_capacity = 2 * _capacity;
_ptr = new double[_capacity];
for ( int i = 0; i < _size; ++i)
_ptr[i] = _oldptr[i];
delete [] _oldptr;
}
This is very similar how the standard library std::vector is implemented.
The complete DVector class
To make our class complete, we do some refactoring work following the DRY – Do not Repeat Yourself – principle. We create the copy and release methods to implement common activies of the copy constructor, assignment operator and the destructor. Naturally, as neither of grow, copy, release are part of the intended interface of the class, we declare them as private methods.
We conclude with the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// dvector.h
#ifndef DVECTOR_H
#define DVECTOR_H
class DVector
{
public:
DVector(); // constructor
DVector(const DVector& rhs); // copy constructor
DVector& operator=(const DVector& rhs); // assignment
~DVector(); // destructor
int size() const; // actual size
double& operator[](int i); // unchecked access
double operator[](int i) const;// unchecked, const member
void push_back(double d); // append to end
void pop_back(); // remove from end;
private:
int _size; // actual number of elements
int _capacity; // buffer size
double* _ptr; // pointer to buffer
void copy(const DVector& rhs);// private helper function
void release(); // private helper function
void grow(); // reallocate buffer
};
#endif /* DVECTOR_H */
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// dvector.cpp
#include <stdexcept>
#include "dvector.h"
DVector::DVector()
{
_capacity = 4;
_size = 0;
_ptr = new double[_capacity];
}
DVector::DVector(const DVector& rhs)
{
copy(rhs);
}
DVector& DVector::operator=(const DVector& rhs)
{
if ( this != &rhs ) // avoid x = x
{
release();
copy(rhs);
}
return *this; // for x = y = z
}
DVector::~DVector()
{
release();
}
void DVector::copy(const DVector& rhs)
{
_capacity = rhs._capacity;
_size = rhs._size;
_ptr = new double[_capacity];
for (int i = 0; i < _size; ++i)
_ptr[i] = rhs._ptr[i];
}
void DVector::release()
{
delete [] _ptr;
}
void DVector::grow()
{
double *_oldptr = _ptr;
_capacity = 2 * _capacity;
_ptr = new double[_capacity];
for ( int i = 0; i < _size; ++i)
_ptr[i] = _oldptr[i];
delete [] _oldptr;
}
int DVector::size() const
{
return _size;
}
double& DVector::operator[](int i)
{
return _ptr[i];
}
double DVector::operator[](int i) const
{
return _ptr[i];
}
void DVector::push_back(double d)
{
if ( _size == _capacity )
grow();
_ptr[_size] = d;
++_size;
}
void DVector::pop_back()
{
if ( 0 == _size )
throw std::out_of_range("vector empty");
--_size;
}