Chapter 8

Structs

Our second major way to organize our data in our programs is via structured data or structs. A struct allows a programmer to group related data into a group. Variables of the struct can be created and the individual fields within the struct can be accessed, updated or modified.

Creation of Structs

Recall from chapter 7 our student data:

Name: Dave 
School: VT
Course: ECE 1574
Grade: B+
Average: 88.65

This is a good example of related information. All of the fields, the name, school, course, grade and average are all related to this one student. The student is comprised of the fields and all of the fields for each student "belong" to the student. The student "owns" the information related to themselves. If we wanted to create a struct for the student we could do so like this:

struct Student
{
    string name;
    string school;
    string course;
    string grade;
    double average; 
};

This is of course just one way we could represent this information. Different choices could be made about how to represent the fields that comprise the struct Student. Note the struct starts with the keyword struct and is comprised of the fields that are within the block. The definition of the struct ends with a semicolon. We add the semicolon because we are allowed to create a variable of the struct at the definition. I typically don't do this because most of my structs are declared at a global scope and that would create a global variable. Normally I declare my variables within a local scope.

Assuming we have declared the struct Student as above, we can now create variables of type Student like this:

Student myStudent;

We can access the individual fields that comprise the myStudent using the dot notation. We give the name of the variable and then the field we wish to access.

myStudent.name      = "Dave";
myStudent.school    = "VT";
myStudent.course    = "ECE 1574";
myStudent.grade     = "B+";
myStuent.average    = 88.65;

This would be how we could update or set the individual fields. If we use the variable and dot notation on the left side of an assignment we can update the individual parts. If we use the variable and the dot notation on the right side of an assignment we can retrieve or access the parts. We can also use this same method for displaying the parts.

cout << myStudent.name << " " << myStudent.school << " " << myStudent.course << " " << myStudent.grade << " " << myStudent.average << "\n";

Assuming that our myStudent variable has the values associated with it as assigned, then we would have this output:

Dave VT ECE 1574 B+ 88.65

We can also assign an entire structure into another struct of the same type. For example:

Student copy;
copy = myStudent;

After these two lines of code, our copy would contain all the same values as the values that are stored in myStudent. We can use this to make copies of our structs for any purpose. We can pass structs into function or have them as returned from functions. A struct comes very close to create a new data type. A data type, as you might recall, supplied both the data that can be stored and the operations we can perform on the data. A struct provides the mechanism for the defining the data that can be stored but it doesn't define the operations that can be performed. In our next chapter we will see how a class completes these requirements and allows for user created data types in C++.

Arrays and Structs

You may recall from the last chapter that one of the limitations of an array is that an array can only store a single type of data. For example, if we needed to create a grade book to store all of the data for the student above in an array, we would need 5 separate arrays. Each array would store one piece of the structure and each index location would be related. While this is possible, solutions that use this sort of parallel array end up being complicated. Say for example we decided to add a new field of data, a final exam grade, we would need to create a new array to store those. Then we would need to update any functions that may be required. Now pretend we decided to store more than just the final exam grade, but rather all of the grades the student earned. This could end up creating many more arrays. This can very quickly get out of hand.

A better solution is to create the struct with all of the required fields and then make a single array of that struct. Later, if a new field needs to be added to the struct, the single array will still be sufficient since it's an array of the struct. The array simply now has a new field for storage. To create an array of the Student struct we would create an array just like before, but use the name of the struct.

Student myStudents[10];

This would create an array of the Student struct and give us memory to store up to 10 student's information.

An example at this point might help give a fuller picture of using structs. For this example we will use the Student struct from above and create a small program that will allow us to read in student information from a CSV file, store the data in an array of structs and then display the data to the screen.

#include <string>
#include <fstream>
#include <iostream>

using std::string;
using std::ifstream;
using std::cout;

struct Student
{
    string name;
    string school;
    string course;
    string grade;
    double average; 
};

const int SIZE = 10;//this is the maximum size for our array

int main()
{
    ifstream in("studentData.txt");
    Student tempStudent; //I'll use this for reading.

    Student myStudents[SIZE]; //this is the array of students
    int count = 0; //this is the variable for keeping track of how many students I actually have.

    getline(in, tempStudent.name, ',');//priming read...I'll read just the name
    while( !in.fail() )
    {
        getline(in, tempStudent.school, ',');//read the rest of the student
        getline(in, tempStudent.course, ',');
        getline(in, tempStudent.grade, ',');
        in >> tempStudent.average;
        in.ignore( 200, '\n');//ignore the newline after the grade.

        if ( count < SIZE)//make sure I have space to store the temp student
        {
            myStudents[count] = tempStudent;//use the whole struct assignment to copy the temp student into the array
            count++;//I've got 1 more student
        }
        getline(in, tempStudent.name, ','); //reprime the next first name.
    }
    in.close();//close our file

    for ( int i=0; i<count; i++)
    {
        cout << myStudents[i].name << "\t" << myStudents[i].school << "\t" << myStudents[i].course << "\t" 
        << myStudents[i].grade << "\t" << myStudents[i].average << "\n";
    }
}

Program 8.1

If the studentData.txt file contains this information

Dave,VT,ECE 1574,B+,88.65
Kelly,VT,ECE 2574,A,95
Matt,VT,ECE 3574,C+,79.3
Molly,VT,ECE 2524,A,99

Then this is the results when I run Program 8.1 on my computer

Dave    VT  ECE 1574    B+  88.65
Kelly   VT  ECE 2574    A   95
Matt    VT  ECE 3574    C+  79.3
Molly   VT  ECE 2524    A   99

This program essentially takes a comma separated file and creates a tab delimited version. While this might be a simple example, we could easily use this same idea and create a more functional program. For example, instead of having the average stored, we could store the individual grade and then compute the average and letter grade.

Functions and Structs

Another improvement we could make to this example would be the use of functions. We could create our program like this:

#include <string>
#include <fstream>
#include <iostream>

using std::string;
using std::ifstream;
using std::cout;
using std::ostream;
using std::istream;

struct Student
{
    string name;
    string school;
    string course;
    string grade;
    double average; 
};

const int SIZE = 10;//this is the maximum size for our array
void readData( istream& in, Student myStudents[], int &count);
void displayData( ostream& out, const Student myStudents[], int count);
int main()
{
    ifstream in("studentData.txt");

    Student myStudents[SIZE]; //this is the array of students
    int count = 0; //this is the variable for keeping track of how many students I actually have.

    readData( in, myStudents, count);
    displayData( cout, myStudents, count);

    in.close();//close our file
}

void readData( istream& in, Student myStudents[], int &count)
{
    Student tempStudent; //I'll use this for reading.
    count = 0;//make sure before we start that the count is 0
    getline(in, tempStudent.name, ',');//priming read...I'll read just the name
    while( !in.fail() )
    {
        getline(in, tempStudent.school, ',');//read the rest of the student
        getline(in, tempStudent.course, ',');
        getline(in, tempStudent.grade, ',');
        in >> tempStudent.average;
        in.ignore( 200, '\n');//ignore the newline after the grade.

        if ( count < SIZE)//make sure I have space to store the temp student
        {
            myStudents[count] = tempStudent;//use the whole struct assignment to copy the temp student into the array
            count++;//I've got 1 more student
        }
        getline(in, tempStudent.name, ','); //reprime the next first name.
    }
}

void displayData( ostream& out, const Student myStudents[], int count)
{
    if ( count < 0 ) //check the count, if it's bad, i.e. negative, set it to 0
        count = 0;
    else if ( count > SIZE )//another check that if it's too big, i.e. > SIZE, set it equal to SIZE
        count = SIZE;
    for ( int i=0; i<count; i++)
    {
        out << myStudents[i].name << "\t" << myStudents[i].school << "\t" << myStudents[i].course << "\t" 
        << myStudents[i].grade << "\t" << myStudents[i].average << "\n";
    }
}

Program 8.2

In program 8.2 I would argue the main function is much simpler. It simply opens the file and calls two functions. All of the work gets done in the functions. There are many benefits to writing code this way.

If the structure changes, we can simply make the changes in the functions.
Our main program is very short and I would argue easier to understand.
We can add functionality via functions to do almost anything, e.g. sorting, computing averages, etc.

We do this by creating functions that use both our struct and an array. Using this pattern allows us to be very creative in how we can organize our data and build up our program.

More Structs

We saw in our Student struct that we can put doubles and strings in our struct, but what are we limited to putting inside of our struct? Almost nothing is off limits. It's even possible to put a struct within our struct. I don't normally do that, but it is very common to put an array in our struct. A few times I mentioned the idea of storing the individual grades instead of the computed average. Let's say we have the following categories of assignments:

Homework/Labs
Attendance
Projects
Midterm exam
Final exam

We could then modify our struct to look something like this:

struct Student
{
    string name;
    double homeworkGrades[25];
    int homeworkGradeCount;
    double projectGrades[5];
    int projectGradeCount;
    double attendanceGrades[15];
    int attendanceGradeCount;
    double midtermExam;
    double finalExam;
};

Adding the arrays and their subsequent counts to the struct would allow us to store the individual grades and if we knew the weighting for each category we could compute the final average.

We can continue to create structs to be as complex as we need to allow us to store the data that is needed to solve our problems.

Limitations of Structs

Structs are great and they allow us to bundle together related data. They have one major drawback. Structs are completely public by default. That means that any code can directly access the data that is stored within the struct. While this may sound like a good thing, it also means that the code must "know" how the data is stored within the struct.

Imagine you are writing code that is security sensitive. Allowing any outside direct access to your data would not be a good idea. It would be possible to modify sensitive data or retrieve it and store it outside of your code. This obviously isn't a good thing.

The second problem is more subtle. It might seem okay that you don't have sensitive information and so another program being able to access your data might be fine. But now image you are using a struct and you have developed a complete suite of software that uses the struct. Your software relies on how the data is stored and the names of the variables. Now image, version 2 of the struct is released and all of the internal representations have changed. Now you have a new job or rewriting your software to use the new version of the struct.

Summary

Structs are a really good thing and they do allow us a lot of good functionality. I feel they are a good halfway step to true user-defined data types in C++. We will see in the next chapter how a class in C++ gives us a user-defined data type and how it closes the loop on the structs to allow for a more secure software and software that is more resilient to the changes in other components.