Due date: March 1, 2024 11:59pm
Introduction
Last semester in COMP 1701, you wrote a program to process a file containing applicants to Canada’s Federal Skilled Worker Program . The program read the file, calculated a score for each applicant, displayed the number of applicants qualifying for the next step, and wrote a new file summarizing the qualifying applicants.
In this assignment, you’ll be doing it all over again, but in C++. You’ll also be passing the file names as command line arguments rather than input prompts to take advantage of tab completion.
For those of you who did not do this assignment in COMP 1701, there will be a bit more mental overhead in understanding the problem, but a Python solution is provided for you to study.
Objectives
- Develop a comprehensive program in C++ that reads from and writes to files
- Learn how to use command line arguments in C++
- Learn how to work with structures, C-strings, and arrays in C++
Note: This assignment is not to be completed using std::vector
, std::string
, or other abstract data types. The goal is to learn about C-style arrays/structures/C-strings and the associated memory management.
The Program
Upon completion, your program should behave as follows:
$ ./a2 data/input/dataset-full.txt data/output/qualified_applicants.txt
There were 1608 qualified applicants.
The program should read from the file data/input/dataset-full.txt
and write to the file data/output/qualified_applicants.txt
. Sample input and output files are provided in the data
directory. The output format should be identical to the Python solution provided, i.e:
First Name | Last Name | Age | Score
--------------------+--------------------+-----+------
Aaron Grimm 30 86
Aaron Koressel 33 86
Aaron Russo 30 86
Abigail Kelley 57 67
Abraham Harper 58 85
Abraham Howard 26 82
Ada Parker 26 72
Ada Walker 61 70
Note: depending on your instructor, the output format may differ somewhat from your COMP 1701 solution.
The Data
The input file has a header line followed by lines (records) of data. There are 19 fields (columns), described as follows:
Col# | Title | Description |
---|---|---|
0 | first_name | The first name of the applicant. |
1 | last_name | The last name of the applicant. |
2 | age | The age of the applicant. |
3 | marital_status | The marital status of the applicant (connected to the answers to points 13, 14, and 15). |
4 | speak_1 | The applicant’s primary language speaking Canadian Language Benchmark (CLB). |
5 | listen_1 | The applicant’s primary language listening Canadian Language Benchmark (CLB). |
6 | read_1 | The applicant’s primary language reading Canadian Language Benchmark (CLB). |
7 | write_1 | The applicant’s primary language writing Canadian Language Benchmark (CLB). |
8 | all_2 | ‘yes’ or ’no’. ‘yes’ indicates that the applicant has a CLB that is at least 5 across all the language skills. |
9 | education | Text representing the education that the applicant has received. The text will come from the Education entry in the Scoring System section. |
10 | work_experience | The number of years of relevant work experience. |
11 | arranged_employment | ‘yes’ indicating that the applicant has acceptable work arranged; ’no’ otherwise. |
12 | adaptability_spouse_language | ‘yes’ or ’no’ value representing whether the applicant’s spouse has an acceptable language score. |
13 | adaptability_spouse_education | ‘yes’ or ’no’ value representing whether the applicant’s spouse has relevant educational qualifications. |
14 | adaptability_spouse_work | ‘yes’ or ’no’ value representing whether the applicant’s spouse has relevant work experience. |
15 | adaptability_you_education | ‘yes’ or ’no’ value representing whether the applicant has relevant education. |
16 | adaptability_you_work | ‘yes’ or ’no’ value representing whether the applicant has relevant work experience. |
17 | adaptability_you_employment | ‘yes’ or ’no’ value representing whether the applicant has arranged employment. |
18 | adaptability_relatives | ‘yes’ or ’no’ value representing whether the applicant (or their spouse/partner) has family in Canada that qualifies. |
The Scoring System
An applicant’s score is computed by scoring each category and summing the results.
Language Skills (max 28 points)
First official language points (max 24 points)
CLB Level | Speaking | Listening | Reading | Writing |
---|---|---|---|---|
CLB level 9 or higher | 6 | 6 | 6 | 6 |
CLB level 8 | 5 | 5 | 5 | 5 |
CLB level 7 | 4 | 4 | 4 | 4 |
Second official language (max 4 points)
You get 4 points if and only if you have a score of at least CLB 5 in each of the 4 language abilities. 0 points otherwise.
Education (max 25 points)
Level of Education for Express Entry Profile | Federal Skilled Workers Program Factor Points |
---|---|
Secondary school (high school diploma) | 5 |
One-year degree, diploma or certificate | 15 |
Two-year degree, diploma or certificate | 19 |
Bachelor’s degree or other programs (three or more years) | 21 |
Two or more certificates, diplomas, or degrees | 22 |
Professional degree needed to practice in a licensed profession | 23 |
University degree at the Master’s level | 23 |
University degree at the Doctoral (PhD) level | 25 |
Caution: If you copy/paste the above strings, you may need to re-type the commas in “Bachelor’s” and “Master’s”, as these may be rendered as different characters on the web browser.
Work Experience (max 15 points)
Experience | Points |
---|---|
Under 1 year | 0 |
1 year | 9 |
2-3 years | 11 |
4-5 years | 13 |
6 or more years | 15 |
Age (max 12 points)
Age | Points |
---|---|
Under 18 | 0 |
18-35 | 12 |
36 | 11 |
37 | 10 |
38 | 9 |
39 | 8 |
40 | 7 |
41 | 6 |
42 | 5 |
43 | 4 |
44 | 3 |
45 | 2 |
46 | 1 |
47 or older | 0 |
Arranged Employment (max 10 points)
The description is ‘yes’ if the applicant has arranged suitable employment.
Adaptability (max 10 points)
Adaptability | Description | Points |
---|---|---|
Your spouse or partner’s language level | ‘yes’ if your spouse has achieved the minimum standard; ’no’ otherwise | 5 |
Your spouse or partner’s past studies in Canada | ‘yes’ if your spouse completed at least 2 years of full-time study; ’no’ otherwise | 5 |
Your spouse or common-law partner’s past work in Canada | ‘yes’ if your partner did at least 1 year of full-time work in Canada; ’no’ otherwise | 5 |
Your past studies in Canada | ‘yes’ if you have completed at least 2 academic years of full-time study; ’no’ otherwise | 5 |
Your past work in Canada | ‘yes’ if you did at least 1 year of full-time work in Canada; ’no’ otherwise | 10 |
You have arranged employment in Canada | ‘yes’ if you have arranged employment in Canada; ’no’ otherwise | 5 |
Relatives in Canada | ‘yes’ if you, your spouse or common-law partner have a qualifying relative; ’no’ otherwise | 5 |
The Starter Code
cd
to your 1633
directory (or wherever you keep your files for this course), then clone the starter code repository from /library/students/comp1633/a2.git
using the following command:
$ git clone /library/students/comp1633/a2.git
This will create a directory called a2
in your current directory. cd
into a2
and configure your assignment dropbox using the git-asg-config
command as described in lab 2
. Select option 2
to configure for assignment 2.
Inside this directory you will find the following structure:
$ tree
.
├── applicant.cpp
├── applicant.h
├── data
│ ├── input
│ │ ├── dataset-100.txt
│ │ ├── dataset-10.txt
│ │ └── full-dataset.txt
│ └── output
│ └── qualified_applicants-10.txt
├── main.cpp
├── makefile
├── python_solution
│ ├── applicant.py
│ └── assign4.py
├── score.cpp
├── score.h
└── testing
├── makefile
└── test_a2.cpp
As with assignment 1, the header files are populated, but the .cpp files are mostly empty.
applicant.h
The applicant.h
file contains the definition of the Applicant
structure. This structure is used to store the data for each applicant. The structure is defined as follows:
struct Applicant {
char first_name[BUFSIZE];
char last_name[BUFSIZE];
int age;
char marital_status[BUFSIZE];
int language[4];
bool all_2;
char education[BUFSIZE];
int work_experience;
bool arranged_employment;
bool adaptability[6];
bool adaptability_you_work;
int score;
};
Again, depending on your instructor from COMP 1701, this may be similar to the provided Python Applicant
class, or you may have used lists of lists.
Two functions are declared in applicant.h
:
void read(std::istream &in, Applicant &a);
void write(const Applicant &a, std::ostream &out);
These functions should be implemented in applicant.cpp
in order to read and write Applicant
objects. Note that the data type for in
and out
is std::istream
and std::ostream
respectively - this is the generic type for a C++ input/output stream, which can be either a file or std::cin
/std::cout
.
The read
and write
functions should read/write a single Applicant
, regardless of how many records are in the given stream. This means that the read/write functions need to be called repeatedly in order to read/write multiple records.
read
tips
The trickiest part about reading the data is dealing with the different data types and whitespace in the input file. Here are some tips to help you get started:
- Each field in the input file is separated by the tab (
'\t'
) character, and each record is separated by the newline ('\n'
) character. This means that the last field in each record will be followed by a newline, not a tab. - You can use
in.getline
to read a string with whitespace into a C-string, and the delimiter will be consumed and discarded. For example,in.getline(a.first_name, BUFSIZE, '\t')
will read a string intoa.first_name
until it encounters a tab character, and then discard the tab character. - You can use
in >> variable
to read a single word/number into a variable. When>>
is used, any leading whitespace is ignored. - To skip over any whitespace in the buffer, you can use
in >> ws
, orin.ignore(n)
to skip overn
characters. - The
language
andadaptability
fields are arrays that group together multiple fields that all get the same scoring treatment. Note that there are actually 7adaptability
fields -adaptability_you_work
is handled separately as it is worth 10 points, not 5. - Finally, there are quite a few yes/no fields that should be converted to
bool
values. You might want to define a helper function to handle this conversion.
write
tips
This one’s more straightforward, and should just involve some out << ...
statements along with some formatting. You can either include the newline here, or write it separately when you call it.
The syntax to write to a specific field width is out << setw(n)
where n
is the width of the field. Combine this with out << left
or out << right
to left- or right-align the text within the field.
score.h
The score.h
file contains the declaration of the score
function, which takes an Applicant
object and returns an updates the int
score
field of the applicant object according to the criteria described above
. score
should be implemented in score.cpp
.
You’re welcome to implement any other helper functions you need in score.cpp
to help with the scoring process, but only functions that are called from outside score.cpp
should be declared in score.h
.
main.cpp
The main.cpp
file contains the main
function, of course. main
should handle the following:
- Make sure there are exactly 3 command line arguments, and print a usage message if not:
$ ./a2 Usage: ./a2 input_file output_file
- Open the input and output files for reading/writing, respectively. If an error occurs, print an error message, e.g:
$ ./a2 data/input/does-not-exist.txt data/output/qualified_applicants.txt Error: could not open data/input/does-not-exist.txt for reading
Note: an error in a file stream can be checked by simply using the stream as a boolean in an
if
statement, e.g.if (!in) { ... }
. - Declare an array of
Applicant
objects, and read the data from the input file into the array. The array should be at least big enough for the full dataset, which has over 2000 records. - Score each
Applicant
in the array using thescore
function - Write the qualified
Applicant
records to the output file, and print the number of qualified applicants to the console. - Don’t forget to close the file streams when you’re done with them!
You may implement helper functions for the above tasks, but main.cpp
should not implement the functionality that belongs in applicant.cpp
or score.cpp
.
Testing and development tips
Test early, test often! Do not try to write the entire program at once - start with each piece and test as you go.
As with labs and the previous assignment, the starter code includes a test program that you can use to test some of the functionality. To run the tests, cd
into the testing
directory and run make
. This will compile the test suite, at which point it can be run with ./test_a2
.
Tip: The test code tests the
read
,write
, andscore
functions. To test just one at a time, define a function stub for the other functions so that the test code compiles. For example, if you want to testread
, you can define a stub forscore
inscore.cpp
that simply returns a constant.
Incremental development
A portion of the marks for this assignment is for “evidence of incremental development”. This means I am looking for several meaningful commits to your git repository rather than just adding your solution all in one go. This is ultimately a good habit to get into! I recommend committing your changes whenever you:
- Add a new function
- Solve a problem you’ve been trying to fix
- Stop working on your code to go do something else
- Or any other time it feels right
You may add, commit, and push your code as many times as you like up until the assignment deadline. I will only mark the final version, but I will look at your commit history to see how you developed your solution.
Marking scheme
This assignment is worth 8% of your final grade and roughly divided as:
- 40% for program functionality (does it behave as specified?)
- 40% for program implementation (is it written as specified?)
- 10% for style and documentation (is it readable and appropriately commented/cited?)
- 10% for incremental development (did you commit your changes regularly with descriptive commit messages?)
Refer to the style guide for a reference on style and documentation. If you use external resources such as Stack Overflow, ChatGPT, or a friend in the class, make sure to cite them in your comments. Failure to cite external resources will be considered plagiarism. An example of a citation is as follows:
// ChatGPT helped me with this function
void foo(int bar) {
// ...
}
If your solution uses std::string
, vectors, or other data structures or techniques not covered in class, you will receive a reduced grade, possibly as low as 0. If you have previous experience, try to challenge yourself to solve this problem using only the basics.
In addition, there will be an automatic 20% deduction if your code fails to compile or run. If the problem is extreme and I cannot fix it with a small change, a grade of 0 may be assigned. Make sure your code compiles and runs on INS with the given makefile before submitting.