Assignment 1C
CAB420, Machine Learning
This document sets out the two (2) questions you are to complete for CAB420 Assignment
1C. The assignment is worth 12% of the overall subject grade. All questions are weighted
equally. Students are to work individually. Students should submit their answers in a single
document (either a PDF or word document), and upload this to TurnItIn.
Further Instructions:
1. Data required for this assessment is available on blackboard alongside this document
in CAB420 Assessment 1C Data.zip. Please refer to individual questions regarding
which data to use for which question.
2. Answers should be submitted via the TurnItIn submission system, linked to on Black-
board. In the event that TurnItIn is down, or you are unable to submit via TurnItIn,
please email your responses to cab420query@qut.edu.au.
3. For each question, a concise written response (approximately 2-3 pages) is expected.
This response should explain and justify the approach taken to address the question
(including, if relevant, why the approach was selected over other possible methods),
and include results, relevant figures, and analysis. Python Notebooks, or similar
materials will not on their own constitute a valid response to a question
and will score a mark of 0.
4. Python code, including live scripts or notebooks (or equivalent materials for other
languages) may optionally be included as appendices. Figures and outputs/results
that are critical to question answers should be included in the main question
response, and not appear only in an appendix.
5. Students who require an extension should lodge their extension application with HiQ
(see http://external-apps.qut.edu.au/studentservices/concession/). Please
note that teaching staff (including the unit coordinator) cannot grant extensions.
1
Problem 1. Clustering and Recommendations. Recommendation engines are typically
built around clustering, i.e. finding a group of people similar to a person of interest and mak-
ing recommendations for the target person based on the response of other subjects within
the identified cluster.
You have been provided with a copy of the MovieLens small dataset1, which contains movie
review data for 600 subjects. The data is contained in the Q1 directory within the data
archive, and is split over several files as follows:
ratings.csv: Contains the movie ratings, and consists of a user ID, a movie ID, a
rating (out of 5), and a timestamp.
movies.csv: A list of all movie ID’s, alongside the movie titles and a list of genres.
tags.csv: A list of tags applied to movies by users. Each entry consits of a user ID,
a movie ID, the text tag, and a timestamp.
links.csv: Contains IDs to link the MovieLens dataset to IMDB and TMBD.
It is recommended that you do not use the tags.csv and links.csv file, though they are
contained here for completeness and you may choose to use them if you wish.
You have been provided with data loading functions for ratings.csv and movies.csv that
will:
Compute the average rating of each movie;
Reformat the list of genres in movies.csv to a set of columns, one per genre, where a
value of 1 indicates that the movie belongs to that genre and a values of NaN indicates
that it does not;
Merge the ratings.csv and movies.csv tables, obtaining a table that provides de-
tailed information on each movie a user has reviewed.
Create a combined table that computes the average rating each user has reported for
movies belonging to each genre.
Note that each movie can belong to multiple genres.
Your Task: Using the provided data, and (optionally) the above described code you are
to develop a method to cluster users based on their movie viewing preferences. Having
developed this, provide recommendations for the users with the IDs 4, 42, and 314.
A suggested approach to solving this problem is to:
Cluster the combined table that contains the average rating each user has reported for
movies belonging to each genre. You will have to decide how you treat genres that
have an average rating of NaN, which indicates that the user has not watched any
movies from this genre; and select an appropriate clustering method and clustering
hyper-parameters.
1https://grouplens.org/datasets/movielens/
2
Identify the clusters that contain the target users, 4, 42, and 314.
Find the most popular movies within clusters that contain the target users, that the
target users have not already seen.
Note that the above is simply a suggested approach, and you are welcome to select an alter-
nate method.
Your final response should include sections that address the following:
A description of and justification for your clustering method. This should include:
– Description and justification of the data that you chose to cluster;
– Justification for the selected clustering method;
– Justification for the selected clustering hyper-parameters.
A brief discussion and analysis of the results of the clustering, including interpretation
of the resultant clusters (i.e. are clusters distinct, do they capture different viewer
habits )
Recommendations for the three users with IDs: 4, 42, and 314; and a short discussion
of these recommendations which includes:
– A brief description and justification for how recommendations were obtained;
– If the recommendations make sense given these users viewing history and previous
ratings.
3
Problem 2. Multi-Task Learning. Semantic person search is the task of matching a per-
son to a semantic query. For example, given the query ‘1.8m tall man wearing jeans a red
shirt’, a semantic person search method should return images that feature people matching
that description. As such, a semantic search process needs to consider multiple traits. A
simple approach to enable this form of search is use classification to determine the traits
present in an input image.
You have been provided with a dataset (see Q2/Q2.tar.gz) that contains the following
semantic annotations:
Gender: -1 (unknown), 0 (male), 1 (female)
Pose: -1 (unknown), 0 (front), 1 (back), 2 (45 degrees), 3 (90 degrees)
Torso Clothing Type: -1 (unknown), 0 (long), 1 (short)
Torso Clothing Colour: -1 (unknown), 0 (black), 1 (blue), 2 (brown), 3 (green), 4
(grey), 5 (orange), 6 (pink), 7 (purple), 8 (red), 9 (white), 10 (yellow)
Torso Clothing Texture: -1 (unknown) , 0 (irregular), 1 (plaid), 2 (diagonal plaid), 3
(plain), 4 (spots), 5 (diagonal stripes), 6 (horizontal stripes), 7 (vertical stripes)
Leg Clothing Type: -1 (unknown), 0 (long), 1 (short)
Leg Clothing Colour: -1 (unknown), 0 (black), 1 (brown), 2 (blue), 3 (green), 4 (grey),
5 (orange), 6 (pink), 7 (purple), 8 (red), 9 (white), 10 (yellow)
Leg Clothing Texture: -1 (unknown) , 0 (irregular), 1 (plaid), 2 (diagonal plaid), 3
(plain), 4 (spots), 5 (diagonal stripes), 6 (horizontal stripes), 7 (vertical stripes)
Luggage: -1 (unknown), 0 (yes), 1 (no)
The unknown class can be considered either a class in it’s own right (i.e. three classes of
gender), or can be considered as missing data. Note that three colours are annotated for each
of the torso and leg clothing colour, indicating the primary, secondary and tertiary colours.
One or both of the secondary and tertiary colours may be set to unknown (-1) to indicate
that there are only 1 or 2 colours in the garment.
In addition, the dataset contains semantic segmentation for each image in the training data,
that breaks the image down into the following regions:
Leg clothing
Shoes
Torso clothing
Luggage
Leg skin regions
4
Torso/arm skin regions
Facial skin regions
Hair
Semantic segmentation information is supplied as a set of marks, with an individual mask
for each component.
Your Task: Using this data you are to implement a multi-task deep learning approach
that, given an input image, classifies the traits:
Gender
Torso Clothing Type
Primary Torso Clothing Colour
Leg Clothing Type
Primary Leg Clothing Colour, and
Presence of Luggage.
Pose and the semantic segmentation data may optionally be used when developing your ap-
proach (though remember that semantic segmentation data is only available for the training
set, so cannot be used as a model input). Additional traits (clothing texture, secondary and
tertiary torso and leg colours) should be ignored.
You have been provided code to:
Load the images, labels and semantic masks ready for use with keras and tensorflow;
Demonstrate how to use a generator to augment an input image and produce multiple
outputs.
Your final response should include sections that address the following:
Any pre-processing that is performed on the data (cropping, resizing), any data aug-
mentation that is used, and how the missing data (i.e. instances of -1) are handled.
Note that you may wish to crop and/or resize data to reduce the computational de-
mands of your approach. This is completely acceptable, though the pre-processing
should be explained, and care should be taken to ensure that the images are not re-
sized to such an extent that traits become indistinguishable.
A description and justification for your approach. This should include justification for
the network design and training. If you choose use a pre-trained neural network (or
part of a network) and fine-tune it, details and justification for this must be provided.
5
An evaluation of performance for each of the traits using the provided test set. The
evaluation should include an investigation of situations where the proposed solution
performs poorly, and a discussion on the implications of the performance of the clas-
sifiers on the overall task: semantic search. Note that while this discussion should
consider the broader context of the semantic search task, you are not required or ex-
pected to implement the semantic search task.
6