Clustering Aanalysis
Chapter 6 (1)
Guani Wu
STATS 102B: Introduction to Computation and Optimization for
Statistics
Guani Wu, 2021 1/13
Introduction
I The aim of cluster analysis is to create a grouping of objects
such that objects within a group are similar and objects in
different groups are not similar. Before we look at the method
in more detail, we first motivate cluster analysis with an
example.
I Customer preference: Imagine you run a large online store and
would like to personalise users’ shopping experience. You hope
that by improving their shopping experience, users will buy
more.
I One way to do this is to provide each user with a set of unique
recommendations that they see when they access your site.
You do not directly know each user’s personal preferences and
tastes but you do have lots of data.
Guani Wu, 2021 2/13
I Assuming that we can define a measure of similarity between
customers based on their purchasing history, we could use
cluster analysis to group customers into K groups.
I Within each group, customers have similar shopping patterns.
Differences between customers in the same group could form
the basis of a recommender system.
I A recommender system could also be created by clustering the
items based on the customers they were bought by. If items 1
and 2 were bought by customers A, D, E and G, then they
could be considered similar. Customers could then be
recommended items that were similar to items they had already
bought.
Guani Wu, 2021 3/13
Synthetic dataset for clustering examples
Consider the data shown below. It consists of 200 observations,
x1, . . . , x200, each represented by two atributes, xT =[x1, x2].
0 5 10