Published September 3rd 2022
Assessment (non-exam) Brief
Module code/name MSIN0025 Data Analytics II
Module leader name Deyu Ming
Academic year 2022/23
Term 2
Assessment title Group Coursework Late Summer Assessment Report
Individual/group assessment Individual
Submission deadlines: Students should submit all work by the published deadline date and time. Students
experiencing sudden or unexpected events beyond your control which impact your ability to complete assessed
work by the set deadlines may request mitigation via the extenuating circumstances procedure. Students with
disabilities or ongoing, long-term conditions should explore a Summary of Reasonable Adjustments.
Return and status of marked assessments: Students should expect to receive feedback within one calendar month
of the submission deadline, as per UCL guidelines. The module team will update you if there are delays through
unforeseen circumstances (e.g. ill health). All results when first published are provisional until confirmed by the
Examination Board.
Copyright Note to students: Copyright of this assessment brief is with UCL and the module leader(s) named above. If
this brief draws upon work by third parties (e.g. Case Study publishers) such third parties also hold copyright. It must
not be copied, reproduced, transferred, distributed, leased, licensed or shared any other individual(s) and/or
organisations, including web-based organisations, without permission of the copyright holder(s) at any point in time.
Academic Misconduct: Academic Misconduct is defined as any action or attempted action that may result in a
student obtaining an unfair academic advantage. Academic misconduct includes plagiarism, obtaining help
from/sharing work with others be they individuals and/or organisations or any other form of cheating. Refer to
Academic Manual Chapter 6, Section 9: Student Academic Misconduct Procedure – 9.2 Definitions.
Referencing: You must reference and provide full citation for ALL sources used, including articles, text books, lecture
slides and module materials. This includes any direct quotes and paraphrased text. If in doubt, reference it. If you
need further guidance on referencing please see UCL’s referencing tutorial for students. Failure to cite references
correctly may result in your work being referred to the Academic Misconduct Panel.
Content of this assessment brief
Section Content
A Core information
B Coursework brief and requirements
C Module learning outcomes covered in this assessment
D Groupwork instructions (if applicable)
E How your work is assessed
F Additional information
Published September 3rd 2022
Section A: Core information
Submission date 21/08/2023
Submission time 10:00am UK time
Assessment is marked out of: 100 marks
% weighting of this assessment
within total module mark
30%
Maximum word count/page
length/duration
2000 Words (excluding appendices)
Footnotes, appendices, tables,
figures, diagrams, charts included
in/excluded from word count/page
length
Appendices are excluded from the word count.
Footnotes, captions of figures, diagrams, charts and tables are
included in the word count.
Bibliographies, reference lists
included in/excluded from word
count/page length
The bibliographies are excluded from word count.
Penalty for exceeding word
count/page length
Penalty for exceeding word count will be a deduction of 10
percentage points, capped at 40% for Levels 4,5, 6, and 50% for
Level 7) Refer to Academic Manual Section 3: Module
Assessment – 3.13 Word Counts.
Penalty for late submission Standard UCL penalties apply. Students should refer to Refer to
https://www.ucl.ac.uk/academic-manual/chapters/chapter-4-
assessment-framework-taught-programmes/section-3-module assessment#3.12
Submitting your assessment The assignment MUST be submitted to the module submission
link located within this module’s Moodle ‘Submissions’ tab by
the specified deadline.
Anonymity of identity. Normally, all
submissions are anonymous unless
the nature of the submission is such
that anonymity is not appropriate,
illustratively as in presentations or
where minutes of group meetings
are required as part of a group work
submission
The nature of this assessment is such that anonymity is not
required.
Section B: Assessment Brief and Requirements
For this assignment, you will need to identify an important business problem (that should be different from
the business problem you explored in your final individual assignment (including the LSA of the individual
assignment, if it applies to you) of this module and your assignments from other modules), find one or more
relevant datasets, generate insightful visualisations of the data, fit a range of regression models to the data
to produce your best predictions/forecasts, and make and justify recommendations to a decision maker
related to this problem. A key goal for this assignment is to demonstrate a wide range of the concepts
covered in the module.
This assignment is worth 30% of the overall module assessment.
Report Structure
Section 1: The Problem (10%)
Discuss the problem you are addressing.
What are the questions and business/management decisions your analysis is trying to address
Describe your problem’s decision maker and what is important for them to know from your data
analysis
Discuss the source of your data. Questions to consider include:
– Where did you find this data
– How reliable or uncertain is this data
– How old is the data
– Is the data recorded at given dates or times
Discuss and justify why your problem relates to a regression analysis.
Identify and justify your choice of target attribute(s) and explain how this/these should be
derived, if not already available.
Section 2: Understand the Data (30%)
Discuss the nature and size of the dataset(s) you are using.
Discuss the data attributes that are relevant to your problem. Exactly what does the data
represent and, if relevant, how was it derived How is it distributed What type of data is it
Explore and discuss whether any of the data attributes you have focused on are closely correlated
with other attributes – either positively or negatively.
Include at least 2 Tableau-generated visualisations (e.g., map, scatter plot, bar chart, pie chart,
box-and-whisker plot) that give different insights to support your discussions.
Include at least 2 R-generated plots or aggregation tables that give different insights to support
your discussions.
Include the R-code you used in the appendix of your report.
Section 3: Prepare the Data (10%)
If required, explain how you have derived your chosen target attribute(s) in Tableau or in R.
Discuss and justify what other steps you may have taken to prepare your data, including, where
relevant: removing attributes from consideration, adding further “derived” attributes (e.g., Dates),
imputing “reasonable” values for missing data, transforming attributes, and standardising data
values.
Prepare suitable separate “Training” and “Testing” datasets.
Include any R-code you used to prepare your data in the appendix of your report.
Section 4: Generate and Test Prediction Models (40%)
Published September 3rd 2022
Published September 3rd 2022
Select and justify at least 2 different prediction models that are likely to best help with your stated
problem objectives.
Configure your models (e.g., select attributes and/or other model tuning parameters) that you
expect will best deliver relevant insights and/or provide the lowest error rates, justifying your
decisions.
Run these models, discussing the model outputs and drawing, where possible, insights related to
your problem.
Select proper evaluation metrics to measure the accuracy of your models. Determine and
comment on the best model across your 2 prediction models.
Discuss what steps you may have taken to improve your individual models.
Include any R-code you used in the appendix of your report.
Section 5: Problem Conclusions and Recommendations (10%)
Combining the results from your various analysis steps, draw conclusions about the particular
problem and questions stated at the beginning.
What recommendations would you now make to your problem’s decision maker and why E.g.,
– Which are the most important variables/features for the decision maker to look at
– What benefits that the decision maker would gain by implementing your prediction model
Marking Criteria
Marks will be awarded for:
Using Tableau and R in a way that is relevant and appropriately justified, and that is ideally
different from that presented in the lectures and other module materials.
Meaningful insights are discussed after each analysis task.
Your analysis should flow, with each step building on the last.
Structuring your report and analysis so as to follow the standard stages of a data science project.
The correctness, reproducibility, and quality of your code, visualisations and conclusions.
Employing a wide range of the concepts and methods covered in this module.
Problem identification: you have found a novel and significant problem.
Proposed a compelling solution/recommendation: you have generated important business or
policy insights.
Your report was well-written: clear and compelling.
Submission Requirement
You are required to submit 3 files for this assignment:
1. A PDF file containing your fully completed report, including an appendix containing all your Rbased analysis.
2. A runnable R script file (.R file) that contains all your R-based analysis.
3. The data file, if it is not too large to upload on Moodle, that you used for your analysis. If it is too
large, please include a link (either to the original dataset that are freely available online or to the
online cloud, e.g., Dropbox, GitHub, where you store the dataset) in appendix of your PDF report.
Only the first PDF file will be marked. The additional code file and data file are only provided to ensure
your code works as you have claimed it should.
Section C: Module Learning Outcomes covered in this
Assessment
This assessment contributes towards the achievement of the following stated module Learning
Outcomes as highlighted below:
This assignment contributes towards the achievement of the following module Learning Outcomes:
During the module, students will work with example data sets to experience and
understand the stages of the data science process: they will visualise data, propose models
that might fit the data, choose a best-fit model, use that model to make predictions, and
test those predictions against new realisations.
The module builds on ideas and tools introduced in MSIN0010 Data Analytics I and
MSIN0023 Computational Thinking, including R and Tableau, statistical software used by
the world’s leading data scientists.
Published September 3rd 2022
Section D: Groupwork Instructions (where
relevant/appropriate)
1. NA
Published September 3rd 2022
Published September 3rd 2022
Section E: How your work is assessed
Within each section of this assessment you may be assessed on the following aspects, as applicable and
appropriate to this assessment, and should thus consider these aspects when fulfilling the requirements of
each section:
The accuracy of any calculations required.
The strengths and quality of your overall analysis and evaluation;
Appropriate use of relevant theoretical models, concepts and frameworks;
The rationale and evidence that you provide in support of your arguments;
The credibility and viability of the evidenced conclusions/recommendations/plans of action
you put forward;
Structure and coherence of your considerations and reports;
Appropriate and relevant use of, as and where relevant and appropriate, real world examples,
academic materials and referenced sources. Any references should use either the Harvard OR
Vancouver referencing system (see References, Citations and Avoiding Plagiarism)
Academic judgement regarding the blend of scope, thrust and communication of ideas,
contentions, evidence, knowledge, arguments, conclusions.
Each assessment requirement(s) has allocated marks/weightings.
Student submissions are reviewed/scrutinised by an internal assessor and are available to an External
Examiner for further review/scrutiny before consideration by the relevant Examination Board.
It is not uncommon for some students to feel that their submissions deserve higher marks (irrespective of
whether they actually deserve higher marks). To help you assess the relative strengths and weaknesses of
your submission please refer to UCL Assessment Criteria Guidelines, located at