程序案例-COMP3425-Assignment 1

COMP3425 Data Mining S1 2022 Assignment 1 Maximum marks 100 Minimum to pass hurdle 30 Length Maximum of 8 pages excluding cover sheet, bibliography and appendices. Layout A4. At least 11 point type size. Use of typeface, margins and headings consistent with a professional style. Submission deadline 9 am, Monday 14 March Submission mode Electronic, PDF via Wattle, file-name includes u-number Estimated time 15 hours Penalty for lateness 100% after the deadline has passed First posted: 21 Feb, 9am Last modified: 21 Feb, 9am Questions to: Wattle Discussion Forum This assignment specification may be updated to reflect clarifications and modifications after it is first issued. In this assignment, you are required to submit a single report comprising your answers to set questions in the form of a single PDF file with a file-name that includes your University u-number ID. The first page must have a clearly identified title and author, identified by both name and university u-number. You may also attach supporting information (appendices) in the same PDF file. Appendices will not be marked but may be treated as supporting information to your answers. This is a single-person assignment and should be completed on your own. Make certain you carefully reference all the material that you use. Any material that you wish to quote must have the source clearly referenced. It is unacceptable to present any portion of another author’s work as your own. Anyone found doing so will be penalised in marks. In addition, CECS procedures for plagiarism will apply. It is strongly suggested that you start working on the assignment right away. You can submit as many times as you wish. Only the most recent submission at the due date will be assessed. Task The Australian Computer Society Code of Professional Conduct 2014 is expected to be applied by all Computing Professionals in Australia. It sets out six values but stresses the primacy of the public interest as the overriding value. In 2017, the US Branch of the Association for Computing Machinery (ACM), recognizing the ubiquity and far-reaching impact of algorithms in daily lives, issued a Statement on Algorithmic Transparency and Accountability incorporating seven Principles designed to address potential harmful social discrimination due to bias. In 2018, the Australian Government Office of the Australian Information Commissioner released the Guide to Data Analytics and the Australian Privacy Principles (APP). These three documents are provided with this assignment specification. You must also read the paper, Clarke R. (2018), “Guidelines for the Responsible Application of Data Analytics” Computer Law & Security Review 34, 3 (Jul-Aug 2018), that is provided with this assignment specification and hereafter referred to as the Guidelines. You must also read the paper, Du, Liu and Hu, (2020) “Techniques for Interpretable Machine Learning”, Communications of the ACM 63(1) that is also provided with the assignment. You are to consider the application of the ACS code of conduct, the 7 US ACM Principles, Clarke’s Guidelines and Du et al’s Techniques to the following fictitious ad targeting scenario. You may also use the APP guide, where it is helpful. Ad Targeting Scenario (from Clarke R. (2016) “Big Data, Big Risks”, Information Systems Journal 26, 1 (January 2016) 77-90, PrePrint at http://www.rogerclarke.com/EC/BDBR.html A social media service-provider accumulates a vast amount of social transaction data, and some economic transaction data, through activity on its own sites and those of strategic partners. It applies complex data analytics techniques to this data to infer attributes of individual digital personae. It projects third-party ads and its own promotional materials based on the inferred attributes of online identities and the characteristics of the material being projected. The ‘brute force’ nature of the data consolidation and analysis means that no account is taken of the incidence of partial identities, conflated identities, obfuscated identities, and imaginary, fanciful, falsified and fraudulent profiles. This results in mis-placement of a significant proportion of ads, to the detriment mostly of advertisers, but to some extent also of individual consumers. It is challenging to conduct audits of ad-targeting effectiveness, and hence advertisers remain unaware of the low quality of the data and of the inferences. This approach to business is undermined by inappropriate content appearing on childrens’ screens, and gambling and alcohol ads seen by partners in the browser-windows of nominally reformed gamblers and drinkers. You must answer the following questions, clearly indicating which question you are answering within your submission. The page lengths suggested for each question here are for guidance only; the given page length limit for the overall assignment is mandatory. Question 1. (1 page) Consider the ACS code of conduct. For each of the six values, taking account of any relevant sub-parts, discuss whether the value was demonstrated in the scenario and to what extent. If you assess any value as largely irrelevant to the scenario, then a very brief reason for this assessment is sufficient. Question 2. (1/2 page) Consider the 7 US ACM Principles. Looking closely at Principle 1, Awareness, discuss how this principle is applied (or not) in the scenario and identify any “potential harm” that might have ensued. Question 3. (2 pages) Consider the numbered guidelines in Table 2 of Clarke’s Guidelines for the responsible application of data analytics. From every segment (1 General, 2 Data Acquisition, 3 Data analysis, and 4 Use of the Inferences) choose one guideline that you consider would have been applied in the scenario. Its application may not be explicit in the scenario description, but it should be relevant and important to the scenario and you can argue that it was applied properly and therefore did not contribute to the negative consequences of the scenario. Explain its role in the scenario including how it would have contributed to positive outcomes. Justify why it is more relevant than every one of the other guidelines that you consider would have been applied in the same segment. Argue how it is more or less relevant than any guidelines in the same segment that you consider may have been disregarded in the scenario. Be careful to consider the intention of the guidelines rather than an overly literal interpretation; you may rephrase the chosen guideline for the scenario context where beneficial. For further explanation of this point, see Section 3 in Clarke’s Guidelines. Question 4. (1 page) (a) Choose one, numbered guideline (e.g. guideline 3.3) in Table 2 of the Guidelines that you consider to have been disregarded in the scenario. You may choose any guideline that you did not choose for Question 3. Discuss how the failure to consider the guideline could have contributed to the negative outcome of the scenario. (b) In addition, identify any other potential consequences that could have occurred due to the failure to consider that same guideline. For this purpose, the consequences you identify are not necessarily explicit within the scenario description. You might find it helpful to think of this activity as contributing to a risk assessment process prior to your hypothetical involvement in the analysis work of the scenario. Question 5. (1 page) Consider the paper by Du et al, Techniques for Interpretable Machine Learning. Discuss whether and how intrinsic and post-hoc interpretability techniques could be applied to the scenario and what benefits could ensue. General Comments An abstract or executive summary is not required. A cover sheet is optional and does not contribute to the page count. No particular layout is specified, but you should follow a professional style and use no smaller than 11 point typeface and stay within the maximum specified page count. Page margins, heading sizes, paragraph breaks and so forth are not specified but a professional style must be maintained. Text beyond the page limit or word count limit will be treated as non-existent. Appendices may be used and do not contribute to the page count, but appendices might be only quickly scanned or used for reference and will not be specifically marked. You must properly attribute the source documents provided for your assignment (but not this assignment specification itself) and any other reference materials you choose to use. You are not required to use additional materials. No particular referencing style is required. However, you are expected to reference conventionally, conveniently, and consistently. Your references should be sufficient to unambiguously identify the source, to describe the nature of the source, and also to retrieve the source in online and (if possible) traditional publisher formats. An assessment rubric is provided. The rubric will be used to mark your assignment. You are advised to use it to supplement your understanding of what is expected for the assignment and to direct your effort towards the most rewarding parts of the work. Your assignment submission will be treated confidentially, but it will be available to ANU staff involved in the course for the purposes of marking. Assessment Rubric This
rubric will be used to mark your assignment. You are advised to use it
to supplement your understanding of what is expected for the assignment
and to direct your effort towards the most rewarding parts of the
work. Your assignment will be marked out of 100, and marks will be
scaled back to contribute to the defined weighting for assessment of the course. Review Criteria Max Mark Exemplary Excellent Good Acceptable Unsatisfactory Communication, Structure and Presentation 10 9-10 Exemplary use of language enhancing the quality of the submission. Very well ordered with logical and clear structure supported by appropriate headings and sub-headings. All use of others’ ideas and materials acknowledged. References are all included and are formatted consistently and appropriately. Diagrams and/or images are ideally suited to the points where they are used. Professional presentation style. 7-8 Very good use of language. Well-ordered and logical. Headings and sub-headings assist the reader. All use of others’ ideas and material is acknowledged. All references are included, though some minor inconsistency of in-text citation or formatting. Diagrams and/or images are used effectively. Professional presentation style. 6 Reasonable but needs some revision. Mostly well-ordered and logical, most supported by headings and sub-headings All use of others’ ideas and material is acknowledged. Some references are missing and occasional inconsistencies of in-text citation and formatting. Diagrams and/or images improve readability. Professional presentation style. 5 Poor, needs significant revision. Order is not always logical and is sometimes confusing. Headings are largely those suggested by the assignment specification and the questions posed. All use of other’s ideas and material is acknowledged, though sometimes inconsistently. Missing references and inconsistent in-text citation and formatting. Diagrams and/or images are not well selected. Professional style attempted. 0-4 Very difficult to understand. Order is confusing and not always logical. Headings and sub-headings do little to help clarify the text Not all use of other’s ideas and material is acknowledged. Missing in- text citations, i.e. plagiarism. References in the bibliography not used in the text. Poorly and inconsistently formatted. Diagrams and/or images detract from the key messages. Review Criteria Max Mark Exemplary Excellent Good Acceptable Unsatisfactory Question 1: Code of Conduct 20 17-20 The discussion raises subtle and challenging ethical issues related to the code of conduct in important aspects of the scenario. The code itself may be questioned with persuasive argument. All values are addressed in full, with clearly identified extent of demonstration in the scenario. The extent to which the value is pertinent is justified by argument. 14-16 All values are addressed in full. For each value, the extent to which it is demonstrated in the scenario is clear. The extent to which the value is pertinent is justified by argument. More attention may be given to more important or relevant values. 12-13 For most values, the extent to which the value is demonstrated in and pertinent to the scenario is given. 10-11 Perfunctory but arguably correct analysis is given for most of the six values. 0-9 Work does not demonstrate an adequate understanding of the code of conduct. Question 2: ACM Principle 10 9-10 The principle has been well understood as demonstrated by the identification of application (or not) in the scenario supported by considered, reasoned argument with evidence drawn from the scenario together with real- world knowledge. Analysis of the ethical issues demonstrates awareness of alternative viewpoints and possibly cost vs benefits. 7-8 Multiple aspects of the scenario have been used to discuss the application of the principle to the scenario. Potential harm analysis considers a diverse range of harms. 6 It is clear how the principle applies (or not) to the scenario. Potential harm may be too narrowly interpreted. 5 There is a cursory attempt to relate the scenario to the ACM Statement but the analysis is shallow. 0-4 Unclear whether the relevance and purpose of the ACM Statement has been fully understood. Question 3: Guidelines 20 17-20 All 4 segments considered. All selected guidelines show good understanding of the guideline, are rephrased where appropriate, and the benefit of application to the scenario context is clear. All justifications of relevance convincingly argue the relative importance of the guideline to all others in the segment 14-16 All 4 segments considered. Most selected guidelines show good understanding of the guideline, are rephrased where appropriate, and are beneficially applied to the scenario context. Most justifications of relevance convincingly argue the relative importance of the guideline to others in the segment. 12-13 All 4 segments considered. Most chosen guidelines are well explained in the scenario context. For most chosen guidelines, the argument for its relevance is made with reference to the alternative guidelines in the segment. 10-11 All 4 segments have been considered with a chosen guideline from each segment both explained and justified. 0-9 Partial attempt, incomplete or hard to follow. Question 4: Disregarded guideline 20 17-20 The significance of the selected guideline is argued persuasively. The impact of the failure of the guideline is supported by critical analysis demonstrating an understanding of both the costs and benefits of applying the guideline in the scenario from multiple viewpoints. The consequence is thought-provoking and its connection to the failed guideline is explained and logical. Arguments are supported by real-world evidence or literature. 14-16 Guideline that was selected is made clear and is relevant to the scenario. Discussion on impact of the failure of the guideline is supported by critical analysis of both the guideline itself and the events of the scenario with multiple viewpoints on the situation presented. The connection of the alternative potential consequence to the failed guideline is explained and logical. 12-13 Guideline that was selected is made clear and is relevant to the scenario. Discussion on impact of the failure to follow the guideline is reasoned and related to scenario. The connection of the alternative potential consequence to the failed guideline is explained. 10-11 Guideline that was selected is made clear and is relevant to the scenario. There is a cursory attempt to explain the impact of the failure to follow the guideline on the scenario outcome. Alternative potential consequence identified. 0-9 Unclear that the selected guideline was understood. Unconvincing or implausible impact on the scenario outcome. Other potential consequence missing or implausible. Question 5: 20 17-20 Understanding of both techniques is demonstrated by detailed description of application to scenario. Potential benefits of interpretability are well situated in the scenario 14-16 The techniques and their application in the scenario are well explained and potential benefits are articulated. 12-13 It is shown how the techniques can be used in the scenario and some potential benefits are enumerated. 10-11 The techniques and their purpose are broadly understood. 0-9 It is not clear that the paper was read.