Purpose

The purpose of the final paper is to summarize results for two interesting questions using a combination of figures, tables, and modeling techniques. Written communication is an integral part of data science. This is your opportunity to develop a high quality blog post/article that could potentially be published to the web and used in future job interviews.

Requirements

After the exploratory data analysis, your group should have two questions that are interesting, relevant, and worth sharing to the world. These questions should involve multiple variables and should be answerable by predictive modeling techniques. Be innovative and creative. Do not try to answer questions that have obvious solutions or have been extensively studied. Pick questions that would spark a reader or fellow researcher to ask more questions or engage in discussion.

The final paper consists of four sections: Introduction, Data, Results, and Conclusion. Each section will be graded separately and follow a rubric with a combination of objective and subjective requirements. Provided on the course website is a simple Rmarkdown template with predefined headings. The template also contains requirements and suggestions for each of the four sections.

The Deliverer is responsible for compiling all the information into the RMarkdown template provided on the course website. This document should be carefully proofread and submitted as a PDF file via Gradescope by the due date. Please, submit the file as a group submission and add all your group members. A minimum 2 point penalty will be given, if this document is submitted late. This penalty applies to your entire group.

In the final PDF document, there should be absolutely no R code. The writing and proofreading of the document should be shared by all members of the group. In each section, there are points removed for spelling and grammatical errors. All figures should have appropriate legends, titles, and colors. All tables should be displayed in PDF format. Each required figure and table is worth 2 points. The first point is for the appropriateness of the figure/table for the situation and the second point is based on aesthetics. You are encouraged to use Markdown syntax for subsections, bold, italic, hyperlink, tables, etc.

Five points of the final project is based on an average score measuring overall contribution as seen by you and the other members of your group. Each group member should score every person in their group on a continuous scale from 0 (Bad) to 10 (Good). Before the due date of the final paper, every member is required to submit the group scoring through the google survey link on course website. Your name and this information will remain private between me and you. If you fail to submit this group scoring before deadline, I will subtract 2 points from your personal score.

Rubric

Requirement Points
Introduction: 2 Questions Clearly Defined 2 Points
Introduction: Am I Interested? 2 Points
Introduction: Free of Spelling and Grammatical Errors 1 Point
Data: Adequately Describes Data 2 Points
Data: 1 Descriptive Table 1 Points
Data: 1 Descriptive Figure 1 Points
Data: Free of Spelling and Grammatical Errors 1 Point
Results: Appropriate Methods Using Multiple Models 3 Points
Results: Adequately Explains Results 2 Points
Results: 4 Figures and/or Tables 8 Points
Results: Free of Spelling and Grammatical Errors 1 Points
Conclusion: Summarize Questions with Results 1 Points
Conclusion: Do I Want to Learn More? 2 Points
Conclusion: Free of Spelling and Grammatical Errors 1 Point
Overall: No R Code 1 Point
Overall: Followed RMarkdown Template 1 Point
Individual Score (reported separately in Sakai) 10 Points
Total 40 Points