STAT570 Data Handling and Visualization

Course Code:2460570
METU Credit (Theoretical-Laboratory hours/week):3 (3.00 - 0.00)
ECTS Credit:8.0
Department:Statistics
Language of Instruction:English
Level of Study:Graduate
Course Coordinator:Prof.Dr. CEYLAN YOZGATLIGİL
Offered Semester:Fall and Spring Semesters.

Course Objectives

This course aims to equip students with the essential skills to proficiently manage, manipulate, and visually represent data. By the end of this course, students will be adept at handling diverse data formats, cleaning and preparing data for downstream tasks, and creating informative data visualizations. They will also develop an understanding of data ethics, privacy, and the significance of effective data communication. Through hands-on experience and practical projects, students will become confident in their ability to work with data across various domains and industries.


Course Content

Structures, semi-structured and unstructured data types. Data manipulation and preprocessing. Dimension reduction. Sampling, oversampling, undersampling. Data scraping and wrangling. Visualization of multivariate data. Panel displays, surface plots, 3D scatterplots, contour plots. 2D representation of multivariate data. Interactive graphics revealing any structure in data: Asimov?s grand tour, projection pursuit explanatory data analysis (PPEDA). Visualization of categorical data. Dynamic graphics.


Course Learning Outcomes

In this course, students will develop a comprehensive set of skills and competencies to proficiently handle and visualize data using Unix and R. They will learn to collect data from various sources, ensuring data quality and integrity, with an emphasis on Unix-based data processing. Through hands-on experience, they will gain expertise in preprocessing and cleaning data within the Unix environment and R. Students will become adept at using R for data analysis and data visualization, creating insightful visual representations, and interpreting these visualizations to extract valuable insights. Furthermore, ethical considerations surrounding data privacy and security will be emphasized. Students will enhance their communication skills by effectively conveying data-driven findings through visualizations and reports created using R. Collaborative project work will promote teamwork, while critical evaluation of visualizations will foster a discerning approach. Staying adaptable to emerging technologies and applying these skills to domain-specific contexts will be integral to the learning journey. Additionally, students will utilize GitHub as a key tool for version control, collaboration, and sharing of data-related projects within the Unix and R environments, reinforcing their ability to work effectively in data-related teams and showcasing their work to a broader audience.


Program Outcomes Matrix

Level of Contribution
#Program Outcomes0123
1Ability for converting theoretical, methodological, and computational statistical knowledge into analytical solutions in researches requiring statistical analyses.
2Ability for specifiying problems in real life situations bearing uncertainty, forming hypotheses, modeling, application, and interpreting the results.
3Ability for using current technology, computer softwares for statistical applications, computer programming for specific problems when necessary, writing computer codes for speeding up statistical calculations, organizing and cleaning databases, and preparing them for statistical analyses, and data mining.
4Ability for taking part in intra/inter disciplinary team work, efficient use of time, taking responsibility as a team leader, and entrepreneurship.
5Ability for taking responsibility in solitary work and producing creative solutions.
6Ability for keeping up-to-date with current advancements in statistical sciences, doing research, being open-minded, and adopting critical thinking.
7Ability for effective communication both in Turkish and English in specification of statistical problems, analyes, and interpretation of findings.
8Ability for using the knowledge in the field of expertise for the welfare of the society.
9Ability for suggesting the researchers in a comprehensible way the appropriate statistical methods for problems in fields that use statistics such as economics, finance, industrial engineering, genetics, and medicine and apply if needed.
10Ability for catalyzing discussions and presentations, public speaking, making presentations, communicating topics of expertise to the audiance in a comprehensible way.

0: No Contribution 1: Little Contribution 2: Partial Contribution 3: Full Contribution