STAT112 INTRODUCTION TO DATA PROCESSING AND VISUALIZATION

Course Code:2460112
METU Credit (Theoretical-Laboratory hours/week):4 (3.00 - 2.00)
ECTS Credit:5.0
Department:Statistics
Language of Instruction:English
Level of Study:Undergraduate
Course Coordinator:
Offered Semester:Fall and Spring Semesters.

Course Objectives

This is a beginning course for statistics students. The focus of the course is on the fundamental principles and best practices for data manipulation and visualization. It covers two parts: First part will focus on basic concepts such as data types and data manipulation and query. The second part will focus on data visualization starting with the exploratory data analysis using various statistical plots. The data manipulation and visualization methods will be materialized in code using Tableau, Flourish and Python packages. Students will create their own data visualizations, and learn to use Open Source data visualization tools.


Course Content

Basic definitions and managing different types of data. Introduction to manipulation (indexing, subsetting, reshaping, transforming etc.), visualization, mapping and analysis of data. Dealing with common problems like missing or inconsistent values in datasets. Use of related R and/or Python programming packages. Merging multiple data tables (equivalent to an SQL JOIN)


Course Learning Outcomes

  • Learn data types
  • Perform basic data operations using Tableau and Python. Apply data transformations such as aggregation and filtering for visualization. Indexing, slicing and subsetting in pandas DataFrames
  • Practical experience building and evaluating visualization systems. Design and create data visualizations.
  • Conduct exploratory data analysis using visualization.
  • Using the Grammar of Graphics to convert data into figures using the seaborn and matplotlib libraries
  • Identify potential pitfalls when handling data with Tableau and Python
  • Create basic and visually appealing diagrams using Tableau and Python
  • Arrange visual presentations of data for effective communication.
  • Design and evaluate color palettes for visualization based on principles of perception. 
  • Identify opportunities for application of data visualization in various domains

Program Outcomes Matrix

Level of Contribution
#Program Outcomes0123
1Applying the knowledge of statistics, mathematics and computer to statistical problems and developing analytical solutions.
2Defining, modeling and solving real life problems that involve uncertainty, and interpreting results.
3To decide on the data collection technique, and apply it through experiment, observation, questionnaire or simulation.
4Analysing small and big volumes of data and interpreting results.
5Utilizing up-to-date techniques, computer hardware and software required for statistical applications; developing software programs and numerical solutions for specific problems when necessary.
6Taking part in intradisciplinary and interdisciplinary teamwork, using time efficiently, taking leadership responsibilities and being entrepreneurial.
7Taking responsibility in individual work and offering authentic solutions.
8Following contemporary developments and publications in statistical science, conducting research, being open to novelty and thinking critically.
9Efficiently communicating in Turkish and English to define and analyze statistical problems and to interpret the results.
10Having a professional and ethical sense of responsibility.
11Developing computational solutions to statistical problems that cannot be solved analytically.
12Having theoretical background and developing new theories in statistics, building relations between theoretical and practical knowledge.
13Serving the society with the expertise in the field.

0: No Contribution 1: Little Contribution 2: Partial Contribution 3: Full Contribution