STAT112 INTRODUCTION TO DATA PROCESSING AND VISUALIZATION
Course Code: | 2460112 |
METU Credit (Theoretical-Laboratory hours/week): | 4 (3.00 - 2.00) |
ECTS Credit: | 5.0 |
Department: | Statistics |
Language of Instruction: | English |
Level of Study: | Undergraduate |
Course Coordinator: | |
Offered Semester: | Fall and Spring Semesters. |
Course Objectives
This is a beginning course for statistics students. The focus of the course is on the fundamental principles and best practices for data manipulation and visualization. It covers two parts: First part will focus on basic concepts such as data types and data manipulation and query. The second part will focus on data visualization starting with the exploratory data analysis using various statistical plots. The data manipulation and visualization methods will be materialized in code using Tableau, Flourish and Python packages. Students will create their own data visualizations, and learn to use Open Source data visualization tools.
Course Content
Basic definitions and managing different types of data. Introduction to manipulation (indexing, subsetting, reshaping, transforming etc.), visualization, mapping and analysis of data. Dealing with common problems like missing or inconsistent values in datasets. Use of related R and/or Python programming packages. Merging multiple data tables (equivalent to an SQL JOIN)
Course Learning Outcomes
- Learn data types
- Perform basic data operations using Tableau and Python. Apply data transformations such as aggregation and filtering for visualization. Indexing, slicing and subsetting in pandas DataFrames
- Practical experience building and evaluating visualization systems. Design and create data visualizations.
- Conduct exploratory data analysis using visualization.
- Using the Grammar of Graphics to convert data into figures using the seaborn and matplotlib libraries
- Identify potential pitfalls when handling data with Tableau and Python
- Create basic and visually appealing diagrams using Tableau and Python
- Arrange visual presentations of data for effective communication.
- Design and evaluate color palettes for visualization based on principles of perception.
- Identify opportunities for application of data visualization in various domains
Program Outcomes Matrix
Level of Contribution | |||||
# | Program Outcomes | 0 | 1 | 2 | 3 |
1 | Applying the knowledge of statistics, mathematics and computer to statistical problems and developing analytical solutions. | ✔ | |||
2 | Defining, modeling and solving real life problems that involve uncertainty, and interpreting results. | ✔ | |||
3 | To decide on the data collection technique, and apply it through experiment, observation, questionnaire or simulation. | ✔ | |||
4 | Analysing small and big volumes of data and interpreting results. | ✔ | |||
5 | Utilizing up-to-date techniques, computer hardware and software required for statistical applications; developing software programs and numerical solutions for specific problems when necessary. | ✔ | |||
6 | Taking part in intradisciplinary and interdisciplinary teamwork, using time efficiently, taking leadership responsibilities and being entrepreneurial. | ✔ | |||
7 | Taking responsibility in individual work and offering authentic solutions. | ✔ | |||
8 | Following contemporary developments and publications in statistical science, conducting research, being open to novelty and thinking critically. | ✔ | |||
9 | Efficiently communicating in Turkish and English to define and analyze statistical problems and to interpret the results. | ✔ | |||
10 | Having a professional and ethical sense of responsibility. | ✔ | |||
11 | Developing computational solutions to statistical problems that cannot be solved analytically. | ✔ | |||
12 | Having theoretical background and developing new theories in statistics, building relations between theoretical and practical knowledge. | ✔ | |||
13 | Serving the society with the expertise in the field. | ✔ |
0: No Contribution 1: Little Contribution 2: Partial Contribution 3: Full Contribution