The course Data science: Data analysis and visualization offers a range of statistical and graphical techniques to uncover hidden structures in the data, including machine learning and data mining techniques. The course has a strong practical focus; participants actively learn how to apply these techniques on real data.
This course can be taken separately or as part of the Data science specialization.
What puts former criminals on the right track? How can we prevent heart disease? Can Twitter predict election outcomes? What does a violent brain look like? How many social classes does 21st century society have? Are hospitals spending too much on health care, or too little?
Data analysis is the art and science of tackling questions like these by looking at data. Just as cartographers make maps to see what a country looks like, data analysts explore the hidden structures of data by creating informative pictures and summarizing relationships among variables. And just as doctors diagnose sick patients and advise healthy ones on how to stay healthy, data analysts predict important events and variables so we can act on this knowledge. Methods from statistics, data mining, and machine learning play an important part in this process, as well as visualizations that allow the analyst and other humans to better understand what we can conclude from the available facts.
During this course, participants will actively learn how to apply the main statistical methods in data analysis and how to use data mining algorithms and visualizing techniques. The course has a strongly practical, hands-on focus: rather than focusing on the mathematics and background of the discussed techniques, you will gain hands-on experience in using them on real data during the course and interpreting the results. The course covers both classical and modern topics in data analysis and visualization.
Participants are requested to bring their own laptop for lab meetings.
Basic knowledge of the statistical software program R is required (e.g. of the level of the Summer School Data Science: Statistical Programming with R or the book Data Science with R Wickham).
This course is part of a series of courses in the Summer School Data Science specialization taught by UU’s department of Methodology & Statistics. This course can also be taken separately.
Summer School Data Science specialization:
1. Data science: Statistical Programming with R (S24)
2. Data science: Multiple Imputation in Practice (S28)
3. Data science: Data analysis & visualization (this course) (S31)
Upon completing all courses in the specialization, students can obtain a certificate. Each course may also be taken separately.