Course - Data analysis and visualisation in chemistry - KJ2222
Data analysis and visualisation in chemistry
New from the academic year 2025/2026
About
About the course
Course content
The course will introduce the use of simple data analysis and visualization techniques in chemistry, using Python as a general purpose scripting and programming language. You will learn the basic data analysis techniques needed to interpret experimental data acquired in your experimental lab courses, and you will also be introduced to the use of the Jupyter notebook as an interactive tool for scripting and programming. The course uses Python as a scripting language to aid in the quantitative analysis of data, but it is not intended to be a comprehensive course in general programming. It will, however, introduce the elements of the Python language, and hence it should serve as a useful pre-introduction to Python programming.
The course is delivered through a combination of lectures and interactive sessions, using Jupyter notebooks. The first part of the course will introduce simple data types in Python, 1D and 2D numerical arrays, simple numerical calculations, function definitions, and plotting via the Matplotlib data visualization library. The focus will be on
- Production of clean, labelled, easy-to-read linear and logarithmic 2D plots, using appropriate units.
- General statistical quantities, including mean, mode, median, standard deviation, standard error and relative standard deviation.
- Visual presentation of statistical data using histograms and other plot types.
- Extraction and manipulation of sub-data within a larger data set, using splicing methods and logical operators.
- Simple methods for writing data to text files and importing data from text files into Python
- Weighted and unweighted linear fitting of chemical data to polynomials and other basic functions with emphasis on the dangers of over-fitting data to high order polynomials.
- Use of Python modules to extend the functionality of the core Python programming language, using scipy's curve_fit function for nonlinear fitting as an example.
- Introduction to 3D curve plotting, together with the use of contour plots and projections for 2D representation of 3D data.
- Use of Experimental Design (factorial design) to plan and execute experiments.
In the second part of the course, you will carry out a number of pass/fail and assessed assignments that make use of the concepts introduced in the earlier part of the course. These assignments will illustrate the application of the methods we have learned to common problems in chemistry, including the analysis of kinetic data, fitting of chemical data to known physical models, statistical analysis of chemical data, and the planning of chemistry-based experiments.
The assessed assignments will also include a laboratory-based component, where you will be required to gather experimental data for further analysis in Python.
Learning outcome
On successful completion of the course you should be comfortable using Python for simple analysis and plotting of experimental data. In particular, you will be able to:
- Use Python as a scripting language for analysis and plotting.
- Use Jupyter notebooks as an interactive scripting environment.
- Understand and use common data types in Python, in both scalar and array format.
- Extract basic statistical information from data arrays, including mean, median, mode, standard deviation, and relative standard deviation.
- Use linear and non-linear fitting methods to fit data to simple functions.
- Generate clean, labeled plots in 2D and 3D with appropriate units.
- Understand how the above techniques can be applied to common problems and tasks in physical, organic, inorganic and analytical chemistry.
Learning methods and activities
The course will be delivered through a combination of lectures and Jupyter notebook exercises. The notebook exercises are compulsory and 100% must be approved to give access to the examined assignments. The examined assignments will take the form of two mixed experimental/computational exercises, one focused on the acquisition and analysis of kinetic data, and the second focused on the use and application of factorial design in planning chemical experiments.
The total workload is 200 hours including lectures, problem sets, and self study.
Compulsory assignments
- Practical training1
- Practical training2
- Practical training 3
- Practical training 4
- Practical training 5
- Practical training 6
Further on evaluation
Notebook exercises
Two assessed experimental/computational assignments
Recommended previous knowledge
Assumed previous knowledge: high-school level maths.
Required previous knowledge
There are no requirements for admission to the course.
Course materials
A series of interactive Jupyter notebooks will be provided.
Subject areas
- Chemistry