Course - Data Science and Machine Learning in Natural Sciencesogy - KP8907
Data Science and Machine Learning in Natural Sciencesogy
New from the academic year 2025/2026
About
About the course
Course content
This course offers a concise yet comprehensive dive into ML’s integration and utility in the natural science domain. Participants will learn about clustering and dimensionality reduction to the group and visualize high-dimensional data, machine learning methods such as Quantitative Structure-Activity Relationship (QSAR) for molecular property prediction and classification, physics-informed methods for combining physics principles with machine learning, and surrogate models to accelerate model run time. The course equips attendees with the skills and knowledge to apply machine learning effectively in their respective fields by combining theory with practical examples. Additionally, students will develop an understanding of the relationship between ML principles and traditional fields such as mathematics, chemometrics, and automatic control, thereby demystifying machine learning and providing a clear path to its practical application in natural sciences.
Learning outcome
Learning Outcomes:
- Develop advanced machine learning methodologies by leveraging domain-specific context, ensuring their theoretical soundness and practical applicability in addressing scientific challenges within natural sciences.
- Apply machine learning techniques to scientific research problems, effectively combining data-driven models with physical laws to generate new insights and extend current scientific knowledge.
Learning Objectives - After completing this course, you will be able to:
- Program in Python or Julia within a Jupyter Notebook environment.
- Load CSV files into Python Data Frames, visualize variables, select columns, summarize data, and fill in missing data.
- Solve supervised learning tasks like regression and classification.
- Solve unsupervised learning tasks, such as clustering, dimensionality reduction, and density estimation.
- Understand and use the mathematical structure of deep neural networks and their applications to regression and classification.
- Understand basic automatic control theory and use this in data science and machine learning contexts. Be able to set up simple control structures.
- Develop and apply models that combine physical knowledge with machine learning to enhance predictive power.
- Formulate and solve the molecular property prediction problem as a regression/classification task.
- Formulate and solve time series forecast problems as a regression task.
Learning methods and activities
Problem-driven learning takes place weekly (4 hours) with associated exercises and guidance (2 hours)
Compulsory assignments
- Compulsories exercises
Further on evaluation
Report and presentation from the project work counts 100% of the grade. 6 excersices must be approved in order to get the grade.
In the event of a postponed examination (re-sit examination), the written examination may be changed to an oral examination.
Specific conditions
Admission to a programme of study is required:
Chemical Engineering (PHKJPROS)
Course materials
Data Analysis with Python online book - For Modules 1, 2 and 3, Chapters 2 to 7.
Multivariable Feedback Control: Analysis and Design, 2nd Edition Sigurd Skogestad, Ian Postlethwaite
Applications of Deep Neural Networks with Keras by Jeff Heaton for deep learning -Modules and 10.
Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications
Subject areas
- Technological subjects