Course - Computer Vision and Deep Learning - TDT4265
TDT4265 - Computer Vision and Deep Learning
About
Examination arrangement
Examination arrangement: Aggregate score
Grade: Letter grades
Evaluation | Weighting | Duration | Grade deviation | Examination aids |
---|---|---|---|---|
Assignment | 40/100 | |||
School exam | 60/100 | 4 hours | D |
Course content
Computer vision is the field of computer science that focuses on creating digital systems that can process, analyze, and make sense of visual data (images, videos, point clouds etc.) in the same way that humans do. The human visual system is great at interpreting visual content and a decade ago the general idea was that a computer would never beat a human doing this. Modern computer vision (CV) is built on the deep learning revolution where machines learn to interpret visual content from data and today we see super-human performance within many fields. CV has become one of the hottest research fields out there and is the key enabling technology for everything from self-driving cars to automated medical image analysis. The focus of this course will be modern CV while still giving you a small glimpse of how things looked like prior to the introduction of the famous AlexNet in 2012, as well as some CV-related tasks where more traditional methods still outperform the new data driven methods. Some of the topics covered in the course are as follows:
- CV fundamentals, inc. various CV tasks.
- DL basics, neurons, weights, biases, transfer functions and cost/loss functions.
- FCNN and CNN: Forward and Backward pass.
- Generalization and Overfitting, GD and Optimizers, Parameter Initialization and Hyperparameter Tuning, Data and Batch Normalization and Regularization methods and Data Augmentation.
- Image Classification and Backbones
- Object Detection and Tracking
- Segmentation (semantic, instance and panoptic) and Depth Estimation.
- Vision Transformers for various CV tasks.
- Generative Models (VAEs and GANs)
- Deep RL (state defined based on visual input)
- The use of simulators for training and verifying models.
- Augmented / Mixed Reality
- The CV landscape prior to 2012
- CV methods that have not been replaced by DL-based methods.
- The basics, neurons, weights, biases, transfer functions and cost/loss functions.
- FCNN and CNN: Forward and Backward pass.
- Generalization and Overfitting, GD and Optimizers, Parameter Initialisation and Hyperparameter Tuning, Data and Batch Normalisation and Regularization methods and Data Augmentation.
- Image Classification and Backbones
- Object Detection and Tracking
- Segmentation (semantic, instance and panoptic) and Depth Estimation.
- Vision Transformers for various CV tasks.
- Generative Models (VAEs and GANs)
- Deep RL (state defined based on visual input)
- The use of simulators for training and verifying models.
- Augmented / Mixed Reality
- The CV landscape prior to 2012
- CV methods that have not been replaced by DL-based methods.
Learning outcome
Learning outcome
A student taking this course should have:
- an overall view of the CV field and its increasing importance in the society,
- in-depth understanding of several fundamental techniques in modern CV, including the underlying concepts of learning from data.
- a thorough understanding of the underlying principles and the mathematical foundations of deep learning (DL), as well as the practical skills and knowhow to utilize modern DL-frameworks like PyTorch to develop solutions for key CV tasks like image classification, object detection and tracking, dense predictions tasks like segmentation and depth estimation, etc.
- knowledge about key datasets in the field, well-known and state-of-the-art architectures for various CV tasks, and an overview of the needed metrics to assess the quality of, and compare different DL-based CV models.
- the skills to design and construct advanced CV modules that function within a system to achieve the vision system's goals,
- a good understanding of how CV relates to key application domains like autonomy/ robotics (vehicles, drones and ships) and medical image analysis,
- the needed knowledge and skills for CV-related jobs in industry or public sector, as well as for future doctoral research within the field.
Learning methods and activities
Lectures, compulsory assignments, and a real-world mini project. Lectures will be given in English. Developing practical skills (tools, key DL-frameworks etc.) is an important part of the course.
Compulsory assignments
- Exercises
Further on evaluation
The final grades are based on two parts, a real-world mini-project (40%) and a digital school exam (60%). Both parts are assigned a letter grade and then weighted and combined to form the final letter grade in the course. Both parts must be passed individually the same semester, in order to pass the course.
If a student decides to retake the course for grade improvement or if the student failed the course, then they have to redo both parts of the course.
The examination papers will be given in English only. If there is a re-sit examination, the examination form may change from written to oral.
Traditional assignments are considered compulsory activity and a certain amount to this work must be approved to be allowed to attend the exam.
Recommended previous knowledge
TDT4195 Visual Computing fundamentals or equivalent.
Course materials
- Book: Neural Networks and Deep Learning, Michael Nielsen (online)
- Book: Deep Learning, Ian Goodfellow et. al. (online)
- Supplementary material will be handed out as needed.
Credit reductions
Course code | Reduction | From | To |
---|---|---|---|
SIF8066 | 7.5 |
No
Version: 1
Credits:
7.5 SP
Study level: Second degree level
Term no.: 1
Teaching semester: SPRING 2024
Language of instruction: English
Location: Trondheim
- Informatics
- Technological subjects
Department with academic responsibility
Department of Computer Science
Examination
Examination arrangement: Aggregate score
- Term Status code Evaluation Weighting Examination aids Date Time Examination system Room *
- Spring ORD School exam 60/100 D INSPERA
-
Room Building Number of candidates - Spring ORD Assignment 40/100 INSPERA
-
Room Building Number of candidates - Summer UTS School exam 60/100 D INSPERA
-
Room Building Number of candidates
- * The location (room) for a written examination is published 3 days before examination date. If more than one room is listed, you will find your room at Studentweb.
For more information regarding registration for examination and examination procedures, see "Innsida - Exams"