course-details-portlet

TDT4265

Computer Vision and Deep Learning

Choose study year

Credits 7.5

Level Second degree level

Course start Spring 2023

Duration 1 semester

Language of instruction English

Location Trondheim

Examination arrangement Aggregate score

About the course

Course content

Computer vision is the field of computer science that focuses on creating digital systems that can process, analyze, and make sense of visual data (images, videos, point clouds etc.) in the same way that humans do. The human visual system is great at interpreting visual content and a decade ago the general idea was that a computer would never beat a human doing this. Modern computer vision (CV) is built on the deep learning revolution where machines learn to interpret visual content from data and today we see super-human performance within many fields. CV has become one of the hottest research fields out there and is the key enabling technology for everything from self-driving cars to automated medical image analysis. The focus of this course will be modern CV while still giving you a small glimpse of how things looked like prior to the introduction of the famous AlexNet in 2012, as well as some CV-related tasks where more traditional methods still outperform the new data driven methods. Some of the topics covered in the course are as follows:

CV fundamentals, inc. various CV tasks.
DL basics, neurons, weights, biases, transfer functions and cost/loss functions.
FCNN and CNN: Forward and Backward pass.
Generalization and Overfitting, GD and Optimizers, Parameter Initialization and Hyperparameter Tuning, Data and Batch Normalization and Regularization methods and Data Augmentation.
Image Classification and Backbones
Object Detection and Tracking
Segmentation (semantic, instance and panoptic) and Depth Estimation.
Vision Transformers for various CV tasks.
Generative Models (VAEs and GANs)
Deep RL (state defined based on visual input)
The use of simulators for training and verifying models.
Augmented / Mixed Reality
The CV landscape prior to 2012
CV methods that have not been replaced by DL-based methods.
The basics, neurons, weights, biases, transfer functions and cost/loss functions.
FCNN and CNN: Forward and Backward pass.
Generalization and Overfitting, GD and Optimizers, Parameter Initialisation and Hyperparameter Tuning, Data and Batch Normalisation and Regularization methods and Data Augmentation.
Image Classification and Backbones
Object Detection and Tracking
Segmentation (semantic, instance and panoptic) and Depth Estimation.
Vision Transformers for various CV tasks.
Generative Models (VAEs and GANs)
Deep RL (state defined based on visual input)
The use of simulators for training and verifying models.
Augmented / Mixed Reality
The CV landscape prior to 2012
CV methods that have not been replaced by DL-based methods.

Learning outcome

Learning outcome

A student taking this course should have:

an overall view of the CV field and its increasing importance in the society,
in-depth understanding of several fundamental techniques in modern CV, including the underlying concepts of learning from data.
a thorough understanding of the underlying principles and the mathematical foundations of deep learning (DL), as well as the practical skills and knowhow to utilize modern DL-frameworks like PyTorch to develop solutions for key CV tasks like image classification, object detection and tracking, dense predictions tasks like segmentation and depth estimation, etc.
knowledge about key datasets in the field, well-known and state-of-the-art architectures for various CV tasks, and an overview of the needed metrics to assess the quality of, and compare different DL-based CV models.
the skills to design and construct advanced CV modules that function within a system to achieve the vision system's goals,
a good understanding of how CV relates to key application domains like autonomy/ robotics (vehicles, drones and ships) and medical image analysis,
the needed knowledge and skills for CV-related jobs in industry or public sector, as well as for future doctoral research within the field.

Learning methods and activities

Lectures, compulsory assignments, and a real-world mini project. Lectures will be given in English. Developing practical skills (tools, key DL-frameworks etc.) is an important part of the course.

Compulsory assignments

Exercises

Further on evaluation

The final grades are based on two parts, a real-world mini-project (40%) and a digital school exam (60%). Both parts are assigned a letter grade and then weighted and combined to form the final letter grade in the course. Both parts must be passed individually in order to pass the course.

The examination papers will be given in English only. If there is a re-sit examination, the examination form may change from written to oral.

Traditional assignments are considered compulsory activity and a certain amount to this work must be approved to be allowed to attend the exam.

Recommended previous knowledge

TDT4195 Visual Computing fundamentals or equivalent.

Course materials

Book: Neural Networks and Deep Learning, Michael Nielsen (online)
Book: Deep Learning, Ian Goodfellow et. al. (online)
Supplementary material will be handed out as needed.

Credit reductions

Course code	Reduction	From
SIF8066	7.5 sp

This course has academic overlap with the course in the table above. If you take overlapping courses, you will receive a credit reduction in the course where you have the lowest grade. If the grades are the same, the reduction will be applied to the course completed most recently.

Subject areas

Informatics
Technological subjects

Contact information

Course coordinator

Gabriel Hanssen Kiss

Department with academic responsibility

Department of Computer Science

Examination

Examination arrangement: Aggregate score

Grade: Letter grades

Ordinary examination - Spring 2023

School exam

Weighting 60/100 Examination aids Code D Date 2023-05-24 Time 09:00 Duration 4 hours Exam system Inspera Assessment

Place and room for school exam

The specified room can be changed and the final location will be ready no later than 3 days before the exam. You can find your room location on Studentweb.

Sluppenvegen 14

Room SL311 grønn sone

68 candidates

Room SL311 orange sone

15 candidates

Room SL520

47 candidates

Assignment

Weighting 40/100 Date Submission 2023-04-21 Time Submission 14:00

Re-sit examination - Summer 2023

School exam

Weighting 60/100 Examination aids Code D Duration 4 hours Exam system Inspera Assessment Place and room Not specified yet.

All about examinations at NTNU

Språkvelger

Course - Computer Vision and Deep Learning - TDT4265

course-details-portlet

Computer Vision and Deep Learning

About

About the course

Course content

Learning outcome

Learning methods and activities

Compulsory assignments

Further on evaluation

Recommended previous knowledge

Course materials

Credit reductions

Subject areas

Contact information

Course coordinator

Department with academic responsibility

Examination

Examination

Ordinary examination - Spring 2023

School exam

Assignment

Re-sit examination - Summer 2023

School exam