course-details-portlet

TDT4265

Computer Vision and Deep Learning

Credits 7.5
Level Second degree level
Course start Spring 2023
Duration 1 semester
Language of instruction English
Location Trondheim
Examination arrangement Aggregate score

About

About the course

Course content

Computer vision is the field of computer science that focuses on creating digital systems that can process, analyze, and make sense of visual data (images, videos, point clouds etc.) in the same way that humans do. The human visual system is great at interpreting visual content and a decade ago the general idea was that a computer would never beat a human doing this. Modern computer vision (CV) is built on the deep learning revolution where machines learn to interpret visual content from data and today we see super-human performance within many fields. CV has become one of the hottest research fields out there and is the key enabling technology for everything from self-driving cars to automated medical image analysis. The focus of this course will be modern CV while still giving you a small glimpse of how things looked like prior to the introduction of the famous AlexNet in 2012, as well as some CV-related tasks where more traditional methods still outperform the new data driven methods. Some of the topics covered in the course are as follows:

  • CV fundamentals, inc. various CV tasks.
  • DL basics, neurons, weights, biases, transfer functions and cost/loss functions.
  • FCNN and CNN: Forward and Backward pass.
  • Generalization and Overfitting, GD and Optimizers, Parameter Initialization and Hyperparameter Tuning, Data and Batch Normalization and Regularization methods and Data Augmentation.
  • Image Classification and Backbones
  • Object Detection and Tracking
  • Segmentation (semantic, instance and panoptic) and Depth Estimation.
  • Vision Transformers for various CV tasks.
  • Generative Models (VAEs and GANs)
  • Deep RL (state defined based on visual input)
  • The use of simulators for training and verifying models.
  • Augmented / Mixed Reality
  • The CV landscape prior to 2012
  • CV methods that have not been replaced by DL-based methods.
  • The basics, neurons, weights, biases, transfer functions and cost/loss functions.
  • FCNN and CNN: Forward and Backward pass.
  • Generalization and Overfitting, GD and Optimizers, Parameter Initialisation and Hyperparameter Tuning, Data and Batch Normalisation and Regularization methods and Data Augmentation.
  • Image Classification and Backbones
  • Object Detection and Tracking
  • Segmentation (semantic, instance and panoptic) and Depth Estimation.
  • Vision Transformers for various CV tasks.
  • Generative Models (VAEs and GANs)
  • Deep RL (state defined based on visual input)
  • The use of simulators for training and verifying models.
  • Augmented / Mixed Reality
  • The CV landscape prior to 2012
  • CV methods that have not been replaced by DL-based methods.

Learning outcome

Learning outcome

A student taking this course should have:

  • an overall view of the CV field and its increasing importance in the society,
  • in-depth understanding of several fundamental techniques in modern CV, including the underlying concepts of learning from data.
  • a thorough understanding of the underlying principles and the mathematical foundations of deep learning (DL), as well as the practical skills and knowhow to utilize modern DL-frameworks like PyTorch to develop solutions for key CV tasks like image classification, object detection and tracking, dense predictions tasks like segmentation and depth estimation, etc.
  • knowledge about key datasets in the field, well-known and state-of-the-art architectures for various CV tasks, and an overview of the needed metrics to assess the quality of, and compare different DL-based CV models.
  • the skills to design and construct advanced CV modules that function within a system to achieve the vision system's goals,
  • a good understanding of how CV relates to key application domains like autonomy/ robotics (vehicles, drones and ships) and medical image analysis,
  • the needed knowledge and skills for CV-related jobs in industry or public sector, as well as for future doctoral research within the field.

Learning methods and activities

Lectures, compulsory assignments, and a real-world mini project. Lectures will be given in English. Developing practical skills (tools, key DL-frameworks etc.) is an important part of the course.

Compulsory assignments

  • Exercises

Further on evaluation

The final grades are based on two parts, a real-world mini-project (40%) and a digital school exam (60%). Both parts are assigned a letter grade and then weighted and combined to form the final letter grade in the course. Both parts must be passed individually in order to pass the course.

The examination papers will be given in English only. If there is a re-sit examination, the examination form may change from written to oral.

Traditional assignments are considered compulsory activity and a certain amount to this work must be approved to be allowed to attend the exam.

Course materials

  • Book: Neural Networks and Deep Learning, Michael Nielsen (online)
  • Book: Deep Learning, Ian Goodfellow et. al. (online)
  • Supplementary material will be handed out as needed.

Credit reductions

Course code Reduction From
SIF8066 7.5 sp
This course has academic overlap with the course in the table above. If you take overlapping courses, you will receive a credit reduction in the course where you have the lowest grade. If the grades are the same, the reduction will be applied to the course completed most recently.

Subject areas

  • Informatics
  • Technological subjects

Contact information

Course coordinator

Department with academic responsibility

Department of Computer Science

Examination

Examination

Examination arrangement: Aggregate score
Grade: Letter grades

Ordinary examination - Spring 2023

School exam
Weighting 60/100 Examination aids Code D Date 2023-05-24 Time 09:00 Duration 4 hours Exam system Inspera Assessment
Place and room for school exam

The specified room can be changed and the final location will be ready no later than 3 days before the exam. You can find your room location on Studentweb.

Sluppenvegen 14
Room SL311 grønn sone
68 candidates
Room SL311 orange sone
15 candidates
Room SL520
47 candidates
Assignment
Weighting 40/100 Date Submission 2023-04-21 Time Submission 14:00

Re-sit examination - Summer 2023

School exam
Weighting 60/100 Examination aids Code D Duration 4 hours Exam system Inspera Assessment Place and room Not specified yet.