course-details-portlet

TDT4265 - Computer Vision and Deep Learning

About

Examination arrangement

Examination arrangement: Aggregate score
Grade: Letter grades

Evaluation Weighting Duration Grade deviation Examination aids
Assignment 40/100
School exam 60/100 4 hours D

Course content

Computer vision is the field of computer science that focuses on creating digital systems that can process, analyze, and make sense of visual data (images, videos, point clouds etc.) in the same way that humans do. The human visual system is great at interpreting visual content and a decade ago the general idea was that a computer would never beat a human doing this. Modern computer vision (CV) is built on the deep learning revolution where machines learn to interpret visual content from data and today we see super-human performance within many fields. CV has become one of the hottest research fields out there and is the key enabling technology for everything from self-driving cars to automated medical image analysis. The focus of this course will be modern CV while still giving you a small glimpse of how things looked like prior to the introduction of the famous AlexNet in 2012, as well as some CV-related tasks where more traditional methods still outperform the new data driven methods. Some of the topics covered in the course are as follows:

  • CV fundamentals, inc. various CV tasks.
  • DL basics, neurons, weights, biases, transfer functions and cost/loss functions.
  • FCNN and CNN: Forward and Backward pass.
  • Generalization and Overfitting, GD and Optimizers, Parameter Initialization and Hyperparameter Tuning, Data and Batch Normalization and Regularization methods and Data Augmentation.
  • Image Classification and Backbones
  • Object Detection and Tracking
  • Segmentation (semantic, instance and panoptic) and Depth Estimation.
  • Vision Transformers for various CV tasks.
  • Generative Models (VAEs and GANs)
  • Deep RL (state defined based on visual input)
  • The use of simulators for training and verifying models.
  • Augmented / Mixed Reality
  • The CV landscape prior to 2012
  • CV methods that have not been replaced by DL-based methods.
  • The basics, neurons, weights, biases, transfer functions and cost/loss functions.
  • FCNN and CNN: Forward and Backward pass.
  • Generalization and Overfitting, GD and Optimizers, Parameter Initialisation and Hyperparameter Tuning, Data and Batch Normalisation and Regularization methods and Data Augmentation.
  • Image Classification and Backbones
  • Object Detection and Tracking
  • Segmentation (semantic, instance and panoptic) and Depth Estimation.
  • Vision Transformers for various CV tasks.
  • Generative Models (VAEs and GANs)
  • Deep RL (state defined based on visual input)
  • The use of simulators for training and verifying models.
  • Augmented / Mixed Reality
  • The CV landscape prior to 2012
  • CV methods that have not been replaced by DL-based methods.

Learning outcome

Learning outcome

A student taking this course should have:

  • an overall view of the CV field and its increasing importance in the society,
  • in-depth understanding of several fundamental techniques in modern CV, including the underlying concepts of learning from data.
  • a thorough understanding of the underlying principles and the mathematical foundations of deep learning (DL), as well as the practical skills and knowhow to utilize modern DL-frameworks like PyTorch to develop solutions for key CV tasks like image classification, object detection and tracking, dense predictions tasks like segmentation and depth estimation, etc.
  • knowledge about key datasets in the field, well-known and state-of-the-art architectures for various CV tasks, and an overview of the needed metrics to assess the quality of, and compare different DL-based CV models.
  • the skills to design and construct advanced CV modules that function within a system to achieve the vision system's goals,
  • a good understanding of how CV relates to key application domains like autonomy/ robotics (vehicles, drones and ships) and medical image analysis,
  • the needed knowledge and skills for CV-related jobs in industry or public sector, as well as for future doctoral research within the field.

Learning methods and activities

Lectures, compulsory assignments, and a real-world mini project. Lectures will be given in English. Developing practical skills (tools, key DL-frameworks etc.) is an important part of the course.

Compulsory assignments

  • Exercises

Further on evaluation

The final grades are based on two parts, a real-world mini-project (40%) and a digital school exam (60%). Both parts are assigned a letter grade and then weighted and combined to form the final letter grade in the course. Both parts must be passed individually in order to pass the course.

The examination papers will be given in English only. If there is a re-sit examination, the examination form may change from written to oral.

Traditional assignments are considered compulsory activity and a certain amount to this work must be approved to be allowed to attend the exam.

Specific conditions

Compulsory activities from previous semester may be approved by the department.

Course materials

  • Book: Neural Networks and Deep Learning, Michael Nielsen (online)
  • Book: Deep Learning, Ian Goodfellow et. al. (online)
  • Supplementary material will be handed out as needed.

Credit reductions

Course code Reduction From To
SIF8066 7.5
More on the course

No

Facts

Version: 1
Credits:  7.5 SP
Study level: Second degree level

Coursework

Term no.: 1
Teaching semester:  SPRING 2023

Language of instruction: English

Location: Trondheim

Subject area(s)
  • Informatics
  • Technological subjects
Contact information
Course coordinator:

Department with academic responsibility
Department of Computer Science

Examination

Examination arrangement: Aggregate score

Term Status code Evaluation Weighting Examination aids Date Time Examination system Room *
Spring ORD School exam 60/100 D INSPERA
Room Building Number of candidates
Spring ORD Assignment 40/100
Room Building Number of candidates
Summer UTS School exam 60/100 D INSPERA
Room Building Number of candidates
  • * The location (room) for a written examination is published 3 days before examination date. If more than one room is listed, you will find your room at Studentweb.
Examination

For more information regarding registration for examination and examination procedures, see "Innsida - Exams"

More on examinations at NTNU