2019 IEEE International Conference on Visual Communications and Image Processing (VCIP)

December 1-4, 2019 • Sydney, AUSTRALIA

Tutorial - Deep learning for visual computing and image processing

Time:        10:30-17:30 Sunday, 1 December
Location:  Broadway Room, Aerial UTS Function Centre

During the past few years, deep learning techniques have achieved great success in various computer vision tasks, such as recognition, detection, tracking, and so on. Compared to traditional AI models, deep learning techniques are generally much more powerful for their impressive representation capability. Some of the state-of-the-art deep learning models can even surpass human-level performance in visual computation and image processing. Deep learning is now shaping the future of AI, as well as the human society.

In this tutorial, we will introduce the fundamental deep learning techniques for visual computation and image processing and review the recent progress in different sub-areas of deep learning. First, we will discuss the general development of deep learning and AI. Some representative advancements will be introduced to help illustrate the implementation from perceiving to learning, reasoning and behaving, delivering a broad view of current deep learning techniques and challenges that lie ahead. Second, we will discuss the domain shift problem in deep learning and introduce deep domain adaptation approaches that can tackle this problem effectively. Next, we will focus on the structure in visual data that can provide rich information to help improve the performance of deep models. Recent progress in using deep learning to model the structure in visual data will be reviewed and discussed in details. Following this, we will discuss the concept of generative adversarial networks (GANs) in deep learning. Different adversarial losses and adversarial strategies that can stabilize the training of (GANs) will be analysed and discussed. Then we will introduce the recent progress of deep learning based person re-identification methods. In particular, the analysis on studies about supervised person re-id and unsupervised person re-id will be presented with details. Lastly, we will introduce how generic object detection is implemented using deep learning techniques. We will describe the most representative deep learning-based object detectors proposed in the last few years. To sum up, in this tutorial, we will introduce the latest technologies on deep learning for visual computation and image processing with rich in-depth analysis and insightful discussions.


  • Lecture 1 - The recent progress of AI (by Dacheng Tao)
    Since the concept of Turing machine has been first proposed in 1936, the capability of machines to perform intelligent tasks went on growing exponentially. Artificial Intelligence (AI), as an essential accelerator, pursues the target of making machines as intelligent as human beings. It has already reformed how we live, work, learning, discover and communicate. In this talk, I will review our recent progress on AI by introducing some representative advancements from algorithms to applications, and illustrate the stairs for its realization from perceiving to learning, reasoning and behaving. To push AI from the narrow to the general, many challenges lie ahead. I will bring some examples out into the open, and shed lights on our future target. Today, we teach machines how to be intelligent as ourselves. Tomorrow, they will be our partners to get into our daily life.
  • Lecture 2 - Deep domain adaptation (by Dong Xu)
    In many visual recognition tasks, the training data used to learn a model and the testing data on which the model is applied often have different distributions. In order to enhance the generalization capability of the learnt models on the testing data, domain adaptation can be applied for the visual computing tasks. In this lecture, we introduce deep domain adaptation approaches for explicitly reducing the data distribution mismatch between the training samples in the source domain and the testing samples in the target domain.
  • Lecture 3 - Structured deep learning for visual computing (by Wanli Ouyang)
    Structure in data provide rich information that helps to reduce the complexity and improves the effectiveness of a model. In this lecture, an introduction will be given on the recent progress in using deep learning as a tool for modelling the structure in visual data. We show that observation in our problem are useful in modelling the structure of deep model and help to improve the effectiveness of deep models for many visual computing problems.
  • Lecture 4 - Stabilizing GANs training via evolutionary strategy (by Chaoyue Wang)
    In this lecture, we will introduce the adversarial strategies of training generative adversarial networks (GANs). First, we review the adversarial losses of existing GANs. Then, the advantages and disadvantages of different adversarial losses will be analyzed and discussed. In the end, we demonstrated the limitations of two-player adversarial games and further introduce the concept of evolutionary adversarial learning strategy, which aims to achieve better training stability and generative performance.
  • Lecture 5 - Deep learning for person re-identification (by Jingya Wang)
    In this lecture, we will introduce the deep learning based person re-identification methods for public surveillance. First, we will introduce the main challenges for surveillance data analysing and person re-identification pipeline for real-world applications. Then we will analyse the recent works on supervised person re-id and unsupervised/domain adaptation person re-id. In the end, we will discuss the relationship and future between person re-identification (ReID) and multi-target multi-camera tracking (MTMCT) research.
  • Lecture 6 - Generic Object Detection in Deep Learning Era (by Zhe Chen)
    In this lecture, we will discuss the recent progress of deep learning-based generic object detection. First, we will discuss the background, formulation, and some applications of the generic object detection. Next, we will discuss two major types of deep learning-based object detection frameworks, including single-stage detection framework and two-stage detection framework. Lastly, we will discuss some potential directions of next-generation deep learning-based generic object detectors.

Organizers & Presenters

Dacheng Tao


University of Sydney


Dacheng Tao (F’15) is Professor of Computer Science and ARC Laureate Fellow in the School of Computer Science and the Faculty of Engineering and Information Technologies, and the Inaugural Director of the UBTECH Sydney Artificial Intelligence Centre, at the University of Sydney. He mainly applies statistics and mathematics to Artificial Intelligence and Data Science. His research results have expounded in one monograph and 200+ publications at prestigious journals and prominent conferences, such as IEEE T-PAMI, T-IP, T-NNLS, IJCV, JMLR, NIPS, ICML, CVPR, ICCV, ECCV, ICDM; and ACM SIGKDD, with several best paper awards, such as the best theory/algorithm paper runner up award in IEEE ICDM’07, the best student paper award in IEEE ICDM’13, the distinguished paper award in the 2018 IJCAI, the 2014 ICDM 10-year highest-impact paper award, and the 2017 IEEE Signal Processing Society Best Paper Award. He is a Fellow of the Australian Academy of Science, AAAS, IEEE, IAPR, OSA and SPIE.

Dong Xu


University of Sydney


Dong Xu is chair in computer engineering in the School of Electrical and Information Engineering, The University of Sydney, Australia. He has published more than 100 papers in IEEE Transactions and top tier conferences. His co-authored work received the Best Student Paper Award in CVPR in 2010. He is on the editorial boards of the IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), the IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), and the IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT). He is a fellow of the IEEE.

Wanli Ouyang

Senior Lecturer

University of Sydney


Dr. Wanli Ouyang received the PhD degree in the Department of Electronic Engineering, the Chinese University of Hong Kong. He is now a senior lecturer at the University of Sydney. His research interests include image processing, computer vision and pattern recognition. He received the best reviewer award of ICCV. He organized workshop in ECCV 2018, CVPR 2017, ACCV 2014 and gave tutorial in ACCV 2016. He serves as the area chair of the International Conference on Pattern Recognition (ICPR) 2018. He is a senior member of the IEEE.

Chaoyue Wang

Postdoc Researcher

University of Sydney


Chaoyue Wang is postdoc researcher in Machine Learning and Computer Vision at the School of Computer Science, The University of Sydney. He received a bachelor degree from Tianjin University (TJU), China, and a Ph.D. degree from University of Technology Sydney (UTS), Australia. His research interests mainly include machine learning, deep learning, and generative models. He received the Distinguished Student Paper Award in the 2017 International Joint Conference on Artificial Intelligence (IJCAI).

Jingya Wang

Research Fellow

University of Sydney


Jingya Wang received the B.S. degree from Swinburne University of Technology, Australia and the Ph.D. degree from Queen Mary University of London, UK, both in computer science. She is currently a research fellow at the University of Sydney. Her research interests include computer vision, human-centred visual understanding and surveillance data analysing. She has achieved CVPR Doctoral Consortium Award and one of her work was selected as Best of CVPR 2018 Paper by Computer Vision News Magazine. She has published several top tier conference and journal including ICCV, CVPR, AAAI and Artificial Intelligence.

Zhe Chen

Ph.D candidate

University of Sydney


Zhe Chen received the B.S. degree in Computer Science from University of Science and Technology of China, HeFei, China, in 2014. He is currently pursuing the Ph.D. degree at the UBTECH Sydney Artificial Intelligence Centre, the University of Sydney. His research interests include object recognition, pedestrian detection, road detection, visual object tracking, and deep learning. His studies have been published on CVPR and ECCV.


Wanli Ouyang (wanli.ouyang@sydney.edu.au)

2019 IEEE International Conference on Visual Communications and Image Processing (VCIP)