Image Understanding

Dauer4 SWS
VortragendeProf. Michael Beetz, Ferenc Balint-Benczedi, Feroz Ahmed Siddiky, Thiemo Wiedemeyer, Jan-Hendrik Worch
SpracheDeutsch, Englisch
Termine Mo., 10:00 - 12:00, Ort: TAB 1.58
BemerkungenVorlesungsbeginn: 18.04.2016

Organizational Issues and Materials can be found at our Stud.IP page


The seminar will deal with the challenges of semantic perception in the context of robotics, presenting various aspects of it. Students will be presented with an overview of the field followed by individual presentations and reports of pre-defined topics.



Weakly supervised graph based semantic segmentation by learning communities of image-parts

Decision Making under Uncertain Segmentations


KAZE Features + Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces

B-SHOT: A Binary Feature Descriptor for Fast and Efficient Keypoint Matching on 3D Point Clouds

Rotation and Translation Invariant 3D Descriptor for Surfaces

Object Detection, Recognition and Tracking

Real-time Pose Detection and Tracking of Hundreds of Objects + SimTrack: A Simulation-based Framework for Scalable Real-time Object Pose Detection and Tracking

Surface Oriented Traverse for Robust Instance Detection in RGB-D

RGB-D Object Modelling for Object Recognition and Tracking

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Rich feature hierarchies for accurate object detection and semantic segmentation

Efficient RGB-D Object Categorization Using cascaded Ensembles of Randomized Decision Trees

Robust 3D tracking of Unknown Objects

Depth-Based Tracking with Physical Constraints for Robot Manipulation


AfNet: The Affordance Network

Affordance detection of Tool parts from Geometric Features

Long-term human affordance maps

Deep Learning

Visualizing and Understanding Convolutional Networks

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

Flowing ConvNets for Human Pose Estimation in Videos

Multimodal deep learning for robust RGB-D object recognition

RGB-D Object Recognition and Pose Estimation Based on Pre-Trained Convolutional Neural Network Features

Unsupervised Deep Learning

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

Convolution Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations

Sparse Feature Learning for Deep Belief Networks

Efficient sparse coding algorithms

Human Detection and Tracking

Automatic initialization for skeleton tracking in optical motion capture

Unsupervised robot learning to predict person motion

Pose estimation for a partially observable human body from RGB-D cameras

Real-time full-body human attribute classification in RGB-D using a tessellation boosting approach

Action Recognition

Learning symbolic representations of actions from human demonstrations

Fast Target Prediction of Human Reaching Motion for Cooperative Human-Robot Manipulation Tasks Using Time Series Classification

Effective 3D action recognition using EigenJoints

Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition

Unsupervised Temporal Segmentation of Repetitive Human Actions Based on Kinematic Modeling and Frequency Analysis

sEMG-based decoding of detailed human intentions from finger-level hand motions

Human motion classification and recognition using wholebody contact force

Context-based intent understanding using an Activation Spreading architecture

A framework for unsupervised online human reaching motion recognition and early prediction

Human intention inference and motion modeling using approximate E-M with online learning


These four papers count as one block, i.e. they have to be presented together.

RoboSherlock: Unstructured Information Processing for Robot Perception

RoboSherlock: Unstructured Information Processing Framework for Robotic Perception

Pervasive 'Calm' Perception for Autonomous Robotic Agents

Perception for Everyday Human Robot Interaction