Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
team:feroz_ahmed_siddiky [2022/07/14 11:08] siddikyteam:feroz_ahmed_siddiky [2024/06/05 10:31] (current) – [Feroz Ahmed Siddiky] cstoess
Line 3: Line 3:
 | {{:wiki:siddiky.jpg?0x180}} |||| | {{:wiki:siddiky.jpg?0x180}} ||||
 |::: ||Research Staff\\ \\ || |::: ||Research Staff\\ \\ ||
-|:::|Tel: |–49 -421 218 64027|+|:::|Tel: |–49 -421 218 64028|
 |:::|Fax: |--49 -421 218 64047| |:::|Fax: |--49 -421 218 64047|
 |:::|Room: |TAB 1.81| |:::|Room: |TAB 1.81|
Line 9: Line 9:
 |:::| || |:::| ||
  
-====Projects involved in===== +==== Deep Action Obserever ===== 
-  * [[http://pr2-looking-at-things.com/index.html|RoboSherlock]] +Robotic agents have to learn how to perform manipulation tasksOne of the biggest challenges in this context is that 
-  * [[https://acat-project.eu/|ACAT]] +manipulation actions are performed in a variety of ways depending on the objects that the robot acts on, the tools it is 
-  * [[http://www.refills-project.eu/|Refills]] +using, the task context, as well as the scene the action is to be executed inThis raises the issue of when to perform 
-  * [[http://www.robohow.org|RoboHow]] +a manipulation action in which wayIn this paper we propose to let the robot read text instructions and watch the 
 +corresponding videos illustrating how the steps are performed in order to generate symbolic action descriptions from the 
 +text instructionsThe text instructions are disambiguated and completed with the information contained in the videosThe 
 +resulting action descriptions are close to action descriptions that can be executed by leading-edge cognition-enabled robot 
 +control plans. To perform this learning task we combine two of the most powerful learning and reasoning mechanisms: 
 +Deep Learning and Markov Logic NetworksConvolutional networks parameterized through deep learning recognize 
 +objects, hand poses, and estimate poses and motions while the Markov logic networks use joint probability over the 
 +relational structure of instructions to fill in missing information and disambiguate descriptions. Besides the combination 
 +of symbolic and sub-symbolic reasoning the novel contributions include a Multi Task Network developed in a single 
 +framework, optimized for computational cost, which can process 10 frames per second. We evaluate our framework on a 
 +large number of video clips and show its impressive ability to interpret the manipulation tasks.
  
 +[1] Feroz Ahmed Siddiky and Michael Beetz, "DeepActionObserver: Refining Instructions for Manipulation Actions by Watching Instruction Videos" https://www.dropbox.com/s/60fweieljn9pbky/deep-action-observer.pdf?dl=0




Prof. Dr. hc. Michael Beetz PhD
Head of Institute

Contact via
Andrea Cowley
assistant to Prof. Beetz
ai-office@cs.uni-bremen.de

Discover our VRB for innovative and interactive research


Memberships and associations:


Social Media: