My Photo

Building smart robots at covariant.ai (formerly, Embodied Intelligence). We are hiring!

Before co-founding covariant.ai, I was a PhD student in EECS at UC Berkeley, advised by Pieter Abbeel, where my interests are in Deep Learning, Reinforcement Learning and Robotics.

I received my bachelor's degree from UC Berkeley with double major in Computer Science and Statistics.


Publications


One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, , Pieter Abbeel, Sergey Levine

To appear in the Robotics: Science and Systems (RSS), 2018.

...

We present an approach for one-shot learning from a video of a human by using human and robot demonstration data from a variety of previous tasks to build up prior knowledge through meta-learning. Then, combining this prior knowledge and only a single video demonstration from a human, the robot can perform the task that the human demonstrated. We show experiments on both a PR2 arm and a Sawyer arm, demonstrating that after meta-learning, the robot can learn to place, push, and pick-and-place new objects using just one video of a human performing the manipulation.
arXiv

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

*, Zoe McCarthy*, Owen Jow, Dennis Lee, Xi (Peter) Chen, Ken Goldberg, Pieter Abbeel

In the IEEE International Conference on Robotics and Automation (ICRA), 2018.

...

We describe how consumer-grade virtual reality headsets and hand tracking hardware can be used to naturally teleoperate robots to perform complex tasks. We also describe how imitation learning can learn deep neural network policies (mapping from pixels to actions) that can acquire the demonstrated skills. Our experiments showcase the effectiveness of our approach for learning visuomotor skills.
arXiv Video Website


One-Shot Visual Imitation Learning via Meta-Learning

Chelsea Finn*, Tianhe Yu*, , Pieter Abbeel, Sergey Levine

In the 1st Annual Conference on Robot Learning (CoRL), 2017.

...

We present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration. Unlike prior methods for one-shot imitation, our method can scale to raw pixel inputs and requires data from significantly fewer prior tasks for effective learning of new skills. Our experiments on both simulated and real robot platforms demonstrate the ability to learn new tasks, end-to-end, from a single visual demonstration.
arXiv Video Website


Learning from the Hindsight Plan -- Episodic MPC Improvement

Aviv Tamar, Garrett Thomas, , Sergey Levine, Pieter Abbeel

In the IEEE International Conference on Robotics and Automation (ICRA), 2017.

...

Model predictive control (MPC) is an effective control method but is limited by planning with short horizon due to practical constraints. We propose a general policy improvement scheme for MPC, hindsight iterative MPC (HIMPC), which incorporates long-term reasoning into MPC short-horizon planning and demonstrates superior empirical performance in simulated and real contact-rich manipulation tasks.
arXiv Video Website


PLATO: Policy Learning using Adaptive Trajectory Optimization

Gregory Kahn, , Sergey Levine, Pieter Abbeel

In the IEEE International Conference on Robotics and Automation (ICRA), 2017.

...

We propose PLATO, an algorithm that trains complex control policies with supervised learning, using model-predictive control (MPC) to generate the supervision. PLATO uses an adaptive training method to modify the behavior of MPC to gradually match the learned policy, in order to generate training samples at states that are likely to be visited by the policy while avoiding highly undesirable on-policy actions. We prove that this type of adaptive MPC expert produces supervision that leads to good long-horizon performance of the resulting policy, and empirically demonstrate that MPC can still avoid dangerous on-policy actions in unexpected situations during training.
arXiv Video Website


Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

, Gregory Kahn, Sergey Levine, Pieter Abbeel

In the IEEE International Conference on Robotics and Automation (ICRA), 2016.
Also, in Neural Information Processing Systems (NIPS) Deep Reinforcement Learning Workshop, 2015.

...

Model predictive control (MPC) is crucial for underactuacted systems such as autonomous aerial vehicles, but its application can be computationally demanding. We propose to combine MPC with reinforcement learning in the framework of guided policy search (GPS). The resulting neural network policy can successfully control the robot at a fraction of the computational cost of MPC.
arXiv Video Slides

Projects


Towards Stochastic Neural Network Control Policies

UC Berkeley CS287 (Advanced Robotics) and CS281A (Statistical Learning Theory) Final Project (Fall 2015).

...

Deterministic neural networks (DNNs) are shown effective policies with good generalization in robot control. However, their determinism restricts DNNs to only modelling uni-modal controls. Training DNNs on multi-modal controls can be inefficient or lead to garbage results. In contrast, stochastic neural networks (SNNs) are able to learn one-to-many mappings. In this paper, we introduce SNNs as control policies and extend existing learning algorithm for feed-forward SNNs to recurrent ones. PDF Poster


Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN

Craig Hiller, David Zhang, , Zihao Zhang

UC Berkeley CS280 (Computer Vision) Final Project (Spring 2015).

...

We propose an algorithm to automatically identify window regions on exterior-facing building facades in a colored 3D point cloud generated using data captured from an ambulatory backpack sensor system outfitted with multiple LiDAR sensors and cameras. Our work is based on a R-CNN-inspired algorithm with novel filtering and preprocessing techniques. We use multiscale combinatorial grouping (MCG) for region proposal generation, pass the proposals to a convolution neural network (CNN), and train a random forest with the CNN output vectors. PDF

Teaching


UC Berkeley CS188 (Introduction to Artificial Intelligence)

Instructors: Prof. Pieter Abbeel and Prof. Anca Dragan
...

Spring 2016 - CS188

Head Student Instructor

Spring 2015 - CS188.1x (MOOC)

Course Moderator

About


Education

University of California, Berkeley
Aug 2016 - present
PhD, Electrical Engineering and Computer Science
Advisor: Pieter Abbeel

University of California, Berkeley
Aug 2012 - May 2016
B.A., Computer Science and Statistics
Cumulative GPA: 3.92
Selected Coursework: (2xx - graduate courses)
(CS287) Advanced Robotics (A+; Rank: 2/34)
(Stat241A/CS281A) Statistical Learning Theory (A+)
(CS189) Machine Learning (A+; Rank: 1/297)
(CS188) Artificial Intelligence (A+)
(CS280) Computer Vision (A)
(CS288) Natural Language Processing (A)
(CS170) Efficient Algorithms (A)
(CS294-12) Deep Reinforcement Learning (A)
(Math110) Linear Algebra (A)
(Math104) Real Analysis (A)
Honors:
EECS Honors Degree Graduate (expected)
Dean's Honors List (five semesters)

Personal

Piano
I'm a fan of Chopin. Here is some of my recordings (list to be expanded):
Nocturne in E-flat major, Op. 9 No. 2   YouTube   Sheet
Nocturne in F minor, Op. 55 No. 1   YouTube   Sheet
Nocturne in E-flat major, Op. 9 No. 1 (in the works)
Polonaise in A major, Op. 40, No. 1 (a.k.a. Military Polonaise) (in the works)

Photography
Coming soon. (in fact, maybe much later...)

Contact

Email: tianhao.z AT eecs.berkeley.edu