Yuwei Hu

HPC Intern at Tusimple Inc.


About Me

I am a research intern in TuSimple’s HPC group, experiencing a gap year after obtaining a bachelor’s degree in Electrical Engineering from Beihang University.

My research interests are in the intersection of computing systems and deep learning. More specifically, I have worked on schedule optimization of deep network workloads with tensor compilation and kernel fusion, and co-design of algorithms and systems for accelerating deep learning inference.

I am applying for 2018 fall Electrical and Computer Engineering PhD programs.


BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU [pdf]
Yuwei Hu, Jidong Zhai, Dinghua Li, Yifan Gong, Yuhao Zhu. In submission to IPDPS 2018.


TVM: Tensor IR Stack for Deep Learning Systems dmlc/tvm

TVM is a novel framework that can: represent and optimize common deep learning computation workloads for CPUs, GPUs and other specialized hardware; automatically transform the computation graph to optimize data layout and fuse computation patterns.

My contributions:
           source: tvmlang.org/release-announcement.html

Here is the TVM tutorial blog, summarizing what I have learned from writing depthwise convolution.

MXNet to TensorRT Model Converter

Deploy neural network models trained on MXNet to TensorRT for fast inference. Consists of a network symbol converter and a parameter file converter. Detection and ReID models went into Tusimple's auto driving system.

Make things as simple as possible, but no simpler.