Research Experience

 
 
 
 
 

Graduate Researcher

TELEDIA, Prof. Carolyn Penstein Rose

Sep 2020 – May 2021 Pittsburgh, PA
  • Learn relational representations of identity labels that provide insight to which dimensions of similarity and difference are relevant with respect to content propagation.
  • Developed an architecture for reblog prediction and performed comprehensive analysis of blog descriptions, communities, and following relationships using real-world data in Tumblr.
 
 
 
 
 

Undergraduate Researcher

Vision and Learning Lab, Prof. Yu-chiang Frank Wang

Feb 2018 – Jul 2020 Taipei, Taiwan

Disentanglement for 3D Point Cloud | Demo | Report

  • Researched on 3D representations and applied generative model to disentangle 3D point cloud.
  • Proposed an autoencoder-based model to disentangle the human poses by continuous labels.

Conventional Computer Vision | Demo

  • Researched and implemented various applications including segmentation, fisher face, depth map generation, etc.
 
 
 
 
 

Undergraduate Researcher

Speech Processing Lab, Prof. Lin-shan Lee and Prof. Hung-yi Lee

Sep 2017 – Jan 2020 Taipei, Taiwan

Speech Disentanglement and Voice Conversion

  • Researched on unsupervised voice conversion by extracting the personality and prosody information.

Personalized Dialogue Generation | Demo | Paper

  • Proposed a GAN-based model to produce responses for multiple persona using a single model by unsupervised learning and puts less constraint on required training data.
  • The proposed model obtains 18.3% increase in persona accuracy compared with the SOTA model, and the paper was accepted in INTERSPEECH2019.
  • Introduced the Bert model, did various experiments on the architecture of the discriminator as well as the initialization of the word embedding vector, and made comprehensive analysis on the detailed performance of each character.

Large-vocabulary Speech Recognition System

  • Implemented a large-vocabulary speech recognition system from scratch by Kaldi.
  • Developed a learning-based model on ASR and made comparison with the rule-based method.

Work Experience

 
 
 
 
 

Siri TTS R&D Intern

Apple Inc.

May 2021 – Aug 2021 Seattle, WA (Telecommuting)
  • Understood the needs of the modeling teams and created robust scripts and systems that meet the needs.

  • Developed a robust system that detects anomaly in data and reduces considerably the required evaluation time.

 
 
 
 
 

Natural Language Processing Intern

DeepHow Inc.

Apr 2019 – Aug 2020 Detroit, MI (Telecommuting)

Unsupervised Temporal Embedding for video segmentation

  • Researched on self-supervised multi-modal networks and helped develop video recommendation systems.

  • Implemented an unsupervised architecture, detected and segmented actions in untrimmed videos, and deployed on the DeepHow platform - AI Stephanie, improving the accuracy by 30%.

Step-embedding for video recommendation

  • Developed a brand-new sentence embedding method by encoding ASR sentences from video clips.

  • Verified on real-world videos, the generated embedding contains features from the texts and recommend other video clips.

 
 
 
 
 

Software Engineering Intern

HTC Taiwan (DeepQ)

Jul 2018 – Mar 2019 Taipei, Taiwan

Software Engineering Intern

  • Implemented various architecture search models and model compression models.
  • Applied the differentiable architecture search models, which use three orders of magnitude fewer computation resources, on the DeepQ product - AI platform.

Generative Model for Image Morphing

  • Developed a brand-new generative model for image morphing on human expressions.
  • Verified on real-world data, the proposed model can successfully generate vivid morphing images.

Teaching

Machine Learning (SPRING 2019)

I designed one homework on Linear Regression for the whole class ( ppt, Website ). I also led 10+ groups for the final project - Image Dehazing ( ppt ).

Machine Learning and having it deep and structured (SPRING 2019)

I led and advised the whole class more than 30 students, conducting and analyzing chatbot. I instructed the students on how to program a chatbot as well as gave them a short talk about recent papers. | Talk & ptt

Signal and System Processing (SPRING 2019)

Aside from setting up homework and answering course questions, I was also responsible for designing some problems for the midterm exam.

Recent Posts

Selected Projects

.js-id-Deep-Learning

Functionally Reduced And-Inverter Graph (FRAIG) [C++]

This is the implementation of circuit simplication simplification. By means of unused gate sweeping, trivial optimization, simplification by structural hash, and previous simulation, I try to preliminarily simplify the circuits in an efficient manner. After that, I also apply Equivalence gate merging to the circuits using Boolean Satisfiability (SAT) solver. I collected functionally equivalent candidates (FEC) by circuit simulation. Each simulation can split different FEC into groups. However, the number of simulation times was crucial for the performance. Therefore, I dynamically adjusted the stopping criteria of the simulation according to the splitting times of FEC.

PokeCan [rpi][C++][Python][Node.js]

Pokecan is a trash can that can automatically detect the level of trash inside itself, and if it is full, it will walk along the path that is set by user and will dump the trash into a larger trash can. After it dumps all trash out, the Pokecan will walk back to its original location.

TTS Without T [Pytorch]

We compare two schema, the Multilabel-Binary Vectors (MBV) au-toencoder and the Vector Quantized Variational Autoencoder (VQVAE), in which discrete representations of subword units could be discovered from speech without any text label, phoneme label and alignment. By combining the methods, we aim to utilize their strengths and achieve a better performance in the ZeroSpeech2019 Challenge, in terms of either bitrate or quality.

Atari games using RL [pytorch]

Implement an agent to play Atari games using Deep Reinforcement Learning In this project, I implemented Policy Gradient, Deep Q-Learning (DQN), Double DQN, Dueling DQN, and A2C for the atari games, such as LunarLander, Assault, and Mario.

Cartoon Face Generation Using Conditional GAN [pytorch]

In theis project, ACGAN and VAE were implemented for the cartoon face generation. Morevoer, I also did serveral experiments on the model architectures to verify the capabilities of the models.

Depth Map Generation [pytorch]

To generate the disparity map given the left and right images, we utilize learning-based method and our understanding of stereo geometry. By training an end-to-end model, it can generate the disparity map only with two images.

Housing Agency System [Python]

As STO (Security Token Offering) is gaining more and more attention, we consider that the concept can be applied to the real- estate transactions. Meanwhile, by introducing the consortium blockchain, the mechanism brings lots of advantages to the market.

Door Friend [Rpi][Arduino][Python]

Imagine that you are busy cooking a great dinner for your party, and your friend is arriving at your house. Your friend pushes the bell, but you can’t open the door with the dirty hands! Now the Door Friend comes to save your day. It can recognize you and your friends’ faces and voices and open the door.

Fashion Ceiba [html][node.js][graphql][react.js][mongo]

It is an internet teaching platform that can assist the teaching system, such as taking real-time notes, asking questions, and updating handout. Deployed on https://fashion-ceiba.herokuapp.com/login .

Contact