We compare two schema, the Multilabel-Binary Vectors (MBV) au-toencoder and the Vector Quantized Variational Autoencoder (VQVAE), in which discrete representations of subword units could be discovered from speech without any text label, phoneme label and alignment. By combining the methods, we aim to utilize their strengths and achieve a better performance in the ZeroSpeech2019 Challenge, in terms of either bitrate or quality.
Implement an agent to play Atari games using Deep Reinforcement Learning In this project, I implemented Policy Gradient, Deep Q-Learning (DQN), Double DQN, Dueling DQN, and A2C for the atari games, such as LunarLander, Assault, and Mario.
In theis project, ACGAN and VAE were implemented for the cartoon face generation. Morevoer, I also did serveral experiments on the model architectures to verify the capabilities of the models.
To generate the disparity map given the left and right images, we utilize learning-based method and our understanding of stereo geometry. By training an end-to-end model, it can generate the disparity map only with two images.