Deep Learning via Fast.AI

22 students in 14 teams

This section of the CS 6101 lab rotation course will run through the self-paced, publicly available deep learning material on the fast.ai website. Our section will be conducted as a group seminar, with class participants nominating themselves and presenting the materials and leading the discussion. The study group covers the Practical Deep Learning for Coders short course, and Session II will cover the (no longer really) Cutting Edge Deep Learning for Coders (currently deep learning is evolving rather quickly so materials are outdated on a weekly basis). Nevertheless, the material should be introductory and should be understandable given some prior study.

Project List


Combine Machine Learning and Probabilistic Model for Sports Analytics

Traditionally, Model Checker has been used to verify the correctness of software system and protocols. Recently, Probabilistic Model is also used in sports analytics. We use machine learning to learn the probabilistic parameters for the model.


TL;DR - Abstractive Summarization of News Articles

In this information age, we find ourselves overwhelmed with an incredulous amount of information. One way to deal with the information is to selectively ignore these information, but choosing what to ignore is also an issue. This project hopes to reduce information overload by building a model capable of summarizing lengthy documents of text. The summary allows people to decide on whether the article is of interest, and also to allow people to spend a shorter amount of time digesting the information. This model will power a browser extension, allowing users to summarize any web page they visit.



Lecture videos have not until recently been considered a theoretical source of entertainment. As a result of this limitation, years of student productivity had been lost in the process. In this research, we demonstrate a novel solution to this millennium problem using state-of-the-art deep learning models.


Auto Manga Colorization

Manga Colorization can be a tedious process, what if a neural network can help you do so?


EmoChat: Emoji Prediction based on Text Inputs

Emojis are playing an increasingly important role in people's life: it carries information while easing the mood of conversation at the same time. Keyboards found on Android and iPhone contain the list of emojis available to use, but sometimes there are too many of them and the user may find it difficult to locate the desired one. Hence in this project, we expect to build a model with CNN/LSTM which gives predictions on the emojis based on the user's text input, and to achieve this goal we are using the Twitter data with emojis.


Predicting Housing Prices in Singapore

Buying a flat or house is the biggest investment most people make in their lives. Many spent months or even years researching housing prices. In the end, however, the estimate of non-experts remains just a good guess. Urban Zoom has collected a huge data set about real estate prices in Singapore and provided to NUS students for a Kaggle competition. The Fast.ai library has been a great solution for many data projects. We will contribute to its reputation by solving this house price problem. The results will interest most Singaporean residents and government. Martin Strobel, Sylvain Riondet



Visual Dialog

Provided an image, the agent is tasked with maintaining a dialog so as to provide the right answers to the queries. A challenge that is at the boundary of perception and natural language understanding, the aim of this project is to build a basic encoder decoder style network to address this.


Deep Learning project on Medical Images (X-ray and CT scan images) for classifying multiple Tuberculosis manifestations using CNN

Medical image diagnostic for the Tuberculosis



Domain Adaption for 2D Human Pose Estimation

Human pose estimation is a key step for understanding human activity, and is a classical problem in computer vision and graphics. The objective of human pose estimation is to estimate the locations of key human body parts in an image. Up to now, a lot of work has been done, and these works, particularly those based on deep learning have achieved excellent results.However, most existing works are based on fully supervised learning, where accurate 2D pose labels for each image are available. Unfortunately, annotating these 2D pose labels manually is an time-consuming and expensive process. Most human pose datasets are thus captured in lab environments using specialized depth and motion capture cameras. Such data is not representative of real world environments, and neural networks trained on indoor datasets do not generalize well to outdoor scenarios. It is therefore important to explore the possibility of training a deep network under the constraint of limited data. To solve the aforementioned problem, we propose to use transfer learning across different domains. Specifically, we plan to adapt the network trained on the source (indoor) dataset for the target (outdoor) domain. We will use a network to compute a feature representation for both the source and target data. Then the feature representation will be optimized by minimizing the distance between the feature distributions of the two domains. In this way, we will obtain an domain-invariant 2D pose estimation model, which has good performance in the target domain even when the labels are not available in the target domain and domain shift exists.


Perspective Control in Architectural Photography

Have you taken photo of a building taller than you, and it appears to be tilting back in the photo? Have you used Instagram's adjust tools to make lines in your photo perfectly vertical? If yes, you've done perspective control! Such a manual process with no promise of the perfect photo. What if AI could do it for us?


Neural Singing Synthesis, but faster.

A neural network based singing synthesizer based on a few design modifications to WaveNet. These are primarily employed for a significant speedup over WaveNets in terms of training time to generate one second of audio.


Bidirectional LSTM-CRF for Named Entity Recognition

I propose a Long Short-Term Memory (LSTM) based model for the well-known sequence labeling task in Natural Language Processing, Named Entity Recognition (NER). This model includes bidirectional LSTM (BI-LSTM) with a bidirectional Conditional Random Field (CRF) layer. This work is the first to apply bidirectional CRF to neural architectures for sequence tagging task. The research shows that CRF can be extended to capture the dependencies between labels in both right and left directions of the sequence. This variation of CRF is referred as BI-CRF and results show that BI-CRF significantly improve the performance of NER model compare to a unidirectional CRF.




To be edited


Deep Learning for Nucleus Detection

Create a deep learning model that can identify a range of nuclei across varied conditions