CS6101

Deep Learning for Natural Language Processing

Description

This course is taken almost verbatim from CS 224N Deep Learning for Natural Language Processing – Richard Socher’s course at Stanford. We are following their course’s formulation and selection of papers, with the permission of Socher. This is a section of the CS 6101 Exploration of Computer Science Research at NUS. CS 6101 is a 4 modular credit pass/fail module for new incoming graduate programme students to obtain background in an area with an instructor’s support. It is designed as a “lab rotation” to familiarize students with the methods and ways of research in a particular research area. Our section will be conducted as a group seminar, with class participants nominating themselves and presenting the materials and leading the discussion. It is not a lecture-oriented course and not as in-depth as Socher’s original course at Stanford, and hence is not a replacement, but rather a class to spur local interest in Deep Learning for NLP.

Participation

35 students in 17 teams

Projects
CS6101-1

#1

Abstractive Summarisation of Long Academic Papers

In this project, we use two publicly available datasets to train abstractive summarisation models for long academic papers which are usually 2000 to 3000 words long. We will generate a summary of around 250 words for the papers. We also evaluate different models' performance.


CS6101-2

#2

CNN for Japanese Text Classification


CS6101-3

#3

Event Driven Stock Prediction from News


CS6101-04

Foreign Exchange Forecasting

This project explored forecasting foreign currency exchange price using both historical price data and market news of the trading pair. We first built a market sentiment classifier to obtain daily market sentiment score from market news of the trading pair. To reduce the amount of labelled training data required, we used transfer learning technique by first training a language model using Wikipedia data, then fine-tuning the language model using market news, and finally using the language model as the basis to train the market sentiment classifier. We then combined the market sentiment score and historical price data and fed them in an end-to-end encoder-decoder recurrent neural network (RNN) to forecast foreign exchange trends.


CS6101-05

Gaining Insight from News Articles for Stock Prediction

Analysing Twitter Feed Sentiment to predict stock market movement


CS6101-07

Generative Stock Question Answering


CS6101-08

MC NLP Rap Lyrics Generator

Train a rap lyrics generation language model using Eminem's lyrics dataset and LSTM.


CS6101-9

#9

Open Source Software Vulnerability Identification With Machine Learning

Open source software are widely used right now but security is a big concern. We use both traditional algorithms and deep learning to identify vulnerabilities from different kinds of open source project repositories.


CS6101-10

#10

Legal Document Classifier based on Legal Topics

This project will train and fine-tune a machine text classifier on legal decisions based on the Universal Language Model Fine-tuning (ULMFiT) model described in the paper "Universal Language Model Fine-tuning for Text Classification" by Howard and Ruder (2018).


CS6101-12

#12

Source Code Comment Generation

Learn to generate comment for a source code in Java by training a NMT model over large amount of open source software.


CS6101-13

#13

Reading Wikipedia to Answer Open-Domain Questions

Train a system that is able to answer general knowledge questions


CS6101-14

Beyond Affine Neurons

Recurrent Neural Networks (RNNs) have emerged as one of the most successful framework for time series prediction and natural language processing. However, the fundamental building block of RNNs, the perceptron [1], has remained largely unchanged since its inception: a nonlinearity applied to an affine function. An affine function cannot easily capture the complex behavior of functions of degree 2 and higher. The sum of products signal propagation for the hidden state and current input incorrectly assumes that these two variables are independent and uncoupled. RNNs are increasingly forced to use very large number of neurons in complex architectures to achieve good results. In this project, inspired by Ref. [2], we propose to add simple and efficient degree 2 behavior to Recurrent Neural Network (RNN) neuron cells to improve the expressive power of each individual neuron. We analyze the benefits of our approach with respect to common language prediction tasks. The clear advantage of our approach is fewer neurons and therefore reduced computational complexity and cost. [1] Rosenblatt, Frank. The perceptron, a perceiving and recognizing automaton Project Para. Cornell Aeronautical Laboratory, 1957. [2] Mirco Milletari, Thiparat Chotibut and Paolo E. Trevisanutto. Expectation propagation: a probabilistic view of Deep Feed Forward Networks, arXiv:1805.08786, (2018)


CS6101-15

CNN-RNN for Image Annotation with Label Correlations

Convolutional Neural Networks (CNNs) have shown great success in image recognition where one image belongs to only one category (label), whereas in multi-label prediction, their performances are suboptimal mainly for their neglection of the label correlations. Recurrent Neural Networks (RNNs) are superior in capturing label relationships, such as label dependency and semantic redundancy. Hence, in this project we implement a CNN-RNN framework for multi-label image annotation by exploiting CNN’s capability for image-to-label recognition and RNN’s complement in label-to-label inference. We experiment on the popular benchmark IAPRTC12 to show that CNN-RNN can help improve the performance on the CNN baseline.


CS6101-16

#16

Video Summaries from TV


CS6101-17

#17

Voice Transfer


CS6101-18

Understanding Neural Translation Models' Translations

The recent success of Neural Machine Translators (NMT), have shown the power of using neural approaches to the problem of machine translation. Nonetheless, one of the biggest problems of NMT and deep learning models in general is their lack of interpretability. In light of this, this projects aims to "unbox" the black box of widely use NMT models using the framework proposed by M Sundararajan et al. 2017.


CS6101-19

#19

InstaCapt - generating human language style captions from pictures


Comments

Jessie Sim10 months ago
Bellyfast gives a good different take of food delivery system, efficiently linking inventory management with order sales.
Christian Hadiwinoto10 months ago
CS6101-18 Title mismatch with poster