CSE6243 | Course Project

Your class project is an opportunity for you to develop new methods for interesting problems in the context of a real-world data sets. You can reproduce one method in the course or your own ideas. Instructor and TA will consult with you, but of course the final responsibility to define and execute an interesting piece of work is yours.

Projects should be done in teams of three to four students. Your project will be worth 40% of your final class grade, and will have 4 deliverables:

Proposal : 3 pages excluding references (20%)
Presentation : oral presentation (30%)
Final Report : 5 pages excluding references (50%)

All write-ups should use the NeurIPS style.

Team Formation
Project Proposal
Final Report
Presentation
Project Examples

Team Formation

You are responsible for forming project teams of 2-4 people. In some cases, we will also accept teams of 1, but a 2-4 person group is preferred. Once you have formed your group, please send one email per team to the class instructor list with the names of all team members. If you have trouble forming a group, please send us an email and we will help you find project partners.

The team formation email will be due at 11:59 PM on September 09th.

Project Proposal

You must turn in a brief project proposal that provides an overview of your idea and also contains a brief survey of related work on the topic. We will provide a list of suggested project ideas for you to choose from, though you may discuss other project ideas with us, whether applied or theoretical. Note that even though you can use datasets you have used before, you cannot use work that you started prior to this class as your project.

Proposals should be approximately 3 pages long excluding references, and should include the following information:

Project title and list of group members.
Overview of project idea. This should be approximately half a page long.
A short literature survey of 4 or more relevant papers. The literature review should take up approximately one page.
Description of potential data sets to use for the experiments.
Plan of activities, including what you plan to complete by the midway report and how you plan to divide up the work.

The grading breakdown for the proposal is as follows:

40% for clear and concise description of proposed method
40% for literature survey that covers at least 4 relevant papers
10% for plan of activities
10% for quality of writing

The project proposal will be due at 11:59 PM on October 16th.

Final Report

Your final report is expected to be 5 pages excluding references. It should have roughly the following format:

Introduction: problem definition and motivation
Background & Related Work: background info and literature survey
Methods – Overview of your proposed method – Intuition on why should it be better than the state of the art – Details of models and algorithms that you developed
Experiments – Description of your testbed and a list of questions your experiments are designed to answer – Details of the experiments and results
Conclusion: discussion and future work

The grading breakdown for the final report is as follows:

10% for introduction and literature survey
30% for proposed method (soundness and originality)
30% for correctness, completeness, and difficulty of experiments and figures
10% for empirical and theoretical analysis of results and methods
20% for quality of writing (clarity, organization, flow, etc.)

The project final report will be due at 11:59 PM on December 9th

Presentation

All project teams will present their work at the end of the semester. Each team should present it during the allocated time. If applicable, live demonstrations of your software are highly encouraged.

Project Examples

Generative Models Comparison

There have been a variety of generative models proposed and introduced, including variational auto-encoder, energy-based models, generative adversarial networks, diffusion models, and so on.

What is the pros and cons for each model in training, inference, and downstream tasks (texts vs. images)? Design experiments to demonstrate your claims.

References

Representation Learning

We have introduced a variaty of representation learning algorithms, including SimCLR, CLIP, BYOL, and Spectral Contrastive, and so on.

What is the pros and cons for each method? Design experiments to demonstarte your claims.

References:

Reinforcement Learning

Reinforcement Learning becomes more and more important by demonstrating its ability from AlphaGo to large language models. Consider to apply RL in practical applications, e.g., Atari, robot control in simulators, controllable generation, and Language Models.

Machine Learning on/with Graphs

Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to social networks modeling. How to conduct machine learning with graph data is an important question, including generative models for graphs, classification on graphs, ad so on.

References: