GSOC Ideas

From LU
Jump to: navigation, search



About LU

Learning Unlimited (LU) creates opportunities for middle school and high school students to explore a variety of topics --- from quantum mechanics to urban design to modern poetry to street drumming --- so that students can find what they love to learn. Our mentoring and support enables college students to run independent educational programs on their campuses where they teach their passions, sharing them with younger generations.

We're a lean organization, but we support programs at 17 universities now (including MIT, Stanford, the University of Chicago, and Duke), which reach is over 7,000 pre-college students and over 1,000 volunteers each year. We are also mentoring several startup efforts and continue to grow each year. To find out more about us, take a look at our website.

LU maintains an extensive open-source Web application system that provides the automation needed for small teams of college students to run programs that scale to hundreds of classes and thousands of participants. This system allows teachers to register classes, students to sign up for classes, and administrators to schedule classes; however, this is only the tip of the iceberg. Web sites running on our code have complex user searching, querying, and e-mailing capabilities; automated generation of printed materials such as course catalogs and attendance sheets; and heavily customizable front-end interfaces. Users on the site include students (aka high school students), teachers (aka college students), admins (also college students), and miscellaneous others such as parents. We're always looking for more people to join the dedicated volunteers that work on this codebase.

One unique aspect of our project is that we're small. Right now, there are about seven active contributors to the codebase, although some of them devote a lot of time to it! If you join up, this carries some benefits because you'll know everyone and you'll be able to have a big impact on the final outcome. On the other hand, you need to be self-directed and OK with a smaller (and tight-knit) community.

Contacting Us

We invite you to join us in creating software that impacts the experiences of thousands of people via educational outreach, while also learning and inventing state-of-the-art Web programming techniques. If you're interested, we invite you to contact us and introduce yourself:

  • E-mail: gsoc at (feel free to add yourself to the list too!)
  • IRC: #lu-web at
  • Some of us hang out in a Jabber chat room:

Our IRC server listens on an alternate port, 8001, as well as the default IRC port; this can be useful if your university blocks the default port. You can also try which is a Web interface to IRC.

We have four mentors; while each project will be assigned one primary mentor, most discussions will take place with all mentors so we can all have a role regardless of which project you work on.

  • Michael Price (MIT '08) participated heavily in MIT ESP programs and now coordinates the LU Web Team. He worked as an electrical engineer doing R&D for small UAVs and satellites at Aurora Flight Sciences, and recently returned to MIT as a graduate student.
  • Andrew Geng (MIT '11) is a graduate student in mathematics at the University of Chicago and has been working on Splash Web sites since 2007.
  • Jordan Moldow (MIT '14) participated in GSoC as a student last summer. He is now a major contributor to our code and has been a treasurer and summer program director of MIT ESP. He is studying mathematics and computer science.
  • J.D. Zamfirescu (MIT '05) has volunteered for both MIT ESP and Stanford ESP and now chairs LU's board of directors. He co-founded AppJet (creators of Etherpad), which was recently acquired by Google.

The Code

Our website is based on the Django framework (using the Python programming language). Our source code repository is hosted by GitHub, where you'll also find records of our bug reports, feature requests and code reviews. We also have a Trac site with some Wiki pages and instructions to check out the code repository and to get a copy running on your computer. The site is open-source, licensed under the AGPLv3.

Most of our developers have set up the application stack (code, Django, database and cache servers) on one of their personal machines in order to try out changes without affecting a production site. We refer to this as a "dev server." You may find it helpful to set up your own dev server so you can explore the code and learn how to make improvements. You can do this with our automated setup script or by downloading a prepared virtual machine image. Once you have a dev server working, try our Dev server tutorial.

We recommend first getting to know the site informally as a user. For example, request an account on this Wiki and browse our user documentation. Or look through some of our existing sites.

About This Wiki

Access to much of the wiki is limited to those with user accounts because we want leaders of each program to feel free to discuss internal logistics. However, we are happy to allow individuals (who have a reason for being here) full access. Just request an account and mention that you're thinking about GSOC.

If you want to find out more about what this code is actually used for, you might want to check out What is a Splash.



Below are some new features that would make the work of the student leaders much easier. We encourage proposals that explore variations or entirely new ideas.

We've included a list of "relevant skills" with each of these ideas. All of them implicitly include an understanding of Python and experience working with Web applications, since our code is primarily a Python Web application.

However, we've found that the most important prerequisite is motivation. Some of our best developers in the past started off with very little relevant experience, and they learned as they went. If you're really excited about a project, go for it; we're more than happy to help you learn some of the background, so that "Relevant Skills" becomes "Skills Learned". And certainly feel free to get in touch: e-mail web-team at and we'll do our best to advise you.

Each project is also accompanied by a list of "relevant tasks." These tasks should be within reach of a new developer (we'll gladly assist) and may provide a useful experience when preparing for the associated project. Setting up a dev server and contributing code is not required for a GSoC application, but may help you write a realistic and specific proposal.

New e-mail handling system

Relevant skills/what you might learn:

  • Web development
  • User-interface design
  • Experience with SMTP protocol

Difficulty: Hard

We currently have two different systems that interact to serve the e-mail needs of our chapters; they don't always interact well, nor does either of them suffice by itself. The systems are currently divided as follows:

  • Program directors send e-mails to targeted groups in order to advertise programs and communicate important logistics. This might mean e-mailing all former students to announce the next program dates, all students registered for a program to tell them where to show up, or any number of subsets of this list such as students who've applied for the lottery but didn't get in to certain classes, students whose classes need to be canceled, students who have received financial aid, etc. This is currently handled by a "dbmail" app in our codebase, which has a lot of potential but is complicated and difficult to use.
  • Local programs need regular mailing lists, with tools for moderation and subscription management. This is currently handled by Mailman, which is almost perfect except it doesn't handle virtual hosts.

Potential projects:

  • Implement moderation and subscription-management for "dbmail" mailing lists.
  • Make it easy for regular users to create "dynamic" mailing lists, based on specified filters and criteria, that respects users' subscription preferences.
  • Study how to reduce the spam score of our e-mail server; implement technical fixes and recommend policies to chapters that will help prevent their e-mails from being blocked and ensure compliance with the CAN-SPAM act.

Relevant tasks:

  • Add a customizable opt-out link to outgoing e-mails (Github ticket)
  • Standardize on a single send_mail() function (Github ticket)
  • Reduce space consumption of e-mail logs in database (Github ticket)

Integration with automatic scheduler

Relevant skills/what you might learn:

  • Designing models (data structures) for complex systems
  • Interprocess communication
  • Logistics and event planning

Difficulty: Moderate

Scheduling classes is a combinatorial optimization problem that can grow quite large (e.g. 1,000 class sections over 20 hours). This problem has several types of constraints, including teacher availability and classroom size, and it takes many man-hours of work to put classes in the schedule manually. Louis Wasserman (a student at the University of Chicago) designed a program, which, after several design iterations over the last 3 years, was first used successfully for the MIT Spark program in spring 2012. The goal of this project is to make it possible for chapters with little technical expertise to take advantage of this powerful and complex tool.

The automatic scheduler (written in Java) is standalone program that accepts a human-readable "program description file" and generates a list of tuples of the form (class, room, timeslot) which represent a schedule. The input file contains a list of teachers, classes, rooms, equipment, and the relationships and constraints between these objects. However, the data structures in our "resources" app and the views used to collect relevant data are not yet suited to the task. In order to make automatic scheduling routine (and hence free up loads of volunteer time) we need to make these models consistent and complete, and give Web site users the ability to specify all of the necessary information in an intuitive manner.

Possible projects:

  • The project we suggest is doing all of the work necessary to set up and run the automatic scheduler through a Web interface. This includes reworking our "resources" models and teacher registration pages, as well as designing the views needed to prepare and edit a program description and the plumbing needed to hook up the Python/Django Web app to the Java scheduling process.

Relevant tasks:

  • Write a view that returns a ZIP file of all of the JSON input data used to initialize a program description.
  • Write a view that allows searching for keywords in the "message to directors" specified by teachers.
  • Add the ability to associate timeslots with a range of grades (Github ticket).

Performance testing

Relevant skills/what you might learn:

  • Web site scraping and data collection
  • Caching techniques and correctness
  • Using and extending WSGI profiling tools

Difficulty: Hard

Performance is a significant issue for our Web sites, especially when students pile on to sign up for classes at the beginning of a registration period. The MIT ESP site in particular has been overloaded countless times in the past, leading to unpredictable (and unfair) behavior and inciting hundreds of concerned e-mails and phone calls that volunteers had to deal with. Our other sites are growing towards this point as well. We're already applying a few techniques to speed things up, including:

We have a test suite for checking the correctness of our code, and now we need a test suite for checking the performance of our code. The test suite needs to simulate the behavior of our actual users as well as possible (perhaps based on log information that we collect). It might hook into profiling tools on the server to collect data during tests.

Many of the "low-hanging fruit" for improving performance in our system have already been picked, and it is becoming less and less obvious how to speed things up. More information on the performance of our code would provide much-needed motivation for architectural changes and tweaking that benefit our users' experience.

Potential projects:

  • The suggested project is to write a stress tester that simulates the load of many users on a site and helps us identify which portions of our code to focus optimization effort on. It may include both a client component (the simulated users, possibly distributed) and a server component (profiling tools).

Relevant tasks:

  • Identify one or more pages that we do not currently proxy-cache (using Varnish) but could be, and explain how.
  • Write a script to fetch a page N times while logged in as a single user and compare the performance to loading the page once via N different logged-in users.

Class pages for student/teacher interaction

Relevant skills/what you might learn:

  • Development of standalone web-application components (database, server-side logic, HTML, optionally JavaScript)
  • Server administration to integrate third-party libraries
  • Interacting with users to design a feature

Difficulty: Moderate

If you've ever used a Web-app like Moodle, or Claroline, or any flavor of course-management tools, you've seen that it can be very useful for teachers to have online tools to help them distribute homework assignments, host online discussions between students, and the like. We would really like to have a similar set of features for our own teachers. This is a very open-ended option; the first step will be to look around at what's out there and talk with some of our teachers (we can put you in contact) and work out a plan for exactly what this functionality will look like. At that point, you would develop a plan for what to implement, either by creating a new system specifically for our site or incorporating another open-source option.

Some past discussion is archived here: Website Design/Class Pages

Schema simplification

Relevant skills/what you might learn:

  • Database design, schema
  • Usage/customization of Django Web framework (models, admin UI)
  • User interfaces; usability analysis

Difficulty: Moderate

Our database schema is often difficult to query efficiently. For example, we store a great deal of data as UserBits, where a UserBit is an RDF-like relation between a user, an "object" on our site, and an action such as "can view this document" or "has registered for this class". Both objects and actions are hierarchical, and UserBits specify time ranges that they are valid for. This is a lot more power than we need to mark someone as a teacher for a class. Similarly, we rarely delete old data but instead mark it as "invalid"; this is necessary for accountability reasons, but it would be nice if it could get shuffled off to old log tables so that it doesn't clutter up the data that we're still working with.

Django, the Web framework that we use, provides a nice Administration interface that lets ordinary users browse table contents and insert and update rows in tables from the Web. One metric of success might be that un-trained admins can go through this interface, and the schema is so intuitive that they can get lots of things done by simply looking at the table names and clicking around and taking a guess at what to do.

Potential projects:

  • Propose a narrower scope for the usage of UserBits and explicitly implement many of the relationships currently implied by UserBits.
  • Identify volatile models and views that can be substituted with the dynamic models and views provided by our custom forms app (a previous GSoC project) and implement the improvements needed to consolidate them.

Relevant tasks:

  • Switch to using a many-to-many field to represent which surveys a user has filled out.
  • Make a view that can be used to open student or teacher registration for an individual user.
  • Remove legacy fields from the ClassSubject model.

Generic templates

Relevant skills/what you might learn:

  • User-interface and user-experience design
  • Code generation
  • Javascript and CSS

Difficulty: Moderate

We're working on simplifying and improving the user experience across all of our chapters' sites. The appearance and navigational structure of our sites is not specified in our codebase (by design), but we should provide the tools chapters need to create and maintain such a structure.

Most of our views take advantage of Django's hierarchical templates: one template derives from another, and they are based on a "root" or "main" template. Site designs are currently implemented by manually editing this "main" HTML template and the associated stylesheets and image files. However, it may be possible to generate the main template based on user preferences. This is similar to the concept of "skins" used by popular blogging and forum software packages. Having such a capability would make it much easier for new chapters (often run by 1-2 overworked student volunteers) to get their Web sites started, and for established chapters to maintain and improve their interfaces.

Potential projects:

  • Design a system for letting users specify the appearance of their site (there could be dozens or hundreds of options, which must be displayed in an efficient manner) with a WYSIWYG preview, and apply/revert the changes to a running site.
  • Using our workflow plans as a guide, design one or more sets of generic templates that looks very professional and easy-to-use, and makes heavy use of CSS so that the design can be parameterized and adjusted easily.
  • Generalize our SVG-based image generator (originally developed by MIT) to generate skins for other sites with different colors, shapes and text specified by a user through a Web interface.

Redesign student registration

Relevant skills/what you might learn:

  • User-interface design
  • Ajax and Javascript

Difficulty: Moderate

The current student registration system is very Web 1.0: it involves clicking through a series of Web pages (consisting mostly of static HTML). This project aims to bring it up to Web 2.0 standards -- we want to improve the student registration process by having a more dynamic scheduling system and a better user experience. And running more of student registration on the client would have the additional effect of reducing server load.

We've been working on standardizing our Javascript views to use a consistent API for fetching data from the server, and provided JSON views for much of the information that client-side apps need to use. This new API was used to redesign the lottery registration sign-up module used by MIT in February. It should now require much less server-side hacking to make a smooth client-side registration app.

Some previous proposals for advanced student registration system can be found here.

Potential projects:

  • The suggested project is to convert the entire student registration process (e.g. /learn/[program]/[instance]/studentreg) into a client-side app having the same modular form.
  • Innovative ideas on how to help students select classes (e.g. presenting a lot of information without overloading the user) will be appreciated.
  • Please consider how to minimize the amount of information exchanged with the server, as these views may be used by thousands of people on a single site concurrently.

Relevant tasks:

Financial system

Relevant skills/what you might learn:

  • Web development
  • Experience with financial accounting

Difficulty: Hard

Students use our Web sites to register for programs run at universities. Sometimes these programs are run for free; sometimes there is an entrance fee. Sometimes the students can buy things like T-shirts with the program's logo on them, or tickets to lunch and dinner events. They can pay for these things online with a credit card, or in person with a check or cash. Some students receive financial aid; this reduces the admission fee for the program, and the fee for some purchasable items. Sometimes the financial aid is granted uniformly and sometimes the administrators adjust it per-student based on need.

On the other hand, teachers sometimes buy supplies for their classes and get reimbursed for them through a precisely-specified reimbursement process. At some universities, this process involves integrating with an internal accounting system, to submit reimbursement requests. Each class has a cap on how much all of its teachers can be reimbursed for in total, but some exceptions are granted for specific classes.

All and all, there's a lot of money moving around between a lot of people. Our accounting system needs to keep track of these transactions and make it easy for people to create, modify and view them.

Our existing code to allows some of these tasks to be performed and tracked through the website, but it is confusing and incomplete. Our models store accounting data using a double-ledger accounting system but our users (and many programmers!) don't fully understand double-ledger accounting. We invite proposals that address these problems in an elegant fashion, keeping in mind that sometimes less is more.

Potential projects:

  • Simplify the accounting models (while preserving most of their capabilities) and create the UIs necessary for users to interact with them.
  • Integrate the site with a separate open-source accounting and/or e-commerce library.

Relevant tasks:

  • Add a view that displays financial information for an individual student.
  • Alter the SplashInfoModule (which is used by some of our chapters to collect students' lunch preferences and apply sibling discounts) to record charges in the accounting system.
  • Display the amount collected so far in credit card payments (Github ticket).

Mobile apps

Relevant skills/what you might learn:

  • Mobile application development (both client-side and server-side)
  • User interface design

Difficulty: Moderate

More and more of our program participants are equipped with smart phones, and many do not have a computer (or use a phone as their primary computing device). Our volunteers have not yet had the time to implement separate (or additional) features specifically for mobile devices. GSoC provides us with an excellent opportunity to take advantage of this technology, both to expand the accessibility of our chapters' sites and to smooth logistics at their programs.

Potential projects:

  • Design a user interface (e.g. HTML templates) for our sites that works well with limited display size and resolution, and figure out how to ensure that these templates are used when rendering pages for small-screen devices.
  • Create a dedicated mobile app that implements a subset of our most commonly used Web site features, for example creating a student account and signing up for a program.
  • Create a dedicated mobile app that allows students to check in for a program electronically, avoiding the crowds of an in-person registration.

Relevant tasks:

  • Parse our server logs and find out what percentage of site visitors used each type of mobile Internet device.

Propose your own

If you've worked with a Splash before, then you might have your own ideas for what might make the software better. Please tell us about them! You may think of something great that we never considered.

Other warm-up tasks

We invite small contributions, as a means of getting familiar with our code and our existing volunteer contributors. Some of the contributions we think would be appropriate for a new GSoC student are listed in association with one of the projects above. Here are some more ideas, of varying complexity:

  • Allow teachers to copy a class from a previous program (Github ticket)
  • Fix the account merging feature.
  • Generalize the per-user loading/saving features implemented here to be usable for any custom form.
  • Allow the customizable registration step (also implemented here) to be used multiple times with different forms.
  • Allow a specified UserBit to be granted to each user when they submit a custom form.
  • Use Django groups for something we currently do using UserBits.
  • Improve the workflow for account registration when there are duplicate e-mail addresses and/or names.
  • Clean up the jQuery code used in the custom forms builder.

If you'd like to work on one of these tasks, we will be happy to help over e-mail and IRC. Remember that there is also a dev server setup script to help you get started.

A complete list of open feature requests and bug reports can be found at our Github issues page.

Past Projects

LU participated in Summer of Code 2011. The students that we selected last year (Vishal Dugar and Jordan Moldow) have agreed to let us link to their proposals in order to provide examples of successful proposals.

This is a good example for the kind of proposal we are looking for; it is concise, with a clear summary of Vishal's previous work, an overview of his vision for the new app, a selection of design elements that he wanted to include, and an implementation timeline. Vishal discussed his project ideas extensively with us on our e-mail list, and prototyped a key element of the system (dynamic models in Django) during the application period.

Jordan had made significant contributions to our code already, so in this case we were content with a lack of technical detail in his proposal.

It should also be noted that he adjusted the topic of his work significantly after submitting the proposal; the scope was narrowed to focus on querying and viewing data, given that Vishal was taking care of custom forms. The takeaway from this is: come up with the best proposal you have for what you want to do, without worrying about interaction with other potential projects. We'll select the best proposals independently and then help the students coordinate their work with one another.