Sponsor:
Charles Isbell
isbell@cc.gatech.edu
CRB 380
Area: Machine Learning
Problem: I get a lot of email. And when I say a lot, I mean a lot. Given my mail problems, a few friends and I wrote an email server/client system that handles all my mail for me. It's called Ishmail, and among its many cool features is that it's an extensible research platorm.
Anyway, like many high-traffic users, I use rules to sort my mail. I have a Scheme-based language I hacked up that helps me to define and use those rules (although that language is actually hidden from the user by a gui). For most cases, it is pretty easy to define the rules I need; however, sometimes I'm too lazy to do it on my own or I can't get it quite right. What would be nice is if I could select a bunch of examples of mail I want to be in the same mailbox (and perhaps some mail I don't want in that mailbox as well) and have the system extract a candidate rule for me that I could then edit and use. Further, if I already have a rule for that mailbox, it would be nice if the new candidate rule were as similar as possible to the old rule.
It would also be nice if the system could notice if I'm manually moving a lot of messages from one mailbox to another and come up with a modification of either the source or target mailbox rules that would get the right messages in the right place, (taking into account things like the order in which rules are fired). Note that one could accomplish a simple version of this without doing much real work on the underlying rule extractor. There are, in fact, a slew of HCI-related issues here.
So I need someone to add this functionality to Ishmail.
What needs to be done: At this point, an undergraduate has already done a simple implementation of an inductive learner for this problem based in part on a simple rule-learning system that was used on a very old emacs-based version of Ishmail. It should be possible to build upon this work. See me for a pointer to the paper, and code. The deliverable, then, would be either a paper that explicated the problem, or better yet, the beginnings of an integrated working prototype.