bfilter plumbed in (sorta)

Posted on January 6, 2016

I had a pretty dismal end to 2015, at least partly because my beautiful little boy discovered that he can i) stand up, and ii) press keys on Daddy’s laptop, and this is definitely one of the best things ever. Unfortunately his Haskell coding skills are rather lacking at the moment, although he can write syntactically valid Perl. :)

I decided I must do better in 2016, and last night I made live on my own email the very first union of flare and bfilter. There are some very serious limitations to work on, but it is already doing a fantastic job of diverting my haskell-cafe and fedora-devel email. It also seems to be doing surprisingly well at picking out spam, although that’s probably largely because it’s pretty much assuming everything is spam (which is pretty much true).

Here are some notes on the limitations from the flare side:

  1. the list of categories is fixed at compile time.
  2. the “Good” and “Spam” buttons need to be made to do sensible things.
  3. there is only a single bfilter database, not one per user.

For 1, I think I add a new database table of the available categories (for each user). Um, but then there’ll need to be UI to add to the list… and what happens if you remove a category?

I think there must be at least 2 “fixed” categories, Spam and, idk, General, and the Spam / Good buttons simply move the message to those. We don’t currently have any way to know when to diplay the buttons (see below about exposing the range), which I have “solved” by never displaying them.

Obviously 3 is fairly easy to fix.

And on the bfilter side:

  1. I think I’d like to expose the “range” (which is used to decide when to say UNSURE), since we will need it to know when to display the Good / Spam buttons.
  2. There is no way to untrain a message…
  3. …or remove a category.
  4. I really want to be able to have two (or n) databases, so we can have a honeypot style thing. Simplest way to do that would probably be to define BFILTER_DBS_EXTRA as a colon-separated list of, errm, extra databases.

So, plenty of work to do. But still, I’m feeling really very smug about how well it’s going so far.