] There's been some discussion in the blog world about
] using a Bayesian categorizer to enable a person to
] discriminate along various interest/non-interest axes. I
] took a run at this recently and, although my experiments
] haven't been wildly successful, I want to report them
] because I think the idea may have merit.
It seems like a nice idea, but it doesn't this approach probably wouldn't work well with blogs. Spam may be easily classified by bayesian filters. The content of two blog entries, however, could easily contain many common keywords, yet provide significantly different levels of interest to the reader.
Also, if you're going to go through the trouble of structuring a set of articles in a way that they could be parsed by some filter, effectively restricting the article database to a single system, one might as well be using Memestreams, at least in its methods. I believe the results would be more worthwhile.