I’m annoyed. I’ve been playing with various AntiXss tools, including the one made by Microsoft. They all work, and they do what is expected, but they expose a fairly nasty bug. This bug is not in the Xss tools themselves, but rather in the HttpUtility’s HtmlEncode() method.

You see, it’s a bad, buggy method. It can encode an < to become an &lt;, but that’s about all it can do. It cannot encode something like a • (&bull;), or a — (&mdash;) or a ξ (&xi;). Why? Because it doesn’t know how.

Because of this, the AntiXss methods are next to useless. What I effectively have to do is go over the resulting code after the transformation has taken place, doing ridiculous stuff like x = x.Replace("", "&bull;"). This is a real annoyance because I cannot be expected to be able to cover every HTML entity under the sun.

This goes to show a simple fact: the idea of doing AntiXss by Html-decoding everything and only encoding what you want is fundamentally wrong. There has got to be a better way out there somewhere.