HTML::TagFilter
documentation
downloads
on cpan
report bug

HTML::TagFilter

...is available from CPAN or here. It's a fine-grained cleaner of html tags, able to filter out disallowed tags, attributes and attribute values according to your specifications. Its default configuration protects against cross-site scripting attacks and removes all but basic formatting markup, or you can supply your own rules.

It's a fairly simple module (mostly because it inherits from HTML::Parser, which is not), and it only does one thing, but it does it thoroughly and quite well. It can't act on the text around tags, which is a shame, nor does it know anything about document structure or the relations between tags, but it will clean your html up nicely.

And I must apologise: I committed an intercapitalisation. It was a long time ago, I don't know what I was thinking and I'm stuck with it now.

last updated 3rd December 2004