Blogs »
Simple HTML Sanitization with NekoHTML

Last days I'm working on HTML Sanitization problem. Just to do not forgot - here is a small code-snippet to do simple html sanitization with using nekoHTML library.

Please note - it is not a real solution - it is just idea, as well as some example of nekoHTML Filters usage. Better solution I will post later (and it is not nekoHTML based) - after some testing on my side.

This example sanitized "description" variable. safeTags and safeAttributes listed tags and attributes allowed in result, 'safe' html

        String[] safeTags = {"font","color","img","b","i","a","p","br","pre","center","table","tr","td","tbody","th","h1","h2","h3","h4","h5","h6"};
        String[] safeAttributes = {"href"};
       
        // we need to filter description to remove all script tags
        ElementRemover remover = new ElementRemover();
        for (String tag : safeTags) {
            remover.acceptElement(tag, safeAttributes);
        }
       
        remover.removeElement("script");
        // writer
        StringWriter filteredDescription = new StringWriter();
        org.cyberneko.html.filters.Writer writer =
            new org.cyberneko.html.filters.Writer(filteredDescription, null);

        // setup filter chain
        XMLDocumentFilter[] filters = {
            remover,
            writer,
        };
        // create HTML parser
        XMLParserConfiguration parser = new HTMLConfiguration();
        parser.setProperty("http://cyberneko.org/html/properties/filters", filters);       
        XMLInputSource source = new XMLInputSource(null, null, null, new StringReader(description), null);
        try {
            parser.parse(source);
            description = filteredDescription.toString();
        } catch (Exception ex) {
            log.warn("Cannot process descriotion", ex);
        }

Alexey Kakunin

Twitter emforge

About Me I hope to make EmForge really useful for all developers

Activity Details
<b>32</b> Blog Entries 32 Blog Entries RSS
<b>203</b> Tasks 203 Tasks
<b>25</b> Friends 25 Friends