Ant usage

Apache Ant is a widely used Java-based build tool. First ensure that HtmlCleaner JAR file is at the Ant's class path. Then create Ant task in the following way:

    <taskdef name="mytask" classname="org.htmlcleaner.HtmlCleanerForAnt"/>
    ....
    <target name="...">
        <mytask [src = "..."] 
                [incharset = "..."]
                [dest = "..."] 
                [outcharset = "..."] 
                [taginfofile = "..."] 
                [outputtype = "simple" | "compact" | "pretty"]
                [advancedxmlescape = "true" | "false"]
                [transrescharstoncr = "true" | "false"]
                [usecdata = "true" | "false"]
                [specialentities = "true" | "false"]
                [transspecialentitiestoncr = "true" | "false"]
                [unicodechars = "true" | "false"]
                [omitunknowntags = "true" | "false"]
                [treatunknowntagsascontent = "true" | "false"]
                [omitdeprtags = "true" | "false"]
                [treatdeprtagsascontent = "true" | "false"]
                [omitcomments = "true" | "false"]
                [omitxmldecl = "true" | "false"]
                [omitdoctypedecl = "true" | "false"]
                [useemptyelementtags = "true" | "false"]
                [allowmultiwordattributes = "true" | "false"]
                [allowhtmlinsideattributes = "true" | "false"]
                [ignoreqe = "true" | "false"]
                [namespacesaware = "true" | "false"]
                [hyphenreplacement = "..."]>
                [prunetags = "..."]
                [booleanatts = "true" | "false"]
                [nodebyxpath = "..."] >
                [omitenvelope = "true" | "false"]
                [transform = "..."] >
 
            .... optional HTML code ....
            
        </mytask>
    </target>
    
Note: in order to make difference between URLs and files, URS's must begin with http:// or https:// in src attribute.
If src attribute is not specified, HTML from the task's body is used.

Optional parameter taginfofile is path to XML file that contains description of all tags and tag dependencies. It will be used in cleaning process instead of default tag info set. See description file of default tag info set as reference.

Transformations are described with attribute transform and it should consist of series of transformation rules separated by pipe character(|). It's value for the example would be: cfoutput|c:block=div,false|font=span,true|font.size|font.face|font.style=${style};font-family=${face};font-size=${size};

Starting with V2.20, you can also output the result of cleaning as an Ant property, enabling the use of HtmlCleaner itself to be used as part of an Ant build process. For example, an antlib.xml would look like:

<antlib>
        <taskdef name="htmlcleaner" classname="org.htmlcleaner.HtmlCleanerForAnt"/>
</antlib>
... and the code for adding a htmlcleaner task in a build.xml file would be
<typedef resource="org/htmlcleaner/antlib.xml" classpath=".../htmlcleaner.jar"/>