Release notes

Sep. 02, 2008: HtmlCleaner release 2.1
  • Parsing transformations are developed in order to easily skip or change specified tags or attributes during the cleanup process.
  • Few more constructors added in class HtmlCleaner giving possibility to reuse same cleaner properties with multiple cleaner instances.
  • Code cleanup.
Jul. 15, 2008: HtmlCleaner release 2.0
  • Complete code refactoring is done so as to better separate roles of cleaner, cleaner properties, object model nodes and serializators. API is not compatible with previous versions, though it is still very simple for use.
  • Post-cleaning node manipulation is enabled with rich set of methods in TagNode class. Now, there is no need to create DOM or JDom out of HtmlCleaner object model in order to select, add or remove some nodes or attributes.
  • Basic XPath is supported on HtmlCleaner object model. Despite partial implementation, if should be power enough to find or collect nodes/attributes/text even with fairly complex criteria.
  • Modifying already cleaned HtmlCleaner object model is enabled with HtmlCleaner.setInnerHtml(node, html) similar to DHTML feature to set inner html of an object.
  • Creating custom tag rule set is now much easier by defining XML configuration file.
  • New properties booleanAttributeValues and nodeByXPath for setting cleaner's behavior are introduced.
  • Test cases added to source code.
  • Memory leak problem in Java 1.4 fixed.
  • Number of bug fixes.
Dec. 26, 2007: HtmlCleaner release 1.6
  • New flag parameter ignoreQuestAndExclam is introduced offering control over special tags - <?TAGNAME....>, <!TAGNAME....>.
  • Bug fixes.
Sep. 27, 2007: HtmlCleaner release 1.55
  • Added Reader based HtmlCleaner constructors.
  • New parameter pruneTags is introduced offering a way to remove undesired tags with all the children from XML tree after parsing and cleaning.
  • Bug fixes.
Sep. 8, 2007: HtmlCleaner release 1.5
  • Several bug fixes.
  • Added option to escape XML content in DOM serializer - HtmlCleaner.createDOM(boolean escapeXml)
Aug. 24, 2007: HtmlCleaner release 1.4
  • New flag allowHtmlInsideAttributes is introduced in order to give the parser flexibility in handling attribute values.
  • Several bug fixes.
Jul. 12, 2007: HtmlCleaner release 1.3
  • New browser-compact serializer added, that preserves single whitespace where multiple occure.
  • New flag namespacesAware is introduced in order to control namespace prefixes and namespace declarations. It should be used instead of omitXmlnsAttributes that existed in previous versions and had limited functionality.
  • New flag allowMultiWordAttributes is introduced giving HtmlCleaner's parser flexibility to (dis)allow tag attributes consisting of multiple words.
  • New flag useEmptyElementTags is introduced in order to controll output of tags with empty body
    (<xxx/> vs <xxx></xxx>).
  • Several bug fixes.
May. 05, 2007: HtmlCleaner release 1.2
  • Several bugs fixed.
  • New flags added to control behaviour of unknown/deprecated tags.
  • New flag added to optionally remove HTML envelope from resulting XML.
  • JDOM serializer added.
Apr. 13, 2007: HtmlCleaner release 1.13
  • Serialization of XML to Java DOM supported with createDOM() method of HtmlCleaner class.
Jan. 28, 2007: HtmlCleaner release 1.12
  • Hexadecimal entities escaping supported (i.e. &#x09;).
Jan. 11, 2007: HtmlCleaner release 1.1
  • Compact XML serializer improved.
  • Minor XML escaping bug fixed.
Jan. 02, 2007: HtmlCleaner release 1.0.5
  • A html tokenizing bug fixed.
  • Methods of the class TagNode made public in order to enable creating custom XML serializers.
  • Method writeXml(XmlSerializer) added to HtmlCleaner class in order to support creating custom XML serializers.
Dec. 23, 2006: HtmlCleaner release 1.0
  • Minor bug in advanced XML escaping fixed.
Dec. 05, 2006: HtmlCleaner release 0.9
  • HtmlCleaner Ant task added
  • XML compact serializer added - stripps all unneeded whitespaces from the result
  • Few minor bugs fixed
Nov. 27, 2006: HtmlCleaner initial release (version 0.8)