org.htmlcleaner
Class CleanerProperties

java.lang.Object
  extended by org.htmlcleaner.CleanerProperties
All Implemented Interfaces:
HtmlModificationListener

public class CleanerProperties
extends Object
implements HtmlModificationListener

Properties defining cleaner's behaviour


Field Summary
static String BOOL_ATT_EMPTY
           
static String BOOL_ATT_SELF
           
static String BOOL_ATT_TRUE
           
static String DEFAULT_CHARSET
           
 
Constructor Summary
CleanerProperties()
           
CleanerProperties(ITagInfoProvider tagInfoProvider)
           
 
Method Summary
 void addHtmlModificationListener(HtmlModificationListener listener)
          Adds a listener to the list of objects that will be notified about changes that cleaner does during cleanup process.
 void addPruneTagNodeCondition(ITagNodeCondition condition)
          Adds the condition to existing prune tag set.
 void fireConditionModification(ITagNodeCondition condition, TagNode tagNode)
          Fired when cleaner modifies html due to ITagNodeCondition match.
 void fireHtmlError(boolean certainty, TagNode startTagToken, ErrorType type)
          Fired when cleaner fixes some error in html syntax.
 void fireUglyHtml(boolean certainty, TagNode startTagToken, ErrorType errorType)
          Fired when cleaner fixes ugly html -- when syntax was correct but task was implemented by weird code.
 void fireUserDefinedModification(boolean certainty, TagNode tagNode, ErrorType errorType)
          Fired when cleaner modifies html due to user specified rules.
 String getAllowTags()
           
 Set<ITagNodeCondition> getAllowTagSet()
           
 String getBooleanAttributeValues()
           
 String getCharset()
           
 CleanerTransformations getCleanerTransformations()
           
 String getHyphenReplacementInComment()
           
 String getPruneTags()
           
 Set<ITagNodeCondition> getPruneTagSet()
           
 ITagInfoProvider getTagInfoProvider()
           
 boolean isAddNewlineToHeadAndBody()
           
 boolean isAdvancedXmlEscape()
           
 boolean isAllowHtmlInsideAttributes()
           
 boolean isAllowMultiWordAttributes()
           
 boolean isIgnoreQuestAndExclam()
           
 boolean isKeepWhitespaceAndCommentsInHead()
           
 boolean isNamespacesAware()
           
 boolean isOmitCdataOutsideScriptAndStyle()
           
 boolean isOmitComments()
           
 boolean isOmitDeprecatedTags()
           
 boolean isOmitDoctypeDeclaration()
           
 boolean isOmitHtmlEnvelope()
           
 boolean isOmitUnknownTags()
           
 boolean isOmitXmlDeclaration()
           
 boolean isRecognizeUnicodeChars()
           
 boolean isTranslateSpecialEntities()
           
 boolean isTransResCharsToNCR()
           
 boolean isTransSpecialEntitiesToNCR()
           
 boolean isTreatDeprecatedTagsAsContent()
           
 boolean isTreatUnknownTagsAsContent()
           
 boolean isUseCdataForScriptAndStyle()
           
 boolean isUseEmptyElementTags()
           
 void reset()
          advancedXmlEscape = true; useCdataForScriptAndStyle = true; translateSpecialEntities = true; recognizeUnicodeChars = true; omitUnknownTags = false; treatUnknownTagsAsContent = false; omitDeprecatedTags = false; treatDeprecatedTagsAsContent = false; omitComments = false; omitXmlDeclaration = OptionalOutput.alwaysOutput; omitDoctypeDeclaration = OptionalOutput.alwaysOutput; omitHtmlEnvelope = OptionalOutput.alwaysOutput; useEmptyElementTags = true; allowMultiWordAttributes = true; allowHtmlInsideAttributes = false; ignoreQuestAndExclam = true; namespacesAware = true; keepHeadWhitespace = true; addNewlineToHeadAndBody = true; hyphenReplacementInComment = "="; pruneTags = null; allowTags = null; booleanAttributeValues = BOOL_ATT_SELF; collapseNullHtml = CollapseHtml.none charset = "UTF-8";
 void setAddNewlineToHeadAndBody(boolean addNewlineToHeadAndBody)
           
 void setAdvancedXmlEscape(boolean advancedXmlEscape)
           
 void setAllowHtmlInsideAttributes(boolean allowHtmlInsideAttributes)
           
 void setAllowMultiWordAttributes(boolean allowMultiWordAttributes)
           
 void setAllowTags(String allowTags)
           
 void setBooleanAttributeValues(String booleanAttributeValues)
           
 void setCharset(String charset)
           
 void setCleanerTransformations(CleanerTransformations cleanerTransformations)
           
 void setHyphenReplacementInComment(String hyphenReplacementInComment)
           
 void setIgnoreQuestAndExclam(boolean ignoreQuestAndExclam)
           
 void setKeepWhitespaceAndCommentsInHead(boolean keepHeadWhitespace)
           
 void setNamespacesAware(boolean namespacesAware)
           
 void setOmitCdataOutsideScriptAndStyle(boolean value)
           
 void setOmitComments(boolean omitComments)
           
 void setOmitDeprecatedTags(boolean omitDeprecatedTags)
           
 void setOmitDoctypeDeclaration(boolean omitDoctypeDeclaration)
           
 void setOmitHtmlEnvelope(boolean omitHtmlEnvelope)
           
 void setOmitUnknownTags(boolean omitUnknownTags)
           
 void setOmitXmlDeclaration(boolean omitXmlDeclaration)
           
 void setPruneTags(String pruneTags)
          Resets prune tags set and adds tag name conditions to it.
 void setRecognizeUnicodeChars(boolean recognizeUnicodeChars)
           
 void setTranslateSpecialEntities(boolean translateSpecialEntities)
          TODO : use OptionalOutput
 void setTransResCharsToNCR(boolean transResCharsToNCR)
           
 void setTransSpecialEntitiesToNCR(boolean transSpecialEntitiesToNCR)
           
 void setTreatDeprecatedTagsAsContent(boolean treatDeprecatedTagsAsContent)
           
 void setTreatUnknownTagsAsContent(boolean treatUnknownTagsAsContent)
           
 void setUseCdataForScriptAndStyle(boolean useCdataForScriptAndStyle)
           
 void setUseEmptyElementTags(boolean useEmptyElementTags)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_CHARSET

public static final String DEFAULT_CHARSET
See Also:
Constant Field Values

BOOL_ATT_SELF

public static final String BOOL_ATT_SELF
See Also:
Constant Field Values

BOOL_ATT_EMPTY

public static final String BOOL_ATT_EMPTY
See Also:
Constant Field Values

BOOL_ATT_TRUE

public static final String BOOL_ATT_TRUE
See Also:
Constant Field Values
Constructor Detail

CleanerProperties

public CleanerProperties()

CleanerProperties

public CleanerProperties(ITagInfoProvider tagInfoProvider)
Parameters:
tagInfoProvider -
Method Detail

getTagInfoProvider

public ITagInfoProvider getTagInfoProvider()

isAdvancedXmlEscape

public boolean isAdvancedXmlEscape()

setAdvancedXmlEscape

public void setAdvancedXmlEscape(boolean advancedXmlEscape)

isTransResCharsToNCR

public boolean isTransResCharsToNCR()

setTransResCharsToNCR

public void setTransResCharsToNCR(boolean transResCharsToNCR)

isUseCdataForScriptAndStyle

public boolean isUseCdataForScriptAndStyle()

setUseCdataForScriptAndStyle

public void setUseCdataForScriptAndStyle(boolean useCdataForScriptAndStyle)

isTranslateSpecialEntities

public boolean isTranslateSpecialEntities()

setTranslateSpecialEntities

public void setTranslateSpecialEntities(boolean translateSpecialEntities)
TODO : use OptionalOutput

Parameters:
translateSpecialEntities -

isRecognizeUnicodeChars

public boolean isRecognizeUnicodeChars()

setRecognizeUnicodeChars

public void setRecognizeUnicodeChars(boolean recognizeUnicodeChars)

isOmitUnknownTags

public boolean isOmitUnknownTags()

setOmitUnknownTags

public void setOmitUnknownTags(boolean omitUnknownTags)

isTreatUnknownTagsAsContent

public boolean isTreatUnknownTagsAsContent()

setTreatUnknownTagsAsContent

public void setTreatUnknownTagsAsContent(boolean treatUnknownTagsAsContent)

isOmitDeprecatedTags

public boolean isOmitDeprecatedTags()

setOmitDeprecatedTags

public void setOmitDeprecatedTags(boolean omitDeprecatedTags)

isTreatDeprecatedTagsAsContent

public boolean isTreatDeprecatedTagsAsContent()

setTreatDeprecatedTagsAsContent

public void setTreatDeprecatedTagsAsContent(boolean treatDeprecatedTagsAsContent)

isOmitComments

public boolean isOmitComments()

setOmitComments

public void setOmitComments(boolean omitComments)

isOmitXmlDeclaration

public boolean isOmitXmlDeclaration()

setOmitXmlDeclaration

public void setOmitXmlDeclaration(boolean omitXmlDeclaration)

isOmitDoctypeDeclaration

public boolean isOmitDoctypeDeclaration()
Returns:
also return true if omitting the Html Envelope

setOmitDoctypeDeclaration

public void setOmitDoctypeDeclaration(boolean omitDoctypeDeclaration)

isOmitHtmlEnvelope

public boolean isOmitHtmlEnvelope()

setOmitHtmlEnvelope

public void setOmitHtmlEnvelope(boolean omitHtmlEnvelope)

isUseEmptyElementTags

public boolean isUseEmptyElementTags()

setUseEmptyElementTags

public void setUseEmptyElementTags(boolean useEmptyElementTags)

isAllowMultiWordAttributes

public boolean isAllowMultiWordAttributes()

setAllowMultiWordAttributes

public void setAllowMultiWordAttributes(boolean allowMultiWordAttributes)

isAllowHtmlInsideAttributes

public boolean isAllowHtmlInsideAttributes()

setAllowHtmlInsideAttributes

public void setAllowHtmlInsideAttributes(boolean allowHtmlInsideAttributes)

isIgnoreQuestAndExclam

public boolean isIgnoreQuestAndExclam()

setIgnoreQuestAndExclam

public void setIgnoreQuestAndExclam(boolean ignoreQuestAndExclam)

isNamespacesAware

public boolean isNamespacesAware()

setNamespacesAware

public void setNamespacesAware(boolean namespacesAware)

isAddNewlineToHeadAndBody

public boolean isAddNewlineToHeadAndBody()

setAddNewlineToHeadAndBody

public void setAddNewlineToHeadAndBody(boolean addNewlineToHeadAndBody)

isKeepWhitespaceAndCommentsInHead

public boolean isKeepWhitespaceAndCommentsInHead()

setKeepWhitespaceAndCommentsInHead

public void setKeepWhitespaceAndCommentsInHead(boolean keepHeadWhitespace)

getHyphenReplacementInComment

public String getHyphenReplacementInComment()

setHyphenReplacementInComment

public void setHyphenReplacementInComment(String hyphenReplacementInComment)

getPruneTags

public String getPruneTags()

isOmitCdataOutsideScriptAndStyle

public boolean isOmitCdataOutsideScriptAndStyle()

setOmitCdataOutsideScriptAndStyle

public void setOmitCdataOutsideScriptAndStyle(boolean value)

setPruneTags

public void setPruneTags(String pruneTags)
Resets prune tags set and adds tag name conditions to it. All the tags listed by pruneTags param are added.

Parameters:
pruneTags -

addPruneTagNodeCondition

public void addPruneTagNodeCondition(ITagNodeCondition condition)
Adds the condition to existing prune tag set.

Parameters:
condition -

getPruneTagSet

public Set<ITagNodeCondition> getPruneTagSet()

getAllowTags

public String getAllowTags()

setAllowTags

public void setAllowTags(String allowTags)

isTransSpecialEntitiesToNCR

public boolean isTransSpecialEntitiesToNCR()

setTransSpecialEntitiesToNCR

public void setTransSpecialEntitiesToNCR(boolean transSpecialEntitiesToNCR)

getAllowTagSet

public Set<ITagNodeCondition> getAllowTagSet()

setCharset

public void setCharset(String charset)
Parameters:
charset - the charset to set

getCharset

public String getCharset()
Returns:
the charset

getBooleanAttributeValues

public String getBooleanAttributeValues()

setBooleanAttributeValues

public void setBooleanAttributeValues(String booleanAttributeValues)

reset

public void reset()
advancedXmlEscape = true; useCdataForScriptAndStyle = true; translateSpecialEntities = true; recognizeUnicodeChars = true; omitUnknownTags = false; treatUnknownTagsAsContent = false; omitDeprecatedTags = false; treatDeprecatedTagsAsContent = false; omitComments = false; omitXmlDeclaration = OptionalOutput.alwaysOutput; omitDoctypeDeclaration = OptionalOutput.alwaysOutput; omitHtmlEnvelope = OptionalOutput.alwaysOutput; useEmptyElementTags = true; allowMultiWordAttributes = true; allowHtmlInsideAttributes = false; ignoreQuestAndExclam = true; namespacesAware = true; keepHeadWhitespace = true; addNewlineToHeadAndBody = true; hyphenReplacementInComment = "="; pruneTags = null; allowTags = null; booleanAttributeValues = BOOL_ATT_SELF; collapseNullHtml = CollapseHtml.none charset = "UTF-8";


getCleanerTransformations

public CleanerTransformations getCleanerTransformations()
Returns:
the cleanerTransformations

setCleanerTransformations

public void setCleanerTransformations(CleanerTransformations cleanerTransformations)

addHtmlModificationListener

public void addHtmlModificationListener(HtmlModificationListener listener)
Adds a listener to the list of objects that will be notified about changes that cleaner does during cleanup process.

Parameters:
listener - -- listener object to be notified of the changes.

fireConditionModification

public void fireConditionModification(ITagNodeCondition condition,
                                      TagNode tagNode)
Description copied from interface: HtmlModificationListener
Fired when cleaner modifies html due to ITagNodeCondition match.

Specified by:
fireConditionModification in interface HtmlModificationListener
Parameters:
condition - that was applied to make the modification
tagNode - - problematic node.

fireHtmlError

public void fireHtmlError(boolean certainty,
                          TagNode startTagToken,
                          ErrorType type)
Description copied from interface: HtmlModificationListener
Fired when cleaner fixes some error in html syntax.

Specified by:
fireHtmlError in interface HtmlModificationListener
Parameters:
certainty - - true if change made doesn't hurts end document.
startTagToken - - problematic node.

fireUglyHtml

public void fireUglyHtml(boolean certainty,
                         TagNode startTagToken,
                         ErrorType errorType)
Description copied from interface: HtmlModificationListener
Fired when cleaner fixes ugly html -- when syntax was correct but task was implemented by weird code. For example when deprecated tags are removed.

Specified by:
fireUglyHtml in interface HtmlModificationListener
Parameters:
certainty - - true if change made doesn't hurts end document.
startTagToken - - problematic node.

fireUserDefinedModification

public void fireUserDefinedModification(boolean certainty,
                                        TagNode tagNode,
                                        ErrorType errorType)
Description copied from interface: HtmlModificationListener
Fired when cleaner modifies html due to user specified rules.

Specified by:
fireUserDefinedModification in interface HtmlModificationListener
Parameters:
certainty - - true if change made doesn't hurts end document.
tagNode - - problematic node.


Copyright © 2006-2013. All Rights Reserved.