org.htmlcleaner
Class TagNode

java.lang.Object
  extended byorg.htmlcleaner.TagToken
      extended byorg.htmlcleaner.TagNode
All Implemented Interfaces:
BaseToken

public class TagNode
extends TagToken

XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentToken), comments (CommentToken) and optionally doctype node (DoctypeToken).

Created by: Vladimir Nikic
Date: November, 2006.


Nested Class Summary
static interface TagNode.ITagNodeCondition
          Used as base for different node checkers.
 class TagNode.TagAllCondition
          All nodes.
 class TagNode.TagNodeAttExistsCondition
          Checks if node contains specified attribute.
 class TagNode.TagNodeAttValueCondition
          Checks if node has specified attribute with specified value.
 class TagNode.TagNodeNameCondition
          Checks if node has specified name.
 
Field Summary
 
Fields inherited from class org.htmlcleaner.TagToken
name
 
Constructor Summary
TagNode(java.lang.String name)
           
TagNode(java.lang.String name, HtmlCleaner cleaner)
           
 
Method Summary
 void addAttribute(java.lang.String attName, java.lang.String attValue)
          Adds specified attribute to this tag or overrides existing one.
 void addChild(java.lang.Object child)
           
 void addChildren(java.util.List newChildren)
          Add all elements from specified list to this node.
 java.lang.Object[] evaluateXPath(java.lang.String xPathExpression)
          Evaluates XPath expression on give node.
 TagNode findElementByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 TagNode findElementByName(java.lang.String findName, boolean isRecursive)
           
 TagNode findElementHavingAttribute(java.lang.String attName, boolean isRecursive)
           
 TagNode[] getAllElements(boolean isRecursive)
           
 java.util.List getAllElementsList(boolean isRecursive)
           
 java.lang.String getAttributeByName(java.lang.String attName)
           
 java.util.Map getAttributes()
           
 java.util.List getChildren()
           
 java.util.List getChildTagList()
           
 TagNode[] getChildTags()
           
 DoctypeToken getDocType()
           
 java.util.List getElementListByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 java.util.List getElementListByName(java.lang.String findName, boolean isRecursive)
           
 java.util.List getElementListHavingAttribute(java.lang.String attName, boolean isRecursive)
           
 TagNode[] getElementsByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
           
 TagNode[] getElementsByName(java.lang.String findName, boolean isRecursive)
           
 TagNode[] getElementsHavingAttribute(java.lang.String attName, boolean isRecursive)
           
 TagNode getParent()
           
 java.lang.StringBuffer getText()
           
 boolean hasAttribute(java.lang.String attName)
          Checks existance of specified attribute.
 TagNode makeCopy()
           
 void removeAttribute(java.lang.String attName)
          Removes specified attribute from this tag.
 boolean removeChild(java.lang.Object child)
          Remove specified child element from this node.
 boolean removeFromTree()
          Remove this node from the tree.
 void serialize(XmlSerializer xmlSerializer, java.io.Writer writer)
           
 void setDocType(DoctypeToken docType)
           
 
Methods inherited from class org.htmlcleaner.TagToken
getName, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TagNode

public TagNode(java.lang.String name)

TagNode

public TagNode(java.lang.String name,
               HtmlCleaner cleaner)
Method Detail

getAttributeByName

public java.lang.String getAttributeByName(java.lang.String attName)
Parameters:
attName -
Returns:
Value of the specified attribute, or null if it this tag doesn't contain it.

getAttributes

public java.util.Map getAttributes()
Returns:
Map instance containing all attribute name/value pairs.

hasAttribute

public boolean hasAttribute(java.lang.String attName)
Checks existance of specified attribute.

Parameters:
attName -

addAttribute

public void addAttribute(java.lang.String attName,
                         java.lang.String attValue)
Adds specified attribute to this tag or overrides existing one.

Parameters:
attName -
attValue -

removeAttribute

public void removeAttribute(java.lang.String attName)
Removes specified attribute from this tag.

Parameters:
attName -

getChildren

public java.util.List getChildren()
Returns:
List of children objects. During the cleanup process there could be different kind of childern inside, however after clean there should be only TagNode instances.

getChildTagList

public java.util.List getChildTagList()

getChildTags

public TagNode[] getChildTags()
Returns:
An array of child TagNode instances.

getText

public java.lang.StringBuffer getText()
Returns:
Text content of this node and it's subelements.

getParent

public TagNode getParent()
Returns:
Parent of this node, or null if this is the root node.

getDocType

public DoctypeToken getDocType()

setDocType

public void setDocType(DoctypeToken docType)

addChild

public void addChild(java.lang.Object child)

addChildren

public void addChildren(java.util.List newChildren)
Add all elements from specified list to this node.

Parameters:
newChildren -

getAllElementsList

public java.util.List getAllElementsList(boolean isRecursive)

getAllElements

public TagNode[] getAllElements(boolean isRecursive)

findElementByName

public TagNode findElementByName(java.lang.String findName,
                                 boolean isRecursive)

getElementListByName

public java.util.List getElementListByName(java.lang.String findName,
                                           boolean isRecursive)

getElementsByName

public TagNode[] getElementsByName(java.lang.String findName,
                                   boolean isRecursive)

findElementHavingAttribute

public TagNode findElementHavingAttribute(java.lang.String attName,
                                          boolean isRecursive)

getElementListHavingAttribute

public java.util.List getElementListHavingAttribute(java.lang.String attName,
                                                    boolean isRecursive)

getElementsHavingAttribute

public TagNode[] getElementsHavingAttribute(java.lang.String attName,
                                            boolean isRecursive)

findElementByAttValue

public TagNode findElementByAttValue(java.lang.String attName,
                                     java.lang.String attValue,
                                     boolean isRecursive,
                                     boolean isCaseSensitive)

getElementListByAttValue

public java.util.List getElementListByAttValue(java.lang.String attName,
                                               java.lang.String attValue,
                                               boolean isRecursive,
                                               boolean isCaseSensitive)

getElementsByAttValue

public TagNode[] getElementsByAttValue(java.lang.String attName,
                                       java.lang.String attValue,
                                       boolean isRecursive,
                                       boolean isCaseSensitive)

evaluateXPath

public java.lang.Object[] evaluateXPath(java.lang.String xPathExpression)
                                 throws XPatherException
Evaluates XPath expression on give node.
This is not fully supported XPath parser and evaluator. Examples below show supported elements:
  • //div//a
  • //div//a[@id][@class]
  • /body/*[1]/@type
  • //div[3]//a[@id][@href='r/n4']
  • //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
  • //div[2]/@*[2]
  • data(//div//a[@id][@class])
  • //p/last()
  • //body//div[3][@class]//span[12.2
  • data(//a['v' < @id])

Parameters:
xPathExpression -
Returns:
Throws:
XPatherException

removeFromTree

public boolean removeFromTree()
Remove this node from the tree.

Returns:
True if element is removed (if it is not root node).

removeChild

public boolean removeChild(java.lang.Object child)
Remove specified child element from this node.

Parameters:
child -
Returns:
True if child object existed in the children list.

serialize

public void serialize(XmlSerializer xmlSerializer,
                      java.io.Writer writer)
               throws java.io.IOException
Throws:
java.io.IOException

makeCopy

public TagNode makeCopy()