|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.htmlcleaner.HtmlCleaner
public class HtmlCleaner
Main HtmlCleaner class.
It represents public interface to the user. It's task is to call tokenizer with specified source HTML, traverse list of produced token list and create internal object model. It also offers a set of methods to write resulting XML to string, file or any output stream.
Typical usage is the following:
Nested Class Summary | |
---|---|
protected class |
HtmlCleaner.NestingState
|
Constructor Summary | |
---|---|
HtmlCleaner()
Constructor - creates cleaner instance with default tag info provider and default properties. |
|
HtmlCleaner(CleanerProperties properties)
Constructor - creates the instance with default tag info provider and specified properties |
|
HtmlCleaner(ITagInfoProvider tagInfoProvider)
Constructor - creates the instance with specified tag info provider and default properties |
|
HtmlCleaner(ITagInfoProvider tagInfoProvider,
CleanerProperties properties)
Constructor - creates the instance with specified tag info provider and specified properties |
Method Summary | |
---|---|
protected void |
addPruneNode(TagNode node,
org.htmlcleaner.CleanTimeValues cleanTimeValues)
|
TagNode |
clean(File file)
|
TagNode |
clean(File file,
String charset)
|
TagNode |
clean(InputStream in)
|
TagNode |
clean(InputStream in,
String charset)
|
TagNode |
clean(Reader reader)
|
protected TagNode |
clean(Reader reader,
org.htmlcleaner.CleanTimeValues cleanTimeValues)
Basic version of the cleaning call. |
TagNode |
clean(String htmlContent)
|
TagNode |
clean(URL url)
Creates instance from the content downloaded from specified URL. |
TagNode |
clean(URL url,
String charset)
Deprecated. |
protected Set<ITagNodeCondition> |
getAllowTagSet(org.htmlcleaner.CleanTimeValues cleanTimeValues)
|
protected Set<String> |
getAllTags(org.htmlcleaner.CleanTimeValues cleanTimeValues)
|
String |
getInnerHtml(TagNode node)
For the specified node, returns it's content as string. |
CleanerProperties |
getProperties()
|
protected Set<ITagNodeCondition> |
getPruneTagSet(org.htmlcleaner.CleanTimeValues cleanTimeValues)
|
ITagInfoProvider |
getTagInfoProvider()
|
CleanerTransformations |
getTransformations()
|
void |
initCleanerTransformations(Map transInfos)
|
protected boolean |
isRemovingNodeReasonablySafe(TagNode startTagToken)
|
void |
setInnerHtml(TagNode node,
String content)
For the specified tag node, defines it's html content. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HtmlCleaner()
public HtmlCleaner(ITagInfoProvider tagInfoProvider)
tagInfoProvider
- Provider for tag filtering and balancingpublic HtmlCleaner(CleanerProperties properties)
properties
- Properties used during parsing and serializingpublic HtmlCleaner(ITagInfoProvider tagInfoProvider, CleanerProperties properties)
tagInfoProvider
- Provider for tag filtering and balancingproperties
- Properties used during parsing and serializingMethod Detail |
---|
public TagNode clean(String htmlContent)
public TagNode clean(File file, String charset) throws IOException
IOException
public TagNode clean(File file) throws IOException
IOException
@Deprecated public TagNode clean(URL url, String charset) throws IOException
url
- charset
-
IOException
public TagNode clean(URL url) throws IOException
url
-
IOException
public TagNode clean(InputStream in, String charset) throws IOException
IOException
public TagNode clean(InputStream in) throws IOException
IOException
public TagNode clean(Reader reader) throws IOException
IOException
protected TagNode clean(Reader reader, org.htmlcleaner.CleanTimeValues cleanTimeValues) throws IOException
reader
- (not closed)
IOException
protected boolean isRemovingNodeReasonablySafe(TagNode startTagToken)
startTagToken
-
public CleanerProperties getProperties()
protected Set<ITagNodeCondition> getPruneTagSet(org.htmlcleaner.CleanTimeValues cleanTimeValues)
protected Set<ITagNodeCondition> getAllowTagSet(org.htmlcleaner.CleanTimeValues cleanTimeValues)
protected void addPruneNode(TagNode node, org.htmlcleaner.CleanTimeValues cleanTimeValues)
protected Set<String> getAllTags(org.htmlcleaner.CleanTimeValues cleanTimeValues)
public ITagInfoProvider getTagInfoProvider()
public CleanerTransformations getTransformations()
public String getInnerHtml(TagNode node)
node
-
public void setInnerHtml(TagNode node, String content)
node
- content
- public void initCleanerTransformations(Map transInfos)
transInfos
-
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |