org.htmlcleaner
Class Utils

java.lang.Object
  extended by org.htmlcleaner.Utils

public class Utils
extends Object

Common utilities.

Created by: Vladimir Nikic
Date: November, 2006.


Field Summary
static Pattern DECIMAL
           
static Pattern HEX_RELAXED
           
static Pattern HEX_STRICT
           
 
Constructor Summary
Utils()
           
 
Method Summary
static String escapeHtml(String s, CleanerProperties props)
          Escapes HTML string
static String escapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR)
          change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character.
static String escapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR, boolean isHtmlOutput)
          change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character.
static String escapeXml(String s, CleanerProperties props, boolean isDomCreation)
          Escapes XML string.
static String getXmlName(String name)
           
static String getXmlNSPrefix(String name)
           
static boolean isEmptyString(Object o)
           
static boolean isIdentifierHelperChar(char ch)
          Checks if specified character can be part of xml identifier (tag name of attribute name) and is not standard identifier character.
static boolean isValidXmlIdentifier(String s)
          Checks whether specified string can be valid tag name or attribute name in xml.
static boolean isWhitespaceString(Object object)
          Checks whether specified object's string representation is empty string (containing of only whitespaces).
static String ltrim(String s)
          Trims specified string from left.
static String rtrim(String s)
          Trims specified string from right.
static String[] tokenize(String s, String delimiters)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

HEX_STRICT

public static Pattern HEX_STRICT

HEX_RELAXED

public static Pattern HEX_RELAXED

DECIMAL

public static Pattern DECIMAL
Constructor Detail

Utils

public Utils()
Method Detail

escapeHtml

public static String escapeHtml(String s,
                                CleanerProperties props)
Escapes HTML string

Parameters:
s - String to be escaped
props - Cleaner properties affects escaping behaviour
Returns:

escapeXml

public static String escapeXml(String s,
                               CleanerProperties props,
                               boolean isDomCreation)
Escapes XML string.

Parameters:
s - String to be escaped
props - Cleaner properties affects escaping behaviour
isDomCreation - Tells if escaped content will be part of the DOM

escapeXml

public static String escapeXml(String s,
                               boolean advanced,
                               boolean recognizeUnicodeChars,
                               boolean translateSpecialEntities,
                               boolean isDomCreation,
                               boolean transResCharsToNCR,
                               boolean translateSpecialEntitiesToNCR)
change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character. 3) convert html special entities to xml &#xxx; when outputing in xml

Parameters:
s -
advanced -
recognizeUnicodeChars -
translateSpecialEntities -
isDomCreation -
Returns:
TODO Consider moving to CleanerProperties since a long list of params is misleading.

escapeXml

public static String escapeXml(String s,
                               boolean advanced,
                               boolean recognizeUnicodeChars,
                               boolean translateSpecialEntities,
                               boolean isDomCreation,
                               boolean transResCharsToNCR,
                               boolean translateSpecialEntitiesToNCR,
                               boolean isHtmlOutput)
change notes: 1) convert ascii characters encoded using &#xx; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert &#xxx; format characters to " style representation if available for the character. 3) convert html special entities to xml &#xxx; when outputing in xml

Parameters:
s -
advanced -
recognizeUnicodeChars -
translateSpecialEntities -
isDomCreation -
isHtmlOutput -
Returns:
TODO Consider moving to CleanerProperties since a long list of params is misleading.

isIdentifierHelperChar

public static boolean isIdentifierHelperChar(char ch)
Checks if specified character can be part of xml identifier (tag name of attribute name) and is not standard identifier character.

Parameters:
ch - Character to be checked
Returns:
True if it can be part of xml identifier

isValidXmlIdentifier

public static boolean isValidXmlIdentifier(String s)
Checks whether specified string can be valid tag name or attribute name in xml.

Parameters:
s - String to be checked
Returns:
True if string is valid xml identifier, false otherwise

isEmptyString

public static boolean isEmptyString(Object o)
Parameters:
o -
Returns:
True if specified string is null of contains only whitespace characters

tokenize

public static String[] tokenize(String s,
                                String delimiters)

getXmlNSPrefix

public static String getXmlNSPrefix(String name)
Parameters:
name -
Returns:
For xml element name or attribute name returns prefix (part before :) or null if there is no prefix

getXmlName

public static String getXmlName(String name)
Parameters:
name -
Returns:
For xml element name or attribute name returns name after prefix (part after :)

ltrim

public static String ltrim(String s)
Trims specified string from left.

Parameters:
s -

rtrim

public static String rtrim(String s)
Trims specified string from right.

Parameters:
s -

isWhitespaceString

public static boolean isWhitespaceString(Object object)
Checks whether specified object's string representation is empty string (containing of only whitespaces).

Parameters:
object - Object whose string representation is checked
Returns:
true, if empty string, false otherwise


Copyright © 2006-2014. All Rights Reserved.