Package com.inet.editor
Class HtmlConverter
java.lang.Object
com.inet.editor.HtmlConverter
Utils class to convert text/plain to HTML and back
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classContainer class for the result of a html2text call -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringgetCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap) Writes the content of a document and removes anything which does not influence the visual appearance of the content.static StringgetCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions) Writes the content of a document and removes anything which does not influence the visual appearance of the content.static StringgetCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Writes the content of a document and removes anything which does not influence the visual appearance of the content.static @Nonnull StringgetCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<String, String> hrefMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Writes the content of a document and removes anything which does not influence the visual appearance of the content.static @Nonnull StringgetCompactHtmlText(String htmlText, Map<String, String> imageMap) Parses a HTML string and removes anything which does not influence the visual appearance of the content.static @Nonnull StringgetCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions) Parses a HTML string and removes anything which does not influence the visual appearance of the content.static @Nonnull StringgetCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Parses a HTML string and removes anything which does not influence the visual appearance of the content.static @Nonnull StringgetCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<String, String> hrefMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Parses a HTML string and removes anything which does not influence the visual appearance of the content.static StringgetInlinedHtml(String htmlText) Returns the inlined html from the text.
NOTE: It's an inline of the content without HTML or SPAN container elements so the style content is not encapsulated and may be affected by the target context.static @Nonnull Stringhtml2inlinedHtml(String htmlText) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements).static @Nonnull Stringhtml2inlinedHtml(String htmlText, boolean inlineImages) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)static @Nonnull Stringhtml2inlinedHtml(String htmlText, boolean inlineImages, boolean contentOnly, boolean compact, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)static @Nonnull Stringhtml2inlinedHtml(String htmlText, boolean inlineImages, boolean contentOnly, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)static @Nonnull Stringhtml2inlinedHtml(String htmlText, boolean inlineImages, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)static StringConverts HTML to plain text.static HtmlConverter.ConvertResultConverts HTML to plain text.static StringConverts a text/plain string into HTML.static StringConverts a text/plain string into HTML.static StringConverts a text/plain string into HTML.
-
Constructor Details
-
HtmlConverter
public HtmlConverter()
-
-
Method Details
-
text2html
Converts a text/plain string into HTML. The HTML body content will always begin with an opening P tag. Therefore it is recommended to use this method to convert text blocks.- Parameters:
plainText- the text to be convertedfont- the base font for the HTML content, may be null for none- Returns:
- the HTML formatted content
-
text2html
Converts a text/plain string into HTML.- Parameters:
plainText- the text to be convertedfont- the base font for the HTML content, may be null for nonestartWithP- set true, to start the content in an P element, false to start the content as inline text- Returns:
- the HTML formatted content
-
text2html
public static String text2html(String plainText, Font font, boolean startWithP, String defaultClass) Converts a text/plain string into HTML.- Parameters:
plainText- the text to be convertedfont- the base font for the HTML content, may be null for nonestartWithP- set true, to start the content in an P element, false to start the content as inline textdefaultClass- the default class attribute value for all generated elements. This class can be used to set a separate CSS style for all generated elements.- Returns:
- the HTML formatted content
-
html2text
Converts HTML to plain text. The conversion will replace BRs as well as block level elements by line breaks. HR will replaced as well. Tables are block level elements and will be handled as such. There is no distinct table conversion.- Parameters:
htmlText- the text to convert, anullvalue will return an empty string- Returns:
- the converted content, never
null
-
html2text
Converts HTML to plain text. The conversion will replace BRs as well as block level elements by line breaks. HR will replaced as well. Tables are block level elements and will be handled as such. There is no distinct table conversion.- Parameters:
htmlText- the text to convert, anullvalue will return an empty stringmaxLength- the maximum length of the output string, any further content will be discarded- Returns:
- the conversion result containing the text content and a flag whether mayLength was reached
-
html2inlinedHtml
Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements). The content off all images will be converted to inline images as well.- Parameters:
htmlText- html coded content- Returns:
- html coded content with all styles inline
-
html2inlinedHtml
Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)- Parameters:
htmlText- html coded contentinlineImages- if true, the content of the images will be converted to inlined data as well, eliminating all external references from this document.- Returns:
- html coded content with all styles inline
-
html2inlinedHtml
@Nonnull public static @Nonnull String html2inlinedHtml(String htmlText, boolean inlineImages, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)- Parameters:
htmlText- html coded contentinlineImages- if true, the content of the images will be converted to inlined data as well, eliminating all external references from this document.baseURL- the base url of the html content. Required to resolve relative URIs within the document. This parameter should be set if inlineImages is set to true!- Returns:
- html coded content with all styles inline
-
html2inlinedHtml
@Nonnull public static @Nonnull String html2inlinedHtml(String htmlText, boolean inlineImages, boolean contentOnly, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)- Parameters:
htmlText- html coded contentinlineImages- if true, the content of the images will be converted to inlined data as well, eliminating all external references from this document.contentOnly- iftrue, only the content of the body will be returned. This is recommended if the original content will be inserted into another document. In case offalse, a valid HTML document will be returnedbaseURL- the base url of the html content. Required to resolve relative URIs within the document. This parameter should be set if inlineImages is set to true!- Returns:
- html coded content with all styles inline
-
html2inlinedHtml
@Nonnull public static @Nonnull String html2inlinedHtml(String htmlText, boolean inlineImages, boolean contentOnly, boolean compact, URL baseURL) Converts HTML content with CSS references or global CSS definitions to a HTML with all styles defined inline(within the styles attributes of the elements)- Parameters:
htmlText- html coded contentinlineImages- if true, the content of the images will be converted to inlined data as well, eliminating all external references from this document.contentOnly- iftrue, only the content of the body will be returned. This is recommended if the original content will be inserted into another document. In case offalse, a valid HTML document will be returnedcompact- iftrue, the output will have not indents and fill spaces. Does not affect the rendering, only reduces the output sizebaseURL- the base url of the html content. Required to resolve relative URIs within the document. This parameter should be set if inlineImages is set to true!- Returns:
- html coded content with all styles inline
-
getCompactHtmlText
@Nonnull public static @Nonnull String getCompactHtmlText(String htmlText, Map<String, String> imageMap) Parses a HTML string and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
htmlText- the html content to be convertedimageMap- a map to replaces image source links- Returns:
- the compacted content or the original one in case of an error
-
getCompactHtmlText
@Nonnull public static @Nonnull String getCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions) Parses a HTML string and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
htmlText- the html content to be convertedimageMap- a map to replaces image source linkstagWritingOptions- map with special writing settings- Returns:
- the compacted content or the original one in case of an error
-
getCompactHtmlText
@Nonnull public static @Nonnull String getCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Parses a HTML string and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
htmlText- the html content to be convertedimageMap- a map to replaces image source linkstagWritingOptions- map with special writing settingstrustedImagePath- indicates whether the image paths in the image map are already encoded; So iftruethe paths will not be URL path encoded. This may lead to a corrupted file in case the paths are not properly encoded. When in doubt, set tofalseor don't use this method.- Returns:
- the compacted content or the original one in case of an error
-
getCompactHtmlText
@Nonnull public static @Nonnull String getCompactHtmlText(String htmlText, Map<String, String> imageMap, Map<String, String> hrefMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Parses a HTML string and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
htmlText- the html content to be convertedimageMap- a map to replaces image source linkshrefMap- a map to replace a-href linkstagWritingOptions- map with special writing settingstrustedImagePath- indicates whether the image paths in the image map are already encoded; So iftruethe paths will not be URL path encoded. This may lead to a corrupted file in case the paths are not properly encoded. When in doubt, set tofalseor don't use this method.- Returns:
- the compacted content or the original one in case of an error
- Since:
- 1.12
-
getCompactHtmlText
public static String getCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap) throws BadLocationExceptionWrites the content of a document and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
doc- the document content to be converted, must not be nullimageMap- a map to replaces image source links- Returns:
- the compacted content or the original one in case of an error
- Throws:
BadLocationException- thrown in case of a corrupt model
-
getCompactHtmlText
public static String getCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<HTML.Tag, throws BadLocationExceptionBoolean> tagWritingOptions) Writes the content of a document and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
doc- the document content to be converted, must not be nullimageMap- a map to replaces image source linkstagWritingOptions- map with special writing settings- Returns:
- the compacted content or the original one in case of an error
- Throws:
BadLocationException- thrown in case of a corrupt model
-
getCompactHtmlText
public static String getCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<HTML.Tag, throws BadLocationExceptionBoolean> tagWritingOptions, boolean trustedImagePath) Writes the content of a document and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
doc- the document content to be converted, must not be nullimageMap- a map to replaces image source linkstagWritingOptions- map with special writing settingstrustedImagePath- indicates whether the image paths in the image map are already encoded; So iftruethe paths will not be URL path encoded. This may lead to a corrupted file in case the paths are not properly encoded. When in doubt, set tofalseor don't use this method.- Returns:
- the compacted content or the original one in case of an error
- Throws:
BadLocationException- thrown in case of a corrupt model
-
getCompactHtmlText
@Nonnull public static @Nonnull String getCompactHtmlText(InetHtmlDocument doc, Map<String, String> imageMap, Map<String, throws BadLocationExceptionString> hrefMap, Map<HTML.Tag, Boolean> tagWritingOptions, boolean trustedImagePath) Writes the content of a document and removes anything which does not influence the visual appearance of the content. All IMG tags with SRC-link will be replaced using the imageMap- Parameters:
doc- the document content to be converted, must not be nullimageMap- a map to replaces image source linkshrefMap- map to replace a-href linkstagWritingOptions- map with special writing settingstrustedImagePath- indicates whether the image paths in the image map are already encoded; So iftruethe paths will not be URL path encoded. This may lead to a corrupted file in case the paths are not properly encoded. When in doubt, set tofalseor don't use this method.- Returns:
- the compacted content or the original one in case of an error
- Throws:
BadLocationException- thrown in case of a corrupt model- Since:
- 1.12
-
getInlinedHtml
Returns the inlined html from the text.
NOTE: It's an inline of the content without HTML or SPAN container elements so the style content is not encapsulated and may be affected by the target context.- Parameters:
htmlText- the html text- Returns:
- the inlined html text
-