public interface CommonCrawlFormat extends Closeable
Modifier and Type | Method and Description |
---|---|
void |
close()
Optional method that could be implemented if the actual format needs some
close procedure.
|
List<String> |
getInLinks()
gets set of inlinks
|
String |
getJsonData() |
String |
getJsonData(String url,
Content content,
Metadata metadata)
Returns a string representation of the JSON structure of the URL content
|
String |
getJsonData(String url,
Content content,
Metadata metadata,
ParseData parseData)
Returns a string representation of the JSON structure of the URL content
takes into account the parsed metadata about the URL
|
void |
setInLinks(List<String> inLinks)
sets inlinks of this document
|
String getJsonData() throws IOException
IOException
String getJsonData(String url, Content content, Metadata metadata) throws IOException
url
- content
- metadata
- IOException
String getJsonData(String url, Content content, Metadata metadata, ParseData parseData) throws IOException
url
- content
- metadata
- IOException
void setInLinks(List<String> inLinks)
inLinks
- list of inlinksvoid close()
close
in interface AutoCloseable
close
in interface Closeable
Copyright © 2021 The Apache Software Foundation