public class CommonCrawlFormatFactory extends Object
CommonCrawlFormat
objects (a.k.a. formatter) that map crawled files to CommonCrawl format.Constructor and Description |
---|
CommonCrawlFormatFactory() |
Modifier and Type | Method and Description |
---|---|
static CommonCrawlFormat |
getCommonCrawlFormat(String formatType,
Configuration nutchConf,
CommonCrawlConfig config) |
static CommonCrawlFormat |
getCommonCrawlFormat(String formatType,
String url,
Content content,
Metadata metadata,
Configuration nutchConf,
CommonCrawlConfig config)
Deprecated.
|
public static CommonCrawlFormat getCommonCrawlFormat(String formatType, String url, Content content, Metadata metadata, Configuration nutchConf, CommonCrawlConfig config) throws IOException
CommonCrawlFormat
object specifying the type of formatter.formatType
- the type of formatter to be created.url
- the url.content
- the content.metadata
- the metadata.nutchConf
- the configuration.config
- the CommonCrawl output configuration.CommonCrawlFormat
object.IOException
- If any I/O error occurs.public static CommonCrawlFormat getCommonCrawlFormat(String formatType, Configuration nutchConf, CommonCrawlConfig config) throws IOException
IOException
Copyright © 2021 The Apache Software Foundation