Package | Description |
---|---|
org.apache.nutch.crawl |
Crawl control code and tools to run the crawler.
|
org.apache.nutch.fetcher |
The Nutch robot.
|
org.apache.nutch.hostdb | |
org.apache.nutch.indexer |
Index content, configure and run indexing and cleaning jobs to
add, update, and delete documents from an index.
|
org.apache.nutch.metadata |
A Multi-valued Metadata container, and set
of constant fields for Nutch Metadata.
|
org.apache.nutch.scoring.webgraph | |
org.apache.nutch.segment |
A segment stores all data from on generate/fetch/update cycle:
fetch list, protocol status, raw content, parsed content, and extracted outgoing links.
|
org.apache.nutch.tools.warc |
Tools to import / export between Nutch segments and
WARC archives.
|
Modifier and Type | Method and Description |
---|---|
void |
CrawlDbReader.CrawlDbStatReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Modifier and Type | Method and Description |
---|---|
RecordWriter<Text,NutchWritable> |
FetcherOutputFormat.getRecordWriter(TaskAttemptContext context) |
Modifier and Type | Method and Description |
---|---|
void |
UpdateHostDbReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Modifier and Type | Method and Description |
---|---|
void |
IndexerMapReduce.IndexerReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Modifier and Type | Class and Description |
---|---|
class |
MetaWrapper
This is a simple decorator that adds metadata to any Writable-s that can be
serialized by NutchWritable.
|
Modifier and Type | Method and Description |
---|---|
void |
WebGraph.OutlinkDb.OutlinkDbReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Modifier and Type | Method and Description |
---|---|
void |
SegmentReader.InputCompatReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Modifier and Type | Method and Description |
---|---|
void |
WARCExporter.WARCMapReduce.WARCReducer.reduce(Text key,
Iterable<NutchWritable> values,
Reducer.Context context) |
Copyright © 2021 The Apache Software Foundation