Package | Description |
---|---|
org.apache.nutch.crawl |
Crawl control code and tools to run the crawler.
|
org.apache.nutch.indexer |
Index content, configure and run indexing and cleaning jobs to
add, update, and delete documents from an index.
|
org.apache.nutch.net |
Web-related interfaces: URL
filters
and normalizers . |
org.apache.nutch.parse |
The
Parse interface and related classes. |
Modifier and Type | Class and Description |
---|---|
class |
CrawlDbReader
Read utility for the CrawlDB.
|
class |
LinkDbReader
Read utility for the LinkDb.
|
Modifier and Type | Class and Description |
---|---|
class |
IndexingFiltersChecker
Reads and parses a URL and run the indexers on it.
|
Modifier and Type | Class and Description |
---|---|
class |
URLFilterChecker
Checks one given filter or all filters.
|
class |
URLNormalizerChecker
Checks one given normalizer or all normalizers.
|
Modifier and Type | Class and Description |
---|---|
class |
ParserChecker
Parser checker, useful for testing parser.
|
Copyright © 2021 The Apache Software Foundation