Package | Description |
---|---|
org.apache.nutch.analysis.lang |
Text document language identifier.
|
org.apache.nutch.any23 |
This packages uses the Apache Any23 library
for parsing and extracting structured data in RDF format from a
variety of Web documents.
|
org.apache.nutch.exchange |
Control code for exchange component, which acts in indexing job and decides to
which index writer a document should be routed, based on plugins behavior.
|
org.apache.nutch.exchange.jexl |
Plugin of Exchange component based on JEXL expressions.
|
org.apache.nutch.indexer |
Index content, configure and run indexing and cleaning jobs to
add, update, and delete documents from an index.
|
org.apache.nutch.indexer.anchor |
An indexing plugin for inbound anchor text.
|
org.apache.nutch.indexer.basic |
A basic indexing plugin, adds basic fields: url, host, title, content, etc.
|
org.apache.nutch.indexer.feed |
Indexing filter to index meta data from RSS feeds.
|
org.apache.nutch.indexer.filter | |
org.apache.nutch.indexer.geoip |
This plugin implements an indexing filter which takes
advantage of the
GeoIP2-java API.
|
org.apache.nutch.indexer.jexl |
This plugin implements a dynamic indexing filter which uses JEXL
expressions to allow filtering based on the page's metadata
|
org.apache.nutch.indexer.links | |
org.apache.nutch.indexer.metadata |
Indexing filter to add document metadata to the index.
|
org.apache.nutch.indexer.more |
A more indexing plugin, adds "more" index fields:
last modified date, MIME type, content length.
|
org.apache.nutch.indexer.replace |
Indexing filter to allow pattern replacements on metadata.
|
org.apache.nutch.indexer.staticfield |
A simple plugin called at indexing that adds fields with static data.
|
org.apache.nutch.indexer.subcollection |
Indexing filter to assign documents to subcollections.
|
org.apache.nutch.indexer.tld |
Top Level Domain Indexing plugin.
|
org.apache.nutch.indexer.urlmeta |
URL Meta Tag Indexing Plugin
|
org.apache.nutch.indexwriter.cloudsearch | |
org.apache.nutch.indexwriter.csv |
Index writer plugin to write a plain CSV file.
|
org.apache.nutch.indexwriter.dummy |
Index writer plugin for debugging, writes pairs of <action, url> to a
text file, action is one of "add", "update", or "delete".
|
org.apache.nutch.indexwriter.elastic |
Index writer plugin for Elasticsearch.
|
org.apache.nutch.indexwriter.kafka |
Index writer plugin to produce JSON messages to Kafka.
|
org.apache.nutch.indexwriter.rabbit | |
org.apache.nutch.indexwriter.solr |
Index writer plugin for Apache Solr.
|
org.apache.nutch.microformats.reltag |
A microformats Rel-Tag
Parser/Indexer/Querier plugin.
|
org.apache.nutch.scoring |
The
ScoringFilter interface. |
org.apache.nutch.scoring.depth |
Scoring filter to stop crawling at a configurable depth
(number of "hops" from seed URLs).
|
org.apache.nutch.scoring.link |
Scoring filter used in conjunction with
WebGraph . |
org.apache.nutch.scoring.opic |
Scoring filter implementing a variant of the Online Page Importance Computation
(OPIC) algorithm.
|
org.apache.nutch.scoring.tld |
Top Level Domain Scoring plugin.
|
org.apache.nutch.scoring.urlmeta |
URL Meta Tag Scoring Plugin
|
org.apache.nutch.tools |
Miscellaneous tools.
|
org.creativecommons.nutch |
Sample plugins that parse and index Creative Commons medadata.
|
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexWriterParams |
IndexWriters
Creates and caches
IndexWriter implementing plugins. |
NutchDocument
A
NutchDocument is the unit of indexing. |
NutchField
This class represents a multi-valued field with a weight.
|
NutchIndexAction
A
NutchIndexAction is the new unit of indexing holding the document
and action information. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexWriter |
IndexWriterParams |
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
NutchDocument
A
NutchDocument is the unit of indexing. |
Class and Description |
---|
IndexingException |
IndexingFilter
Extension point for indexing.
|
NutchDocument
A
NutchDocument is the unit of indexing. |
Copyright © 2021 The Apache Software Foundation