Package | Description |
---|---|
org.apache.nutch.protocol |
Classes related to the
Protocol interface,
see also org.apache.nutch.net.protocols . |
org.apache.nutch.protocol.file |
Protocol plugin which supports retrieving local file resources.
|
org.apache.nutch.protocol.ftp |
Protocol plugin which supports retrieving documents via the ftp protocol.
|
org.apache.nutch.protocol.http.api |
Common API used by HTTP plugins (
http ,
httpclient ) |
org.apache.nutch.util |
Miscellaneous utility classes.
|
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
Protocol.getProtocolOutput(Text url,
CrawlDatum datum)
Returns the
Content for a fetchlist entry. |
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
File.getProtocolOutput(Text url,
CrawlDatum datum)
Creates a
FileResponse object corresponding to the url and return a
ProtocolOutput object as per the content received |
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
Ftp.getProtocolOutput(Text url,
CrawlDatum datum)
Creates a
FtpResponse object corresponding to the url and returns a
ProtocolOutput object as per the content received |
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
HttpBase.getProtocolOutput(Text url,
CrawlDatum datum) |
Modifier and Type | Method and Description |
---|---|
protected ProtocolOutput |
AbstractChecker.getProtocolOutput(String url,
CrawlDatum datum,
boolean checkRobotsTxt) |
Copyright © 2021 The Apache Software Foundation