Package | Description |
---|---|
org.apache.nutch.protocol.htmlunit |
Protocol plugin which supports retrieving documents via the http protocol.
|
org.apache.nutch.protocol.http |
Protocol plugin which supports retrieving documents via the http protocol.
|
org.apache.nutch.protocol.http.api |
Common API used by HTTP plugins (
http ,
httpclient ) |
org.apache.nutch.protocol.httpclient |
Protocol plugin which supports retrieving documents via the HTTP and
HTTPS protocols, optionally with Basic, Digest and NTLM authentication
schemes for web server as well as proxy server.
|
org.apache.nutch.protocol.interactiveselenium |
Protocol plugin which supports retrieving documents via selenium.
|
org.apache.nutch.protocol.okhttp |
Protocol plugin based on okhttp, supports http, https, http/2.
|
org.apache.nutch.protocol.selenium |
Protocol plugin which supports retrieving documents via selenium.
|
Modifier and Type | Class and Description |
---|---|
class |
HttpResponse
An HTTP response.
|
Modifier and Type | Method and Description |
---|---|
protected Response |
Http.getResponse(URL url,
CrawlDatum datum,
boolean redirect) |
Modifier and Type | Method and Description |
---|---|
protected Response |
Http.getResponse(URL url,
CrawlDatum datum,
boolean redirect) |
Modifier and Type | Method and Description |
---|---|
protected abstract Response |
HttpBase.getResponse(URL url,
CrawlDatum datum,
boolean followRedirects) |
Modifier and Type | Method and Description |
---|---|
protected void |
HttpRobotRulesParser.addRobotsContent(List<Content> robotsTxtContent,
URL robotsUrl,
Response robotsResponse)
Append
Content of robots.txt to robotsTxtContent |
Modifier and Type | Method and Description |
---|---|
protected Response |
Http.getResponse(URL url,
CrawlDatum datum,
boolean redirect)
Fetches the
url with a configured HTTP client and gets the
response. |
Modifier and Type | Method and Description |
---|---|
protected Response |
Http.getResponse(URL url,
CrawlDatum datum,
boolean redirect) |
Modifier and Type | Class and Description |
---|---|
class |
OkHttpResponse |
Modifier and Type | Method and Description |
---|---|
protected Response |
OkHttp.getResponse(URL url,
CrawlDatum datum,
boolean redirect) |
Modifier and Type | Method and Description |
---|---|
protected Response |
Http.getResponse(URL url,
CrawlDatum datum,
boolean redirect) |
Copyright © 2021 The Apache Software Foundation