-
-
Notifications
You must be signed in to change notification settings - Fork 194
Description
Currently I'm having a little problem with changing http client transport. It is caused by two facts:
- httpClient (ext.go:120) is private
- DefaultExtender.Fetch uses it
In my case I needed to change the standard transport to add timeouts:
client.Transport = &http.Transport{
Dial: func(netw, addr string) (net.Conn, error) {
deadline := time.Now().Add(ext.Info.Timeout)
c, err := net.DialTimeout(netw, addr, ext.Info.Timeout)
if err != nil {
return nil, err
}
c.SetDeadline(deadline)
return c, nil
},
},
In current situation I ended up copypasting your default 'Fetch' code (as-is) and 'CheckRedirect' part of your httpClient (as-is). Also, I ended up copypasting your 'isRobotsTxt' func :)
But I think it would be better to allow somehow to change just parts of this logic, without rewriting it all.
For example, if we talk about Transport, it could be OK to add a Transport field to Options and to use it in the instantiation code (probably then it should be moved from package-level vars to DefaultExtender):
httpClient = &http.Client{Transport: options.Transport, CheckRedirect: func(req *http.Request, via []*http.Request) error {
// For robots.txt URLs, allow up to 10 redirects, like the default http client.
// Rationale: the site owner explicitly tells us that this specific robots.txt
// should be used for this domain.
if isRobotsTxtUrl(req.URL) {
if len(via) >= 10 {
return errors.New("stopped after 10 redirects")
}
return nil
}
// For all other URLs, do NOT follow redirections, the default Fetch() implementation
// will ask the worker to enqueue the new (redirect-to) URL. Returning an error
// will make httpClient.Do() return a url.Error, with the URL field containing the new URL.
return &EnqueueRedirectError{"redirection not followed"}
}}
I'm not sure if it is already in your package reorganization plans, I thought that I should submit it just in case.