Skip to content

httpClient.Transport #9

@ghost

Description

Currently I'm having a little problem with changing http client transport. It is caused by two facts:

  • httpClient (ext.go:120) is private
  • DefaultExtender.Fetch uses it

In my case I needed to change the standard transport to add timeouts:

client.Transport = &http.Transport{
            Dial: func(netw, addr string) (net.Conn, error) {
                deadline := time.Now().Add(ext.Info.Timeout)
                c, err := net.DialTimeout(netw, addr, ext.Info.Timeout)
                if err != nil {
                    return nil, err
                }
                c.SetDeadline(deadline)
                return c, nil
            },
        },

In current situation I ended up copypasting your default 'Fetch' code (as-is) and 'CheckRedirect' part of your httpClient (as-is). Also, I ended up copypasting your 'isRobotsTxt' func :)

But I think it would be better to allow somehow to change just parts of this logic, without rewriting it all.

For example, if we talk about Transport, it could be OK to add a Transport field to Options and to use it in the instantiation code (probably then it should be moved from package-level vars to DefaultExtender):

httpClient = &http.Client{Transport: options.Transport,  CheckRedirect: func(req *http.Request, via []*http.Request) error {
    // For robots.txt URLs, allow up to 10 redirects, like the default http client.
    // Rationale: the site owner explicitly tells us that this specific robots.txt
    // should be used for this domain.
    if isRobotsTxtUrl(req.URL) {
        if len(via) >= 10 {
            return errors.New("stopped after 10 redirects")
        }
        return nil
    }

    // For all other URLs, do NOT follow redirections, the default Fetch() implementation
    // will ask the worker to enqueue the new (redirect-to) URL. Returning an error
    // will make httpClient.Do() return a url.Error, with the URL field containing the new URL.
    return &EnqueueRedirectError{"redirection not followed"}
}}

I'm not sure if it is already in your package reorganization plans, I thought that I should submit it just in case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions