Skip to content

Document exact semantics of DNS interfaces #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 15, 2025

Conversation

gucci-on-fleek
Copy link
Collaborator

@gucci-on-fleek gucci-on-fleek commented Mar 9, 2025

I'm posting these documentation suggestions as a draft PR, as suggested in #145 (comment). I'll discuss the specific changes below, but some more general comments:

  1. I've generally tried to document this “how they work right now” rather than “how they should work”, although I haven't actually looked through very many provider implementations, so some of my assumptions may be incorrect.
  2. I have very little Go experience, so I'm not very familiar with the Go documentation conventions.
  3. I'm just a hobbyist and certainly not a DNS expert, so if something looks incorrect, then it's probably me that made the mistake.

Thanks!

Closes #145.

libdns.go Outdated
@@ -43,7 +47,8 @@ import (

// RecordGetter can get records from a DNS zone.
type RecordGetter interface {
// GetRecords returns all the records in the DNS zone.
// GetRecords returns all the records in the DNS zone. DNSSEC-specific
// records may or may not be included in an implementation-defined way.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale: Most providers implement DNSSEC automatically (or not at all) rather than expecting users to manually sign their zone. Because of this, many providers don't include the DNSSEC records in their standard "get zone" API calls, even though they're technically just standard records in the zone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; maybe we should be even clearer about this so users don't have to change their code based on the specific providers being used (which may not even be known).

Gah, this is a frustrating one. If a DNS provider expects users to configure DNSSEC manually (via actual DNS records), then the API needs to support this. If it abstracts away those records (which it sounds like many do), then it's annoying to have every application's code potentially filter them. But then again, it's annoying to expect all DNS providers to work the same in this regard. Both adding and stripping DNSSEC records artificially is annoying and error-prone.

Maybe we should clarify DNSSEC records usually aren't exposed via this API, but provider implementations should clearly document if DNSSEC records are included. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should clarify DNSSEC records usually aren't exposed via this API, but provider implementations should clearly document if DNSSEC records are included. What do you think?

One possible problem is that even if GetRecords returns DNSSEC records, you might not be able to set them using SetRecords. That's the case at least for libdns/rfc2136 + Knot (which is what I'm using). So if we say that it's implementation defined behaviour, then

recs, err := GetRecords(ctx, "example.com")
recs, err = SetRecords(ctx, "example.com", recs)

will work for some implementations and fail for others. But if we say "you must always filter out DNSSEC records", then it would be impossible to do anything involving DNSSEC.

Copy link
Contributor

@mholt mholt Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it definitely feels weird if that code doesn't work doesn't it... I feel like that should work.

What if the provider's DNSSEC behavior can be configured? Somewhere, there is a concrete type that the user has configured (with API key, etc). By default, all DNSSEC gets filtered out by all providers -- that way, calls always work reliably. But if a provider wanted to enable it, it could be configurable.

(If we do this, we should document what DNSSEC records look like. Maybe even provide some boilerplate code for them or an if statement or something. But I'm not familiar with DNSSEC records... what would they look like?)

Comment on lines -72 to -75
// Records that have an ID associating it with a particular resource
// on the provider will be directly replaced. If no ID is given, this
// method may use what information is given to do lookups and will
// ensure that only necessary changes are made to the zone.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale: We want for the interfaces to have consistent semantics between different providers, but not all providers give access to something like a "record ID", so there's no way for something like this to give consistent results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think record IDs are crucial, when available, and should be honored, so I wonder if a "simple" algorithm for identifying records would be useful.

  • If an ID is set, the method affects only those records.
  • Otherwise, a record is identified by its Name+Type ("RRset" -- which the RFC defines as a tuple of "label, class, and type" but I'm not sure what "class" in the spec means. I think label must be the name.)
  • For DeleteRecords: if a Value is set, only matching records with that value are deleted.

Or, the alternative is to introduce a third parameter for SetRecords and DeleteRecords that is exclusively for identifying records, rather than doubling up the Records input to both identify and mutate.

Since we're not 1.0 yet, I'm okay with breaking changes, but I'd like to only do it once at this point.

If we introduce the third parameter, I'd probably have to go around to the top repos, at least, and either make the patch for them, or notify them of the upcoming change, as a courtesy.

My preferred approach is that we recommend an ID-ing algorithm so that libraries can update when they are able without major breakage. (Most existing uses of the APIs won't be affected by these changing semantics.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I'm not sure what "class" in the spec means

The “Class” can either be IN for Internet, CH for Chaosnet, or HS for Hesiod. I know of exactly 1 modern use for CH; otherwise, the class will always be IN.

I do think record IDs are crucial, when available, and should be honored, so I wonder if a "simple" algorithm for identifying records would be useful.

  • If an ID is set, the method affects only those records.

  • Otherwise, a record is identified by its Name+Type ("RRset" -- which the RFC defines as a tuple of "label, class, and type" but I'm not sure what "class" in the spec means. I think label must be the name.)

The problem is that this could be really dangerous for providers that don't implement IDs. Suppose that you're making a new go-spf program that sets SPF records for you. Then if you wanted to add an “append” command that adds additional domains to the current SPF record, one possible implementation could look like this:

func AppendSpf(zone string, domain string, contents string) {
    ctx = Context.TODO()
    recs, err := provider.GetRecords(ctx, zone)

    var target libdns.Record
    var value string
    for _, rec := range records {
        if rec.Name == domain && rec.Type == "TXT" {
            value = rec.Value
            if strings.HasPrefix(rec.Value, "v=spf1") {
                target = rec
                break
            }
        }
    }

    value = strings.TrimSuffix(value, " ~all") + " " + contents + " ~all"
    target.Value = value
    provider.SetRecords(ctx, zone, []libdns.Record{target})
}

Then, if you're using a provider that sets IDs on every record, you'd test the program and everything would look good:

$ dig +noall +answer caddyserver.com txt
caddyserver.com.	300	IN	TXT	"zoho-verification=zb14563515.zmverify.zoho.com"
caddyserver.com.	300	IN	TXT	"v=spf1 include:zoho.com include:servers.mcsv.net ~all"

$ go-spf -zone "caddyserver.com" -domain "@" -append "a:example.com"

$ dig +noall +answer caddyserver.com txt
caddyserver.com.	300	IN	TXT	"zoho-verification=zb14563515.zmverify.zoho.com"
caddyserver.com.	300	IN	TXT	"v=spf1 include:zoho.com include:servers.mcsv.net a:example.com ~all"

But if I were then to run the program with a provider that doesn't support IDs, I'd get the following, which is definitely not good:

$ dig +noall +answer maxchernoff.ca txt
maxchernoff.ca.		86400	IN	TXT	"hosted-email-verify=ghwfivk6"
maxchernoff.ca.		86400	IN	TXT	"v=spf1 include:spf.migadu.com mx:tug.org ~all"
maxchernoff.ca.		86400	IN	TXT	"security_contact=mailto:security@maxchernoff.ca"
maxchernoff.ca.		86400	IN	TXT	"security_contact=https://www.maxchernoff.ca/#contact"
maxchernoff.ca.		86400	IN	TXT	"google-site-verification=gWIJ3Mg-zy1MuwAJHw8PkhOEENqOmLxUNslbQ4ZPfAE"

$ go-spf -zone "caddyserver.com" -domain "@" -append "a:example.com"

$ dig +noall +answer maxchernoff.ca txt
maxchernoff.ca.		86400	IN	TXT	"v=spf1 include:spf.migadu.com mx:tug.org a:example.com ~all"

Or, the alternative is to introduce a third parameter for SetRecords and DeleteRecords that is exclusively for identifying records, rather than doubling up the Records input to both identify and mutate.

Or maybe deprecate SetRecords and add two new interfaces SetRecordsByRrset and SetRecordsById, with only SetRecordsByRrset mandatory to implement? Or maybe even just deprecate SetRecords entirely and require users to only use AppendRecords and DeleteRecords?

If we introduce the third parameter, I'd probably have to go around to the top repos, at least, and either make the patch for them, or notify them of the upcoming change, as a courtesy.

Yes, for sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.

What if records with IDs were rejected with an error if the provider does not support IDs? Obviously, IDs are the ideal way to manage records, so it'd be nice to make it possible to use them if available, but it's true that they aren't portable across providers.

(Note that I'm talking about cross-provider portability here, not cross-zone portability like I was with my argument for making record names zone-relative. My hunch is that moving records around across zones is fairly common, and moving records across providers is less common.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if records with IDs were rejected with an error if the provider does not support IDs?

The problem that I was suggesting was that if the maintainer of a module that uses libdns only ever tested it with ID-supporting providers, then the module might inadvertently delete records for non-ID-supporting providers. This would be easily noticed if the module maintainer tested with multiple providers, but most people only have accounts with 1 or 2 DNS providers at most, so this isn't that easy to do.

Or maybe I'm just misunderstanding something here? I haven't really used any of the libdns APIs myself, so this is quite likely.

Obviously, IDs are the ideal way to manage records, so it'd be nice to make it possible to use them if available,

Agreed. I wonder if it's possible to always synthesize an ID? Like, if the IDs are just an opaque string, could the ID just fallback to being the complete DNS record as a zone-file-formatted string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem that I was suggesting was that if the maintainer of a module that uses libdns only ever tested it with ID-supporting providers, then the module might inadvertently delete records for non-ID-supporting providers. ...

Or maybe I'm just misunderstanding something here?

No, I do think you're right.

What if we document that IDs should be stripped if using records across providers? We could even provide a helper method, I dunno.

And providers should reject records with IDs if they don't support record IDs, by returning an error.

I just think for the use cases where records have IDs, that will be invaluable. But yeah, for portability, we need to strip IDs, but that's an explicit "I am porting records across providers" which probably won't happen super often.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we document that IDs should be stripped if using records across providers?

[...]

but that's an explicit "I am porting records across providers" which probably won't happen super often.

But even within one zone/provider, this might lead to issues. I think (although I may be misunderstanding) that what you're proposing is

  • For ID-supporting-providers, SetRecords only updates records with matching IDs
  • For non-ID-supporting-providers, SetRecords updates records with matching RRsets.

Each of these are both reasonable definitions for SetRecords, but they really have very little in common. If there were two separate functions SetRecordsById and SetRecordsByRRset, then I think that it would break most applications to randomly swap the definitions of those functions; so similarly, I think that having SetRecords map to either SetRecordsById or SetRecordsByRRset depending on the current provider will lead to bugs.

I guess what the best choice to use here depends on how many providers support record IDs. If most of them support IDs, then having SetRecords == SetRecordsById would probably make sense since the semantics of SetRecordsById are easier to reason about. This would be a problem for providers that don't support IDs, but it there's very few of them, then we could just require them to synthesize an ID and figure out how to handle that themselves. But if lots of providers don't support IDs, then having SetRecords == SetRecordsByRRset probably makes the most sense because SetRecordsByRRset is essentially the lowest-common-denominator that has to be supported by all providers.

I think that I'm probably just misunderstanding something here though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmmmm! No, that makes sense, I'm glad you've thought about it more carefully than I have.

If I get you right, you're basically saying: "If there is an ID, use that as the ID. If there is not an ID, use the RRset as an ID."

There may be unpleasant surprises if a record has an ID but the provider doesn't support them (or vice-versa: a record doesn't have an ID but the provider supports them), whereas all records are part of an RRset.

So... gosh, I dunno, maybe we drop the idea of an ID field?

I almost wonder if an "equivalence" type or function could be defined to help customize what equality means when comparing DNS records. But that would be a matter for another discussion I think (unless you have some ideas on that now).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... gosh, I dunno, maybe we drop the idea of an ID field?

Yeah, I think that that's the only fully-portable option here.

I almost wonder if an "equivalence" type or function could be defined to help customize what equality means when comparing DNS records. But that would be a matter for another discussion I think (unless you have some ideas on that now).

Hmm, that's an interesting idea. I haven't thought about it too much though, so we can probably leave that for later.

libdns.go Outdated
Comment on lines 125 to 129
// DeleteRecords deletes the given records from the zone if they exist in
// the zone and exactly match the input. If the input records do not exist
// in the zone, they are silently ignored. `DeleteRecords` returns only the
// the records that were deleted, and does not return any records that were
// provided in the input but did not exist in the zone.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale: I think that this is the semantics intended by the current interfaces? But it wasn't obvious before if trying to delete a non-existent record would lead to an error or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deleting should be idempotent, i.e. if it already doesn't exist, it's not an error.

In fact, I wonder if the semantics of Delete could be extended to include the Value of the records being passed in. In other words, "If the record exists and has this value, delete it." That should help prevent accidentally deleting records of the same name/type but maybe with a different value.

Of course, differentiating "empty value" and "value is irrelevant (nil)" will be difficult with the current API, since value is not a pointer. Maybe that's okay though since an empty record is basically as good as none right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deleting should be idempotent, i.e. if it already doesn't exist, it's not an error.

👍

In fact, I wonder if the semantics of Delete could be extended to include the Value of the records being passed in. In other words, "If the record exists and has this value, delete it." That should help prevent accidentally deleting records of the same name/type but maybe with a different value.

That seems like a good idea to me; I'll add that in.

Of course, differentiating "empty value" and "value is irrelevant (nil)" will be difficult with the current API, since value is not a pointer. Maybe that's okay though since an empty record is basically as good as none right?

I think that that should be okay. The only record type that I'm aware of that's valid with an empty value is TXT, but an empty TXT is mostly useless, so this should be fine. Depending on how we resolve the "@"-vs-"" problem for the root domain, an empty name could be ambiguous, but I see no need to support “delete any records with this type/value, regardless of their name”. And a TTL of 0 seconds is also valid, but all records in an RRset are supposed to have the same TTL, so this should never be ambiguous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Copy link
Contributor

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! Thanks for taking initiative on this.

Some difficult questions to answer, for sure. Let's keep hashing this out!

I like the use of examples, nicely done on those.

libdns.go Outdated
@@ -43,7 +47,8 @@ import (

// RecordGetter can get records from a DNS zone.
type RecordGetter interface {
// GetRecords returns all the records in the DNS zone.
// GetRecords returns all the records in the DNS zone. DNSSEC-specific
// records may or may not be included in an implementation-defined way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; maybe we should be even clearer about this so users don't have to change their code based on the specific providers being used (which may not even be known).

Gah, this is a frustrating one. If a DNS provider expects users to configure DNSSEC manually (via actual DNS records), then the API needs to support this. If it abstracts away those records (which it sounds like many do), then it's annoying to have every application's code potentially filter them. But then again, it's annoying to expect all DNS providers to work the same in this regard. Both adding and stripping DNSSEC records artificially is annoying and error-prone.

Maybe we should clarify DNSSEC records usually aren't exposed via this API, but provider implementations should clearly document if DNSSEC records are included. What do you think?

libdns.go Outdated
Comment on lines 125 to 129
// DeleteRecords deletes the given records from the zone if they exist in
// the zone and exactly match the input. If the input records do not exist
// in the zone, they are silently ignored. `DeleteRecords` returns only the
// the records that were deleted, and does not return any records that were
// provided in the input but did not exist in the zone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deleting should be idempotent, i.e. if it already doesn't exist, it's not an error.

In fact, I wonder if the semantics of Delete could be extended to include the Value of the records being passed in. In other words, "If the record exists and has this value, delete it." That should help prevent accidentally deleting records of the same name/type but maybe with a different value.

Of course, differentiating "empty value" and "value is irrelevant (nil)" will be difficult with the current API, since value is not a pointer. Maybe that's okay though since an empty record is basically as good as none right?

libdns.go Outdated

// The `TTL` field specifies the time-to-live of the record. This is
// represented in the DNS as an unsigned integral number of seconds, but
// is provided here as a time.Duration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should mention that fractional seconds are truncated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should mention that fractional seconds are truncated?

Good idea, I'll fix that in the next push. Also, I'll add a note saying that some providers might reject or silently increase TTLs that are considered to be “too low”.

Copy link
Contributor

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the discussion! I think we're making progress. Curious to know your thoughts.

I don't expect you to design a bunch of new struct types, but I think we are starting to converge on a plan for a stable 1.0 libdns API.

Comment on lines -72 to -75
// Records that have an ID associating it with a particular resource
// on the provider will be directly replaced. If no ID is given, this
// method may use what information is given to do lookups and will
// ensure that only necessary changes are made to the zone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think record IDs are crucial, when available, and should be honored, so I wonder if a "simple" algorithm for identifying records would be useful.

  • If an ID is set, the method affects only those records.
  • Otherwise, a record is identified by its Name+Type ("RRset" -- which the RFC defines as a tuple of "label, class, and type" but I'm not sure what "class" in the spec means. I think label must be the name.)
  • For DeleteRecords: if a Value is set, only matching records with that value are deleted.

Or, the alternative is to introduce a third parameter for SetRecords and DeleteRecords that is exclusively for identifying records, rather than doubling up the Records input to both identify and mutate.

Since we're not 1.0 yet, I'm okay with breaking changes, but I'd like to only do it once at this point.

If we introduce the third parameter, I'd probably have to go around to the top repos, at least, and either make the patch for them, or notify them of the upcoming change, as a courtesy.

My preferred approach is that we recommend an ID-ing algorithm so that libraries can update when they are able without major breakage. (Most existing uses of the APIs won't be affected by these changing semantics.)

Copy link
Contributor

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great discussion!! We're getting close.

Comment on lines -72 to -75
// Records that have an ID associating it with a particular resource
// on the provider will be directly replaced. If no ID is given, this
// method may use what information is given to do lookups and will
// ensure that only necessary changes are made to the zone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point.

What if records with IDs were rejected with an error if the provider does not support IDs? Obviously, IDs are the ideal way to manage records, so it'd be nice to make it possible to use them if available, but it's true that they aren't portable across providers.

(Note that I'm talking about cross-provider portability here, not cross-zone portability like I was with my argument for making record names zone-relative. My hunch is that moving records around across zones is fairly common, and moving records across providers is less common.)

@mholt
Copy link
Contributor

mholt commented Mar 14, 2025

Alrighty, so, to wrap this PR up for now, do you want to make any final adjustments if you have any you want to do, and then I'll merge this, then I'll open a new PR with the discussed changes / expansion to the library? Would be happy to have your review at that point too.

@gucci-on-fleek gucci-on-fleek force-pushed the documentation-updates branch from 1b96ca6 to 6c12460 Compare March 15, 2025 10:47
@gucci-on-fleek gucci-on-fleek marked this pull request as ready for review March 15, 2025 10:48
@gucci-on-fleek
Copy link
Collaborator Author

Alrighty, so, to wrap this PR up for now, do you want to make any final adjustments if you have any you want to do,

Alright, done. The content should be the same as before, but I went ahead and reformatted it so that it displays better on https://pkg.go.dev.

I'll open a new PR with the discussed changes / expansion to the library? Would be happy to have your review at that point too.

Sure, sounds good. Please @mention me when you do.

Thanks again for your helpful reviews/comments!

Copy link
Contributor

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I'll get to work! (Might be after the weekend)

@mholt mholt merged commit 013384d into libdns:master Mar 15, 2025
@mholt mholt added the documentation Improvements or additions to documentation label Mar 15, 2025
mholt added a commit that referenced this pull request Mar 19, 2025
Initial draft for v1.0 pre-releases.

Follow-up from #152.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify semantics for SetRecords, GetRecords and Close
2 participants