-
Notifications
You must be signed in to change notification settings - Fork 274
Remove unused code paths and switch the policy object to a policy builder #4791
Remove unused code paths and switch the policy object to a policy builder #4791
Conversation
cba5b7c
to
93e3857
Compare
Codecov Report
@@ Coverage Diff @@
## main #4791 +/- ##
==========================================
- Coverage 68.96% 68.64% -0.33%
==========================================
Files 227 223 -4
Lines 16454 16266 -188
==========================================
- Hits 11348 11166 -182
+ Misses 5054 5049 -5
+ Partials 52 51 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
…lder. Signed-off-by: Sean Teeling <seanteeling@microsoft.com>
93e3857
to
6062c30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see too many parts of the code being changed without a clear motivation being described in the commit message. It's hard for me to understand why significant parts of the RBAC policy builder + traffic policy helpers are being refactored.
My request to you is:
- Clearly describe the motivation of the PR in the commit and PR descriptions
- Break this PR into smaller independent units (removal of unused code, traffic policy, RBAC etc.)
- Open a Github issue for larger refactors so they can be tracked and discussed
A net 650 lines of code removed is a big benefit in my opinion. All of the refactored code is due to unused code paths, or inlining functions that are called in one place (particularly after portions of that function that are not used are removed). The net result is a significant improvement in readability, simplicity, with an added benefit of less bytes used in the envoy config itself :) |
I am not disputing the removal of unused code, rather pointing out that there's not enough context to review these changes effectively. For example, the RBAC policy code is being refactored, yet there isn't any indication on why this is necessary or if there are existing bugs. It would be a lot easier to review by breaking this change up into smaller units, each potentially addressing a localized problem. |
It's mostly for readability. I have to make a couple small changes here with respect to the trust domain, and ended up spending a lot more time than I anticipated in simply understanding the code. In my experience this understanding will last for about a month, and then I'll need to refresh myself. The goal here is purely around simplicity. I can add more context, but I'd expect that this PR is simple enough to remain as a single PR vs splitting it up |
I suggest not removing what is already supported. There is a reason that And/Or rules was implemented for permissions and principals, so that we could support more fine grained RBAC on attributes other than just service-account, eg. source/destination IP/port, L7 headers etc. I am not in favor of refactoring out this functionality for the purpose of integrating the trust domain work. Instead, I'd like to see the trust domain transparently plumbed using the existing code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the transition to a builder pattern here, but I'm hesitant to remove the code powering AND semantics. Just because SMI doesn't really support it now doesn't mean customers don't want/need that policy crafting tool
FWIW i wrote this PR in < a day, and I'd estimate re-adding that functionality to take a similar amount of time. However we read code much more than we write it, so I think removing unused features, when it would be trivial to add them back, makes sense. Especially since the and/or rules, as written are limited, in that they don't support nesting. If we do need AND later, there's no promise that the current implementation would even support the need. I'd love to get more thoughts here, especially around the idea of maintaining unused code, particularly since I'll be adding on top of this, so essentially we're adding code on top of tech debt. |
A lot of effort went into ensuring the given RBAC policy generator supports basic AND/OR semantics to accommodate future use cases beyond just the service principal. We didn't need complex nesting semantics, so it wasn't an immediate priority back then. I am not in favor of refactoring this code and removing core functionality just to plumb the trust domain work, which I don't believe requires significant refactoring of this code path. If this refactor was done without removing the generic AND/OR support for readability, then it would be more appealing to me. But as it stands, I am not in favor of this change. |
I'd say YAGNI should be a leading principle here. Since this seems to be relatively subjective I'd like to open it up to others to get opinions before reverting if that's ok |
@nojnhuh @trstringer @jaellio pinging other maintainers for opinions on above |
Adding to this argument, I have some proposals to remove most of the logic from the trafficpolicy objects like Rules. Instead, the meshcatalog should present more normalized data, and allow builders (like policy.Builder), to specify the rules. So instead of
Where rules contains business logic on how the rules were applied (or at least needed a lot of business logic to construct), we would instead do: builder.SetApplyPermissionsWithAnd(true)
// or
builder.SetAndRules(rules) This allows for that flexible nesting that is currently not possible, removes this "double" translation logic that is currently happening, simplifies the code base, and is trivial to implement. With that system I double down that we don't need to maintain unused AND logic when we don't use it. If for nothing else that
Again pinging the rest of the maintainers for there thoughts here As a middle ground, I'd be happy to provide that toggle on the policy builder, but I don't think it belongs in the mesh catalog |
As a final point of data, why does the mesh catalog know that a port translates into a rule? We should follow the rule of single responsbility, where the mesh catalog is aware about the state of the mesh, not how that translates to an envoy config (which is the responsbility of the builders). Instead, it should be supplying available ports, not a set of AND/OR rules. That places envoy config logic in multiple places. Again, happy to implement on the policy builder side but I strongly believe adding this logic within the mesh catalog would take us in the wrong direction |
My primary concern pertains to removing the capability to remove AND/OR rule semantics from the existing RBAC code. I am not against refactoring this code to use a data model. As we are adding more complex features, I can imagine us needing to use AND/OR semantics for RBAC rules, as the API is complex due to nesting of protos. I also believe we are not going to expose nested APIs to users, so top level AND/OR support should be good enough (e.g. filter on source IP AND destination port, or filter on source IP OR source principal, etc.). Further, I understand that you are proposing a design change to the MeshCatalog. I am not against this either. I want to point out that the existing design of MeshCatalog is to translate k8s config to objects that can easily be translated to XDS objects, vs needing to call into k8s APIs from |
In general, I'm in favor of docs or other formats as a method of detailing technical goals maintainers & other community members have for a project's code base. These kinds of docs serve as a guiding light to folks working on the project, and when a group of people have an explicit, common goal, accomplishing that goal tends to be smoother. |
I think we're on the same page :) I just added a commit that includes the AND semantics to the policy portion, PTAL. Agreed, I have a doc I'll be sharing with the team that lays out some founding principals to help make decision on how this code should be structured going forward |
Signed-off-by: Sean Teeling <seanteeling@microsoft.com>
3455cd8
to
31ff5e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not comfortable with the changes made to the RBAC code. Here's a suggestion, let me know if that works.
- Refactor the existing policy code in a way where the test expectations remain the same. Currently, I see a lot of test code being updated, so it's extremely hard to spot whether the rules are applied exactly as before (defaultin to OR). If this is not the case, then this PR title is misleading. It would not be removing unusued code paths, but rather changing the behavior of the existing RBAC policies, which I believe should be a separate change in itself.
- In a follow-up change, migrate from OR rules to top-level rules.
This way, there is an easier path to making the refactoring without changing the underlying behavior.
pkg/catalog/traffictarget_test.go
Outdated
@@ -507,7 +507,7 @@ func TestListInboundTrafficTargetsWithRoutes(t *testing.T) { | |||
}, | |||
TCPRouteMatches: []trafficpolicy.TCPRouteMatch{ | |||
{ | |||
Ports: []int{8000, 9000}, | |||
Ports: []uint32{8000, 9000}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be uint16 instead as port ranges are from 0-65535
}, | ||
}, | ||
}, | ||
|
||
expectedPolicy: &xds_rbac.Policy{ | ||
Permissions: []*xds_rbac.Permission{ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above.
Discussed this offline with @steeling. Although this change moves from OR based fields to top-level fields for both principals and permissions, by default Envoy uses OR based rules for top-level fields: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/rbac/v3/rbac.proto#envoy-v3-api-msg-config-rbac-v3-policy |
Signed-off-by: Sean Teeling <seanteeling@microsoft.com>
852c7e8
to
37506ce
Compare
Remove large swaths of code by removing unused code paths and merging some of the resulting small functions into their callers.
This PR adds the following simplifications (with justification provided) around building RBAC rules from the policy object:
AddRule
, which is only called in a single instance, where there are no pre-existing rules (permissive mode).getTrafficSpecName
from a method to a function. This provides additional clarity to the reader since the reader knows the method getTrafficSpecName only operates on its parameters, and not on the fields of the method's instance.ANY
, whereas before it could have bothANY
, and specific principals.allowPartialHostnamesMatch
, since only a single value was ever provided.