Skip to content

Conversation

jwendell
Copy link
Member

@jwendell jwendell commented Feb 22, 2021

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

@jwendell
Copy link
Member Author

@rcernich @dgn @bison @luksa

You that have more experience in controllers than me,

The problem arises when multiple replicas of istiod are in place.

IOR listens for Gateways changes and react to them. If there is more than one replica, each one will independently react to them and might end up, for instance, creating duplicate routes when a Gateway is created.

This PR "almost" fixes this issue by making use of the leader election. Only one IOR (the leader) is active. The problem is when it is demoted. The stop signal is not closed immediately when a new leader is elected, meaning that, for a small amount of time (a few seconds) both istiod/ior are the leaders. So, if in that time window a gateway is created, both IOR's will still react.

I talked to a few people, it seems that having exactly 1 leader at all times is not guaranteed by the election system. So, we should be more robust in our code to deal with that scenario.

Do you have any suggestions here? Perhaps a whole refactoring of the IOR code? How?

@jwendell
Copy link
Member Author

Perhaps a simpler solution would be to have a static name for the route, deterministic, instead of having a dynamic one. Currently we are making use of the GenerateName field passing only a prefix and letting OpenShift create the name, similar to pod names that derive from the deployment/replicaset name.

Thus, the second attempt to create the route will fail with a duplicate name.

@rcernich
Copy link
Contributor

I think this is an all around good idea. Not having a consistent name makes the generated routes pretty useless, as you don't know what the name will be, so you can't have a static link for it.

@jwendell
Copy link
Member Author

@rcernich I meant the Route object's name, not the hostname.

@bison
Copy link
Contributor

bison commented Feb 24, 2021

Perhaps a simpler solution would be to have a static name for the route, deterministic, instead of having a dynamic one. Currently we are making use of the GenerateName field passing only a prefix and letting OpenShift create the name, similar to pod names that derive from the deployment/replicaset name.

I think this is getting closer to the correct solution, though we should still be doing leader election of course. Ultimately, we need the control loops to be properly idempotent.

To be honest, I'm not that familiar with how IOR is supposed to work. Is it always one Gateway results in one Route? I think GenerateName is intended for one to many relationships based on label selectors, e.g. Deployment > ReplicaSets > Pods. If it's one to one, then I agree we should just name it predictably.

@dgn
Copy link
Contributor

dgn commented Feb 24, 2021

+1 for deterministic names. We can use e.g. the name and a hash of the Gateway resource as part of the name.

Update: actually that would result in new routes being created whenever we change a resource.. so maybe just namespace+name of the gateway resource?

@bison
Copy link
Contributor

bison commented Feb 24, 2021

Update: actually that would result in new routes being created whenever we change a resource.. so maybe just namespace+name of the gateway resource?

Yeah, exactly. I think it's as simple as ${NAMESPACE}-${NAME} assuming they go in a different namespace than the corresponding Gateway. If they go in the same namespace, then literally just give them the exact same name.

@jwendell
Copy link
Member Author

Ok folks I'll work on that. Just a quick note it can't be just "ns+gw name" as a gateway can have more than one host, which means more than one route. Probably append the hostname too. I'll think about it and update this pr. Thanks.

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).
@jwendell jwendell changed the title MAISTRA-2149: Make sure only the leader runs IOR MAISTRA-2149: Make IOR robust in multiple replicas Feb 24, 2021
@jwendell
Copy link
Member Author

Just pushed the fix. I've updated the commit/PR description to reflect the changes.

@rcernich
Copy link
Contributor

instead of hash, maybe the uid for the gateway object?

@jwendell
Copy link
Member Author

instead of hash, maybe the uid for the gateway object?

A gateway can have more than 1 hostname. So, 1 Gateway -> N Routes

@jwendell
Copy link
Member Author

/retest

@maistra-bot maistra-bot merged commit 430a16a into maistra:maistra-2.0 Mar 1, 2021
@maistra-bot
Copy link

In response to a cherrypick label: #275 failed to apply on top of branch "maistra-2.1":

Applying: MAISTRA-2149: Make IOR robust in multiple replicas
Using index info to reconstruct a base tree...
M	pilot/pkg/bootstrap/configcontroller.go
M	pilot/pkg/config/kube/ior/ior.go
M	pilot/pkg/config/kube/ior/route.go
Falling back to patching base and 3-way merge...
Auto-merging pilot/pkg/config/kube/ior/route.go
Auto-merging pilot/pkg/config/kube/ior/ior.go
CONFLICT (content): Merge conflict in pilot/pkg/config/kube/ior/ior.go
Auto-merging pilot/pkg/bootstrap/configcontroller.go
CONFLICT (content): Merge conflict in pilot/pkg/bootstrap/configcontroller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 MAISTRA-2149: Make IOR robust in multiple replicas
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

jwendell added a commit to jwendell/istio-maistra that referenced this pull request Mar 1, 2021
In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

Manual cherrypick of maistra#275
luksa pushed a commit to luksa/istio-maistra that referenced this pull request Feb 7, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
luksa pushed a commit to luksa/istio-maistra that referenced this pull request Feb 9, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
luksa pushed a commit to luksa/istio-maistra that referenced this pull request Feb 28, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
luksa pushed a commit to luksa/istio-maistra that referenced this pull request Mar 1, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
maistra-bot pushed a commit that referenced this pull request Mar 1, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

* MAISTRA-1400: Add IOR to Pilot (#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from #269.

* MAISTRA-2149: Make IOR robust in multiple replicas (#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of #275

* MAISTRA-1813: Add unit tests for IOR (#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
jewertow pushed a commit to jewertow/istio that referenced this pull request Jun 29, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
jewertow pushed a commit to jewertow/istio that referenced this pull request Jun 30, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
jewertow pushed a commit to jewertow/istio that referenced this pull request Jun 30, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
jewertow pushed a commit to jewertow/istio that referenced this pull request Jul 1, 2022
* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.
maistra-bot pushed a commit that referenced this pull request Jul 4, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

* MAISTRA-1400: Add IOR to Pilot (#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from #269.

* MAISTRA-2149: Make IOR robust in multiple replicas (#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of #275

* MAISTRA-1813: Add unit tests for IOR (#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>
jewertow added a commit to jewertow/istio that referenced this pull request Aug 23, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 24, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 24, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 25, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 25, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 25, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 29, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 29, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.
jewertow added a commit to jewertow/istio that referenced this pull request Aug 29, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)
jewertow added a commit to jewertow/istio that referenced this pull request Aug 29, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)
maistra-bot pushed a commit that referenced this pull request Aug 30, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

* MAISTRA-1400: Add IOR to Pilot (#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from #269.

* MAISTRA-2149: Make IOR robust in multiple replicas (#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of #275

* MAISTRA-1813: Add unit tests for IOR (#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (#551)
jwendell pushed a commit to jwendell/istio-maistra that referenced this pull request Nov 10, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)
jwendell pushed a commit to jwendell/istio-maistra that referenced this pull request Nov 15, 2022
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)
maistra-bot added a commit that referenced this pull request Nov 16, 2022
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

* MAISTRA-1400: Add IOR to Pilot (#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from #269.

* MAISTRA-2149: Make IOR robust in multiple replicas (#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of #275

* MAISTRA-1813: Add unit tests for IOR (#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (#631)

* OSSM-2109 Fix flaky IOR unit test (#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Aug 23, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
Signed-off-by: Yann Liu <yannliu@redhat.com>
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Sep 4, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
Signed-off-by: Yann Liu <yannliu@redhat.com>
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Sep 4, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <bavery@redhat.com>
Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
Signed-off-by: Yann Liu <yannliu@redhat.com>
yannuil added a commit to yannuil/maistra-istio that referenced this pull request Sep 6, 2023
commit 466ae69
Author: Yang Liu <yannliu@redhat.com>
Date:   Thu Mar 23 04:22:40 2023 +0800

    OSSM-1689 Simplify IOR (maistra#747)

    * Rework IOR initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove `initialSync`

    `initialSync` is not needed.

    - During boostrap, `SetNamesapces`is always called with no namespaces.
    - When removing or adding a namespace, the underlaying informer will
      trigger an `ADD` event for all resources the informer watches

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * DIsable TestPref

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Call `findService` once for each gateway

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Use original host to generate Route name

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Skip duplicate update test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Improve concurrency test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Introduce update Route on Gateway update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Format and lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Respect log level

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Refactor IOR

    - `gatawayMap` is removed. `Routes` are retrived via API.
    -  `reconcileGateway` is used to achieve the desired state.
    - `processEvent` will only process the latest and try to abort early.

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused functions

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Use `Lister` for finding target service

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Start IOR before kube client

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused properties

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rework test initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Log correct debug information

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unnecessary parameters

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove ResourceVersion usage

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Avoid deletion of a route when failing to update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Update FakeRouter to record API call counts

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rework initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Keep startup process order consistent

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix creating matching service

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Test IOR to be idempotent

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused parameters

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename symbol

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove used struct

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Improve styling and wording

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Add support list across namespaces in faker

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint and format

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Introduce Openshift Route informer

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Run make gen

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix test data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename variables

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix update route

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Linit

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Increase wait for the delete

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Maximize time to wait for the route deletion

    * Fix route update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix route update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Test with a 30 second wait

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix  flaky test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Add disabling IOR and clean up

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Defer clean up

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Clear only ior routes

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * rename newRoute to newRouteController

    * rename route.go to controller.go

    ---------

    Signed-off-by: Yann Liu <yannliu@redhat.com>
    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
    Signed-off-by: Yann Liu <yannliu@redhat.com>

commit afe4692
Author: Jonh Wendell <jonh.wendell@redhat.com>
Date:   Wed Nov 16 08:10:44 2022 -0500

    OSSM-2256: Add IOR (maistra#680)

    * [ior] OSSM-2256: Add IOR

    * [ior] MAISTRA-1400 Add IOR to Pilot

    * [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

    * MAISTRA-1400: Add IOR to Pilot (maistra#135)

    * MAISTRA-1400: Add IOR to Pilot

    * [MAISTRA-1744] Add route annotation propagation (maistra#158)

    * MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

    * MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

    * MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

    If Gateway's httpsRedirect is set to true, create the OpenShift Route
    with Insecure Policy set to `Redirect`.

    Manual cherrypick from maistra#269.

    * MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

    In scenarios where multiple replicas of istiod are running,
    only one IOR should be in charge of keeping routes in sync
    with Istio Gateways. We achieve this by making sure IOR only
    runs in the leader replica.

    Also, because leader election is not 100% acurate, meaning
    that for a small window of time there might be two instances
    being the leader - which could lead to duplicated routes
    being created if a new gateway is created in that time frame -
    we also change the way the Route name is created: Instead of
    having a generateName field, we now explicitly pass a name to
    the Route object to be created. Being deterministic, it allows
    the Route creation to fail when there's already a Route object
    with the same name (created by the other leader in that time frame).

    Use an exclusive leader ID for IOR

    * Manual cherrypick of maistra#275

    * MAISTRA-1813: Add unit tests for IOR (maistra#286)

    * MAISTRA-2051 fixes for maistra install

    * MAISTRA-2164: Refactor IOR internals (maistra#295)

    Instead of doing lots of API calls on every event - this
    does not scale well with lots of namespaces - keep the state
    in memory, by doing an initial synchronization on start up and
    updating it when receiving events.

    The initial synchronization is more complex, as we have to deal with
    asynchronous events (e.g., we have to wait for the Gateway store to
    be warmed up). Once it's initialized, handling events as they arrive
    becomes trivial.

    Tests that make sure we do not make more calls to the API server than
    the necessary were added, to avoid regressions.

    * MAISTRA-2205: Add an option to opt-out for automatic route creation

    If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
    then IOR ignores it and doesn't attempt to create or manage route(s) for
    this Gateway.

    Also, ignore Gateways with the annotation `istio: egressgateway` as
    these are not meant to have routes.

    * Add integration test for IOR

    Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

    For some obscure reason, it looks like we may receive UPDATE events with
    the new object being equal to the old one. As IOR always delete and
    recreate routes when receiving an UPDATE event, this might lead to some
    service downtime, given for a few moments the route will not exist.

    We guard against this behavior by comparing the `resourceVersion` field
    of the new object and the one stored in the Route object.

    * Add test

    Co-authored-by: Brian Avery <bavery@redhat.com>
    Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

    Fix debug log formatting

    OSSM-1800: Copy gateway labels to routes

    Simplify the comparison of resource versions

    We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
    There's no need to loop over the routes to perform the comparison.

    This also fix the corner case where the gateway has one host and for
    some reason OCP rejects the creation of the route (e.g., when hostname is already
    taken). In this case the `syncRoute` object exists with zero routes in
    it. Thus the loop is a no-op and the function wrongly returns with an
    error of `eventDuplicatedMessage`. By comparing directly using the
    `syncRoute.metadata` we fix this.

    OSSM-1105: Support namespace portion in gateway hostnames

    They are not used by routes, so we essentially ignore the namespace part
    - anything on the left side of a "namespace/hostname" string.

    OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

    * OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

    * OSSM-2109 Fix flaky IOR unit test (maistra#648)

    The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

    Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

    * OSSM-2006 Fix multiNamespaceInformer.HasSynced()

    Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
    Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
    Signed-off-by: Yann Liu <yannliu@redhat.com>

Signed-off-by: Yann Liu <yannliu@redhat.com>
openshift-merge-robot pushed a commit that referenced this pull request Sep 6, 2023
commit 466ae69
Author: Yang Liu <yannliu@redhat.com>
Date:   Thu Mar 23 04:22:40 2023 +0800

    OSSM-1689 Simplify IOR (#747)

    * Rework IOR initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove `initialSync`

    `initialSync` is not needed.

    - During boostrap, `SetNamesapces`is always called with no namespaces.
    - When removing or adding a namespace, the underlaying informer will
      trigger an `ADD` event for all resources the informer watches

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * DIsable TestPref

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Call `findService` once for each gateway

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Use original host to generate Route name

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Skip duplicate update test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Improve concurrency test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Introduce update Route on Gateway update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Format and lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Respect log level

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Refactor IOR

    - `gatawayMap` is removed. `Routes` are retrived via API.
    -  `reconcileGateway` is used to achieve the desired state.
    - `processEvent` will only process the latest and try to abort early.

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused functions

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Use `Lister` for finding target service

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Start IOR before kube client

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused properties

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rework test initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Log correct debug information

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unnecessary parameters

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove ResourceVersion usage

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Avoid deletion of a route when failing to update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Update FakeRouter to record API call counts

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rework initialization

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Keep startup process order consistent

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix creating matching service

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Test IOR to be idempotent

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove unused parameters

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename symbol

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Remove used struct

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Improve styling and wording

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Add support list across namespaces in faker

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint and format

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Introduce Openshift Route informer

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Run make gen

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix test data race

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Lint

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Rename variables

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix update route

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Linit

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Increase wait for the delete

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Maximize time to wait for the route deletion

    * Fix route update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix route update

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Test with a 30 second wait

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Fix  flaky test

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Add disabling IOR and clean up

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Defer clean up

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * Clear only ior routes

    Signed-off-by: Yann Liu <yannliu@redhat.com>

    * rename newRoute to newRouteController

    * rename route.go to controller.go

    ---------

    Signed-off-by: Yann Liu <yannliu@redhat.com>
    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
    Signed-off-by: Yann Liu <yannliu@redhat.com>

commit afe4692
Author: Jonh Wendell <jonh.wendell@redhat.com>
Date:   Wed Nov 16 08:10:44 2022 -0500

    OSSM-2256: Add IOR (#680)

    * [ior] OSSM-2256: Add IOR

    * [ior] MAISTRA-1400 Add IOR to Pilot

    * [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

    * MAISTRA-1400: Add IOR to Pilot (#135)

    * MAISTRA-1400: Add IOR to Pilot

    * [MAISTRA-1744] Add route annotation propagation (#158)

    * MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

    * MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

    * MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

    If Gateway's httpsRedirect is set to true, create the OpenShift Route
    with Insecure Policy set to `Redirect`.

    Manual cherrypick from #269.

    * MAISTRA-2149: Make IOR robust in multiple replicas (#282)

    In scenarios where multiple replicas of istiod are running,
    only one IOR should be in charge of keeping routes in sync
    with Istio Gateways. We achieve this by making sure IOR only
    runs in the leader replica.

    Also, because leader election is not 100% acurate, meaning
    that for a small window of time there might be two instances
    being the leader - which could lead to duplicated routes
    being created if a new gateway is created in that time frame -
    we also change the way the Route name is created: Instead of
    having a generateName field, we now explicitly pass a name to
    the Route object to be created. Being deterministic, it allows
    the Route creation to fail when there's already a Route object
    with the same name (created by the other leader in that time frame).

    Use an exclusive leader ID for IOR

    * Manual cherrypick of #275

    * MAISTRA-1813: Add unit tests for IOR (#286)

    * MAISTRA-2051 fixes for maistra install

    * MAISTRA-2164: Refactor IOR internals (#295)

    Instead of doing lots of API calls on every event - this
    does not scale well with lots of namespaces - keep the state
    in memory, by doing an initial synchronization on start up and
    updating it when receiving events.

    The initial synchronization is more complex, as we have to deal with
    asynchronous events (e.g., we have to wait for the Gateway store to
    be warmed up). Once it's initialized, handling events as they arrive
    becomes trivial.

    Tests that make sure we do not make more calls to the API server than
    the necessary were added, to avoid regressions.

    * MAISTRA-2205: Add an option to opt-out for automatic route creation

    If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
    then IOR ignores it and doesn't attempt to create or manage route(s) for
    this Gateway.

    Also, ignore Gateways with the annotation `istio: egressgateway` as
    these are not meant to have routes.

    * Add integration test for IOR

    Signed-off-by: Jacek Ewertowski <jewertow@redhat.com>

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (#516)

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

    For some obscure reason, it looks like we may receive UPDATE events with
    the new object being equal to the old one. As IOR always delete and
    recreate routes when receiving an UPDATE event, this might lead to some
    service downtime, given for a few moments the route will not exist.

    We guard against this behavior by comparing the `resourceVersion` field
    of the new object and the one stored in the Route object.

    * Add test

    Co-authored-by: Brian Avery <bavery@redhat.com>
    Co-authored-by: Jonh Wendell <jonh.wendell@redhat.com>

    Fix debug log formatting

    OSSM-1800: Copy gateway labels to routes

    Simplify the comparison of resource versions

    We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
    There's no need to loop over the routes to perform the comparison.

    This also fix the corner case where the gateway has one host and for
    some reason OCP rejects the creation of the route (e.g., when hostname is already
    taken). In this case the `syncRoute` object exists with zero routes in
    it. Thus the loop is a no-op and the function wrongly returns with an
    error of `eventDuplicatedMessage`. By comparing directly using the
    `syncRoute.metadata` we fix this.

    OSSM-1105: Support namespace portion in gateway hostnames

    They are not used by routes, so we essentially ignore the namespace part
    - anything on the left side of a "namespace/hostname" string.

    OSSM-1650 Make sure initialSync and event loop behave the same (#551)

    * OSSM-1301 Wait for Route resource type to become available on ior startup (#631)

    * OSSM-2109 Fix flaky IOR unit test (#648)

    The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

    Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>

    * OSSM-2006 Fix multiNamespaceInformer.HasSynced()

    Co-authored-by: Jacek Ewertowski <jewertow@redhat.com>
    Co-authored-by: Marko Lukša <marko.luksa@gmail.com>
    Co-authored-by: maistra-bot <57098434+maistra-bot@users.noreply.github.com>
    Signed-off-by: Yann Liu <yannliu@redhat.com>

Signed-off-by: Yann Liu <yannliu@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants