Skip to content

Conversation

jcoupey
Copy link
Collaborator

@jcoupey jcoupey commented Aug 8, 2019

Issue

The scope of this PR is to implement support for mixing independent pickups and deliveries in a single problem (#241). It should also fix #197 along the way.

Here "independent" stands for "no precedence constraints involved" (as opposed to the discussion in #189). So this is "just" a matter of making sure all our capacity checks handle mixed scenarios.

API

No big change required, amounts are already defined using Capacity which is an alias for int64_t. We should just make it clear that deliveries/pickups for jobs can be modeled using positive/negative capacities. Maybe just enforce that vehicle capacity values are unsigned integers. And perhaps report vehicle load at step level in output.

EDIT : #262 (comment)

Data changes

Modelling

If capacities can be positive/negative in input, the easiest way is probably to split pickups and deliveries at job level. This could be achieved by replacing the current Job::amount by two members Job::delivery and Job::pickup.

Example: a job amount of [4, 6, -2] in input would result in a [4, 6, 0] job delivery and a [0, 0, 2] pickup at the Job object level.

EDIT : #262 (comment)

Maintaining relevant information

The crucial part here to avoid performance degradation is to keep constant-time lookups for all capacity checks. So going through a whole route part to decide if adding a job will break something before or after is out of question. We should be able to achieve this by storing relevant information upon route modifications (those do not happen very often compared to validity checks).

Say we have a route s -> j_1 -> j_2 -> ... -> j_i -> ... -> j_n -> e and pickup (resp. delivery) amount for job j_i is p_i (resp. d_i).

  1. Π_i = sum(p_k) for 1 <= k <= i (all pickups up to j_i in the route)
  2. Δ_i = sum(d_k) for i < k <= n (all pending deliveries after j_i)
  3. Vehicle load at j_i is then L_i = Π_i + Δ_i
  4. Max vehicle load up to j_i is FL_i = max(L_k) for k <= i
  5. Max vehicle load after j_i is BL_i = max(L_k) for i <= k <= n

This is quite similar to what is currently stored in SolutionState::fwd_amounts and SolutionState::bwd_amounts, which are updated upon route modification through SolutionState::update_amounts. Warning: LocalSearch::try_job_additions currently has its own logic to update those values, it should be replaced to use the same function.

Validity checks

Additions

Suppose we want to check whether it is OK to include:

  • something with delivery D and pickup P (be it a job or bundle of jobs)
  • between steps i and i + 1
  • in a route for vehicle with capacity C

Then we have to check for BL_{i + 1} + P <= C && FL_i + D <= C.

Removal

Should always be OK.

Tasks

  • Add delivery and pickup members for jobs
  • Refactor try_job_addition to use SolutionState::update_amounts
  • Store and maintain additional relevant information
  • Move all amount-related members of SolutionState to RawRoute
  • Add capacity check functions as RawRoute members.
  • Adjust all capacity checks in Solomon-based heuristics
    - [ ] Adjust all capacity checks in clustering heuristics
  • Adjust try_job_additions capacity checks
  • Adjust all local search operators capacity checks
  • Add asserts on capacity/load upon final route construction
  • Switch some gain and validity checks to fix Cut down validity checks time #266
    - [ ] Fall back to simpler validity checks for delivery-only or pickup-only instances
    - [ ] Update 2-opt early abort checks
  • Add pickup and delivery keys at job level
  • Deprecate amount keys in both input and output
  • Deprecate CVRP heuristics to fix Deprecate CVRP clustering heuristics? #267
  • Test on use-cases with pickups only and mixed scenarios
  • Enforce that vehicle capacities are unsigned integers?
  • Update docs/API.md
  • Update CHANGELOG.md
  • review

@jcoupey jcoupey added this to the v1.5.0 milestone Aug 8, 2019
@jcoupey
Copy link
Collaborator Author

jcoupey commented Aug 13, 2019

On second thoughts, we can't just keep the old amount values for jobs because it does not allow to model a situation where there are both pickups and deliveries at the same job.

Example: say we have a delivery of 6 and a pickup of 5 (whatever metric) for a given job. We'd have to just pass the delta in the job amount as 1, but we'd then be loosing mandatory information for capacity checks. The vehicle need to have enough room for a 6 delivery from the start up to that job, and enough room for a 5 pickup from that job on to the end of its route...

Solution: add two pickup and delivery keys at job level and deprecate amount keys in both input and output. To maintain previous behaviour before deprecation is effective, we'll probably have to treat an amount as a delivery with an empty pickup.

@jcoupey
Copy link
Collaborator Author

jcoupey commented Sep 3, 2019

More on point 4 from previous comment. I've been profiling executions with perf and doing some flame graphs on Solomon instance R203 at 42e901b (parent commit in master for this PR) and 4d46ca4 (current PR HEAD):

I'm puzzled by:

  • the huge depth of the second one compared to the first, this looks wrong
  • the fact that we see a lot of malloc and cfree@GLIBC_2.2.5 stuff appearing in the global view
  • the changes observed when browsing to e.g. vroom::vrptw::CrossExchange::is_valid in both graphs. The one for 42e901b is totally expected (the usual TW checks with helper function, plus the calls to vroom::cvrp::CrossExchange::is_valid that consist mostly in operator<= checks for amounts). On the other hand for 4d46ca4, vroom::cvrp::CrossExchange::is_valid has a lot of stuff I don't get on top of the expected vroom::RawRoute::* checks, including some cfree, malloc, operator new, delete, etc.

The place where the latest start showing off is at f16aa71, the first check update:

$ git diff f16aa71~ f16aa71
diff --git a/src/problems/cvrp/operators/exchange.cpp b/src/problems/cvrp/operators/exchange.cpp
index 4f2a611..726a150 100644
--- a/src/problems/cvrp/operators/exchange.cpp
+++ b/src/problems/cvrp/operators/exchange.cpp
@@ -115,13 +115,19 @@ bool Exchange::is_valid() {
   bool valid = _input.vehicle_ok_with_job(t_vehicle, s_job_rank);
   valid &= _input.vehicle_ok_with_job(s_vehicle, t_job_rank);
 
-  valid &= (_sol_state.fwd_amounts[t_vehicle].back() -
-              _input.jobs[t_job_rank].amount + _input.jobs[s_job_rank].amount <=
-            _input.vehicles[t_vehicle].capacity);
-
-  valid &= (_sol_state.fwd_amounts[s_vehicle].back() -
-              _input.jobs[s_job_rank].amount + _input.jobs[t_job_rank].amount <=
-            _input.vehicles[s_vehicle].capacity);
+  valid &=
+    target.is_valid_addition_for_capacity(_input,
+                                          _input.jobs[s_route[s_rank]].pickup,
+                                          _input.jobs[s_route[s_rank]].delivery,
+                                          t_rank,
+                                          t_rank + 1);
+
+  valid &=
+    source.is_valid_addition_for_capacity(_input,
+                                          _input.jobs[t_route[t_rank]].pickup,
+                                          _input.jobs[t_route[t_rank]].delivery,
+                                          s_rank,
+                                          s_rank + 1);
 
   return valid;
 }

@krypt-n I suspect this might be related to the way I've been working with amounts for validity checks, any idea?

@krypt-n
Copy link
Contributor

krypt-n commented Sep 3, 2019

The 5-argument is_valid_addition_for_capacity from the diff is now called is_valid_addition_for_capacity_margins if I am not mistaken.

bool RawRoute::is_valid_addition_for_capacity_margins(
  const Input& input,
  const Amount& pickup,
  const Amount& delivery,
  const Index first_rank,
  const Index last_rank) const {
  assert(1 <= last_rank);
  assert(last_rank <= route.size() + 1);

  auto first_deliveries =
    (first_rank == 0) ? current_loads[0] : bwd_deliveries[first_rank - 1];

  auto first_pickups = (first_rank == 0) ? Amount(input.amount_size())
                                         : fwd_pickups[first_rank - 1];

  auto replaced_deliveries = first_deliveries - bwd_deliveries[last_rank - 1];

  return (fwd_peaks[first_rank] + delivery <=
          capacity + replaced_deliveries) and
         (bwd_peaks[last_rank] + pickup <=
          capacity + fwd_pickups[last_rank - 1] - first_pickups);
}

I'd suggest the following changes which get rid of the allocations that are noticeable in the flamegraph:

  • auto first_deliveries -> auto& first_deliveries. This avoids copying Amounts which is expensive since they are basically std::vectors. We can do this change since the lifetime of current_loads and bwd_deliveries is longer than the lifetime of the reference and other threads don't write to them (hopefully).

  • Replace Amount(input.amount_size()) with a global const Amount zero that is initialized once at the beginning to the correct size. Could be a new member of Input. This avoids allocating a new zero-Amount every time we use it.

  • Now that both of the possible values are not temporaries, we can change auto first_pickups to auto& first_pickups. This avoids copying the involved Amount objects.

The declaration auto replaced_deliveries does not need to be turned into a reference since this just constructs an AmountDiff object, which does not involve allocations.

The other is_valid_addition... methods would need similar changes to get the performance back into the same ballpark as before.

Hope that helps

@jcoupey
Copy link
Collaborator Author

jcoupey commented Sep 3, 2019

@krypt-n thanks a lot for your input, I've been able to reduce allocations following your suggestions. I still see some in the flame graph for the last commit but this is probably related to this Amount object creation we can't really get rid of.

@jcoupey
Copy link
Collaborator Author

jcoupey commented Sep 4, 2019

As several other changes are related to the current work in this PR, I'm extending it's scope to also cover #266 and #267. I've updated the task list accordingly.

@jcoupey
Copy link
Collaborator Author

jcoupey commented Sep 5, 2019

The last few commits to address #266 are really promising on the computing times side. Average computing time on master before this PR (42e901b) for Solomon benchmark instances was 604ms. It went up to 2052ms (more than a 3x increase!) at 474d6bc with the increase of complexity in the various is_valid implementations across local search operators.
Now reducing the number of validity checks altogether bring it down to 489ms for latest commit (cb5a4c8). It means that:

  • we should be able to handle mixed pickups and deliveries as a new feature with similar computing times (maybe even a bit lower)
  • we'll probably even get a nice speed-up for delivery-only instances when falling back to simpler validity checks for capacity

@jcoupey jcoupey merged commit e6cd768 into master Sep 17, 2019
@jcoupey jcoupey deleted the feature/mixed-pickup-delivery branch October 14, 2019 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate CVRP clustering heuristics? Cut down validity checks time Invalid solution with negative amount
2 participants