Courier routing and assignment for food delivery service using reinforcement learning. (February 2022)
- Record Type:
- Journal Article
- Title:
- Courier routing and assignment for food delivery service using reinforcement learning. (February 2022)
- Main Title:
- Courier routing and assignment for food delivery service using reinforcement learning
- Authors:
- Bozanta, Aysun
Cevik, Mucahit
Kavaklioglu, Can
Kavuk, Eray M.
Tosun, Ayse
Sonuc, Sibel B.
Duranel, Alper
Basar, Ayse - Abstract:
- Highlights: We model a real-world food delivery service using Markov decision process. We compare Q-learning, DDQN, and rule-based policy. Policies generated by Q-Learning and DDQN collect more reward than benchmark. A policy generated by DDQN for the single courier outperforms all other algorithms. Abstract: We consider a Markov decision process model mimicking a real-world food delivery service where the objective is to maximize the revenue derived from served requests given a limited number of couriers over a period of time. The model incorporates the courier location, order origin, and order destination. Each courier's task is to pick-up an assigned order and deliver it to the requested destination. We apply three different approaches to solve this problem. In the first approach, we simplify the model to a one courier case and then solve the resulting model using Q-Learning. The resulting policy is used for each courier in the model with more than one courier based on the assumption that all couriers are identical. In the second approach, we use the same logic, however, the underlying one courier model is solved using Double Deep Q-Networks (DDQN). In the third approach, the extensive model is considered where a system state consists of the positions of all couriers and all orders in the system. We use DDQN to solve the extensive model. Policies generated by these approaches are compared against a benchmark rule-based policy. We observe that the resulting policy ofHighlights: We model a real-world food delivery service using Markov decision process. We compare Q-learning, DDQN, and rule-based policy. Policies generated by Q-Learning and DDQN collect more reward than benchmark. A policy generated by DDQN for the single courier outperforms all other algorithms. Abstract: We consider a Markov decision process model mimicking a real-world food delivery service where the objective is to maximize the revenue derived from served requests given a limited number of couriers over a period of time. The model incorporates the courier location, order origin, and order destination. Each courier's task is to pick-up an assigned order and deliver it to the requested destination. We apply three different approaches to solve this problem. In the first approach, we simplify the model to a one courier case and then solve the resulting model using Q-Learning. The resulting policy is used for each courier in the model with more than one courier based on the assumption that all couriers are identical. In the second approach, we use the same logic, however, the underlying one courier model is solved using Double Deep Q-Networks (DDQN). In the third approach, the extensive model is considered where a system state consists of the positions of all couriers and all orders in the system. We use DDQN to solve the extensive model. Policies generated by these approaches are compared against a benchmark rule-based policy. We observe that the resulting policy of training a single courier with Q-learning accumulates higher rewards than the reward collected by the rule-based policy. In addition, DDQN algorithm for a single courier outperforms both the Q-learning and the rule-based approaches, however, DDQN performance is noted to be highly dependent on the hyper-parameters of the algorithm. … (more)
- Is Part Of:
- Computers & industrial engineering. Volume 164(2022)
- Journal:
- Computers & industrial engineering
- Issue:
- Volume 164(2022)
- Issue Display:
- Volume 164, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 164
- Issue:
- 2022
- Issue Sort Value:
- 2022-0164-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02
- Subjects:
- Q-Learning -- DQN -- DDQN -- Courier routing -- Courier assignment
00-01 -- 99-00
Engineering -- Data processing -- Periodicals
Industrial engineering -- Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03608352 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cie.2021.107871 ↗
- Languages:
- English
- ISSNs:
- 0360-8352
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.713000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20361.xml