1. Lisbon’s approach procedures to RWY 02
The ORCI project focuses on the North configuration at Lisbon Airport, with mixed mode operations (ARR/DEP mixed) conducted on RWY 02.
Arrivals to this runway are managed through RNAV1 instrument approach procedures known as Point Merge System (PMS), which is a systemised method for sequencing arrival flows developed by the EUROCONTROL Experimental Centre (EEC) in 2006. This procedure is designed to manage high traffic levels without the need for radar vectoring. It uses a predefined route structure made up of a merge point and several fixed legs located at the same distance from that point. Aircraft follow these legs until they are instructed to proceed directly to the merge point.

2. 𝗗𝗮𝘁𝗮 𝗖𝗹𝗲𝗮𝗻𝗶𝗻𝗴 / Trajectory Validation
he ORCI project focuses on landing operations on runway RWY 02and does not consider departure operations. In addition, only flights that perform the PMS maneuver and whose trajectories are reasonably similar to the published procedure are included.
To prepare the data for training, it is necessary to carefully clean and select the flight trajectories. Although the PMS procedure is clearly defined, it is not always followed in practice. For example, during periods of low traffic, the sequencing legs may be skipped and aircraft may be guided directly to the initial fix (PESEX). For this reason, only those flights that closely follow the defined approach procedure have been selected, ensuring that the training data contains representative trajectories. The first image shows trajectories where the PMS is not used, while the second and third images show trajectories that follow the published PMS, approaching from the East and West, respectively.



3. Creating synthetic flights.
The main objective of ORCI is to predict the spacing between two consecutive aircraft at the point where the preceding arrives to PESEX (IF). For this purpose, each aircraft is paired with the one immediately ahead, designating the first to land as the preceding aircraft and the second as the follower. For subsequent pairs, the same aircraft may act as a follower in one pair and as the preceding aircraft in the next, depending on its position in the landing sequence.
Since the tool is developed using real traffic data, the range of situations it is exposed to is limited to those that occur in normal operations. However, for ORCI to provide reliable predictions, it must also be able to handle a wider variety of situations, including rare or extreme cases that are not present in historical data.
To overcome this limitation, synthetic training samples are created by shifting the timing of the leading aircraft in each valid flight pair forward or backward by a few seconds, while keeping the following aircraft unchanged and preserving the original 3D trajectories. This allows a single real flight pair to generate multiple scenarios with different spacing conditions.
This process makes it possible to cover a full range of situations, including:
- Real operational situations
- Excessive spacing situations
- Unrealistic situations where the following aircraft overtakes or collides with the leading aircraft
By expanding the original dataset from around 3,700 real flights to approximately 96,000 samples, the tool is exposed to separation values ranging roughly from 0 to 14 NM, enabling it to learn from both tighter and wider spacing scenarios than those observed in real operations.


4. Machine Learning Models: content coming soon.
The dataset includes information for both the follower and the preceding aircraft at the theoretical moment when the turn instruction is commanded. Some columns contain data directly extracted from the flight information at that instant; other columns are derived features. Finally, the dataset includes the target column.
| Raw Flight variables | Calculated variables | Prediction Field |
| Geographical coordinates for both aircraft GS for both aircraft FL for both aircraft Wake turbulence follower | Angle to PESEX of the preceding aircraft Distance to PESEX of the preceding aircraft Distance to PESEX of the follower aircraft Median-centered GS for both aircraft Median-centered FL for both aircraft | Inter-distance when the preceding aircraft flies over PESEX |
The training and test sets were split by allocating three weeks, one of each month to the test dataset and the remaining weeks to the training set, resulting in an approximate training and test ratio of 75%–25%. This temporal separation was chosen to prevent synthetic samples derived from the same flight from appearing in both datasets.
The geometry of the Point Merge system means that spacing evolution is influenced by the entry point of the follower aircraft into the structure. For this reason, two separate linear regression models were trained: one for aircraft entering the Point Merge from the east and another for those entering from the west. This distinction allows the model to better capture the different trajectory patterns and convergence behaviours associated with each entry flow, improving prediction accuracy while maintaining overall model simplicity and interpretability.
ML Results
The development of the ORCI prediction tool prioritized explainable machine-learning models, particularly linear regression, which offers a good balance between transparency and performance. During the exploratory phase, more complex algorithms such as Random Forests, gradient boosting methods and neural networks were also evaluated, but they did not provide significant improvements in predictive accuracy.
An initial linear regression model was first trained to predict the spacing between aircraft. In this context, prediction error is defined as the difference between predicted and actual spacing. Positive errors correspond to situations where aircraft end up closer than predicted, which represents the least desirable outcome from a safety perspective.
To mitigate this effect, a second iteration of the model was developed using a weighted linear regression. Higher weights were assigned to cases with positive errors, allowing the model to focus on these critical situations and reduce the likelihood of non-conservative predictions.
| Model /Metric | MAE | MSE | R2 | p10 | p50 | p90 | ||
| Conventional LR | Model E | Train | 0,5009 | 0,404 | 0,962 | -0,809 | 0,0127 | 0,776 |
| Test | 0,498 | 0,399 | 0,963 | -0,773 | 0,0174 | 0,794 | ||
| Model W | Train | 0,5137 | 0,4227 | 0,9614 | -0,766 | 0,008 | 0,756 | |
| Test | 0,4845 | 0,3758 | 0,9659 | -0,715 | 0,0176 | 0,741 | ||
| Weighted LR | Model E | Train | 0,603 | 0,583 | 0,948 | -1,25 | -0,4 | 0,35 |
| Test | 0,596 | 0,559 | 0,9502 | -1,21 | -0,41 | 0,35 | ||
| Model W | Train | 0,576 | 0,529 | 0,953 | -1,18 | -0,39 | 0,34 | |
| Test | 0,561 | 0,514 | 0,955 | -1,12 | -0,38 | 0,32 | ||


Conventional linear regression models achieve better global performance, with lower MAE (around 0.50 NM) and higher R² values (~0.96), indicating strong overall accuracy and consistency between training and test datasets.
In contrast, weighted linear regression introduces a slight degradation in global metrics (MAE increasing to ~0.56–0.60 NM and R² slightly reduced), as expected from the applied weighting strategy. However, this comes with a significant change in the error distribution.
The weighted models effectively reduce large positive errors, as reflected by a substantial decrease in the upper percentile (p90 from ~0.75–0.79 down to ~0.32–0.35). This indicates a lower risk of non-conservative predictions, which are critical from an operational perspective. At the same time, the error distribution shifts towards more negative values, meaning the model becomes more conservative by slightly overestimating spacing.
Overall, the results show a clear trade-off: conventional models optimize global accuracy, while weighted models prioritize operational safety by reducing the most critical error cases.
Follow ORCI on LinkedIn to stay up to date with the latest news.


