How to Build Your Own Fare-Prediction Model Using Commodity and Cargo Data
analyticshow-todata

How to Build Your Own Fare-Prediction Model Using Commodity and Cargo Data

sscanflight
2026-02-01 12:00:00
10 min read
Advertisement

Build a DIY fare-prediction model merging commodity prices, air cargo stats and historical fares—tutorial with spreadsheets, 2026 trends, and alerts.

Beat rising fares: build a DIY fare-prediction model with commodity & cargo data

If unpredictable airfare is draining your travel budget, you don't need an ML team to get an edge. In 2026, airlines are reacting faster to supply shocks—cargo-first airlines, commodity swings and SAF costs—so travelers who model those signals can spot cheap booking windows earlier. This guide walks data-savvy travelers through a practical spreadsheet tutorial to combine commodity prices, air cargo statistics and historical fares into a simple, actionable fare-prediction model.

Why this matters in 2026

Late 2025 and early 2026 saw renewed volatility across commodity markets and a structural shift in airfreight: industrial shipments like aluminium coils surged into the U.S., reshaping belly-capacity allocation on passenger routes. Airline cost bases also changed with broader SAF adoption and new route-level cargo demand—both causal for fare swings. Put simply: air cargo trends and commodity prices are now real, leading signals for some passenger fares.

"Industrial airfreight growth—not just consumer parcels—has been a growing driver of belly-space allocation and route capacity in late 2025." — industry reporting

Quick overview: the model you'll build

You'll create a time-series regression in a spreadsheet (Google Sheets or Excel) using monthly or weekly observations. The dependent variable is fare (average price for a route / cabin). Predictors include:

  • Jet fuel / crude oil prices (direct cost influence).
  • Commodity prices relevant to trade on that route (aluminium, cotton, soy, wheat—where applicable).
  • Air cargo activity (tonnes shipped, freight ton-kilometres, belly-capacity utilization) — get regional FTK and carrier data from IATA and national sources; see cargo-focused coverage.
  • Demand & capacity proxies (Google Trends, seat capacity / ASK if available, days-to-departure).
  • Seasonality (month dummies, holiday flags).

Step 1 — Decide scope & frequency

Be explicit up front. Two choices determine everything:

  • Route granularity: A single route (e.g., LAX–JFK) is easiest. Multi-route panels require more work but give stronger models.
  • Time frequency: Weekly for short-term booking signals; monthly for longer-term trend predictions. Weekly captures volatile commodity shocks but increases data-collection effort.

Recommendation

Start with monthly data for a pilot (12–36 months), then move to weekly once you validate a baseline model.

Step 2 — Collect historical fares

For a DIY model you need a consistent fare series for your route/cabin. Options:

  • Use APIs: Kiwi Tequila, Skyscanner or RapidAPI flight endpoints can return historical snapshots or current search prices. These are easiest if you're comfortable with API keys and CSV exports — if you have a complex stack, do a quick stack audit before automating.
  • Google Flights / ITA Matrix: manually sample the price graph on consistent days (e.g., first of each week) and record the average roundtrip fare for your date-window.
  • Browser automation: if you know Python, a simple Selenium script can query and record prices daily into a CSV—importable to Sheets.

For pilots without code, collect the median price from 7–14 days of searches per month to reduce sampling noise. Store columns: DateObserved, Route, Cabin, FareUSD, DaysToDeparture (if using), SampleSize.

Step 3 — Pull commodity and fuel data

Commodities and fuel are available from reliable public sources. Key datasets:

  • Jet fuel / crude: U.S. EIA (Weekly/Monthly spot prices), Brent/WTI via Yahoo Finance or GoogleFinance in Sheets.
  • Metals & agricultural commodities: Aluminium, cotton, corn, wheat, soy—download monthly futures settlement prices from CME Group or Nasdaq Data Link (formerly Quandl).
  • Currency / USD index: Strong USD frequently suppresses import volumes; get DXY from FRED or financial APIs.

In Google Sheets you can use =GOOGLEFINANCE('CURRENCY:USDEUR') for FX or import CSVs from Nasdaq Data Link. In Excel, use Power Query to pull and refresh CSV/JSON feeds.

Step 4 — Add air cargo statistics

Air cargo activity is the critical connection: if cargo demand rises, belly capacity shrinks and fares can rise on affected routes. Useful sources:

  • U.S. Bureau of Transportation Statistics (BTS): T-100 (international freight) gives monthly tonnes and freight revenue by carrier and route.
  • IATA: publishes monthly cargo performance and freight tonne-kilometres (FTK) at a regional level — see regional FTK data and cargo-first reporting at Cargo-First Airlines coverage.
  • National port or customs agencies sometimes publish commodity-specific import airfreight volumes (e.g., aluminium coil shipments); industry reporting can surface these trends.

Collect monthly tonnes or FTK for your route or closest available market. If route-level data is unavailable, use regional FTK and note the limitation.

Step 5 — Build your merged dataset

Key data engineering steps in Sheets/Excel:

  1. Create a master date column (e.g., 2023-01, 2023-02).
  2. Align all series to that frequency (aggregate weekly -> monthly using average or end-of-month snapshot).
  3. Normalize currencies and units (convert commodity settlement to USD where needed).
  4. Create lagged variables (1-month, 2-month lags) for commodities and cargo—many effects are delayed.
  5. Fill missing data carefully: use forward-fill for short gaps or leave NA if too long; avoid heavy imputation for small samples.

Example columns

  • Date
  • AvgFareUSD (dependent)
  • JetFuel_USDperBarrel
  • Brent_USD
  • Aluminium_USDperTon (or futures close)
  • Cotton_USDperlb
  • FTK_Region (monthly freight tonne-km)
  • SeatCapacity_Est (optional)
  • GoogleTrendsIndex_Route
  • Month (1–12) and HolidayFlag (0/1)

Step 6 — Feature engineering (make signals usable)

How you transform raw series determines model power. Try these:

  • Percent changes (month-over-month % change) for commodities and FTK—often more predictive than levels.
  • Moving averages (3-month, 6-month) to reduce noise and capture trend direction.
  • Lags (t-1, t-2): cargo and commodity shocks take weeks/months to influence capacity and pricing.
  • Interaction terms: JetFuel * FTK can capture cost-pressure when cargo demand is high and fuel spikes.
  • Seasonality dummies (Jan=1, Feb=0, etc.) or sin/cos transforms if you prefer continuous seasonality.

Step 7 — Build the regression in your spreadsheet

Keep the first model simple: linear regression with a handful of predictors. In Google Sheets use =LINEST or in Excel use the Data Analysis Toolpak > Regression.

Example formula (Google Sheets):

=LINEST(B2:B37, {C2:C37,D2:D37,E2:E37,F2:F37}, TRUE, TRUE)

Where B is AvgFareUSD, and C–F are predictors (JetFuel, Aluminium_pctChange_lag1, FTK_pctChange_lag1, GoogleTrendsIndex).

Interpreting output

  • Coefficients: the sign and magnitude tell you direction & sensitivity (e.g., $X increase in fare per $1/barrel fuel rise).
  • R-squared: not huge for fares—contextualize. An R2 of 0.4–0.6 for route-level monthly data is strong.
  • p-values: drop variables with p > 0.1 unless they are theoretically necessary.

Step 8 — Backtest and validate

Split your data: train on the past 70–80% and test on the remainder. In the test set compute:

  • MAE (mean absolute error)
  • MAPE (mean absolute percentage error) — useful to compare across routes
  • Direction accuracy (did the model correctly predict up/down?)

Keep a simple rule-of-thumb: if your model predicts a price drop of >8–10% and MAE is within your tolerance, consider waiting to book.

Step 9 — Turn model output into booking signals

Convert predicted fare and confidence interval into actionable alerts:

  • Green: predicted drop >8% with low uncertainty — wait and set a price alert.
  • Yellow: predicted small change (<8%) — monitor day-to-day and consider flexible dates.
  • Red: predicted rise >8% or high volatility — book now if travel is certain.

Use conditional formatting in Sheets to visualize signals and auto-email via Apps Script when thresholds are met.

Advanced tips & common pitfalls

1) Lag selection matters

Commodity and cargo signals often act with a lag. Test 1–3 month lags and use cross-correlation plots (manual or via the CORREL function across shifted ranges) to find lead/lag relationships.

2) Beware multicollinearity

Jet fuel and crude are correlated; metals can correlate with macro USD moves. If coefficient signs flip or p-values inflate, drop or combine correlated predictors (e.g., construct a single CommodityPressureIndex using normalized z-scores).

3) Use robust validation

Time-series cross-validation (rolling windows) is better than a single train/test split. In a spreadsheet, simulate by retraining on expanding windows and recording test errors. If you want to accelerate validation, do a quick tool audit to remove noisy data sources first.

4) Include events and structural shifts

Late 2025 saw structural changes (industrial airfreight surges). Add binary flags for events (e.g., major port strike, SAF mandate start date) so the model can account for sudden breaks.

Case study (practical walkthrough)

Example: LAX → FRA (Los Angeles to Frankfurt) monthly model, Jan 2023–Dec 2025 (36 months)

  1. Collected median monthly roundtrip fares via Skyscanner API.
  2. Downloaded monthly Brent and EIA Jet-Kerosene indices, aluminium futures, and IATA regional FTK data (Europe–North America).
  3. Created 1-month lag for aluminium and FTK, 3-month moving avg for jet fuel.
  4. Ran regression: AvgFare = β0 + β1*Jet3MA + β2*Alu_pct_lag1 + β3*FTK_pct_lag1 + β4*MonthDummies + ε.

Results summary:

  • Jet fuel coefficient positive and significant: a $5/barrel rise in fuel -> ~$6–$9 increase in average route fare.
  • FTK_pct_lag1 positive and significant: a 10% rise in cargo FTK correlated with a $15 rise in fares that month.
  • Aluminium_pct_lag1 significant for specific months when U.S.-bound aluminium shipments surged (late-2025), aligning with industry reporting that industrial imports reshaped airfreight patterns (see sector coverage in late 2025).

Backtest held: MAPE ~12% and direction accuracy ~68%—sufficient to generate conservative booking signals.

How to operationalize: alerts, dashboards and automation

Automation tips to make this a usable tool:

  • Use Sheets' IMPORTDATA or Power Query to refresh commodity CSVs weekly.
  • Automate fare pulls with the airline API or a low-frequency headless search (once daily); if you are building a lightweight pipeline, consider local-first sync appliances so your dashboard stays responsive offline.
  • Create a dashboard: include predicted fare, confidence band, recent cargo and commodity changes and a color-coded booking suggestion.
  • Set up email or SMS alerts via Apps Script (Sheets) or Power Automate (Excel) when the signal flips to Green/Red — before automating, run a quick stack tidy to avoid costly integrations.

What to expect—limits and realistic gains

Fares are noisy: events like schedule changes, inventory moves from revenue management, or airline promotions are hard to predict with public macro signals alone. Expect your spreadsheet model to provide probabilistic guidance, not certainties. Good use cases:

  • Deciding to wait vs. book for non-urgent discretionary travel.
  • Finding routes likely to spike because of cargo-driven capacity pressure.
  • Spotting longer-term trends to time multi-month bookings — if you're planning a multi-leg trip, consider micro-trip rental insights from micro-trip rental playbooks.

Ongoing trends you can incorporate now:

  • SAF pass-throughs: Governments and airlines are increasingly passing SAF-linked costs to tickets—track regulatory milestones and SAF price indices via travel tech scans.
  • Industrial airfreight: Expect more metal and high-value parts to be flown as supply-chain reshoring continues—monitor trade press and monthly import data to capture route-specific shocks (cargo-first reporting is useful: Cargo-First Airlines).
  • Data democratization: More freight pricing portals and APIs emerged by late-2025, making cargo rate signals easier to ingest into consumer models.

Sources & further reading

Use official datasets where possible: U.S. Bureau of Transportation Statistics (T-100), IATA cargo reports, U.S. EIA fuel prices, and futures data from CME/Nasdaq Data Link. Industry reporting in late 2025 documented increased aluminium imports flown into the U.S., demonstrating how industrial demand can reshape belly capacity and passenger fares.

Simple checklist to get started (30–90 minutes)

  1. Pick a route and time frequency (weekly or monthly).
  2. Grab 12–36 months of historical fares via an API or manual sampling.
  3. Download monthly Brent & jet fuel, and 1–2 commodity futures series relevant to the route.
  4. Pull monthly FTK data for the region from IATA or BTS.
  5. Create the merged sheet, compute % changes and lags, and run a quick LINEST regression.
  6. Backtest with a holdout sample — if MAPE <15% and direction accuracy >60%, build alerts.

Final actionable takeaways

  • Model what you can measure: cargo and commodity shifts are measurable leading indicators for fare pressure on certain routes.
  • Keep models simple: a handful of well-engineered predictors often beats a noisy overfit model in spreadsheets.
  • Automate incrementally: start manual, validate signal value, then automate data pulls and alerts — if you need portable power recommendations for on-the-road dashboard checks, see portable power stations compared or compact solar backup kits.
  • Use predictions as signals, not certainties: combine your model’s output with fare rules, seat sales calendars, and personal flexibility.

Ready to build yours?

If you want a head start, we built a starter spreadsheet template with example feeds, formulas and a sample LAX–FRA model tuned on 2023–2025 data. Download the template, adapt it to your routes, and sign up for our weekly scan that highlights routes where commodity and cargo signals predict upcoming fare moves.

Sign up at scanflight.direct to get the template and weekly route alerts—start making data work for your next booking.

Advertisement

Related Topics

#analytics#how-to#data
s

scanflight

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-01-24T06:52:07.063Z