Forelore reports a "high-confidence floor" — the value of pleasant outdoor hours that has been held in 24 of 25 observed years for any given venue × date pair. We communicate this externally as ~90% one-sided confidence, derived from order statistic theory under the Weibull plotting position and adjusted for climate autocorrelation (effective sample size reduces from 25 to ~18–22 because of multi-year cycles like ENSO and AMO).
1. The order statistic foundation
1.1 Definition
For any venue × date, Forelore computes a per-year value (pleasant daytime hours, after applying the event-window operator — 6/7 for vacation mode, 3/3 for wedding mode). This yields 25 yearly values (years 2001–2025 inclusive).
The "high-confidence floor" is defined as:
In order statistic notation: X_(2) where X_(1) ≤ X_(2) ≤ … ≤ X_(25).
1.2 Quantile interpretation
For n iid (independent and identically distributed) samples from an underlying distribution F,
the k-th order statistic X_(k) is a point estimator of the p-quantile
of F where p depends on the plotting position formula:
| Formula | p for k=2, n=25 | Source |
|---|---|---|
| Weibull | k / (n+1) = 2/26 = 7.69% | Weibull (1939); standard in hydrology, climatology |
| Hazen | (k − 0.5) / n = 1.5/25 = 6.00% | Hazen (1914) |
| Cunnane | (k − 0.4) / (n + 0.2) = 1.6/25.2 ≈ 6.35% | Cunnane (1978); often preferred for extremes |
We adopt the Weibull plotting position as the canonical choice — it is the standard in extreme value statistics for atmospheric and hydrological data (Wilks 2011, §5.6).
One-sided lower confidence = 1 − p:
- Weibull: 1 − 0.0769 = 92.3%
- Hazen: 1 − 0.06 = 94.0%
- Cunnane: 1 − 0.0635 = 93.7%
Under the iid assumption, the Forelore floor corresponds to a ~92% one-sided lower confidence bound on the true distribution of pleasant-hours values at that venue × date.
2. The climate autocorrelation adjustment
The iid assumption is violated by climate data. Successive years are not statistically independent because of multi-year climate modes (ENSO, AMO, NAO, PDO, sub-decadal variability).
2.1 Why this matters
For positively autocorrelated samples, the effective sample size Neff is less than the nominal N. This degrades the confidence of any order statistic computed on the raw N.
The standard correction (Bretherton et al. 1999; Wilks 2011, §5.2.4):
N_eff ≈ N × (1 − ρ) / (1 + ρ)
where ρ is the lag-1 autocorrelation of the per-year series.
2.2 Empirical estimate for Forelore
Per-year yearly window-floor values exhibit modest autocorrelation. Typical lag-1 ρ for tropical and sub-tropical pleasant-hours series at sub-monthly resolution: 0.15–0.25 (based on ENSO and Atlantic Multi-Decadal Oscillation signatures filtering down to daily rain and thermal-comfort metrics).
For ρ = 0.20:
N_eff ≈ 25 × 0.8 / 1.2 = 16.67(lag-1 only, conservative)- Realistic with higher-lag corrections: N_eff ≈ 18–22
2.3 Impact on confidence
With Neff ≈ 20 (midpoint of the 18–22 range), the 2nd-lowest of 25 corresponds to Weibull plotting position 2/(20+1) ≈ 9.5%, i.e., a one-sided confidence of ~90%.
3. Why we report 90%
3.1 Why round down
We report "~90% one-sided confidence" rather than the higher 92% (iid theoretical) for three reasons:
- Conservativeness: choosing the lower end is consistent with Forelore's "audited, not averaged" stance: we err toward under-stating, not over-stating, confidence.
- Memorability: "~90% confidence" is a round, easily-communicated number. Planners and advisors retain "90%" better than "92.3%".
- Buffer against unknown unknowns: Effective-N estimates carry their own uncertainty (we estimate ρ from finite data). The 90% figure absorbs that uncertainty in the planner's favor.
3.2 Customer interpretation
The "90% one-sided confidence floor" can be interpreted by a non-statistical customer as:
4. Why 24/25 and not 25/25
We deliberately drop the single worst year (24/25 = 2nd-lowest) rather than report the absolute minimum (25/25). Three justifications:
- Robustness to single-event outliers: A single tropical storm, satellite data gap, or model-resolution artifact can produce a year-min of 0.0h that is not representative of the underlying climate distribution. Dropping the worst year reduces this brittleness.
- Standard practice in climate risk reporting: The IPCC, World Bank, and major reinsurance models use percentile-based floors (typically 5th–10th percentile) rather than absolute minima for similar reasons.
5. ENSO robustness
5.1 The concern
The 25-year sample window (2001–2025) does not necessarily contain ENSO phases (El Niño / La Niña / Neutral) at their long-run climatological frequencies. If our sample under-represents the ENSO phase that drives bad outcomes at a given venue × date, the audited floor could be optimistically biased.
5.2 Sample composition
| ENSO phase | Count in sample | Sample frequency | Long-run climatology (1950–2023) |
|---|---|---|---|
| El Niño | 5 | 20% | ~28–32% |
| La Niña | 3 | 12% | ~22–28% |
| Neutral | 17 | 68% | ~40–50% |
Our 25-year window under-represents La Niña years by roughly half (12% vs ~25% long-run). The bias is at the cohort frequency, not at the per-year ENSO-state level.
5.3 Empirical test — Bahamas (Apr–May)
For the Harbor Island Wedding Audit sample, we stratified the per-year window-floor values by ENSO phase:
| Date | Phase | N | Mean (h) | Median (h) | Min (h) |
|---|---|---|---|---|---|
| Apr 30 | El Niño | 5 | 9.8 | 10.5 | 3.5 |
| Apr 30 | La Niña | 3 | 6.2 | 6.0 | 0.0 |
| Apr 30 | Neutral | 17 | 11.3 | 12.5 | 0.5 |
| May 8 | El Niño | 5 | 13.2 | 13.0 | 12.0 |
| May 8 | La Niña | 3 | 13.0 | 14.0 | 11.0 |
| May 8 | Neutral | 17 | 12.7 | 13.5 | 4.5 |
ENSO has a large magnitude effect on the worst date (Apr 30, La Niña median 6.5h below Neutral on a 14h scale = −46%) but no detectable effect on the protected date (May 8). The Apr 30 effect does not clear α=0.05 (Mann-Whitney p=0.121, Kruskal-Wallis p=0.162) — but only because N=3 La Niña years is statistically underpowered, not because the magnitude is small.
5.4 Bootstrap re-weighting
To test whether the 24/25 floor is robust to ENSO sample bias, we resampled the 25 years 10,000 times using climatological ENSO weights (30% EN, 25% LN, 45% Neut) instead of observed-sample weights:
| Date | Observed 24/25 floor | Re-weighted mean floor | Re-weighted P5–P95 |
|---|---|---|---|
| Apr 30 | 0.5h | 0.8h | 0.0–3.5h |
| May 8 | 10.5h | 10.1h | 4.5–11.0h |
5.5 Residual risk in extreme-ENSO years
For a future strong La Niña year on a date already flagged as risky (e.g., Apr 30 Bahamas), the conditional residual risk is higher than the headline ~10%. Empirically, 1 of 3 La Niña years in the sample fell below the 0.5h Apr 30 floor — a conditional residual risk closer to ~20–35% in strong-LN years.
This is not a flaw in the methodology — it is a property of the underlying climate. The 0.5h floor for Apr 30 is correctly communicating "this is a high-risk date." A customer who sees the floor and reroutes to a protected window (e.g., May 8) is using the audit exactly as intended.
6. References
- David, H. A., and Nagaraja, H. N. (2003). Order Statistics, 3rd ed. Hoboken, NJ: Wiley.
- Weibull, W. (1939). "A Statistical Theory of the Strength of Materials." Ingeniörsvetenskapsakademiens Handlingar, No. 151.
- Hazen, A. (1914). "Storage to be provided in impounding reservoirs for municipal water supply." Transactions of the ASCE, 77, 1539–1640.
- Cunnane, C. (1978). "Unbiased plotting positions — A review." Journal of Hydrology, 37(3–4), 205–222.
- Bretherton, C. S., Widmann, M., Dymnikov, V. P., Wallace, J. M., and Bladé, I. (1999). "The effective number of spatial degrees of freedom of a time-varying field." Journal of Climate, 12(7), 1990–2009.
- Wilks, D. S. (2011). Statistical Methods in the Atmospheric Sciences, 3rd ed. Academic Press. Chapter 5.
- IPCC (2021). Climate Change 2021: The Physical Science Basis. WG-I to the Sixth Assessment Report. Cambridge University Press. Chapters 11–12 use order statistics and percentile-based floors throughout.
- Bröde, P., Fiala, D., Błażejczyk, K., et al. (2012). "Deriving the operational procedure for the Universal Thermal Climate Index (UTCI)." International Journal of Biometeorology, 56(3), 481–494.
- Di Napoli, C., Barnard, C., Prudhomme, C., Cloke, H. L., and Pappenberger, F. (2021). "ERA5-HEAT: A global gridded historical dataset of human thermal comfort indices from climate reanalysis." Geoscience Data Journal, 8(1), 2–10. DOI: 10.1002/gdj3.102.