Forelore — Methodology · The 90% confidence floor explained

TL;DR

Forelore reports a "high-confidence floor" — the value of pleasant outdoor hours that has been held in 24 of 25 observed years for any given venue × date pair. We communicate this externally as ~90% one-sided confidence, derived from order statistic theory under the Weibull plotting position and adjusted for climate autocorrelation (effective sample size reduces from 25 to ~18–22 because of multi-year cycles like ENSO and AMO).

1. The order statistic foundation

1.1 Definition

For any venue × date, Forelore computes a per-year value (pleasant daytime hours, after applying the event-window operator — 6/7 for vacation mode, 3/3 for wedding mode). This yields 25 yearly values (years 2001–2025 inclusive).

The "high-confidence floor" is defined as:

The 2nd-lowest value of the 25 yearly observations, i.e., the value that 24 of 25 observed years matched or exceeded.

In order statistic notation: X_(2) where X_(1) ≤ X_(2) ≤ … ≤ X_(25).

1.2 Quantile interpretation

For n iid (independent and identically distributed) samples from an underlying distribution F, the k-th order statistic X_(k) is a point estimator of the p-quantile of F where p depends on the plotting position formula:

Formula	p for k=2, n=25	Source
Weibull	k / (n+1) = 2/26 = 7.69%	Weibull (1939); standard in hydrology, climatology
Hazen	(k − 0.5) / n = 1.5/25 = 6.00%	Hazen (1914)
Cunnane	(k − 0.4) / (n + 0.2) = 1.6/25.2 ≈ 6.35%	Cunnane (1978); often preferred for extremes

We adopt the Weibull plotting position as the canonical choice — it is the standard in extreme value statistics for atmospheric and hydrological data (Wilks 2011, §5.6).

One-sided lower confidence = 1 − p:

Weibull: 1 − 0.0769 = 92.3%
Hazen: 1 − 0.06 = 94.0%
Cunnane: 1 − 0.0635 = 93.7%

Under the iid assumption, the Forelore floor corresponds to a ~92% one-sided lower confidence bound on the true distribution of pleasant-hours values at that venue × date.

2. The climate autocorrelation adjustment

The iid assumption is violated by climate data. Successive years are not statistically independent because of multi-year climate modes (ENSO, AMO, NAO, PDO, sub-decadal variability).

2.1 Why this matters

For positively autocorrelated samples, the effective sample size N_eff is less than the nominal N. This degrades the confidence of any order statistic computed on the raw N.

The standard correction (Bretherton et al. 1999; Wilks 2011, §5.2.4):

N_eff ≈ N × (1 − ρ) / (1 + ρ)

where ρ is the lag-1 autocorrelation of the per-year series.

2.2 Empirical estimate for Forelore

Per-year yearly window-floor values exhibit modest autocorrelation. Typical lag-1 ρ for tropical and sub-tropical pleasant-hours series at sub-monthly resolution: 0.15–0.25 (based on ENSO and Atlantic Multi-Decadal Oscillation signatures filtering down to daily rain and thermal-comfort metrics).

For ρ = 0.20:

N_eff ≈ 25 × 0.8 / 1.2 = 16.67 (lag-1 only, conservative)
Realistic with higher-lag corrections: N_eff ≈ 18–22

2.3 Impact on confidence

With N_eff ≈ 20 (midpoint of the 18–22 range), the 2nd-lowest of 25 corresponds to Weibull plotting position 2/(20+1) ≈ 9.5%, i.e., a one-sided confidence of ~90%.

3. Why we report 90%

3.1 Why round down

We report "~90% one-sided confidence" rather than the higher 92% (iid theoretical) for three reasons:

Conservativeness: choosing the lower end is consistent with Forelore's "audited, not averaged" stance: we err toward under-stating, not over-stating, confidence.
Memorability: "~90% confidence" is a round, easily-communicated number. Planners and advisors retain "90%" better than "92.3%".
Buffer against unknown unknowns: Effective-N estimates carry their own uncertainty (we estimate ρ from finite data). The 90% figure absorbs that uncertainty in the planner's favor.

3.2 Customer interpretation

The "90% one-sided confidence floor" can be interpreted by a non-statistical customer as:

"In any given year at this venue on this date, there is approximately a 1-in-10 chance that the actual pleasant-hours value will be lower than the floor. The audit identifies the floor — the worst plausible value across the recent climate record."

4. Why 24/25 and not 25/25

We deliberately drop the single worst year (24/25 = 2nd-lowest) rather than report the absolute minimum (25/25). Three justifications:

Robustness to single-event outliers: A single tropical storm, satellite data gap, or model-resolution artifact can produce a year-min of 0.0h that is not representative of the underlying climate distribution. Dropping the worst year reduces this brittleness.
Standard practice in climate risk reporting: The IPCC, World Bank, and major reinsurance models use percentile-based floors (typically 5th–10th percentile) rather than absolute minima for similar reasons.

5. ENSO robustness

5.1 The concern

The 25-year sample window (2001–2025) does not necessarily contain ENSO phases (El Niño / La Niña / Neutral) at their long-run climatological frequencies. If our sample under-represents the ENSO phase that drives bad outcomes at a given venue × date, the audited floor could be optimistically biased.

5.2 Sample composition

ENSO phase	Count in sample	Sample frequency	Long-run climatology (1950–2023)
El Niño	5	20%	~28–32%
La Niña	3	12%	~22–28%
Neutral	17	68%	~40–50%

Our 25-year window under-represents La Niña years by roughly half (12% vs ~25% long-run). The bias is at the cohort frequency, not at the per-year ENSO-state level.

5.3 Empirical test — Bahamas (Apr–May)

For the Harbor Island Wedding Audit sample, we stratified the per-year window-floor values by ENSO phase:

Date	Phase	N	Mean (h)	Median (h)	Min (h)
Apr 30	El Niño	5	9.8	10.5	3.5
Apr 30	La Niña	3	6.2	6.0	0.0
Apr 30	Neutral	17	11.3	12.5	0.5
May 8	El Niño	5	13.2	13.0	12.0
May 8	La Niña	3	13.0	14.0	11.0
May 8	Neutral	17	12.7	13.5	4.5

ENSO has a large magnitude effect on the worst date (Apr 30, La Niña median 6.5h below Neutral on a 14h scale = −46%) but no detectable effect on the protected date (May 8). The Apr 30 effect does not clear α=0.05 (Mann-Whitney p=0.121, Kruskal-Wallis p=0.162) — but only because N=3 La Niña years is statistically underpowered, not because the magnitude is small.

5.4 Bootstrap re-weighting

To test whether the 24/25 floor is robust to ENSO sample bias, we resampled the 25 years 10,000 times using climatological ENSO weights (30% EN, 25% LN, 45% Neut) instead of observed-sample weights:

Date	Observed 24/25 floor	Re-weighted mean floor	Re-weighted P5–P95
Apr 30	0.5h	0.8h	0.0–3.5h
May 8	10.5h	10.1h	4.5–11.0h

The audit floor is robust to ENSO sample bias. Climatological re-weighting moves the floor by ≤0.3h. The 2nd-lowest order statistic already absorbs the worst La Niña year (2012 at 0.0h for Apr 30) — the methodology captures the ENSO tail directly through the empirical data.

5.5 Residual risk in extreme-ENSO years

For a future strong La Niña year on a date already flagged as risky (e.g., Apr 30 Bahamas), the conditional residual risk is higher than the headline ~10%. Empirically, 1 of 3 La Niña years in the sample fell below the 0.5h Apr 30 floor — a conditional residual risk closer to ~20–35% in strong-LN years.

This is not a flaw in the methodology — it is a property of the underlying climate. The 0.5h floor for Apr 30 is correctly communicating "this is a high-risk date." A customer who sees the floor and reroutes to a protected window (e.g., May 8) is using the audit exactly as intended.

6. References

David, H. A., and Nagaraja, H. N. (2003). Order Statistics, 3rd ed. Hoboken, NJ: Wiley.
Weibull, W. (1939). "A Statistical Theory of the Strength of Materials." Ingeniörsvetenskapsakademiens Handlingar, No. 151.
Hazen, A. (1914). "Storage to be provided in impounding reservoirs for municipal water supply." Transactions of the ASCE, 77, 1539–1640.
Cunnane, C. (1978). "Unbiased plotting positions — A review." Journal of Hydrology, 37(3–4), 205–222.
Bretherton, C. S., Widmann, M., Dymnikov, V. P., Wallace, J. M., and Bladé, I. (1999). "The effective number of spatial degrees of freedom of a time-varying field." Journal of Climate, 12(7), 1990–2009.
Wilks, D. S. (2011). Statistical Methods in the Atmospheric Sciences, 3rd ed. Academic Press. Chapter 5.
IPCC (2021). Climate Change 2021: The Physical Science Basis. WG-I to the Sixth Assessment Report. Cambridge University Press. Chapters 11–12 use order statistics and percentile-based floors throughout.
Bröde, P., Fiala, D., Błażejczyk, K., et al. (2012). "Deriving the operational procedure for the Universal Thermal Climate Index (UTCI)." International Journal of Biometeorology, 56(3), 481–494.
Di Napoli, C., Barnard, C., Prudhomme, C., Cloke, H. L., and Pappenberger, F. (2021). "ERA5-HEAT: A global gridded historical dataset of human thermal comfort indices from climate reanalysis." Geoscience Data Journal, 8(1), 2–10. DOI: 10.1002/gdj3.102.

← Back to forelore.ai

The 90% confidence floor — explained.

1. The order statistic foundation

1.1 Definition

1.2 Quantile interpretation

2. The climate autocorrelation adjustment

2.1 Why this matters

2.2 Empirical estimate for Forelore

2.3 Impact on confidence

3. Why we report 90%

3.1 Why round down

3.2 Customer interpretation

4. Why 24/25 and not 25/25

5. ENSO robustness

5.1 The concern

5.2 Sample composition

5.3 Empirical test — Bahamas (Apr–May)

5.4 Bootstrap re-weighting

5.5 Residual risk in extreme-ENSO years

6. References