pyFTS.benchmarks package¶
Submodules¶
pyFTS.benchmarks.Measures module¶
pyFTS module for common benchmark metrics
-
pyFTS.benchmarks.Measures.
BoxLjungStatistic
(data, h)[source]¶ Q Statistic for Ljung–Box test
Parameters: - data –
- h –
Returns:
-
pyFTS.benchmarks.Measures.
BoxPierceStatistic
(data, h)[source]¶ Q Statistic for Box-Pierce test
Parameters: - data –
- h –
Returns:
-
pyFTS.benchmarks.Measures.
TheilsInequality
(targets, forecasts)[source]¶ Theil’s Inequality Coefficient
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
UStatistic
(targets, forecasts)[source]¶ Theil’s U Statistic
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
acf
(data, k)[source]¶ Autocorrelation function estimative
Parameters: - data –
- k –
Returns:
-
pyFTS.benchmarks.Measures.
brier_score
(targets, densities)[source]¶ Brier (1950). “Verification of Forecasts Expressed in Terms of Probability”. Monthly Weather Review. 78: 1–3.
-
pyFTS.benchmarks.Measures.
coverage
(targets, forecasts)[source]¶ Percent of target values that fall inside forecasted interval
-
pyFTS.benchmarks.Measures.
crps
(targets, densities)[source]¶ Continuous Ranked Probability Score
Parameters: - targets – a list with the target values
- densities – a list with pyFTS.probabil objectsistic.ProbabilityDistribution
Returns: float
-
pyFTS.benchmarks.Measures.
get_distribution_statistics
(data, model, **kwargs)[source]¶ Get CRPS statistic and time for a forecasting model
Parameters: - data – test data
- model – FTS model with probabilistic forecasting capability
- kwargs –
Returns: a list with the CRPS and execution time
-
pyFTS.benchmarks.Measures.
get_interval_statistics
(data, model, **kwargs)[source]¶ Condensate all measures for point interval forecasters
Parameters: - data – test data
- model – FTS model with interval forecasting capability
- kwargs –
Returns: a list with the sharpness, resolution, coverage, .05 pinball mean,
.25 pinball mean, .75 pinball mean and .95 pinball mean.
-
pyFTS.benchmarks.Measures.
get_point_statistics
(data, model, **kwargs)[source]¶ Condensate all measures for point forecasters
Parameters: - data – test data
- model – FTS model with point forecasting capability
- kwargs –
Returns: a list with the RMSE, SMAPE and U Statistic
-
pyFTS.benchmarks.Measures.
mape
(targets, forecasts)[source]¶ Mean Average Percentual Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
pinball
(tau, target, forecast)[source]¶ Pinball loss function. Measure the distance of forecast to the tau-quantile of the target
Parameters: - tau – quantile value in the range (0,1)
- target –
- forecast –
Returns: float, distance of forecast to the tau-quantile of the target
-
pyFTS.benchmarks.Measures.
pinball_mean
(tau, targets, forecasts)[source]¶ Mean pinball loss value of the forecast for a given tau-quantile of the targets
Parameters: - tau – quantile value in the range (0,1)
- targets – list of target values
- forecasts – list of prediction intervals
Returns: float, the pinball loss mean for tau quantile
-
pyFTS.benchmarks.Measures.
resolution
(forecasts)[source]¶ Resolution - Standard deviation of the intervals
-
pyFTS.benchmarks.Measures.
rmse
(targets, forecasts)[source]¶ Root Mean Squared Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
rmse_interval
(targets, forecasts)[source]¶ Root Mean Squared Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
smape
(targets, forecasts, type=2)[source]¶ Symmetric Mean Average Percentual Error
Parameters: - targets –
- forecasts –
- type –
Returns:
-
pyFTS.benchmarks.Measures.
winkler_mean
(tau, targets, forecasts)[source]¶ Mean Winkler score value of the forecast for a given tau-quantile of the targets
Parameters: - tau – quantile value in the range (0,1)
- targets – list of target values
- forecasts – list of prediction intervals
Returns: float, the Winkler score mean for tau quantile
pyFTS.benchmarks.ResidualAnalysis module¶
Residual Analysis methods
-
pyFTS.benchmarks.ResidualAnalysis.
chi_squared
(q, h)[source]¶ Chi-Squared value
Parameters: - q –
- h –
Returns:
-
pyFTS.benchmarks.ResidualAnalysis.
compare_residuals
(data, models)[source]¶ Compare residual’s statistics of several models
Parameters: - data – test data
- models –
Returns: a Pandas dataframe with the Box-Ljung statistic for each model
-
pyFTS.benchmarks.ResidualAnalysis.
plotResiduals
(targets, models, tam=[8, 8], save=False, file=None)[source]¶ Plot residuals and statistics
Parameters: - targets –
- models –
- tam –
- save –
- file –
Returns:
-
pyFTS.benchmarks.ResidualAnalysis.
plot_residuals
(targets, models, tam=[8, 8], save=False, file=None)[source]¶
pyFTS.benchmarks.Util module¶
Facilities for pyFTS Benchmark module
-
pyFTS.benchmarks.Util.
plot_dataframe_interval
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_interval_pinball
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_point
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['UAVG', 'RMSEAVG', 'USTD', 'RMSESTD'], sort_ascend=[1, 1, 1, 1], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_probabilistic
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['CRPS1AVG', 'CRPS2AVG', 'CRPS1STD', 'CRPS2STD'], sort_ascend=[True, True, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
save_dataframe_interval
(coverage, experiments, file, objs, resolution, save, sharpness, synthetic, times, q05, q25, q75, q95, steps, method)[source]¶
-
pyFTS.benchmarks.Util.
save_dataframe_point
(experiments, file, objs, rmse, save, synthetic, smape, times, u, steps, method)[source]¶ Create a dataframe to store the benchmark results
Parameters: - experiments – dictionary with the execution results
- file –
- objs –
- rmse –
- save –
- synthetic –
- smape –
- times –
- u –
Returns:
-
pyFTS.benchmarks.Util.
save_dataframe_probabilistic
(experiments, file, objs, crps, times, save, synthetic, steps, method)[source]¶ Save benchmark results for m-step ahead probabilistic forecasters :param experiments: :param file: :param objs: :param crps_interval: :param crps_distr: :param times: :param times2: :param save: :param synthetic: :return:
-
pyFTS.benchmarks.Util.
unified_scaled_interval
(experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
unified_scaled_interval_pinball
(experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
pyFTS.benchmarks.arima module¶
-
class
pyFTS.benchmarks.arima.
ARIMA
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Façade for statsmodels.tsa.arima_model
-
forecast
(ndata, **kwargs)[source]¶ Point forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted values
-
forecast_ahead_distribution
(data, steps, **kwargs)[source]¶ Probabilistic forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
forecast_ahead_interval
(ndata, steps, **kwargs)[source]¶ Interval forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted intervals
-
forecast_distribution
(data, **kwargs)[source]¶ Probabilistic forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
pyFTS.benchmarks.benchmarks module¶
Benchmarks methods for FTS methods
-
pyFTS.benchmarks.benchmarks.
get_benchmark_interval_methods
()[source]¶ Return all non FTS methods for point_to_interval forecasting
-
pyFTS.benchmarks.benchmarks.
get_benchmark_point_methods
()[source]¶ Return all non FTS methods for point forecasting
-
pyFTS.benchmarks.benchmarks.
get_benchmark_probabilistic_methods
()[source]¶ Return all FTS methods for probabilistic forecasting
-
pyFTS.benchmarks.benchmarks.
get_interval_methods
()[source]¶ Return all FTS methods for point_to_interval forecasting
-
pyFTS.benchmarks.benchmarks.
get_point_methods
()[source]¶ Return all FTS methods for point forecasting
-
pyFTS.benchmarks.benchmarks.
get_probabilistic_methods
()[source]¶ Return all FTS methods for probabilistic forecasting
-
pyFTS.benchmarks.benchmarks.
plot_compared_intervals_ahead
(original, models, colors, distributions, time_from, time_to, intervals=True, save=False, file=None, tam=[20, 5], resolution=None, cmap='Blues', linewidth=1.5)[source]¶ Plot the forecasts of several one step ahead models, by point or by interval
Parameters: - original – Original time series data (list)
- models – List of models to compare
- colors – List of models colors
- distributions – True to plot a distribution
- time_from – index of data poit to start the ahead forecasting
- time_to – number of steps ahead to forecast
- interpol – Fill space between distribution plots
- save – Save the picture on file
- file – Filename to save the picture
- tam – Size of the picture
- resolution –
- cmap – Color map to be used on distribution plot
- option – Distribution type to be passed for models
Returns:
-
pyFTS.benchmarks.benchmarks.
plot_compared_series
(original, models, colors, typeonlegend=False, save=False, file=None, tam=[20, 5], points=True, intervals=True, linewidth=1.5)[source]¶ Plot the forecasts of several one step ahead models, by point or by interval
Parameters: - original – Original time series data (list)
- models – List of models to compare
- colors – List of models colors
- typeonlegend – Add the type of forecast (point / interval) on legend
- save – Save the picture on file
- file – Filename to save the picture
- tam – Size of the picture
- points – True to plot the point forecasts, False otherwise
- intervals – True to plot the interval forecasts, False otherwise
- linewidth –
Returns:
-
pyFTS.benchmarks.benchmarks.
plot_density_rectange
(ax, cmap, density, fig, resolution, time_from, time_to)[source]¶
-
pyFTS.benchmarks.benchmarks.
plot_distribution
(ax, cmap, probabilitydist, fig, time_from, reference_data=None)[source]¶
-
pyFTS.benchmarks.benchmarks.
plot_interval
(axis, intervals, order, label, color='red', typeonlegend=False, ls='-', linewidth=1)[source]¶
-
pyFTS.benchmarks.benchmarks.
plot_point
(axis, points, order, label, color='red', ls='-', linewidth=1)[source]¶
-
pyFTS.benchmarks.benchmarks.
print_distribution_statistics
(original, models, steps, resolution)[source]¶
-
pyFTS.benchmarks.benchmarks.
print_point_statistics
(data, models, externalmodels=None, externalforecasts=None, indexers=None)[source]¶
-
pyFTS.benchmarks.benchmarks.
run_interval
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Interval forecast benchmark function to be executed on cluster nodes
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
run_point
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Point forecast benchmark function to be executed on cluster nodes
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
run_probabilistic
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Probabilistic forecast benchmark function to be executed on cluster nodes
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- steps –
- resolution –
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
simpleSearch_RMSE
(train, test, model, partitions, orders, save=False, file=None, tam=[10, 15], plotforecasts=False, elev=30, azim=144, intervals=False, parameters=None, partitioner=<class 'pyFTS.partitioners.Grid.GridPartitioner'>, transformation=None, indexer=None)[source]¶
-
pyFTS.benchmarks.benchmarks.
sliding_window_benchmarks
(data, windowsize, train=0.8, **kwargs)[source]¶ Sliding window benchmarks for FTS forecasters.
For each data window, a train and test datasets will be splitted. For each train split, number of partitions and partitioning method will be created a partitioner model. And for each partitioner, order, steps ahead and FTS method a foreasting model will be trained.
Then all trained models are benchmarked on the test data and the metrics are stored on a sqlite3 database (identified by the ‘file’ parameter) for posterior analysis.
All these process can be distributed on a dispy cluster, setting the atributed ‘distributed’ to true and informing the list of dispy nodes on ‘nodes’ parameter.
The number of experiments is determined by ‘windowsize’ and ‘inc’ parameters.
Parameters: - data – test data
- windowsize – size of sliding window
- train – percentual of sliding window data used to train the models
- kwargs – dict, optional arguments
- benchmark_methods – a list with Non FTS models to benchmark. The default is None.
- benchmark_methods_parameters – a list with Non FTS models parameters. The default is None.
- benchmark_models – A boolean value indicating if external FTS methods will be used on benchmark. The default is False.
- build_methods – A boolean value indicating if the default FTS methods will be used on benchmark. The default is True.
- dataset – the dataset name to identify the current set of benchmarks results on database.
- distributed – A boolean value indicating if the forecasting procedure will be distributed in a dispy cluster. . The default is False
- file – file path to save the results. The default is benchmarks.db.
- inc – a float on interval [0,1] indicating the percentage of the windowsize to move the window
- methods – a list with FTS class names. The default depends on the forecasting type and contains the list of all FTS methods.
- models – a list with prebuilt FTS objects. The default is None.
- nodes – a list with the dispy cluster nodes addresses. The default is [127.0.0.1].
- orders – a list with orders of the models (for high order models). The default is [1,2,3].
- partitions – a list with the numbers of partitions on the Universe of Discourse. The default is [10].
- partitioners_models – a list with prebuilt Universe of Discourse partitioners objects. The default is None.
- partitioners_methods – a list with Universe of Discourse partitioners class names. The default is [partitioners.Grid.GridPartitioner].
- progress – If true a progress bar will be displayed during the benchmarks. The default is False.
- start – in the multi step forecasting, the index of the data where to start forecasting. The default is 0.
- steps_ahead – a list with the forecasting horizons, i. e., the number of steps ahead to forecast. The default is 1.
- tag – a name to identify the current set of benchmarks results on database.
- type – the forecasting type, one of these values: point(default), interval or distribution. The default is point.
- transformations – a list with data transformations do apply . The default is [None].
pyFTS.benchmarks.distributed_benchmarks module¶
pyFTS.benchmarks.knn module¶
-
class
pyFTS.benchmarks.knn.
KNearestNeighbors
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
K-Nearest Neighbors
pyFTS.benchmarks.naive module¶
-
class
pyFTS.benchmarks.naive.
Naive
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Naïve Forecasting method
pyFTS.benchmarks.parallel_benchmarks module¶
joblib Parallelized Benchmarks to FTS methods
-
pyFTS.benchmarks.parallel_benchmarks.
ahead_sliding_window
(data, windowsize, train, steps, resolution, models=None, partitioners=[<class 'pyFTS.partitioners.Grid.GridPartitioner'>], partitions=[10], max_order=3, transformation=None, indexer=None, dump=False, save=False, file=None, sintetic=False)[source]¶ Parallel sliding window benchmarks for FTS probabilistic forecasters :param data: :param windowsize: size of sliding window :param train: percentual of sliding window data used to train the models :param steps: :param resolution: :param models: FTS point forecasters :param partitioners: Universe of Discourse partitioner :param partitions: the max number of partitions on the Universe of Discourse :param max_order: the max order of the models (for high order models) :param transformation: data transformation :param indexer: seasonal indexer :param dump: :param save: save results :param file: file path to save the results :param sintetic: if true only the average and standard deviation of the results :return: DataFrame with the results
-
pyFTS.benchmarks.parallel_benchmarks.
interval_sliding_window
(data, windowsize, train=0.8, models=None, partitioners=[<class 'pyFTS.partitioners.Grid.GridPartitioner'>], partitions=[10], max_order=3, transformation=None, indexer=None, dump=False, save=False, file=None, sintetic=False)[source]¶ Parallel sliding window benchmarks for FTS point_to_interval forecasters :param data: :param windowsize: size of sliding window :param train: percentual of sliding window data used to train the models :param models: FTS point forecasters :param partitioners: Universe of Discourse partitioner :param partitions: the max number of partitions on the Universe of Discourse :param max_order: the max order of the models (for high order models) :param transformation: data transformation :param indexer: seasonal indexer :param dump: :param save: save results :param file: file path to save the results :param sintetic: if true only the average and standard deviation of the results :return: DataFrame with the results
-
pyFTS.benchmarks.parallel_benchmarks.
point_sliding_window
(data, windowsize, train=0.8, models=None, partitioners=[<class 'pyFTS.partitioners.Grid.GridPartitioner'>], partitions=[10], max_order=3, transformation=None, indexer=None, dump=False, save=False, file=None, sintetic=False)[source]¶ Parallel sliding window benchmarks for FTS point forecasters :param data: :param windowsize: size of sliding window :param train: percentual of sliding window data used to train the models :param models: FTS point forecasters :param partitioners: Universe of Discourse partitioner :param partitions: the max number of partitions on the Universe of Discourse :param max_order: the max order of the models (for high order models) :param transformation: data transformation :param indexer: seasonal indexer :param dump: :param save: save results :param file: file path to save the results :param sintetic: if true only the average and standard deviation of the results :return: DataFrame with the results
-
pyFTS.benchmarks.parallel_benchmarks.
run_ahead
(mfts, partitioner, train_data, test_data, steps, resolution, transformation=None, indexer=None)[source]¶ Probabilistic m-step ahead forecast benchmark function to be executed on threads :param mfts: FTS model :param partitioner: Universe of Discourse partitioner :param train_data: data used to train the model :param test_data: ata used to test the model :param steps: :param resolution: :param transformation: data transformation :param indexer: seasonal indexer :return: a dictionary with the benchmark results
-
pyFTS.benchmarks.parallel_benchmarks.
run_interval
(mfts, partitioner, train_data, test_data, transformation=None, indexer=None)[source]¶ Interval forecast benchmark function to be executed on threads :param mfts: FTS model :param partitioner: Universe of Discourse partitioner :param train_data: data used to train the model :param test_data: ata used to test the model :param window_key: id of the sliding window :param transformation: data transformation :param indexer: seasonal indexer :return: a dictionary with the benchmark results
-
pyFTS.benchmarks.parallel_benchmarks.
run_point
(mfts, partitioner, train_data, test_data, transformation=None, indexer=None)[source]¶ Point forecast benchmark function to be executed on threads :param mfts: FTS model :param partitioner: Universe of Discourse partitioner :param train_data: data used to train the model :param test_data: ata used to test the model :param window_key: id of the sliding window :param transformation: data transformation :param indexer: seasonal indexer :return: a dictionary with the benchmark results
pyFTS.benchmarks.quantreg module¶
-
class
pyFTS.benchmarks.quantreg.
QuantileRegression
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Façade for statsmodels.regression.quantile_regression
-
forecast
(ndata, **kwargs)[source]¶ Point forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted values
-
forecast_ahead_distribution
(ndata, steps, **kwargs)[source]¶ Probabilistic forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
forecast_ahead_interval
(ndata, steps, **kwargs)[source]¶ Interval forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted intervals
-
forecast_distribution
(ndata, **kwargs)[source]¶ Probabilistic forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
Module contents¶
pyFTS module for benchmarking the FTS models