pyFTS.benchmarks package¶
Module contents¶
pyFTS module for benchmarking the FTS models
Submodules¶
pyFTS.benchmarks.benchmarks module¶
Benchmarks methods for FTS methods
-
pyFTS.benchmarks.benchmarks.
get_benchmark_interval_methods
()[source]¶ Return all non FTS methods for point_to_interval forecasting
-
pyFTS.benchmarks.benchmarks.
get_benchmark_point_methods
()[source]¶ Return all non FTS methods for point forecasting
-
pyFTS.benchmarks.benchmarks.
get_benchmark_probabilistic_methods
()[source]¶ Return all FTS methods for probabilistic forecasting
-
pyFTS.benchmarks.benchmarks.
get_interval_methods
()[source]¶ Return all FTS methods for point_to_interval forecasting
-
pyFTS.benchmarks.benchmarks.
get_point_methods
()[source]¶ Return all FTS methods for point forecasting
-
pyFTS.benchmarks.benchmarks.
get_point_multivariate_methods
()[source]¶ Return all multivariate FTS methods por point forecasting
-
pyFTS.benchmarks.benchmarks.
get_probabilistic_methods
()[source]¶ Return all FTS methods for probabilistic forecasting
-
pyFTS.benchmarks.benchmarks.
plot_compared_series
(original, models, colors, typeonlegend=False, save=False, file=None, tam=[20, 5], points=True, intervals=True, linewidth=1.5)[source]¶ Plot the forecasts of several one step ahead models, by point or by interval
Parameters: - original – Original time series data (list)
- models – List of models to compare
- colors – List of models colors
- typeonlegend – Add the type of forecast (point / interval) on legend
- save – Save the picture on file
- file – Filename to save the picture
- tam – Size of the picture
- points – True to plot the point forecasts, False otherwise
- intervals – True to plot the interval forecasts, False otherwise
- linewidth –
Returns:
-
pyFTS.benchmarks.benchmarks.
plot_point
(axis, points, order, label, color='red', ls='-', linewidth=1)[source]¶
-
pyFTS.benchmarks.benchmarks.
print_distribution_statistics
(original, models, steps, resolution)[source]¶ Run probabilistic benchmarks on given models and data and print the results
Parameters: - data – test data
- models – a list of FTS models to benchmark
Returns:
-
pyFTS.benchmarks.benchmarks.
print_interval_statistics
(original, models)[source]¶ Run interval benchmarks on given models and data and print the results
Parameters: - data – test data
- models – a list of FTS models to benchmark
Returns:
-
pyFTS.benchmarks.benchmarks.
print_point_statistics
(data, models, externalmodels=None, externalforecasts=None, indexers=None)[source]¶ Run point benchmarks on given models and data and print the results
Parameters: - data – test data
- models – a list of FTS models to benchmark
- externalmodels – a list with benchmark models (façades for other methods)
- externalforecasts –
- indexers –
Returns:
-
pyFTS.benchmarks.benchmarks.
process_interval_jobs
(dataset, tag, job, conn)[source]¶ Extract information from an dictionary with interval benchmark results and save it on a database
Parameters: - dataset – the benchmark dataset name
- tag – alias for the benchmark group being executed
- job – a dictionary with the benchmark results
- conn – a connection to a Sqlite database
Returns:
-
pyFTS.benchmarks.benchmarks.
process_point_jobs
(dataset, tag, job, conn)[source]¶ Extract information from a dictionary with point benchmark results and save it on a database
Parameters: - dataset – the benchmark dataset name
- tag – alias for the benchmark group being executed
- job – a dictionary with the benchmark results
- conn – a connection to a Sqlite database
Returns:
-
pyFTS.benchmarks.benchmarks.
process_probabilistic_jobs
(dataset, tag, job, conn)[source]¶ Extract information from an dictionary with probabilistic benchmark results and save it on a database
Parameters: - dataset – the benchmark dataset name
- tag – alias for the benchmark group being executed
- job – a dictionary with the benchmark results
- conn – a connection to a Sqlite database
Returns:
-
pyFTS.benchmarks.benchmarks.
run_interval
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Run the interval forecasting benchmarks
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
run_point
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Run the point forecasting benchmarks
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
run_probabilistic
(mfts, partitioner, train_data, test_data, window_key=None, **kwargs)[source]¶ Run the probabilistic forecasting benchmarks
Parameters: - mfts – FTS model
- partitioner – Universe of Discourse partitioner
- train_data – data used to train the model
- test_data – ata used to test the model
- steps –
- resolution –
- window_key – id of the sliding window
- transformation – data transformation
- indexer – seasonal indexer
Returns: a dictionary with the benchmark results
-
pyFTS.benchmarks.benchmarks.
simpleSearch_RMSE
(train, test, model, partitions, orders, save=False, file=None, tam=[10, 15], plotforecasts=False, elev=30, azim=144, intervals=False, parameters=None, partitioner=<class 'pyFTS.partitioners.Grid.GridPartitioner'>, transformation=None, indexer=None)[source]¶
-
pyFTS.benchmarks.benchmarks.
sliding_window_benchmarks
(data, windowsize, train=0.8, **kwargs)[source]¶ Sliding window benchmarks for FTS forecasters.
For each data window, a train and test datasets will be splitted. For each train split, number of partitions and partitioning method will be created a partitioner model. And for each partitioner, order, steps ahead and FTS method a foreasting model will be trained.
Then all trained models are benchmarked on the test data and the metrics are stored on a sqlite3 database (identified by the ‘file’ parameter) for posterior analysis.
All these process can be distributed on a dispy cluster, setting the atributed ‘distributed’ to true and informing the list of dispy nodes on ‘nodes’ parameter.
The number of experiments is determined by ‘windowsize’ and ‘inc’ parameters.
Parameters: - data – test data
- windowsize – size of sliding window
- train – percentual of sliding window data used to train the models
- kwargs – dict, optional arguments
- benchmark_methods – a list with Non FTS models to benchmark. The default is None.
- benchmark_methods_parameters – a list with Non FTS models parameters. The default is None.
- benchmark_models – A boolean value indicating if external FTS methods will be used on benchmark. The default is False.
- build_methods – A boolean value indicating if the default FTS methods will be used on benchmark. The default is True.
- dataset – the dataset name to identify the current set of benchmarks results on database.
- distributed – A boolean value indicating if the forecasting procedure will be distributed in a dispy cluster. . The default is False
- file – file path to save the results. The default is benchmarks.db.
- inc – a float on interval [0,1] indicating the percentage of the windowsize to move the window
- methods – a list with FTS class names. The default depends on the forecasting type and contains the list of all FTS methods.
- models – a list with prebuilt FTS objects. The default is None.
- nodes – a list with the dispy cluster nodes addresses. The default is [127.0.0.1].
- orders – a list with orders of the models (for high order models). The default is [1,2,3].
- partitions – a list with the numbers of partitions on the Universe of Discourse. The default is [10].
- partitioners_models – a list with prebuilt Universe of Discourse partitioners objects. The default is None.
- partitioners_methods – a list with Universe of Discourse partitioners class names. The default is [partitioners.Grid.GridPartitioner].
- progress – If true a progress bar will be displayed during the benchmarks. The default is False.
- start – in the multi step forecasting, the index of the data where to start forecasting. The default is 0.
- steps_ahead – a list with the forecasting horizons, i. e., the number of steps ahead to forecast. The default is 1.
- tag – a name to identify the current set of benchmarks results on database.
- type – the forecasting type, one of these values: point(default), interval or distribution. The default is point.
- transformations – a list with data transformations do apply . The default is [None].
pyFTS.benchmarks.Measures module¶
pyFTS module for common benchmark metrics
-
pyFTS.benchmarks.Measures.
BoxLjungStatistic
(data, h)[source]¶ Q Statistic for Ljung–Box test
Parameters: - data –
- h –
Returns:
-
pyFTS.benchmarks.Measures.
BoxPierceStatistic
(data, h)[source]¶ Q Statistic for Box-Pierce test
Parameters: - data –
- h –
Returns:
-
pyFTS.benchmarks.Measures.
TheilsInequality
(targets, forecasts)[source]¶ Theil’s Inequality Coefficient
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
UStatistic
(targets, forecasts)[source]¶ Theil’s U Statistic
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
acf
(data, k)[source]¶ Autocorrelation function estimative
Parameters: - data –
- k –
Returns:
-
pyFTS.benchmarks.Measures.
brier_score
(targets, densities)[source]¶ Brier (1950). “Verification of Forecasts Expressed in Terms of Probability”. Monthly Weather Review. 78: 1–3.
-
pyFTS.benchmarks.Measures.
coverage
(targets, forecasts)[source]¶ Percent of target values that fall inside forecasted interval
-
pyFTS.benchmarks.Measures.
crps
(targets, densities)[source]¶ Continuous Ranked Probability Score
Parameters: - targets – a list with the target values
- densities – a list with pyFTS.probabil objectsistic.ProbabilityDistribution
Returns: float
-
pyFTS.benchmarks.Measures.
get_distribution_statistics
(data, model, **kwargs)[source]¶ Get CRPS statistic and time for a forecasting model
Parameters: - data – test data
- model – FTS model with probabilistic forecasting capability
- kwargs –
Returns: a list with the CRPS and execution time
-
pyFTS.benchmarks.Measures.
get_interval_statistics
(data, model, **kwargs)[source]¶ Condensate all measures for point interval forecasters
Parameters: - data – test data
- model – FTS model with interval forecasting capability
- kwargs –
Returns: a list with the sharpness, resolution, coverage, .05 pinball mean,
.25 pinball mean, .75 pinball mean and .95 pinball mean.
-
pyFTS.benchmarks.Measures.
get_point_statistics
(data, model, **kwargs)[source]¶ Condensate all measures for point forecasters
Parameters: - data – test data
- model – FTS model with point forecasting capability
- kwargs –
Returns: a list with the RMSE, SMAPE and U Statistic
-
pyFTS.benchmarks.Measures.
mape
(targets, forecasts)[source]¶ Mean Average Percentual Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
pinball
(tau, target, forecast)[source]¶ Pinball loss function. Measure the distance of forecast to the tau-quantile of the target
Parameters: - tau – quantile value in the range (0,1)
- target –
- forecast –
Returns: float, distance of forecast to the tau-quantile of the target
-
pyFTS.benchmarks.Measures.
pinball_mean
(tau, targets, forecasts)[source]¶ Mean pinball loss value of the forecast for a given tau-quantile of the targets
Parameters: - tau – quantile value in the range (0,1)
- targets – list of target values
- forecasts – list of prediction intervals
Returns: float, the pinball loss mean for tau quantile
-
pyFTS.benchmarks.Measures.
resolution
(forecasts)[source]¶ Resolution - Standard deviation of the intervals
-
pyFTS.benchmarks.Measures.
rmse
(targets, forecasts)[source]¶ Root Mean Squared Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
rmse_interval
(targets, forecasts)[source]¶ Root Mean Squared Error
Parameters: - targets –
- forecasts –
Returns:
-
pyFTS.benchmarks.Measures.
smape
(targets, forecasts, type=2)[source]¶ Symmetric Mean Average Percentual Error
Parameters: - targets –
- forecasts –
- type –
Returns:
-
pyFTS.benchmarks.Measures.
winkler_mean
(tau, targets, forecasts)[source]¶ Mean Winkler score value of the forecast for a given tau-quantile of the targets
Parameters: - tau – quantile value in the range (0,1)
- targets – list of target values
- forecasts – list of prediction intervals
Returns: float, the Winkler score mean for tau quantile
pyFTS.benchmarks.ResidualAnalysis module¶
Residual Analysis methods
-
pyFTS.benchmarks.ResidualAnalysis.
chi_squared
(q, h)[source]¶ Chi-Squared value
Parameters: - q –
- h –
Returns:
-
pyFTS.benchmarks.ResidualAnalysis.
compare_residuals
(data, models)[source]¶ Compare residual’s statistics of several models
Parameters: - data – test data
- models –
Returns: a Pandas dataframe with the Box-Ljung statistic for each model
-
pyFTS.benchmarks.ResidualAnalysis.
plotResiduals
(targets, models, tam=[8, 8], save=False, file=None)[source]¶ Plot residuals and statistics
Parameters: - targets –
- models –
- tam –
- save –
- file –
Returns:
-
pyFTS.benchmarks.ResidualAnalysis.
plot_residuals
(targets, models, tam=[8, 8], save=False, file=None)[source]¶
pyFTS.benchmarks.Util module¶
Facilities for pyFTS Benchmark module
-
pyFTS.benchmarks.Util.
create_benchmark_tables
(conn)[source]¶ Create a sqlite3 table designed to store benchmark results.
Parameters: conn – a sqlite3 database connection
-
pyFTS.benchmarks.Util.
get_dataframe_from_bd
(file, filter)[source]¶ Query the sqlite benchmark database and return a pandas dataframe with the results
Parameters: - file – the url of the benchmark database
- filter – sql conditions to filter
Returns: pandas dataframe with the query results
-
pyFTS.benchmarks.Util.
insert_benchmark
(data, conn)[source]¶ Insert benchmark data on database
Parameters: data – a tuple with the benchmark data with format: ID: integer incremental primary key Date: Date/hour of benchmark execution Dataset: Identify on which dataset the dataset was performed Tag: a user defined word that indentify a benchmark set Type: forecasting type (point, interval, distribution) Model: FTS model Transformation: The name of data transformation, if one was used Order: the order of the FTS method Scheme: UoD partitioning scheme Partitions: Number of partitions Size: Number of rules of the FTS model Steps: prediction horizon, i. e., the number of steps ahead Measure: accuracy measure Value: the measure value
Parameters: conn – a sqlite3 database connection Returns:
-
pyFTS.benchmarks.Util.
open_benchmark_db
(name)[source]¶ Open a connection with a Sqlite database designed to store benchmark results.
Parameters: name – database filenem Returns: a sqlite3 database connection
-
pyFTS.benchmarks.Util.
plot_dataframe_interval
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_interval_pinball
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_point
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['UAVG', 'RMSEAVG', 'USTD', 'RMSESTD'], sort_ascend=[1, 1, 1, 1], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
plot_dataframe_probabilistic
(file_synthetic, file_analytic, experiments, tam, save=False, file=None, sort_columns=['CRPS1AVG', 'CRPS2AVG', 'CRPS1STD', 'CRPS2STD'], sort_ascend=[True, True, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
process_common_data
(dataset, tag, type, job)[source]¶ Wraps benchmark information on a tuple for sqlite database
Parameters: - dataset – benchmark dataset
- tag – benchmark set alias
- type – forecasting type
- job – a dictionary with benchmark data
Returns: tuple for sqlite database
-
pyFTS.benchmarks.Util.
save_dataframe_interval
(coverage, experiments, file, objs, resolution, save, sharpness, synthetic, times, q05, q25, q75, q95, steps, method)[source]¶
-
pyFTS.benchmarks.Util.
save_dataframe_point
(experiments, file, objs, rmse, save, synthetic, smape, times, u, steps, method)[source]¶ Create a dataframe to store the benchmark results
Parameters: - experiments – dictionary with the execution results
- file –
- objs –
- rmse –
- save –
- synthetic –
- smape –
- times –
- u –
Returns:
-
pyFTS.benchmarks.Util.
save_dataframe_probabilistic
(experiments, file, objs, crps, times, save, synthetic, steps, method)[source]¶ Save benchmark results for m-step ahead probabilistic forecasters :param experiments: :param file: :param objs: :param crps_interval: :param crps_distr: :param times: :param times2: :param save: :param synthetic: :return:
-
pyFTS.benchmarks.Util.
unified_scaled_interval
(experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
-
pyFTS.benchmarks.Util.
unified_scaled_interval_pinball
(experiments, tam, save=False, file=None, sort_columns=['COVAVG', 'SHARPAVG', 'COVSTD', 'SHARPSTD'], sort_ascend=[True, False, True, True], save_best=False, ignore=None, replace=None)[source]¶
pyFTS.benchmarks.arima module¶
-
class
pyFTS.benchmarks.arima.
ARIMA
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Façade for statsmodels.tsa.arima_model
-
forecast
(ndata, **kwargs)[source]¶ Point forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted values
-
forecast_ahead_distribution
(data, steps, **kwargs)[source]¶ Probabilistic forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
forecast_ahead_interval
(ndata, steps, **kwargs)[source]¶ Interval forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted intervals
-
forecast_distribution
(data, **kwargs)[source]¶ Probabilistic forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with probabilistic.ProbabilityDistribution objects representing the forecasted Probability Distributions
-
pyFTS.benchmarks.knn module¶
-
class
pyFTS.benchmarks.knn.
KNearestNeighbors
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
K-Nearest Neighbors
-
forecast_distribution
(data, **kwargs)[source]¶ Probabilistic forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with probabilistic.ProbabilityDistribution objects representing the forecasted Probability Distributions
-
pyFTS.benchmarks.naive module¶
-
class
pyFTS.benchmarks.naive.
Naive
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Naïve Forecasting method
pyFTS.benchmarks.quantreg module¶
-
class
pyFTS.benchmarks.quantreg.
QuantileRegression
(**kwargs)[source]¶ Bases:
pyFTS.common.fts.FTS
Façade for statsmodels.regression.quantile_regression
-
forecast
(ndata, **kwargs)[source]¶ Point forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with the forecasted values
-
forecast_ahead_distribution
(ndata, steps, **kwargs)[source]¶ Probabilistic forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted Probability Distributions
-
forecast_ahead_interval
(ndata, steps, **kwargs)[source]¶ Interval forecast n steps ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- steps – the number of steps ahead to forecast
- kwargs – model specific parameters
Returns: a list with the forecasted intervals
-
forecast_distribution
(ndata, **kwargs)[source]¶ Probabilistic forecast one step ahead
Parameters: - data – time series data with the minimal length equal to the max_lag of the model
- kwargs – model specific parameters
Returns: a list with probabilistic.ProbabilityDistribution objects representing the forecasted Probability Distributions
-