Module futureexpert.matcher
Contains the models with the configuration for the matcher and the result format.
Classes
class ActualsCovsConfiguration (**data: Any)
-
Expand source code
class ActualsCovsConfiguration(BaseModel): """Configuration of actuals and covariates via name and lag. Parameters ---------- actuals_name: builtins.str Name of the time series. covs_configurations: builtins.list List of Covariates. """ actuals_name: str covs_configurations: list[CovariateRef]
Configuration of actuals and covariates via name and lag.
Parameters
actuals_name
:builtins.str
- Name of the time series.
covs_configurations
:builtins.list
- List of Covariates.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.Ancestors
- pydantic.main.BaseModel
Class variables
var actuals_name : str
var covs_configurations : list[CovariateRef]
var model_config
class CovariateRankingDetails (**data: Any)
-
Expand source code
class CovariateRankingDetails(BaseModel): """Final rank for a given set of covariates. Parameters ---------- rank: futureexpert.shared_models.PositiveInt Rank for the given set of covariates. covariates: builtins.list Used covariates (might be zero or more than one). """ model_config = ConfigDict(arbitrary_types_allowed=True) rank: ValidatedPositiveInt covariates: list[Covariate]
Final rank for a given set of covariates.
Parameters
rank
:PositiveInt
- Rank for the given set of covariates.
covariates
:builtins.list
- Used covariates (might be zero or more than one).
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.Ancestors
- pydantic.main.BaseModel
Class variables
var covariates : list[Covariate]
var model_config
var rank : PositiveInt
class LagSelectionConfig (**data: Any)
-
Expand source code
class LagSelectionConfig(BaseModel): """Configures covariate lag selection. Parameters ---------- fixed_lags: typing.Optional Lags that are tested in the lag selection. min_lag: typing.Optional Minimal lag that is tested in the lag selection. For example, a lag 3 means the covariate is shifted 3 data points into the future. max_lag: typing.Optional Maximal lag that is tested in the lag selection. For example, a lag 12 means the covariate is shifted 12 data points into the future. """ min_lag: Optional[int] = None max_lag: Optional[int] = None fixed_lags: Optional[list[int]] = None @model_validator(mode='after') def _check_range(self) -> Self: if (self.min_lag is None) ^ (self.max_lag is None): raise ValueError( 'If one of `min_lag` and `max_lag` is set the other one also needs to be set.') if self.min_lag and self.max_lag: if self.fixed_lags is not None: raise ValueError('Fixed lags and min/max lag are mutually exclusive.') if self.max_lag < self.min_lag: raise ValueError('max_lag needs to be greater or equal to min_lag.') lag_range = abs(self.max_lag - self.min_lag) + 1 if lag_range > 15: raise ValueError(f'Only 15 lags are allowed to be tested. The requested range has length {lag_range}.') if self.fixed_lags and len(self.fixed_lags) > 15: raise ValueError( f'Only 15 lags are allowed to be tested. The provided fixed lags has length {len(self.fixed_lags)}.') return self
Configures covariate lag selection.
Parameters
fixed_lags
:typing.Optional
- Lags that are tested in the lag selection.
min_lag
:typing.Optional
- Minimal lag that is tested in the lag selection. For example, a lag 3 means the covariate is shifted 3 data points into the future.
max_lag
:typing.Optional
- Maximal lag that is tested in the lag selection. For example, a lag 12 means the covariate is shifted 12 data points into the future.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.Ancestors
- pydantic.main.BaseModel
Class variables
var fixed_lags : list[int] | None
var max_lag : int | None
var min_lag : int | None
var model_config
class MatcherConfig (**data: Any)
-
Expand source code
class MatcherConfig(BaseConfig): """Configuration for a MATCHER run. Parameters ---------- title: builtins.str A short description of the report. actuals_version: builtins.str The version ID of the actuals. covs_versions: builtins.list List of versions of the covariates. actuals_filter: builtins.dict Filter criterion for actuals time series. The given actuals version is automatically added as additional filter criterion. Possible Filter criteria are all fields that are part of the TimeSeries class. e.g. {'name': 'Sales'} For more complex filter check: https://www.mongodb.com/docs/manual/reference/operator/query/#query-selectors covs_filter: builtins.dict Filter criterion for covariates time series. The given covariate version is automatically added as additional filter criterion. Possible Filter criteria are all fields that are part of the TimeSeries class. e.g. {'name': 'Sales'} For more complex filter check: https://www.mongodb.com/docs/manual/reference/operator/query/#query-selectors max_ts_len: typing.Optional At most this number of most recent observations of the actuals time series is used. Check the variable MAX_TS_LEN_CONFIG for allowed configuration. lag_selection: futureexpert.matcher.LagSelectionConfig Configuration of covariate lag selection. evaluation_start_date: typing.Optional Optional start date for the evaluation. The input should be in the ISO format with date and time, "YYYY-mm-DDTHH-MM-SS", e.g., "2024-01-01T16:40:00". Actuals and covariate observations prior to this start date are dropped. evaluation_end_date: typing.Optional Optional end date for the evaluation. The input should be in the ISO format with date and time, "YYYY-mm-DDTHH-MM-SS", e.g., "2024-01-01T16:40:00". Actuals and covariate observations after this end date are dropped. max_publication_lag: builtins.int Maximal publication lag for the covariates. The publication lag of a covariate is the number of most recent observations (compared to the actuals) that are missing for the covariate. E.g., if the actuals (for monthly granularity) end in April 2023 but the covariate ends in February 2023, the covariate has a publication lag of 2. post_selection_queries: builtins.list List of queries that are executed on the ranking summary DataFrame. Only ranking entries that match the queries are kept. The query strings need to satisfy the pandas query syntax (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html). Here are the columns of the ranking summary DataFrame that you might want to filter on: Column Name | Data Type | Description ----------------------------------------------------------------------------------------------- Lag | Int64 | Lag of the covariate. Rank | float64 | Rank of the model. BetterThanNoCov | bool | Indicates whether the model is better than the non-cov model. enable_leading_covariate_selection: builtins.bool When True, all covariates after the lag is applied that do not have at least one more datapoint beyond the the time period covered by actuals are removed from the candidate covariates passed to covariate selection. fixed_season_length: typing.Optional An optional parameter specifying the length of a season in the dataset. pool_covs: typing.Optional List of covariate definitions. db_name: typing.Optional Only accessible for internal use. Name of the database to use for storing the results. """ title: str actuals_version: str covs_versions: list[str] = Field(default_factory=list) actuals_filter: dict[str, Any] = Field(default_factory=dict) covs_filter: dict[str, Any] = Field(default_factory=dict) max_ts_len: Annotated[ Optional[int], pydantic.Field(ge=1, le=1500)] = None lag_selection: LagSelectionConfig = LagSelectionConfig() evaluation_start_date: Optional[str] = None evaluation_end_date: Optional[str] = None max_publication_lag: int = 2 post_selection_queries: list[str] = [] enable_leading_covariate_selection: bool = True fixed_season_length: Optional[int] = None pool_covs: Optional[list[PoolCovDefinition]] = None db_name: Optional[str] = None @model_validator(mode='after') def _validate_post_selection_queries(self) -> Self: # Validate the post-selection queries. invalid_queries = [] columns = { 'Lag': 'int', 'Rank': 'float', 'BetterThanNoCov': 'bool' } # Create an empty DataFrame with the specified column names and data types validation_df = pd.DataFrame(columns=columns.keys()).astype(columns) for postselection_query in self.post_selection_queries: try: validation_df.query(postselection_query, ) except Exception: invalid_queries.append(postselection_query) if len(invalid_queries): raise ValueError("The following post-selection queries are invalidly formatted: " f"{', '.join(invalid_queries)}. ") return self
Configuration for a MATCHER run.
Parameters
title
:builtins.str
- A short description of the report.
actuals_version
:builtins.str
- The version ID of the actuals.
covs_versions
:builtins.list
- List of versions of the covariates.
actuals_filter
:builtins.dict
- Filter criterion for actuals time series. The given actuals version is automatically added as additional filter criterion. Possible Filter criteria are all fields that are part of the TimeSeries class. e.g. {'name': 'Sales'} For more complex filter check: https://www.mongodb.com/docs/manual/reference/operator/query/#query-selectors
covs_filter
:builtins.dict
- Filter criterion for covariates time series. The given covariate version is automatically added as additional filter criterion. Possible Filter criteria are all fields that are part of the TimeSeries class. e.g. {'name': 'Sales'} For more complex filter check: https://www.mongodb.com/docs/manual/reference/operator/query/#query-selectors
max_ts_len
:typing.Optional
- At most this number of most recent observations of the actuals time series is used. Check the variable MAX_TS_LEN_CONFIG for allowed configuration.
lag_selection
:LagSelectionConfig
- Configuration of covariate lag selection.
evaluation_start_date
:typing.Optional
- Optional start date for the evaluation. The input should be in the ISO format with date and time, "YYYY-mm-DDTHH-MM-SS", e.g., "2024-01-01T16:40:00". Actuals and covariate observations prior to this start date are dropped.
evaluation_end_date
:typing.Optional
- Optional end date for the evaluation. The input should be in the ISO format with date and time, "YYYY-mm-DDTHH-MM-SS", e.g., "2024-01-01T16:40:00". Actuals and covariate observations after this end date are dropped.
max_publication_lag
:builtins.int
- Maximal publication lag for the covariates. The publication lag of a covariate is the number of most recent observations (compared to the actuals) that are missing for the covariate. E.g., if the actuals (for monthly granularity) end in April 2023 but the covariate ends in February 2023, the covariate has a publication lag of 2.
post_selection_queries
:builtins.list
-
List of queries that are executed on the ranking summary DataFrame. Only ranking entries that match the queries are kept. The query strings need to satisfy the pandas query syntax (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html). Here are the columns of the ranking summary DataFrame that you might want to filter on:
Column Name | Data Type | Description
Lag | Int64 | Lag of the covariate. Rank | float64 | Rank of the model. BetterThanNoCov | bool | Indicates whether the model is better than the non-cov model.
enable_leading_covariate_selection
:builtins.bool
- When True, all covariates after the lag is applied that do not have at least one more datapoint beyond the the time period covered by actuals are removed from the candidate covariates passed to covariate selection.
fixed_season_length
:typing.Optional
- An optional parameter specifying the length of a season in the dataset.
pool_covs
:typing.Optional
- List of covariate definitions.
db_name
:typing.Optional
- Only accessible for internal use. Name of the database to use for storing the results.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.Ancestors
- BaseConfig
- pydantic.main.BaseModel
Class variables
var actuals_filter : dict[str, typing.Any]
var actuals_version : str
var covs_filter : dict[str, typing.Any]
var covs_versions : list[str]
var db_name : str | None
var enable_leading_covariate_selection : bool
var evaluation_end_date : str | None
var evaluation_start_date : str | None
var fixed_season_length : int | None
var lag_selection : LagSelectionConfig
var max_publication_lag : int
var max_ts_len : int | None
var model_config
var pool_covs : list[PoolCovDefinition] | None
var post_selection_queries : list[str]
var title : str
class MatcherResult (**data: Any)
-
Expand source code
class MatcherResult(BaseModel): """Result of a covariate matcher run and the corresponding input data. Parameters ---------- actuals: futureexpert.shared_models.TimeSeries Time series for which the matching was performed. ranking: builtins.list Ranking of the different covariate and non-covariate models. """ actuals: TimeSeries ranking: list[CovariateRankingDetails] def convert_ranking_to_forecast_config(self) -> ActualsCovsConfiguration: """Converts MATCHER results into the input format for the FORECAST.""" covs_config = [CovariateRef(name=cov.ts.name, lag=cov.lag) for r in self.ranking for cov in r.covariates] return ActualsCovsConfiguration(actuals_name=self.actuals.name, covs_configurations=covs_config)
Result of a covariate matcher run and the corresponding input data.
Parameters
actuals
:TimeSeries
- Time series for which the matching was performed.
ranking
:builtins.list
- Ranking of the different covariate and non-covariate models.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError
][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.self
is explicitly positional-only to allowself
as a field name.Ancestors
- pydantic.main.BaseModel
Class variables
var actuals : TimeSeries
var model_config
var ranking : list[CovariateRankingDetails]
Methods
def convert_ranking_to_forecast_config(self) ‑> ActualsCovsConfiguration
-
Expand source code
def convert_ranking_to_forecast_config(self) -> ActualsCovsConfiguration: """Converts MATCHER results into the input format for the FORECAST.""" covs_config = [CovariateRef(name=cov.ts.name, lag=cov.lag) for r in self.ranking for cov in r.covariates] return ActualsCovsConfiguration(actuals_name=self.actuals.name, covs_configurations=covs_config)
Converts MATCHER results into the input format for the FORECAST.