paradance.evaluation.Calculator

class paradance.evaluation.Calculator(df: DataFrame, selected_columns: List[str], overall_score_lower_bound: float | None = None, overall_score_upper_bound: float | None = None, equation_type: str = 'product', weights_for_groups: Series | None = None, equation_eval_str: str | None = None, equation_json: Dict | None = None, delimiter: str | None = '#', rerank_eval_str: str | None = None)[source]

A calculator for processing and analyzing data within a DataFrame based on specified equations and methods.

df

The DataFrame to perform calculations on.

Type:

pd.DataFrame

selected_columns

The names of the columns to include in calculations.

Type:

List[str]

overall_score_lower_bound

The lower bound for overall scores.

Type:

Optional[float]

overall_score_upper_bound

The upper bound for overall scores.

Type:

Optional[float]

equation_eval_str

A string representing a custom equation to evaluate.

Type:

Optional[str]

equation_type

The type of equation to use for calculations (“product”, “sum”, “free_style”, or “json”).

Type:

str

selected_columns

Columns selected for calculations.

Type:

List[str]

selected_values

The values of the selected columns in the DataFrame.

Type:

np.ndarray

value_scales

The negative average log10 magnitude of absolute values for selected columns.

Type:

np.ndarray

weights_for_groups

A Series containing weights for different groups within the DataFrame.

Type:

pd.Series

__init__(df: DataFrame, selected_columns: List[str], overall_score_lower_bound: float | None = None, overall_score_upper_bound: float | None = None, equation_type: str = 'product', weights_for_groups: Series | None = None, equation_eval_str: str | None = None, equation_json: Dict | None = None, delimiter: str | None = '#', rerank_eval_str: str | None = None) None[source]

Initializes the Calculator object.

Parameters:
  • df (pd.DataFrame) – The DataFrame to perform calculations on.

  • selected_columns (List[str]) – The names of the columns to include in calculations.

  • equation_type (str, optional) – The type of equation to use for score calculation. Defaults to “product”.

  • weights_for_groups (Optional[pd.Series], optional) – A Series containing weights for different groups. Defaults to None, which sets equal weights.

  • equation_eval_str (Optional[str], optional) – A string representing a custom equation for free-style calculations. Defaults to None.

  • rerank_eval_str (Optional[str], optional) – A string representing a custom equation for reranking. Defaults to None.

Methods

__init__(df, selected_columns[, ...])

Initializes the Calculator object.

calculate_auc_triple_parameters(grid_interval)

calculate_corrcoef(mask_column, ...)

calculate_cumulative_deviation(mask_column, ...)

calculate_distinct_count_portfolio_concentration(...)

calculate_distinct_top_coverage(mask_column, ...)

calculate_inverse_pair(target_column[, ...])

calculate_log_mse(target_column[, ...])

calculate_mean(mask_column, target_column, ...)

calculate_neg_rank_ratio([label_column])

calculate_portfolio_concentration(...)

calculate_proportion(mask_column, ...)

calculate_standard_deviation(mask_column, ...)

calculate_tau(target_column, groupby[, ...])

calculate_top_coverage(mask_column, ...)

calculate_woauc(groupby, target_column[, ...])

calculate_wuauc(mask_column, target_column, ...)

clip_max(left, right)

Clips the values in the right array or scalar to a maximum value specified by left.

clip_min(left, right)

Clips the values in the right array or scalar to a minimum value specified by left.

create_score_columns(boundary_dict[, ...])

Creates new columns in the DataFrame to categorize rows based on score boundaries.

get_overall_score(weights_for_equation)

Calculates the overall score for each row in the DataFrame based on the specified equation type and weights.

initialize_fq_sampler(sample_size, score_column)

Initializes a frequency sampler for a given score column and applies sampling results to create new columns.

initialize_local_dict(weights_for_equation, ...)

Initializes a dictionary that can be used for additional calculations.

rerank_with_side_information()

Reranks the rows in the DataFrame based on side information.

value_scale()

Calculates the negative average log10 magnitude of absolute values for selected columns in the dataframe, storing the result in self.value_scales.

value_scale() None[source]

Calculates the negative average log10 magnitude of absolute values for selected columns in the dataframe, storing the result in self.value_scales.

get_overall_score(weights_for_equation: List[float]) None[source]

Calculates the overall score for each row in the DataFrame based on the specified equation type and weights.

Parameters:

weights_for_equation (List[float]) – A list of weights to apply to each selected column for the calculation.

create_score_columns(boundary_dict: dict, score_column: str = 'score') None[source]

Creates new columns in the DataFrame to categorize rows based on score boundaries.

Parameters:
  • boundary_dict (Dict) – A dictionary with score boundaries as keys and conditions as values.

  • score_column (str, optional) – The name of the column to apply the boundaries to. Defaults to “score”.

initialize_fq_sampler(sample_size: int, score_column: str, slice_from: float | None = None, slice_to: float | None = None, log_scale: bool | None = True, laplace_smoothing: bool | None = True) None[source]

Initializes a frequency sampler for a given score column and applies sampling results to create new columns.

Parameters:
  • sample_size (int) – The size of the sample to generate.

  • score_column (str) – The name of the score column to sample from.

  • slice_from (Optional[float], optional) – The lower bound of the score range to sample. Defaults to None.

  • slice_to (Optional[float], optional) – The upper bound of the score range to sample. Defaults to None.

  • log_scale (Optional[bool], optional) – Whether to use logarithmic scaling for sampling. Defaults to True.

  • laplace_smoothing (Optional[bool], optional) – Whether to apply Laplace smoothing to the sampling. Defaults to True.