drifter_ml.regression_tests package

Submodules

drifter_ml.regression_tests.regression_tests module

class drifter_ml.regression_tests.regression_tests.RegressionComparison(reg_one, reg_two, test_data, target_name, column_names)

Bases: object

cross_val_mae_result(reg, cv=3)
cross_val_mse_result(reg, cv=3)
cv_two_model_regression_testing(cv=3)
mae_result(reg)
mse_result(reg)
two_model_prediction_run_time_stress_test(sample_sizes)
two_model_regression_testing()
class drifter_ml.regression_tests.regression_tests.RegressionTests(reg, test_data, target_name, column_names)

Bases: object

cross_val_mae_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_mae_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_mae_upper_boundary(upper_boundary, cv=3)
cross_val_mse_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_mse_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_mse_upper_boundary(upper_boundary, cv=3)
cross_val_tae_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_tae_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_tae_upper_boundary(upper_boundary, cv=3)
cross_val_tse_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_tse_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_tse_upper_boundary(upper_boundary, cv=3)
describe_scores(scores, method)

Describes scores.

Parameters:
  • scores (array-like) – the scores from the model, as a list or numpy array
  • method (string) – the method to use to calculate central tendency and spread
Returns:

  • Returns the central tendency, and spread
  • by method.
  • Methods
  • mean
  • * central tendency (mean)
  • * spread (standard deviation)
  • median
  • * central tendency (median)
  • * spread (interquartile range)
  • trimean
  • * central tendency (trimean)
  • * spread (trimean absolute deviation)

get_test_score(cross_val_dict)
mae_cv(cv)

This method performs cross-validation over median absolute error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold median absolute error.
mae_upper_boundary(upper_boundary)
mse_cv(cv)

This method performs cross-validation over mean squared error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold mean squared error.
mse_upper_boundary(upper_boundary)
run_time_stress_test(sample_sizes, max_run_times)
tae_cv(cv)

This method performs cross-validation over trimean absolute error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold trimean absolute error.
tae_upper_boundary(upper_boundary)
trimean(data)

I’m exposing this as a public method because the trimean is not implemented in enough packages.

Formula: (25th percentile + 2*50th percentile + 75th percentile)/4

Parameters:data (array-like) – an iterable, either a list or a numpy array
Returns:the trimean
Return type:float
trimean_absolute_deviation(data)

The trimean absolute deviation is the the average distance from the trimean.

Parameters:data (array-like) – an iterable, either a list or a numpy array
Returns:the average distance to the trimean
Return type:float
trimean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')
trimean_squared_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')
tse_cv(cv)

This method performs cross-validation over trimean squared error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold trimean squared error.
tse_upper_boundary(upper_boundary)
upper_bound_regression_testing(mse_upper_boundary, mae_upper_boundary, tse_upper_boundary, tae_upper_boundary)

Module contents

class drifter_ml.regression_tests.RegressionTests(reg, test_data, target_name, column_names)

Bases: object

cross_val_mae_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_mae_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_mae_upper_boundary(upper_boundary, cv=3)
cross_val_mse_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_mse_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_mse_upper_boundary(upper_boundary, cv=3)
cross_val_tae_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_tae_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_tae_upper_boundary(upper_boundary, cv=3)
cross_val_tse_anomaly_detection(tolerance, cv=3, method='mean')
cross_val_tse_avg(minimum_center_tolerance, cv=3, method='mean')
cross_val_tse_upper_boundary(upper_boundary, cv=3)
describe_scores(scores, method)

Describes scores.

Parameters:
  • scores (array-like) – the scores from the model, as a list or numpy array
  • method (string) – the method to use to calculate central tendency and spread
Returns:

  • Returns the central tendency, and spread
  • by method.
  • Methods
  • mean
  • * central tendency (mean)
  • * spread (standard deviation)
  • median
  • * central tendency (median)
  • * spread (interquartile range)
  • trimean
  • * central tendency (trimean)
  • * spread (trimean absolute deviation)

get_test_score(cross_val_dict)
mae_cv(cv)

This method performs cross-validation over median absolute error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold median absolute error.
mae_upper_boundary(upper_boundary)
mse_cv(cv)

This method performs cross-validation over mean squared error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold mean squared error.
mse_upper_boundary(upper_boundary)
run_time_stress_test(sample_sizes, max_run_times)
tae_cv(cv)

This method performs cross-validation over trimean absolute error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold trimean absolute error.
tae_upper_boundary(upper_boundary)
trimean(data)

I’m exposing this as a public method because the trimean is not implemented in enough packages.

Formula: (25th percentile + 2*50th percentile + 75th percentile)/4

Parameters:data (array-like) – an iterable, either a list or a numpy array
Returns:the trimean
Return type:float
trimean_absolute_deviation(data)

The trimean absolute deviation is the the average distance from the trimean.

Parameters:data (array-like) – an iterable, either a list or a numpy array
Returns:the average distance to the trimean
Return type:float
trimean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')
trimean_squared_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')
tse_cv(cv)

This method performs cross-validation over trimean squared error.

Parameters:cv (*) – The number of cross validation folds to perform
Returns:
Return type:Returns a scores of the k-fold trimean squared error.
tse_upper_boundary(upper_boundary)
upper_bound_regression_testing(mse_upper_boundary, mae_upper_boundary, tse_upper_boundary, tae_upper_boundary)
class drifter_ml.regression_tests.RegressionComparison(reg_one, reg_two, test_data, target_name, column_names)

Bases: object

cross_val_mae_result(reg, cv=3)
cross_val_mse_result(reg, cv=3)
cv_two_model_regression_testing(cv=3)
mae_result(reg)
mse_result(reg)
two_model_prediction_run_time_stress_test(sample_sizes)
two_model_regression_testing()