drifter_ml.regression_tests package¶

Submodules¶

drifter_ml.regression_tests.regression_tests module¶

class drifter_ml.regression_tests.regression_tests.RegressionComparison(reg_one, reg_two, test_data, target_name, column_names)¶

Bases: object

cross_val_mae_result(reg, cv=3)¶

cross_val_mse_result(reg, cv=3)¶

cv_two_model_regression_testing(cv=3)¶

mae_result(reg)¶

mse_result(reg)¶

two_model_prediction_run_time_stress_test(performance_boundary)¶

two_model_regression_testing()¶

class drifter_ml.regression_tests.regression_tests.RegressionTests(reg, test_data, target_name, column_names)¶

Bases: object

cross_val_mae_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_mae_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_mae_upper_boundary(upper_boundary, cv=3)¶

cross_val_mse_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_mse_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_mse_upper_boundary(upper_boundary, cv=3)¶

cross_val_tae_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_tae_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_tae_upper_boundary(upper_boundary, cv=3)¶

cross_val_tse_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_tse_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_tse_upper_boundary(upper_boundary, cv=3)¶

describe_scores(scores, method)¶

Describes scores.

Parameters:

scores (array-like) – the scores from the model, as a list or numpy array
method (string) – the method to use to calculate central tendency and spread

Returns:

Returns the central tendency, and spread
by method.
Methods
mean
* central tendency (mean)
* spread (standard deviation)
median
* central tendency (median)
* spread (interquartile range)
trimean
* central tendency (trimean)
* spread (trimean absolute deviation)

get_test_score(cross_val_dict)¶

mae_cv(cv)¶

This method performs cross-validation over median absolute error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold median absolute error.

mae_upper_boundary(upper_boundary)¶

mse_cv(cv)¶

This method performs cross-validation over mean squared error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold mean squared error.

mse_upper_boundary(upper_boundary)¶

run_time_stress_test(sample_sizes, max_run_times)¶

tae_cv(cv)¶

This method performs cross-validation over trimean absolute error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold trimean absolute error.

tae_upper_boundary(upper_boundary)¶

trimean(data)¶

I’m exposing this as a public method because the trimean is not implemented in enough packages.

Formula: (25th percentile + 2*50th percentile + 75th percentile)/4

Parameters:	data (array-like) – an iterable, either a list or a numpy array
Returns:	the trimean
Return type:	float

trimean_absolute_deviation(data)¶

The trimean absolute deviation is the the average distance from the trimean.

Parameters:	data (array-like) – an iterable, either a list or a numpy array
Returns:	the average distance to the trimean
Return type:	float

trimean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')¶

trimean_squared_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')¶

tse_cv(cv)¶

This method performs cross-validation over trimean squared error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold trimean squared error.

tse_upper_boundary(upper_boundary)¶

upper_bound_regression_testing(mse_upper_boundary, mae_upper_boundary, tse_upper_boundary, tae_upper_boundary)¶

Module contents¶

class drifter_ml.regression_tests.RegressionTests(reg, test_data, target_name, column_names)¶

Bases: object

cross_val_mae_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_mae_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_mae_upper_boundary(upper_boundary, cv=3)¶

cross_val_mse_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_mse_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_mse_upper_boundary(upper_boundary, cv=3)¶

cross_val_tae_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_tae_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_tae_upper_boundary(upper_boundary, cv=3)¶

cross_val_tse_anomaly_detection(tolerance, cv=3, method='mean')¶

cross_val_tse_avg(minimum_center_tolerance, cv=3, method='mean')¶

cross_val_tse_upper_boundary(upper_boundary, cv=3)¶

describe_scores(scores, method)¶

Describes scores.

Parameters:

scores (array-like) – the scores from the model, as a list or numpy array
method (string) – the method to use to calculate central tendency and spread

Returns:

Returns the central tendency, and spread
by method.
Methods
mean
* central tendency (mean)
* spread (standard deviation)
median
* central tendency (median)
* spread (interquartile range)
trimean
* central tendency (trimean)
* spread (trimean absolute deviation)

get_test_score(cross_val_dict)¶

mae_cv(cv)¶

This method performs cross-validation over median absolute error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold median absolute error.

mae_upper_boundary(upper_boundary)¶

mse_cv(cv)¶

This method performs cross-validation over mean squared error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold mean squared error.

mse_upper_boundary(upper_boundary)¶

run_time_stress_test(sample_sizes, max_run_times)¶

tae_cv(cv)¶

This method performs cross-validation over trimean absolute error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold trimean absolute error.

tae_upper_boundary(upper_boundary)¶

trimean(data)¶

I’m exposing this as a public method because the trimean is not implemented in enough packages.

Formula: (25th percentile + 2*50th percentile + 75th percentile)/4

Parameters:	data (array-like) – an iterable, either a list or a numpy array
Returns:	the trimean
Return type:	float

trimean_absolute_deviation(data)¶

The trimean absolute deviation is the the average distance from the trimean.

Parameters:	data (array-like) – an iterable, either a list or a numpy array
Returns:	the average distance to the trimean
Return type:	float

trimean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')¶

trimean_squared_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average')¶

tse_cv(cv)¶

This method performs cross-validation over trimean squared error.

Parameters:	cv (*) – The number of cross validation folds to perform
Returns:
Return type:	Returns a scores of the k-fold trimean squared error.

tse_upper_boundary(upper_boundary)¶

upper_bound_regression_testing(mse_upper_boundary, mae_upper_boundary, tse_upper_boundary, tae_upper_boundary)¶

class drifter_ml.regression_tests.RegressionComparison(reg_one, reg_two, test_data, target_name, column_names)¶

Bases: object

cross_val_mae_result(reg, cv=3)¶

cross_val_mse_result(reg, cv=3)¶

cv_two_model_regression_testing(cv=3)¶

mae_result(reg)¶

mse_result(reg)¶

two_model_prediction_run_time_stress_test(performance_boundary)¶

two_model_regression_testing()¶