bayes_models package

Submodules

bayes_models.model module

class bayes_models.model.BGLClassifier(numeric_cols, target_col, cat_cols)

Bases: bayes_models.model._AssembleClassificationModelCode, bayes_models.model._SSModelBase

This class is responsible for performing Bayesian-Gaussian Logistic Regression, which treats categorical variables correctly.

fit(df)

This method is responsible for fitting the model onto the data

Parameters

df (pandas.DataFrame) – Our dataset with our features and predictive columns

Returns

self

predict(test_df)

This predicts the binary values of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

predict_proba(test_df)

This predicts the raw probabilities of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

score(x_val, y_val)

This returns the accuracy of the predictions

Parameters
  • x_val (pandas.DataFrame) – The validation dataset

  • y_val (numpy.array or pandas.DataFrame) – The validation actual values

Returns

class bayes_models.model.BGLRegressor(numeric_cols, target_col, cat_cols)

Bases: bayes_models.model._AssembleRegressionModelCode, bayes_models.model._SSModelBase

This class is responsible for performing Bayesian-Gaussian Linear Regression, which treats categorical variables correctly.

fit(df)

This method is responsible for fitting the model onto the data

Parameters

df (pandas.DataFrame) – Our dataset with our features and predictive columns

Returns

self

predict(test_df)

This predicts the raw values of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

score(x_val, y_val)

The score returned is the R2 Score between the predicted and the actual.

Parameters
  • x_val (pandas.DataFrame) – The validation dataset

  • y_val (numpy.array or pandas.DataFrame) – The validation actual values

Returns

R2 Score of Predictions

class bayes_models.model.BGRFClassifier(train_cols, target_col, n_estimators, max_depth)

Bases: bayes_models.model._AssembleClassificationModelCode, bayes_models.model._SSModelBase

This class is responsible for performing Bayesian-Gaussian Logistic Regression on the predictions of the Decision Trees from the fitted sklearn.ensemble.RandomForestClassifier model. This helps us optimally derive the best weighting based on the predictions of the individual Decision Trees.

fit(df)

First, we will fit the RandomForest on our dataset. Then we will use those predictions as features for our Bayesian Model

Parameters

df (pandas.DataFrame) – Our dataset with our features and predictive columns

Returns

self

predict(test_df)

This predicts the binary values of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

predict_proba(test_df)

This predicts the raw values of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

score(x_val, y_val)

This returns the accuracy of the predictions

Parameters
  • x_val (pandas.DataFrame) – The validation dataset

  • y_val (numpy.array or pandas.DataFrame) – The validation actual values

Returns

class bayes_models.model.BGRFRegressor(train_cols, target_col, n_estimators, max_depth)

Bases: bayes_models.model._AssembleRegressionModelCode, bayes_models.model._SSModelBase

This class is responsible for performing Bayesian-Gaussian Linear Regression on the predictions of the Decision Trees from the fitted sklearn.ensemble.RandomForestRegressor model. This helps us optimally derive the best weighting based on the predictions of the individual Decision Trees.

fit(df)

First, we will fit the RandomForest on our dataset. Then we will use those predictions as features for our Bayesian Model

Parameters

df (pandas.DataFrame) – Our dataset with our features and predictive columns

Returns

self

predict(test_df)

This predicts the raw values of our test dataframe

Parameters

test_df (pandas.DataFrame) – The dataset we will predict on.

Returns

res

score(x_val, y_val)

The score returned is the R2 Score between the predicted and the actual.

Parameters
  • x_val (pandas.DataFrame) – The validation dataset

  • y_val (numpy.array or pandas.DataFrame) – The validation actual values

Returns

Module contents