lal.spark package¶

Submodules¶

lal.spark.model module¶

class lal.spark.model.LALGBSparkBinaryClassifier(**kwargs)¶

Bases: lal.spark.model._LALModelBase

This is when our training labels are binary.

predict(**kwargs)¶

We choose most probable label our samples in the testing dataset has.

Parameters

sdf1 (pyspark.sql.dataframe.DataFrame) – The training dataset
sdf2 (pyspark.sql.dataframe.DataFrame) – The testing dataset

Returns

predict_proba(**kwargs)¶

This predicts the probability of our test data having any of the available labels in the training dataset

Parameters

sdf1 (pyspark.sql.dataframe.DataFrame) – The training dataset
sdf2 (pyspark.sql.dataframe.DataFrame) – The testing dataset

Returns

class lal.spark.model.LALGBSparkCategoricalClassifier(**kwargs)¶

Bases: lal.spark.model._LALModelBase

predict(**kwargs)¶

We choose most probable label our samples in the testing dataset has.

Parameters

sdf1 (pyspark.sql.dataframe.DataFrame) – The training dataset
sdf2 (pyspark.sql.dataframe.DataFrame) – The testing dataset

Returns

predict_proba(**kwargs)¶

This predicts the probability of our test data having any of the available labels in the training dataset

Parameters

sdf1 (pyspark.sql.dataframe.DataFrame) – The training dataset
sdf2 (pyspark.sql.dataframe.DataFrame) – The testing dataset

Returns

class lal.spark.model.LALGBSparkMultiBinaryClassifier(**kwargs)¶

Bases: lal.spark.model._LALSparkMultiBase

This is our Multioutput Binary Classifier, where our training labels are all binary.

predict_proba(sdf1, sdf2)¶

task_base¶: alias of LALGBSparkBinaryClassifier

class lal.spark.model.LALGBSparkMultiCategoricalClassifier(**kwargs)¶

Bases: lal.spark.model._LALSparkMultiBase

This is our Multioutput Categorical Classifier, where our training labels are all binary.

predict_proba(sdf1, sdf2)¶

task_base¶: alias of LALGBSparkCategoricalClassifier

class lal.spark.model.LALGBSparkMultiRegressorClassifier(**kwargs)¶

Bases: lal.spark.model._LALSparkMultiBase

This is our Multioutput Regressor, where our training labels are all binary.

task_base¶: alias of LALGBSparkRegressor

class lal.spark.model.LALGBSparkRegressor(**kwargs)¶

Bases: lal.spark.model._LALModelBase

This is when our training labels are continuous.

predict(**kwargs)¶

We predict the possible value our testing dataset will have, based on the continuous variables.

Parameters

sdf1 (pyspark.sql.dataframe.DataFrame) – The training dataset
sdf2 (pyspark.sql.dataframe.DataFrame) – The testing dataset

Returns

lal.spark.nn module¶

class lal.spark.nn.KNNCosineMatcher(k)¶

Bases: lal.spark.nn._CosineDistance, lal.spark.nn._KNNMatcherBase

This is the K-Nearest Neighbor algorithm with the cosine distance measure.

class lal.spark.nn.KNNMahalanobisMatcher(k)¶

Bases: lal.spark.nn._MahalanobisDistance, lal.spark.nn._KNNMatcherBase

This is the K-Nearest Neighbor algorithm with the mahalanobis distance measure.

class lal.spark.nn.KNNPowerMatcher(p, k)¶

Bases: lal.spark.nn._PowerDistance, lal.spark.nn._KNNMatcherBase

This is the K-Nearest Neighbor algorithm with the p-norm distance measure.

lal.spark.weights module¶

class lal.spark.weights.GBMWeightBinaryClassifier(**kwargs)¶

Bases: lal.spark.weights._LGBMWeightsBase

This object will derive the feature importance weights based on binary output. It will optimize the Gradient Boosting Model based on a classification metric, and then return the featureImportances based on the most optimized result.

class lal.spark.weights.GBMWeightMultiClassifier(**kwargs)¶

Bases: lal.spark.weights._LGBMWeightsBase

This object will derive the feature importance weights based on a multiclass output. It will optimize the Gradient Boosting Model based on a classification metric, and then return the featureImportances based on the most optimized result.

class lal.spark.weights.GBMWeightRegressor(**kwargs)¶

Bases: lal.spark.weights._LGBMWeightsBase

This object will derive the feature importance weights based on a continuous output. It will optimize the Gradient Boosting Model based on a regression metric, and then return the featureImportances based on the most optimized result.

lal.spark package¶

Submodules¶

lal.spark.model module¶

lal.spark.nn module¶

lal.spark.weights module¶

Module contents¶