API Reference¶
Groupyr contains estimator classes that are fully compliant
with the scikitlearn ecosystem. Consequently,
their initialization, fit
, predict
, transform
, and score
methods will be familiar to sklearn
users.
Sparse Groups Lasso Estimators¶
These are groupyr’s canonical estimators. SGL
is intended for regression
problems while LogisticSGL
is intended for classification problems.

class
groupyr.
SGL
(l1_ratio=1.0, alpha=0.0, groups=None, scale_l2_by='group_length', fit_intercept=True, max_iter=1000, tol=1e07, warm_start=False, verbose=0, suppress_solver_warnings=True, include_solver_trace=False)[source]¶ An sklearn compatible sparse group lasso regressor.
This solves the sparse group lasso [1] problem for a feature matrix partitioned into groups using the proximal gradient descent (PGD) algorithm.
 Parameters
 l1_ratiofloat, default=1.0
Hyperparameter : Combination between group lasso and lasso. l1_ratio=0 gives the group lasso and l1_ratio=1 gives the lasso.
 alphafloat, default=1.0
Hyperparameter : overall regularization strength.
 groupslist of numpy.ndarray
list of arrays of nonoverlapping indices for each group. For example, if nine features are grouped into equal contiguous groups of three, then groups would be
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
. If the feature matrix contains a bias or intercept feature, do not include it as a group. If None, all features will belong to one group. We set groups in__init__
so that it can be reused in model selection and CV routines. scale_l2_by[“group_length”, None], default=”group_length”
Scaling technique for the groupwise L2 penalty. By default,
scale_l2_by="group_length
and the L2 penalty is scaled by the square root of the group length so that each variable has the same effect on the penalty. This may not be appropriate for onehot encoded features andscale_l2_by=None
would be more appropriate for that case.scale_l2_by=None
will also reproduce ElasticNet results when all features belong to one group. fit_interceptbool, default=True
Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).
 max_iterint, default=1000
Maximum number of iterations for PGD solver.
 tolfloat, default=1e7
Stopping criterion. Convergence tolerance for the
copt
proximal gradient solver warm_startbool, default=False
If set to
True
, reuse the solution of the previous call tofit
as initialization forcoef_
andintercept_
. verboseint, default=0
Verbosity flag for PGD solver. Any positive integer will produce verbose output
 suppress_solver_warningsbool, default=True
If True, suppress convergence warnings from PGD solver. This is useful for hyperparameter tuning when some combinations of hyperparameters may not converge.
References
 1
Noah Simon, Jerome Friedman, Trevor Hastie & Robert Tibshirani, “A SparseGroup Lasso,” Journal of Computational and Graphical Statistics, vol. 22:2, pp. 231245, 2012 DOI: 10.1080/10618600.2012.681250
 Attributes
 coef_array of shape (n_features,)
Estimated coefficients for the linear predictor (X @ coef_ + intercept_).
 intercept_float
Intercept (a.k.a. bias) added to linear predictor.
 n_iter_int
Actual number of iterations used in the solver.

class
groupyr.
LogisticSGL
(l1_ratio=1.0, alpha=0.0, groups=None, scale_l2_by='group_length', fit_intercept=True, max_iter=1000, tol=1e07, warm_start=False, verbose=0, suppress_solver_warnings=True, include_solver_trace=False)[source]¶ An sklearn compatible sparse group lasso classifier.
This solves the sparse group lasso [1] problem for a feature matrix partitioned into groups using the proximal gradient descent (PGD) algorithm.
 Parameters
 l1_ratiofloat, default=1.0
Hyperparameter : Combination between group lasso and lasso. l1_ratio=0 gives the group lasso and l1_ratio=1 gives the lasso.
 alphafloat, default=0.0
Hyperparameter : overall regularization strength.
 groupslist of numpy.ndarray
list of arrays of nonoverlapping indices for each group. For example, if nine features are grouped into equal contiguous groups of three, then groups would be
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
. If the feature matrix contains a bias or intercept feature, do not include it as a group. If None, all features will belong to one group. We set groups in__init__
so that it can be reused in model selection and CV routines. scale_l2_by[“group_length”, None], default=”group_length”
Scaling technique for the groupwise L2 penalty. By default,
scale_l2_by="group_length
and the L2 penalty is scaled by the square root of the group length so that each variable has the same effect on the penalty. This may not be appropriate for onehot encoded features andscale_l2_by=None
would be more appropriate for that case.scale_l2_by=None
will also reproduce ElasticNet results when all features belong to one group. fit_interceptbool, default=True
Specifies if a constant (a.k.a. bias or intercept) should be added to the linear predictor (X @ coef + intercept).
 max_iterint, default=1000
Maximum number of iterations for PGD solver.
 tolfloat, default=1e7
Stopping criterion. Convergence tolerance for the
copt
proximal gradient solver warm_startbool, default=False
If set to
True
, reuse the solution of the previous call tofit
as initialization forcoef_
andintercept_
. verboseint, default=0
Verbosity flag for PGD solver. Any positive integer will produce verbose output
 suppress_solver_warningsbool, default=True
If True, suppress convergence warnings from PGD solver. This is useful for hyperparameter tuning when some combinations of hyperparameters may not converge.
References
 1
Noah Simon, Jerome Friedman, Trevor Hastie & Robert Tibshirani, “A SparseGroup Lasso,” Journal of Computational and Graphical Statistics, vol. 22:2, pp. 231245, 2012 DOI: 10.1080/10618600.2012.681250
 Attributes
 classes_ndarray of shape (n_classes, )
A list of class labels known to the classifier.
 coef_array of shape (n_features,)
Estimated coefficients for the linear predictor (X @ coef_ + intercept_).
 intercept_float
Intercept (a.k.a. bias) added to linear predictor.
 n_iter_int
Actual number of iterations used in the solver.
Crossvalidation Estimators¶
These estimators have builtin crossvalidation
capabilities to find the best values of the hyperparameters alpha
and
l1_ratio
. These are more efficient than using the canonical estimators
with grid search because they make use of warmstarting. Alternatively, you
can specify tuning_strategy = "bayes"
to use Bayesian optimization over
the hyperparameters
instead of a grid search.

class
groupyr.
SGLCV
(l1_ratio=1.0, groups=None, scale_l2_by='group_length', eps=0.001, n_alphas=100, alphas=None, fit_intercept=True, normalize=False, max_iter=1000, tol=1e07, copy_X=True, cv=None, verbose=False, n_jobs=None, tuning_strategy='grid', n_bayes_iter=50, n_bayes_points=1, random_state=None, suppress_solver_warnings=True)[source]¶ Iterative SGL model fitting along a regularization path.
See the scikitlearn glossary entry for crossvalidation estimator
 Parameters
 l1_ratiofloat or list of float, default=1.0
float between 0 and 1 passed to SGL (scaling between group lasso and lasso penalties). For
l1_ratio = 0
the penalty is the group lasso penalty. Forl1_ratio = 1
it is the lasso penalty. For0 < l1_ratio < 1
, the penalty is a combination of group lasso and lasso. This parameter can be a list, in which case the different values are tested by crossvalidation and the one giving the best prediction score is used. Note that a good choice of list of values will depend on the problem. For problems where we expect strong overall sparsity and would like to encourage grouping, put more values close to 1 (i.e. Lasso). In contrast, if we expect strong groupwise sparsity, but only mild sparsity within groups, put more values close to 0 (i.e. group lasso). groupslist of numpy.ndarray
list of arrays of nonoverlapping indices for each group. For example, if nine features are grouped into equal contiguous groups of three, then groups would be
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
. If the feature matrix contains a bias or intercept feature, do not include it as a group. If None, all features will belong to one group. We set groups in__init__
so that it can be reused in model selection and CV routines. scale_l2_by[“group_length”, None], default=”group_length”
Scaling technique for the groupwise L2 penalty. By default,
scale_l2_by="group_length
and the L2 penalty is scaled by the square root of the group length so that each variable has the same effect on the penalty. This may not be appropriate for onehot encoded features andscale_l2_by=None
would be more appropriate for that case.scale_l2_by=None
will also reproduce ElasticNet results when all features belong to one group. epsfloat, default=1e3
Length of the path.
eps=1e3
means thatalpha_min / alpha_max = 1e3
. n_alphasint, default=100
Number of alphas along the regularization path, used for each l1_ratio.
 alphasndarray, default=None
List of alphas where to compute the models. If None alphas are set automatically
 fit_interceptbool, default=True
whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).
 normalizebool, default=False
This parameter is ignored when
fit_intercept
is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2norm. If you wish to standardize, please usesklearn.preprocessing.StandardScaler
before callingfit
on an estimator withnormalize=False
. max_iterint, default=1000
The maximum number of iterations
 tolfloat, default=1e7
Stopping criterion. Convergence tolerance for the
copt
proximal gradient solver cvint, crossvalidation generator or iterable, default=None
Determines the crossvalidation splitting strategy. Possible inputs for cv are:
None, to use the default 5fold crossvalidation,
int, to specify the number of folds.
an sklearn CV splitter,
An iterable yielding (train, test) splits as arrays of indices.
For int/None inputs,
sklearn.model_selection.KFold
is used.Refer to the scikitlearn User Guide for the various crossvalidation strategies that can be used here.
 copy_Xbool, default=True
If
True
, X will be copied; else, it may be overwritten. verbosebool or int, default=0
Amount of verbosity.
 n_jobsint, default=None
Number of CPUs to use during the cross validation.
None
means 1 unless in ajoblib.parallel_backend
context.1
means using all processors. tuning_strategy[“grid”, “bayes”], default=”grid”
Hyperparameter tuning strategy to use. If
tuning_strategy == "grid"
, then evaluate all parameter points on thel1_ratio
andalphas
grid, using warm start to evaluate differentalpha
values along the regularization path. Iftuning_strategy == "bayes"
, then a fixed number of parameter settings is sampled usingskopt.BayesSearchCV
. The fixed number of settings is set byn_bayes_iter
. Thel1_ratio
setting is sampled uniformly from the minimum and maximum of the inputl1_ratio
parameter. Thealpha
setting is sampled loguniformly either from the maximum and minumum of the inputalphas
parameter, if provided or fromeps
* max_alpha to max_alpha where max_alpha is a conservative estimate of the maximum alpha for which the solution coefficients are nontrivial. n_bayes_iterint, default=50
Number of parameter settings that are sampled if using Bayes search for hyperparameter optimization.
n_bayes_iter
trades off runtime vs quality of the solution. Consider increasingn_bayes_points
if you want to try more parameter settings in parallel. n_bayes_pointsint, default=1
Number of parameter settings to sample in parallel if using Bayes search for hyperparameter optimization. If this does not align with
n_bayes_iter
, the last iteration will sample fewer points. random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
 suppress_solver_warningsbool, default=True
If True, suppress warnings from BayesSearchCV when the objective is evaluated at the same point multiple times. Setting this to False, may be useful for debugging.
See also
sgl_path
SGL
 Attributes
 alpha_float
The amount of penalization chosen by cross validation
 l1_ratio_float
The compromise between l1 and l2 penalization chosen by cross validation
 coef_ndarray of shape (n_features,) or (n_targets, n_features)
Parameter vector (w in the cost function formula),
 intercept_float or ndarray of shape (n_targets, n_features)
Independent term in the decision function.
 scoring_path_ndarray of shape (n_l1_ratio, n_alpha, n_folds)
Mean square error for the test set on each fold, varying l1_ratio and alpha.
 alphas_ndarray of shape (n_alphas,) or (n_l1_ratio, n_alphas)
The grid of alphas used for fitting, for each l1_ratio.
 n_iter_int
number of iterations run by the proximal gradient descent solver to reach the specified tolerance for the optimal alpha.
 bayes_optimizer_skopt.BayesSearchCV instance or None
The BayesSearchCV instance used for hyperparameter optimization if
tuning_strategy == "bayes"
. Iftuning_strategy == "grid"
, then this attribute is None.

class
groupyr.
LogisticSGLCV
(l1_ratio=1.0, groups=None, scale_l2_by='group_length', eps=0.001, n_alphas=100, alphas=None, fit_intercept=True, normalize=False, max_iter=1000, tol=1e07, scoring=None, cv=None, copy_X=True, verbose=False, n_jobs=None, tuning_strategy='grid', n_bayes_iter=50, n_bayes_points=1, random_state=None, suppress_solver_warnings=True)[source]¶ Iterative Logistic SGL model fitting along a regularization path.
See the scikitlearn glossary entry for crossvalidation estimator
 Parameters
 l1_ratiofloat or list of float, default=1.0
float between 0 and 1 passed to SGL (scaling between group lasso and lasso penalties). For
l1_ratio = 0
the penalty is the group lasso penalty. Forl1_ratio = 1
it is the lasso penalty. For0 < l1_ratio < 1
, the penalty is a combination of group lasso and lasso. This parameter can be a list, in which case the different values are tested by crossvalidation and the one giving the best prediction score is used. Note that a good choice of list of values will depend on the problem. For problems where we expect strong overall sparsity and would like to encourage grouping, put more values close to 1 (i.e. Lasso). In contrast, if we expect strong groupwise sparsity, but only mild sparsity within groups, put more values close to 0 (i.e. group lasso). groupslist of numpy.ndarray
list of arrays of nonoverlapping indices for each group. For example, if nine features are grouped into equal contiguous groups of three, then groups would be
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
. If the feature matrix contains a bias or intercept feature, do not include it as a group. If None, all features will belong to one group. We set groups in__init__
so that it can be reused in model selection and CV routines. scale_l2_by[“group_length”, None], default=”group_length”
Scaling technique for the groupwise L2 penalty. By default,
scale_l2_by="group_length
and the L2 penalty is scaled by the square root of the group length so that each variable has the same effect on the penalty. This may not be appropriate for onehot encoded features andscale_l2_by=None
would be more appropriate for that case.scale_l2_by=None
will also reproduce ElasticNet results when all features belong to one group. epsfloat, default=1e3
Length of the path.
eps=1e3
means thatalpha_min / alpha_max = 1e3
. n_alphasint, default=100
Number of alphas along the regularization path, used for each l1_ratio.
 alphasndarray, default=None
List of alphas where to compute the models. If None alphas are set automatically
 fit_interceptbool, default=True
whether to calculate the intercept for this model. If set to false, no intercept will be used in calculations (i.e. data is expected to be centered).
 normalizebool, default=False
This parameter is ignored when
fit_intercept
is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2norm. If you wish to standardize, please usesklearn.preprocessing.StandardScaler
before callingfit
on an estimator withnormalize=False
. max_iterint, default=1000
The maximum number of iterations
 tolfloat, default=1e7
Stopping criterion. Convergence tolerance for the
copt
proximal gradient solver scoringcallable, default=None
A string (see sklearn model evaluation documentation) or a scorer callable object / function with signature
scorer(estimator, X, y)
. For a list of scoring functions that can be used, look at sklearn.metrics. The default scoring option used is accuracy_score. cvint, crossvalidation generator or iterable, default=None
Determines the crossvalidation splitting strategy. Possible inputs for cv are:
None, to use the default 5fold crossvalidation,
int, to specify the number of folds.
an sklearn CV splitter,
An iterable yielding (train, test) splits as arrays of indices.
For int/None inputs,
sklearn.model_selection.StratifiedKFold
is used.Refer to the scikitlearn User Guide for the various crossvalidation strategies that can be used here.
 copy_Xbool, default=True
If
True
, X will be copied; else, it may be overwritten. verbosebool or int, default=False
Amount of verbosity.
 n_jobsint, default=None
Number of CPUs to use during the cross validation.
None
means 1 unless in ajoblib.parallel_backend
context.1
means using all processors. tuning_strategy[“grid”, “bayes”], default=”grid”
Hyperparameter tuning strategy to use. If
tuning_strategy == "grid"
, then evaluate all parameter points on thel1_ratio
andalphas
grid, using warm start to evaluate differentalpha
values along the regularization path. Iftuning_strategy == "bayes"
, then a fixed number of parameter settings is sampled usingskopt.BayesSearchCV
. The fixed number of settings is set byn_bayes_iter
. Thel1_ratio
setting is sampled uniformly from the minimum and maximum of the inputl1_ratio
parameter. Thealpha
setting is sampled loguniformly either from the maximum and minumum of the inputalphas
parameter, if provided or fromeps
* max_alpha to max_alpha where max_alpha is a conservative estimate of the maximum alpha for which the solution coefficients are nontrivial. n_bayes_iterint, default=50
Number of parameter settings that are sampled if using Bayes search for hyperparameter optimization.
n_bayes_iter
trades off runtime vs quality of the solution. Consider increasingn_bayes_points
if you want to try more parameter settings in parallel. n_bayes_pointsint, default=1
Number of parameter settings to sample in parallel if using Bayes search for hyperparameter optimization. If this does not align with
n_bayes_iter
, the last iteration will sample fewer points. random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
 suppress_solver_warningsbool, default=True
If True, suppress warnings from BayesSearchCV when the objective is evaluated at the same point multiple times. Setting this to False, may be useful for debugging.
See also
logistic_sgl_path
LogisticSGL
 Attributes
 alpha_float
The amount of penalization chosen by cross validation
 l1_ratio_float
The compromise between l1 and l2 penalization chosen by cross validation
 classes_ndarray of shape (n_classes, )
A list of class labels known to the classifier.
 coef_array of shape (n_features,)
Estimated coefficients for the linear predictor (X @ coef_ + intercept_).
 intercept_float
Intercept (a.k.a. bias) added to linear predictor.
 scoring_path_ndarray of shape (n_l1_ratio, n_alpha, n_folds)
Classification score for the test set on each fold, varying l1_ratio and alpha.
 alphas_ndarray of shape (n_alphas,) or (n_l1_ratio, n_alphas)
The grid of alphas used for fitting, for each l1_ratio.
 n_iter_int
number of iterations run by the proximal gradient descent solver to reach the specified tolerance for the optimal alpha.
 bayes_optimizer_skopt.BayesSearchCV instance or None
The BayesSearchCV instance used for hyperparameter optimization if
tuning_strategy == "bayes"
. Iftuning_strategy == "grid"
, then this attribute is None.
Dataset Generation¶
Use these functions to generate synthetic sparse grouped data.

groupyr.datasets.
make_group_classification
(n_samples=100, n_groups=20, n_informative_groups=2, n_features_per_group=20, n_informative_per_group=2, n_redundant_per_group=2, n_repeated_per_group=0, n_classes=2, n_clusters_per_class=2, weights=None, flip_y=0.01, class_sep=1.0, hypercube=True, shift=0.0, scale=1.0, shuffle=True, useful_indices=False, random_state=None)[source]¶ Generate a random nclass sparse group classification problem.
This initially creates clusters of points normally distributed (std=1) about vertices of an
n_informative
dimensional hypercube with sides of length2*class_sep
and assigns an equal number of clusters to each class. It introduces interdependence between these features and adds various types of further noise to the data.Prior to shuffling,
X
stacks a number of these primary “informative” features, “redundant” linear combinations of these, “repeated” duplicates of sampled features, and arbitrary noise for and remaining features. This method uses sklearn.datasets.make_classification to construct a giant unshuffled classification problem of sizen_groups * n_features_per_group
and then distributes the returned features to each group. It then optionally shuffles each group. Parameters
 n_samplesint, optional (default=100)
The number of samples.
 n_groupsint, optional (default=10)
The number of feature groups.
 n_informative_groupsint, optional (default=2)
The total number of informative groups. All other groups will be just noise.
 n_features_per_groupint, optional (default=20)
The total number of features_per_group. These comprise n_informative informative features, n_redundant redundant features, n_repeated duplicated features and n_featuresn_informativen_redundant n_repeated useless features drawn at random.
 n_informative_per_groupint, optional (default=2)
The number of informative features_per_group. Each class is composed of a number of gaussian clusters each located around the vertices of a hypercube in a subspace of dimension n_informative_per_group. For each cluster, informative features are drawn independently from N(0, 1) and then randomly linearly combined within each cluster in order to add covariance. The clusters are then placed on the vertices of the hypercube.
 n_redundant_per_groupint, optional (default=2)
The number of redundant features per group. These features are generated as random linear combinations of the informative features.
 n_repeated_per_groupint, optional (default=0)
The number of duplicated features per group, drawn randomly from the informative and the redundant features.
 n_classesint, optional (default=2)
The number of classes (or labels) of the classification problem.
 n_clusters_per_classint, optional (default=2)
The number of clusters per class.
 weightslist of floats or None (default=None)
The proportions of samples assigned to each class. If None, then classes are balanced. Note that if len(weights) == n_classes  1, then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1.
 flip_yfloat, optional (default=0.01)
The fraction of samples whose class are randomly exchanged. Larger values introduce noise in the labels and make the classification task harder.
 class_sepfloat, optional (default=1.0)
The factor multiplying the hypercube size. Larger values spread out the clusters/classes and make the classification task easier.
 hypercubeboolean, optional (default=True)
If True, the clusters are put on the vertices of a hypercube. If False, the clusters are put on the vertices of a random polytope.
 shiftfloat, array of shape [n_features] or None, optional (default=0.0)
Shift features by the specified value. If None, then features are shifted by a random value drawn in [class_sep, class_sep].
 scalefloat, array of shape [n_features] or None, optional (default=1.0)
Multiply features by the specified value. If None, then features are scaled by a random value drawn in [1, 100]. Note that scaling happens after shifting.
 shuffleboolean, optional (default=True)
Shuffle the samples and the features.
 useful_indicesboolean, optional (default=False)
If True, a boolean array indicating useful features is returned
 random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
 Returns
 Xarray of shape [n_samples, n_features]
The generated samples.
 yarray of shape [n_samples]
The integer labels for class membership of each sample.
 groupslist of arrays
Each element is an array of feature indices that belong to that group
 indicesarray of shape [n_features]
A boolean array indicating which features are useful. Returned only if useful_indices is True.
See also
sklearn.datasets.make_classification
nongroupsparse version
sklearn.datasets.make_blobs
simplified variant
sklearn.datasets.make_multilabel_classification
unrelated generator for multilabel tasks
Notes
The algorithm is adapted from Guyon [1] and was designed to generate the “Madelon” dataset.
References
 1
I. Guyon, “Design of experiments for the NIPS 2003 variable selection benchmark”, 2003.

groupyr.datasets.
make_group_regression
(n_samples=100, n_groups=20, n_informative_groups=5, n_features_per_group=20, n_informative_per_group=5, effective_rank=None, noise=0.0, shift=0.0, scale=1.0, shuffle=False, coef=False, random_state=None)[source]¶ Generate a sparse group regression problem.
Prior to shuffling,
X
stacks a number of these primary “informative” features, and arbitrary noise for and remaining features. This method uses sklearn.datasets.make_regression to construct a giant unshuffled regression problem of sizen_groups * n_features_per_group
and then distributes the returned features to each group. It then optionally shuffles each group. Parameters
 n_samplesint, optional (default=100)
The number of samples.
 n_groupsint, optional (default=10)
The number of feature groups.
 n_informative_groupsint, optional (default=2)
The total number of informative groups. All other groups will be just noise.
 n_features_per_groupint, optional (default=20)
The total number of features_per_group. These comprise n_informative informative features, and n_featuresn_informative useless features drawn at random.
 n_informative_per_groupint, optional (default=2)
The number of informative features_per_group that have a nonzero regression coefficient.
 effective_rankint or None, optional (default=None)
If not None, provides the number of singular vectors to explain the input data.
 noisefloat, optional (default=0.0)
The standard deviation of the gaussian noise applied to the output.
 shuffleboolean, optional (default=False)
Shuffle the samples and the features.
 coefboolean, optional (default=False)
If True, returns coefficient values used to generate samples via sklearn.datasets.make_regression.
 random_stateint, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
 Returns
 Xarray of shape [n_samples, n_features]
The generated samples.
 yarray of shape [n_samples]
The integer labels for class membership of each sample.
 groupslist of arrays
Each element is an array of feature indices that belong to that group
 coefarray of shape [n_features]
A numpy array containing true regression coefficient values. Returned only if coef is True.
See also
sklearn.datasets.make_regression
nongroupsparse version