Functions for experiments and utilities

Utilities

Preparing the datasets

pyridge.util.preprocess.prepare_data(folder, dataset, sep='\\s+', scaler='standard')[source]

Read the data from the files and scale them. Target is supposed to be at the last column.

Parameters
  • folder (str) – name of the folder where the dataset is.

  • dataset (str) – name of the dataset to load.

  • sep (str) – separator in string form. Default is spaces.

  • scaler

Returns

Activation and kernel functions

pyridge.util.activation.sigmoid(x)[source]

Sigmoid function. It can be replaced with scipy.special.expit.

Parameters

x

Returns

pyridge.util.activation.sigmoid_der(x)[source]

Derivative of the sigmoid function.

Parameters

y

Returns

pyridge.util.activation.linear_kernel(gamma: float = 1.0, X=None, Y=None)[source]
Parameters
  • gamma

  • X

  • Y

Returns

pyridge.util.activation.rbf_kernel(gamma: float = 1.0, X=None, Y=None)[source]

Function to obtain omega matrix.

Parameters
  • gamma (float) –

  • X (np.matrix) –

  • Y

Returns

pyridge.util.activation.u_dot_norm(u)[source]

Return u vector with norm = 1.

Parameters

u

Returns

Cross validation

pyridge.util.cross.cross_validation(predictor, train_data, train_target, hyperparameter, metric='accuracy', n_folds=5)[source]

Cross validation training in order to find best parameter.

Parameters
  • predictor

  • train_data

  • train_target

  • hyperparameter (dict) –

  • metric (str) –

  • n_folds (int) –

Returns

pyridge.util.cross.train_predictor(predictor, train_data, train_target, hyperparameter, metric='accuracy', n_folds=5)[source]

Cross validation training in order to find best parameter.

Parameters
  • predictor

  • train_data

  • train_target

  • hyperparameter (dict) –

  • metric (str) –

  • n_folds (int) –

Returns

Metrics

pyridge.util.metric.accuracy(clf, pred_data, real_targ)[source]

Percentage of predicted targets that actually coincide with real targets.

Parameters
  • clf – classifier with predict method.

  • pred_data – array of the targets according to the classifier.

  • real_targ (bool) – array of the real targets.

  • real_targ – array of the real targets.

Returns

pyridge.util.metric.rmse(clf, pred_data, real_targ)[source]
Parameters
  • clf

  • pred_data

  • real_targ

Returns

pyridge.util.metric.diversity(clf, pred_data=None, real_targ=None)[source]

Implemented directly from MATLAB, not pythonic.

Parameters
  • clf – Predictor.

  • pred_data – Not used.

  • real_targ – Not used.

Returns

Experiments

In order to perform several experiments and tests de predictors, generic test function is used for different algorithms and cross-validation hyperparameters.

pyridge.experiment.check.check_fold(folder_dataset='data/iris', train_dataset='train_iris.0', test_dataset='test_iris.0', sep='\\s+', algorithm='ELM', metric_list=['accuracy', 'rmse'], metric_cross='accuracy', hyperparameter=None, classification=True, autoencoder=False)[source]

Generic test.

Parameters
  • folder_dataset (str) –

  • train_dataset (str) –

  • test_dataset (str) –

  • sep (str) –

  • algorithm (str) –

  • metric_list (list) –

  • hyperparameter (dict) –

  • metric_cross (str) –

  • classification (bool) – True if we want a classification; False if we are looking for a regression.

  • autoencoder (bool) – True if we want autoencoder test; False if we are looking for a classic supervised test.

Returns

a dictionary, with the metrics.

pyridge.experiment.check.check_algorithm(folder, dataset, algorithm, metric_list, hyperparameter, metric_cross=None, classification=True, autoencoder=False)[source]

Testing easily a predictor along all the folds.

Parameters
  • folder (str) –

  • dataset (str) –

  • algorithm (str) –

  • metric_list (list) –

  • metric_cross (str) –

  • hyperparameter (dict) –

  • classification (bool) – True if we want a classification; False if we are looking for a regression.

  • autoencoder (bool) – True if we want autoencoder test; False if we are looking for a classic supervised test.

Returns

a dictionary, with the metrics.