API

class stella.FlareDataSet(fn_dir=None, catalog=None, downloadSet=None, cadences=200, frac_balance=0.73, training=0.8, validation=0.9)

Given a directory of files, reformat data to create a training set for the convolutional neural network. Files must be in ‘.npy’ file format and contain at minimum the following indices:

  • 0th index = array of time

  • 1st index = array of flux

  • 2nd index = array of flux errors

All other indices in the files are ignored. This class additionally requires a catalog of flare start times for labeling. The flare catalog can be in either ‘.txt’ or ‘.csv’ file format. This class will be passed into the stella.neural_network() class to create and train the neural network.

load_files(id_keyword='TIC', ft_keyword='tpeak', time_offset=2457000.0)

Loads in light curves from the assigned training set directory. Files must be formatted such that the ID of each star is first and followed by ‘_’ (e.g. 123456789_sector09.npy).

times

An n-dimensional array of times, where n is the number of training set files.

Type

np.ndarray

fluxes

An n-dimensional array of fluxes, where n is the number of training set files.

Type

np.ndarray

flux_errs

An n-dimensional array of flux errors, where n is the number of training set files.

Type

np.ndarray

ids

An array of light curve IDs for each time/flux/flux_err. This is essential for labeling flare events.

Type

np.array

id_keyword

The column header in catalog to identify target ID. Default is ‘tic_id’.

Type

str, optional

ft_keyword

The column header in catalog to identify flare peak time. Default is ‘tpeak’.

Type

str, optional

time_offset

Time correction from flare catalog to light curve and is necessary when using Max Guenther’s catalog. Default is 2457000.0

Type

float, optional

reformat_data(random_seed=321)

Reformats the data into cadences-sized array and assigns a label based on flare times defined in the catalog.

Parameters

random_seed (int, optional) – A random seed to set for randomizing the order of the training_matrix after it is constructed. Default is 321.

training_matrix

An n x cadences-sized array used as the training data.

Type

np.ndarray

labels

An n-sized array of labels for each row in the training data.

Type

np.array

class stella.ConvNN(output_dir, ds=None, layers=None, optimizer='adam', loss='binary_crossentropy', metrics=None)

Creates and trains the convolutional neural network.

calibration(df, metric_threshold)

Transforming the rankings output by the CNN into actual probabilities. This can only be run for an ensemble of models.

Parameters
  • df (astropy.Table.table) – Table of output predictions from the validation set.

  • metric_threshold (float) – Defines ranking above which something is considered a flares.

create_model(seed)

Creates the Tensorflow keras model with appropriate layers.

model
Type

tensorflow.python.keras.engine.sequential.Sequential

cross_validation(seed=2, epochs=350, batch_size=64, n_splits=5, shuffle=False, pred_test=False, save=False)

Performs cross validation for a given number of K-folds. Reassigns the training and validation sets for each fold.

Parameters
  • seed (int, optional) – Sets random seed for creating CNN model. Default is 2.

  • epochs (int, optional) – Number of epochs to run each folded model on. Default is 350.

  • batch_size (int, optional) – The batch size for training. Default is 64.

  • n_splits (int, optional) – Number of folds to perform. Default is 5.

  • shuffle (bool, optional) – Allows for shuffling in scikitlearn.model_slection.KFold. Default is False.

  • pred_test (bool, optional) – Allows for predicting on the test set. DO NOT SET TO TRUE UNTIL YOU ARE HAPPY WITH YOUR FINAL MODEL. Default is False.

  • save (bool, optional) – Allows the user to save the kfolds table of predictions. Defaul it False.

crossval_predval

Table of predictions on the validation set from each fold.

Type

astropy.table.Table

crossval_predtest

Table of predictions on the test set from each fold. ONLY EXISTS IF PRED_TEST IS TRUE.

Type

astropy.table.Table

crossval_histories

Table of history values from the model run on each fold.

Type

astropy.table.Table

load_model(modelname, mode='validation')

Loads an already created model.

Parameters
  • modelname (str) –

  • mode (str, optional) –

predict(modelname, times, fluxes, errs, multi_models=False, injected=False)

Takes in arrays of time and flux and predicts where the flares are based on the keras model created and trained.

Parameters
  • modelname (str) – Path and filename of a model to load.

  • times (np.ndarray) – Array of times to predict flares in.

  • fluxes (np.ndarray) – Array of fluxes to predict flares in.

  • flux_errs (np.ndarray) – Array of flux errors for predicted flares.

  • injected (bool, optional) – Returns predictions instead of setting attribute. Used for injection-recovery. Default is False.

model

The model input with modelname.

Type

tensorflow.python.keras.engine.sequential.Sequential

predict_time

The input times array.

Type

np.ndarray

predict_flux

The input fluxes array.

Type

np.ndarray

predict_err

The input flux errors array.

Type

np.ndarray

predictions

An array of predictions from the model.

Type

np.ndarray

train_models(seeds=[2], epochs=350, batch_size=64, shuffle=False, pred_test=False, save=False)

Runs n number of models with given initial random seeds of length n. Also saves each model run to a hidden ~/.stella directory.

Parameters
  • seeds (np.array) – Array of random seed starters of length n, where n is the number of models you want to run.

  • epochs (int, optional) – Number of epochs to train for. Default is 350.

  • batch_size (int, optional) – Setting the batch size for the training. Default is 64.

  • shuffle (bool, optional) – Allows for shuffling of the training set when fitting the model. Default is False.

  • pred_test (bool, optional) – Allows for predictions on the test set. DO NOT SET TO TRUE UNTIL YOU’VE DECIDED ON YOUR FINAL MODEL. Default is False.

  • save (bool, optional) – Saves the predictions and histories of from each model in an ascii table to the specified output directory. Default is False.

history_table

Saves the metric values for each model run.

Type

Astropy.table.Table

val_pred_table

Predictions on the validation set from each run.

Type

Astropy.table.Table

test_pred_table

Predictions on the test set from each run. Must set pred_test = True, or else it is an empty table.

Type

Astropy.table.Table

class stella.FitFlares(id, time, flux, flux_err, predictions)

Uses the predictions from the neural network and identifies flaring events based on consecutive points. Users define a given probability threshold for accpeting a flare event as real.

get_init_guesses(groupings, time, flux, err, prob, maskregion, region)

Guesses at the initial t0 and amplitude based on probability groups.

Parameters
  • groupings (np.ndarray) – Group of indices for a single flare event.

  • time (np.array) –

  • flux (np.array) –

  • err (np.array) –

  • prob (np.array) –

Returns

  • tpeaks (np.ndarray) – Array of tpeaks for each flare group.

  • amps (np.ndarray) – Array of amplitudes at each tpeak.

group_inds(values)

Groups regions marked as flares (> prob_threshold) for flare fitting. Indices within 4 of each other are grouped as one flare.

Returns

results – An array of arrays, which are groups of indices supposedly attributed with a single flare.

Return type

np.ndarray

identify_flare_peaks(threshold=0.5)

Finds where the predicted value is above the threshold as a flare candidate. Groups consecutive indices as one flaring event.

Parameters

threshold (float, optional) – The probability threshold for believing an event is a flare. Default is 0.5.

treshold
Type

float

flare_table

A table of flare times, amplitudes, and equivalent durations. Equivalent duration given in units of days.

Type

astropy.table.Table

class stella.MeasureProt(IDs, time, flux, flux_err)

Used for measuring rotation periods.

assign_flag(period, power, width, avg, secpow, maxperiod, orbit_flag=0)

Assigns a flag in the table for which periods are reliable.

averaged_per_sector(tab)
Looks at targets observed in different sectors and determines

which period measured is likely the best period. Adds a column to MeasureRotations.LS_results of ‘true_period_days’ for the results.

Returns

Return type

astropy.table.Table

chiSquare(var, mu, x, y, yerr)
Calculates chi-square for fitting a Gaussian

to the peak of the LS periodogram.

Parameters
  • var (list) – Variables to fit (std and scale for Gaussian curve).

  • mu (float) – Mean to fit around.

  • x (np.array) –

  • y (np.array) –

  • yerr (np.array) –

Returns

Return type

chi-square value.

fit_LS_peak(period, power, arg)

Fits the LS periodogram at the peak power.

Parameters
  • period (np.array) – Array of periods from Lomb Scargle routine.

  • power (np.array) – Array of powers from the Lomb Scargle routine.

  • arg (int) – Argmax of the power in the periodogram.

Returns

popt – Array of best fit values for Gaussian fit.

Return type

np.array

gauss_curve(x, std, scale, mu)
Fits a Gaussian to the peak of the LS

periodogram.

Parameters
  • x (np.array) –

  • std (float) – Standard deviation of gaussian.

  • scale (float) – Scaling for gaussian.

  • mu (float) – Mean to fit around.

Returns

Return type

Gaussian curve.

phase_lightcurve(table=None, trough=- 0.5, peak=0.5, kernel_size=101)

Finds and creates a phase light curve that traces the spots. Uses only complete rotations and extrapolates outwards until the entire light curve is covered.

Parameters
  • table (astropy.table.Table, optional) – Used for getting the periods of each light curve. Allows users to use already created tables. Default = None. Will search for stella.FindTheSpots.LS_results.

  • trough (float, optional) – Sets the phase value at the minimum. Default = -0.5.

  • peak (float, optional) – Sets the phase value t the maximum. Default = 0.5.

  • kernel_size (odd float, optional) – Sets kernel size for median filter smoothing. Default = 15.

phases
Type

np.ndarray

run_LS(minf=0.08, maxf=10.0, spp=50)

Runs LS fit for each light curve.

Parameters
  • minf (float, optional) – The minimum frequency to search in the LS routine. Default = 1/20.

  • maxf (float, optional) – The maximum frequency to search in the LS routine. Default = 1/0.1.

  • spp (int, optional) – The number of samples per peak. Default = 50.

LS_results
Type

astropy.table.Table

class stella.Visualize(cnn, set='validation')

Creates diagnostic plots for the neural network.

confusion_matrix(threshold=0.5, colormap='inferno')

Plots the confusion matrix of true positives, true negatives, false positives, and false negatives.

Parameters
  • threshold (float, optional) – Defines the threshold for positive vs. negative cases. Default is 0.5 (50%).

  • colormap (str, optional) – Colormap to draw colors from to plot the light curves on the confusion matrix. Default is ‘inferno’.

loss_acc(train_color='k', val_color='darkorange')

Plots the loss & accuracy curves for the training and validation sets.

Parameters
  • train_color (str, optional) – Color to plot the training set in. Default is black.

  • val_color (str, optional) – Color to plot the validation set in. Default is dark orange.

precision_recall(**kwargs)

Plots the ensemble-averaged precision recall metric.

Parameters

**kwargs (dictionary, optional) – Dictionary of parameters to pass into matplotlib.