API¶

class stella.DownloadSets(fn_dir=None, flare_catalog_name=None)¶

Downloads the flare catalog and light curves needed for the CNN training, test, and validation sets. This class also reformats the light curves into .npy files and removes the associated FITS files to save space.

download_catalog()¶

Downloads the flare catalog using Vizier. The flare catalog is named ‘Guenther_2020_flare_catalog.txt’. The star catalog is named ‘Guenther_2020_star_catalog.txt’.

flare_table¶

Flare catalog that was downloaded.

Type: astropy.table.Table

download_lightcurves(remove_fits=True)¶

Downloads light curves for the training, validation, and test sets.

Parameters: remove_fits (bool, optional) – Allows the user to remove the TESS light curveFITS files when done. This will save space. Default is True.

download_models(all_models=False)¶

Downloads the stella pre-trained convolutional neural network models from MAST.

Parameters: all_model (bool, optional) – Determines whether or not to return all 100 trained models or the 10 models used in Feinstein et al. (2020) analysis in the attribute models. Default is False.

model_dir¶

Path to where the CNN models have been downloaded.

Type: str

models¶

Array of model filenames.

Type: np.array

class stella.FlareDataSet(fn_dir=None, catalog=None, downloadSet=None, cadences=200, frac_balance=0.73, training=0.8, validation=0.9)¶

Given a directory of files, reformat data to create a training set for the convolutional neural network. Files must be in ‘.npy’ file format and contain at minimum the following indices:

0th index = array of time

1st index = array of flux

2nd index = array of flux errors

All other indices in the files are ignored. This class additionally requires a catalog of flare start times for labeling. The flare catalog can be in either ‘.txt’ or ‘.csv’ file format. This class will be passed into the stella.neural_network() class to create and train the neural network.

load_files(id_keyword='TIC', ft_keyword='tpeak', time_offset=2457000.0)¶

Loads in light curves from the assigned training set directory. Files must be formatted such that the ID of each star is first and followed by ‘_’ (e.g. 123456789_sector09.npy).

times¶

An n-dimensional array of times, where n is the number of training set files.

Type: np.ndarray

fluxes¶

An n-dimensional array of fluxes, where n is the number of training set files.

Type: np.ndarray

flux_errs¶

An n-dimensional array of flux errors, where n is the number of training set files.

Type: np.ndarray

ids¶

An array of light curve IDs for each time/flux/flux_err. This is essential for labeling flare events.

Type: np.array

id_keyword¶

The column header in catalog to identify target ID. Default is ‘tic_id’.

Type: str, optional

ft_keyword¶

The column header in catalog to identify flare peak time. Default is ‘tpeak’.

Type: str, optional

time_offset¶

Time correction from flare catalog to light curve and is necessary when using Max Guenther’s catalog. Default is 2457000.0

Type: float, optional

reformat_data(random_seed=321)¶

Reformats the data into cadences-sized array and assigns a label based on flare times defined in the catalog.

Parameters: random_seed (int, optional) – A random seed to set for randomizing the order of the training_matrix after it is constructed. Default is 321.

training_matrix¶

An n x cadences-sized array used as the training data.

Type: np.ndarray

labels¶

An n-sized array of labels for each row in the training data.

Type: np.array

class stella.ConvNN(output_dir, ds=None, layers=None, optimizer='adam', loss='binary_crossentropy', metrics=None)¶

Creates and trains the convolutional neural network.

calibration(df, metric_threshold)¶

Transforming the rankings output by the CNN into actual probabilities. This can only be run for an ensemble of models.

Parameters

df (astropy.Table.table) – Table of output predictions from the validation set.
metric_threshold (float) – Defines ranking above which something is considered a flares.

create_model(seed)¶

Creates the Tensorflow keras model with appropriate layers.

model¶

Type: tensorflow.python.keras.engine.sequential.Sequential

cross_validation(seed=2, epochs=350, batch_size=64, n_splits=5, shuffle=False, pred_test=False, save=False)¶

Performs cross validation for a given number of K-folds. Reassigns the training and validation sets for each fold.

Parameters

seed (int, optional) – Sets random seed for creating CNN model. Default is 2.
epochs (int, optional) – Number of epochs to run each folded model on. Default is 350.
batch_size (int, optional) – The batch size for training. Default is 64.
n_splits (int, optional) – Number of folds to perform. Default is 5.
shuffle (bool, optional) – Allows for shuffling in scikitlearn.model_slection.KFold. Default is False.
pred_test (bool, optional) – Allows for predicting on the test set. DO NOT SET TO TRUE UNTIL YOU ARE HAPPY WITH YOUR FINAL MODEL. Default is False.
save (bool, optional) – Allows the user to save the kfolds table of predictions. Defaul it False.

crossval_predval¶

Table of predictions on the validation set from each fold.

Type: astropy.table.Table

crossval_predtest¶

Table of predictions on the test set from each fold. ONLY EXISTS IF PRED_TEST IS TRUE.

Type: astropy.table.Table

crossval_histories¶

Table of history values from the model run on each fold.

Type: astropy.table.Table

load_model(modelname, mode='validation')¶

Loads an already created model.

Parameters

modelname (str) –
mode (str, optional) –

predict(modelname, times, fluxes, errs, multi_models=False, injected=False)¶

Takes in arrays of time and flux and predicts where the flares are based on the keras model created and trained.

Parameters

modelname (str) – Path and filename of a model to load.
times (np.ndarray) – Array of times to predict flares in.
fluxes (np.ndarray) – Array of fluxes to predict flares in.
flux_errs (np.ndarray) – Array of flux errors for predicted flares.
injected (bool, optional) – Returns predictions instead of setting attribute. Used for injection-recovery. Default is False.

model¶

The model input with modelname.

Type: tensorflow.python.keras.engine.sequential.Sequential

predict_time¶

The input times array.

Type: np.ndarray

predict_flux¶

The input fluxes array.

Type: np.ndarray

predict_err¶

The input flux errors array.

Type: np.ndarray

predictions¶

An array of predictions from the model.

Type: np.ndarray

train_models(seeds=[2], epochs=350, batch_size=64, shuffle=False, pred_test=False, save=False)¶

Runs n number of models with given initial random seeds of length n. Also saves each model run to a hidden ~/.stella directory.

Parameters

seeds (np.array) – Array of random seed starters of length n, where n is the number of models you want to run.
epochs (int, optional) – Number of epochs to train for. Default is 350.
batch_size (int, optional) – Setting the batch size for the training. Default is 64.
shuffle (bool, optional) – Allows for shuffling of the training set when fitting the model. Default is False.
pred_test (bool, optional) – Allows for predictions on the test set. DO NOT SET TO TRUE UNTIL YOU’VE DECIDED ON YOUR FINAL MODEL. Default is False.
save (bool, optional) – Saves the predictions and histories of from each model in an ascii table to the specified output directory. Default is False.

history_table¶

Saves the metric values for each model run.

Type: Astropy.table.Table

val_pred_table¶

Predictions on the validation set from each run.

Type: Astropy.table.Table

test_pred_table¶

Predictions on the test set from each run. Must set pred_test = True, or else it is an empty table.

Type: Astropy.table.Table

class stella.FitFlares(id, time, flux, flux_err, predictions)¶

Uses the predictions from the neural network and identifies flaring events based on consecutive points. Users define a given probability threshold for accpeting a flare event as real.

get_init_guesses(groupings, time, flux, err, prob, maskregion, region)¶

Guesses at the initial t0 and amplitude based on probability groups.

Parameters

groupings (np.ndarray) – Group of indices for a single flare event.
time (np.array) –
flux (np.array) –
err (np.array) –
prob (np.array) –

Returns

tpeaks (np.ndarray) – Array of tpeaks for each flare group.
amps (np.ndarray) – Array of amplitudes at each tpeak.

group_inds(values)¶

Groups regions marked as flares (> prob_threshold) for flare fitting. Indices within 4 of each other are grouped as one flare.

Returns: results – An array of arrays, which are groups of indices supposedly attributed with a single flare.
Return type: np.ndarray

identify_flare_peaks(threshold=0.5)¶

Finds where the predicted value is above the threshold as a flare candidate. Groups consecutive indices as one flaring event.

Parameters: threshold (float, optional) – The probability threshold for believing an event is a flare. Default is 0.5.

treshold¶

Type: float

flare_table¶

A table of flare times, amplitudes, and equivalent durations. Equivalent duration given in units of days.

Type: astropy.table.Table

class stella.MeasureProt(IDs, time, flux, flux_err)¶

Used for measuring rotation periods.

assign_flag(period, power, width, avg, secpow, maxperiod, orbit_flag=0)¶: Assigns a flag in the table for which periods are reliable.

averaged_per_sector(tab)¶

Looks at targets observed in different sectors and determines: which period measured is likely the best period. Adds a column to MeasureRotations.LS_results of ‘true_period_days’ for the results.

Returns
Return type: astropy.table.Table

chiSquare(var, mu, x, y, yerr)¶

Calculates chi-square for fitting a Gaussian: to the peak of the LS periodogram.

Parameters

var (list) – Variables to fit (std and scale for Gaussian curve).
mu (float) – Mean to fit around.
x (np.array) –
y (np.array) –
yerr (np.array) –

Returns

Return type

chi-square value.

fit_LS_peak(period, power, arg)¶

Fits the LS periodogram at the peak power.

Parameters

period (np.array) – Array of periods from Lomb Scargle routine.
power (np.array) – Array of powers from the Lomb Scargle routine.
arg (int) – Argmax of the power in the periodogram.

Returns

popt – Array of best fit values for Gaussian fit.

Return type

np.array

gauss_curve(x, std, scale, mu)¶

Fits a Gaussian to the peak of the LS: periodogram.

Parameters

x (np.array) –
std (float) – Standard deviation of gaussian.
scale (float) – Scaling for gaussian.
mu (float) – Mean to fit around.

Returns

Return type

Gaussian curve.

phase_lightcurve(table=None, trough=- 0.5, peak=0.5, kernel_size=101)¶

Finds and creates a phase light curve that traces the spots. Uses only complete rotations and extrapolates outwards until the entire light curve is covered.

Parameters

table (astropy.table.Table, optional) – Used for getting the periods of each light curve. Allows users to use already created tables. Default = None. Will search for stella.FindTheSpots.LS_results.
trough (float, optional) – Sets the phase value at the minimum. Default = -0.5.
peak (float, optional) – Sets the phase value t the maximum. Default = 0.5.
kernel_size (odd float, optional) – Sets kernel size for median filter smoothing. Default = 15.

phases¶

Type: np.ndarray

run_LS(minf=0.08, maxf=10.0, spp=50)¶

Runs LS fit for each light curve.

Parameters

minf (float, optional) – The minimum frequency to search in the LS routine. Default = 1/20.
maxf (float, optional) – The maximum frequency to search in the LS routine. Default = 1/0.1.
spp (int, optional) – The number of samples per peak. Default = 50.

LS_results¶

Type: astropy.table.Table

class stella.Visualize(cnn, set='validation')¶

Creates diagnostic plots for the neural network.

confusion_matrix(threshold=0.5, colormap='inferno')¶

Plots the confusion matrix of true positives, true negatives, false positives, and false negatives.

Parameters

threshold (float, optional) – Defines the threshold for positive vs. negative cases. Default is 0.5 (50%).
colormap (str, optional) – Colormap to draw colors from to plot the light curves on the confusion matrix. Default is ‘inferno’.

loss_acc(train_color='k', val_color='darkorange')¶

Plots the loss & accuracy curves for the training and validation sets.

Parameters

train_color (str, optional) – Color to plot the training set in. Default is black.
val_color (str, optional) – Color to plot the validation set in. Default is dark orange.

precision_recall(**kwargs)¶

Plots the ensemble-averaged precision recall metric.

Parameters: **kwargs (dictionary, optional) – Dictionary of parameters to pass into matplotlib.