API¶
-
class
stella.
DownloadSets
(fn_dir=None, flare_catalog_name=None)¶ Downloads the flare catalog and light curves needed for the CNN training, test, and validation sets. This class also reformats the light curves into .npy files and removes the associated FITS files to save space.
-
download_catalog
()¶ Downloads the flare catalog using Vizier. The flare catalog is named ‘Guenther_2020_flare_catalog.txt’. The star catalog is named ‘Guenther_2020_star_catalog.txt’.
-
flare_table
¶ Flare catalog that was downloaded.
- Type
astropy.table.Table
-
-
download_lightcurves
(remove_fits=True)¶ Downloads light curves for the training, validation, and test sets.
- Parameters
remove_fits (bool, optional) – Allows the user to remove the TESS light curveFITS files when done. This will save space. Default is True.
-
download_models
(all_models=False)¶ Downloads the stella pre-trained convolutional neural network models from MAST.
- Parameters
all_model (bool, optional) – Determines whether or not to return all 100 trained models or the 10 models used in Feinstein et al. (2020) analysis in the attribute models. Default is False.
-
model_dir
¶ Path to where the CNN models have been downloaded.
- Type
str
-
models
¶ Array of model filenames.
- Type
np.array
-
-
class
stella.
FlareDataSet
(fn_dir=None, catalog=None, downloadSet=None, cadences=200, frac_balance=0.73, training=0.8, validation=0.9)¶ Given a directory of files, reformat data to create a training set for the convolutional neural network. Files must be in ‘.npy’ file format and contain at minimum the following indices:
0th index = array of time
1st index = array of flux
2nd index = array of flux errors
All other indices in the files are ignored. This class additionally requires a catalog of flare start times for labeling. The flare catalog can be in either ‘.txt’ or ‘.csv’ file format. This class will be passed into the stella.neural_network() class to create and train the neural network.
-
load_files
(id_keyword='TIC', ft_keyword='tpeak', time_offset=2457000.0)¶ Loads in light curves from the assigned training set directory. Files must be formatted such that the ID of each star is first and followed by ‘_’ (e.g. 123456789_sector09.npy).
-
times
¶ An n-dimensional array of times, where n is the number of training set files.
- Type
np.ndarray
-
fluxes
¶ An n-dimensional array of fluxes, where n is the number of training set files.
- Type
np.ndarray
-
flux_errs
¶ An n-dimensional array of flux errors, where n is the number of training set files.
- Type
np.ndarray
-
ids
¶ An array of light curve IDs for each time/flux/flux_err. This is essential for labeling flare events.
- Type
np.array
-
id_keyword
¶ The column header in catalog to identify target ID. Default is ‘tic_id’.
- Type
str, optional
-
ft_keyword
¶ The column header in catalog to identify flare peak time. Default is ‘tpeak’.
- Type
str, optional
-
time_offset
¶ Time correction from flare catalog to light curve and is necessary when using Max Guenther’s catalog. Default is 2457000.0
- Type
float, optional
-
-
reformat_data
(random_seed=321)¶ Reformats the data into cadences-sized array and assigns a label based on flare times defined in the catalog.
- Parameters
random_seed (int, optional) – A random seed to set for randomizing the order of the training_matrix after it is constructed. Default is 321.
-
training_matrix
¶ An n x cadences-sized array used as the training data.
- Type
np.ndarray
-
labels
¶ An n-sized array of labels for each row in the training data.
- Type
np.array
-
class
stella.
ConvNN
(output_dir, ds=None, layers=None, optimizer='adam', loss='binary_crossentropy', metrics=None)¶ Creates and trains the convolutional neural network.
-
calibration
(df, metric_threshold)¶ Transforming the rankings output by the CNN into actual probabilities. This can only be run for an ensemble of models.
- Parameters
df (astropy.Table.table) – Table of output predictions from the validation set.
metric_threshold (float) – Defines ranking above which something is considered a flares.
-
create_model
(seed)¶ Creates the Tensorflow keras model with appropriate layers.
-
model
¶ - Type
tensorflow.python.keras.engine.sequential.Sequential
-
-
cross_validation
(seed=2, epochs=350, batch_size=64, n_splits=5, shuffle=False, pred_test=False, save=False)¶ Performs cross validation for a given number of K-folds. Reassigns the training and validation sets for each fold.
- Parameters
seed (int, optional) – Sets random seed for creating CNN model. Default is 2.
epochs (int, optional) – Number of epochs to run each folded model on. Default is 350.
batch_size (int, optional) – The batch size for training. Default is 64.
n_splits (int, optional) – Number of folds to perform. Default is 5.
shuffle (bool, optional) – Allows for shuffling in scikitlearn.model_slection.KFold. Default is False.
pred_test (bool, optional) – Allows for predicting on the test set. DO NOT SET TO TRUE UNTIL YOU ARE HAPPY WITH YOUR FINAL MODEL. Default is False.
save (bool, optional) – Allows the user to save the kfolds table of predictions. Defaul it False.
-
crossval_predval
¶ Table of predictions on the validation set from each fold.
- Type
astropy.table.Table
-
crossval_predtest
¶ Table of predictions on the test set from each fold. ONLY EXISTS IF PRED_TEST IS TRUE.
- Type
astropy.table.Table
-
crossval_histories
¶ Table of history values from the model run on each fold.
- Type
astropy.table.Table
-
load_model
(modelname, mode='validation')¶ Loads an already created model.
- Parameters
modelname (str) –
mode (str, optional) –
-
predict
(modelname, times, fluxes, errs, multi_models=False, injected=False)¶ Takes in arrays of time and flux and predicts where the flares are based on the keras model created and trained.
- Parameters
modelname (str) – Path and filename of a model to load.
times (np.ndarray) – Array of times to predict flares in.
fluxes (np.ndarray) – Array of fluxes to predict flares in.
flux_errs (np.ndarray) – Array of flux errors for predicted flares.
injected (bool, optional) – Returns predictions instead of setting attribute. Used for injection-recovery. Default is False.
-
model
¶ The model input with modelname.
- Type
tensorflow.python.keras.engine.sequential.Sequential
-
predict_time
¶ The input times array.
- Type
np.ndarray
-
predict_flux
¶ The input fluxes array.
- Type
np.ndarray
-
predict_err
¶ The input flux errors array.
- Type
np.ndarray
-
predictions
¶ An array of predictions from the model.
- Type
np.ndarray
-
train_models
(seeds=[2], epochs=350, batch_size=64, shuffle=False, pred_test=False, save=False)¶ Runs n number of models with given initial random seeds of length n. Also saves each model run to a hidden ~/.stella directory.
- Parameters
seeds (np.array) – Array of random seed starters of length n, where n is the number of models you want to run.
epochs (int, optional) – Number of epochs to train for. Default is 350.
batch_size (int, optional) – Setting the batch size for the training. Default is 64.
shuffle (bool, optional) – Allows for shuffling of the training set when fitting the model. Default is False.
pred_test (bool, optional) – Allows for predictions on the test set. DO NOT SET TO TRUE UNTIL YOU’VE DECIDED ON YOUR FINAL MODEL. Default is False.
save (bool, optional) – Saves the predictions and histories of from each model in an ascii table to the specified output directory. Default is False.
-
history_table
¶ Saves the metric values for each model run.
- Type
Astropy.table.Table
-
val_pred_table
¶ Predictions on the validation set from each run.
- Type
Astropy.table.Table
-
test_pred_table
¶ Predictions on the test set from each run. Must set pred_test = True, or else it is an empty table.
- Type
Astropy.table.Table
-
-
class
stella.
FitFlares
(id, time, flux, flux_err, predictions)¶ Uses the predictions from the neural network and identifies flaring events based on consecutive points. Users define a given probability threshold for accpeting a flare event as real.
-
get_init_guesses
(groupings, time, flux, err, prob, maskregion, region)¶ Guesses at the initial t0 and amplitude based on probability groups.
- Parameters
groupings (np.ndarray) – Group of indices for a single flare event.
time (np.array) –
flux (np.array) –
err (np.array) –
prob (np.array) –
- Returns
tpeaks (np.ndarray) – Array of tpeaks for each flare group.
amps (np.ndarray) – Array of amplitudes at each tpeak.
-
group_inds
(values)¶ Groups regions marked as flares (> prob_threshold) for flare fitting. Indices within 4 of each other are grouped as one flare.
- Returns
results – An array of arrays, which are groups of indices supposedly attributed with a single flare.
- Return type
np.ndarray
-
identify_flare_peaks
(threshold=0.5)¶ Finds where the predicted value is above the threshold as a flare candidate. Groups consecutive indices as one flaring event.
- Parameters
threshold (float, optional) – The probability threshold for believing an event is a flare. Default is 0.5.
-
treshold
¶ - Type
float
-
flare_table
¶ A table of flare times, amplitudes, and equivalent durations. Equivalent duration given in units of days.
- Type
astropy.table.Table
-
-
class
stella.
MeasureProt
(IDs, time, flux, flux_err)¶ Used for measuring rotation periods.
-
assign_flag
(period, power, width, avg, secpow, maxperiod, orbit_flag=0)¶ Assigns a flag in the table for which periods are reliable.
-
averaged_per_sector
(tab)¶ - Looks at targets observed in different sectors and determines
which period measured is likely the best period. Adds a column to MeasureRotations.LS_results of ‘true_period_days’ for the results.
- Returns
- Return type
astropy.table.Table
-
chiSquare
(var, mu, x, y, yerr)¶ - Calculates chi-square for fitting a Gaussian
to the peak of the LS periodogram.
- Parameters
var (list) – Variables to fit (std and scale for Gaussian curve).
mu (float) – Mean to fit around.
x (np.array) –
y (np.array) –
yerr (np.array) –
- Returns
- Return type
chi-square value.
-
fit_LS_peak
(period, power, arg)¶ Fits the LS periodogram at the peak power.
- Parameters
period (np.array) – Array of periods from Lomb Scargle routine.
power (np.array) – Array of powers from the Lomb Scargle routine.
arg (int) – Argmax of the power in the periodogram.
- Returns
popt – Array of best fit values for Gaussian fit.
- Return type
np.array
-
gauss_curve
(x, std, scale, mu)¶ - Fits a Gaussian to the peak of the LS
periodogram.
- Parameters
x (np.array) –
std (float) – Standard deviation of gaussian.
scale (float) – Scaling for gaussian.
mu (float) – Mean to fit around.
- Returns
- Return type
Gaussian curve.
-
phase_lightcurve
(table=None, trough=- 0.5, peak=0.5, kernel_size=101)¶ Finds and creates a phase light curve that traces the spots. Uses only complete rotations and extrapolates outwards until the entire light curve is covered.
- Parameters
table (astropy.table.Table, optional) – Used for getting the periods of each light curve. Allows users to use already created tables. Default = None. Will search for stella.FindTheSpots.LS_results.
trough (float, optional) – Sets the phase value at the minimum. Default = -0.5.
peak (float, optional) – Sets the phase value t the maximum. Default = 0.5.
kernel_size (odd float, optional) – Sets kernel size for median filter smoothing. Default = 15.
-
phases
¶ - Type
np.ndarray
-
run_LS
(minf=0.08, maxf=10.0, spp=50)¶ Runs LS fit for each light curve.
- Parameters
minf (float, optional) – The minimum frequency to search in the LS routine. Default = 1/20.
maxf (float, optional) – The maximum frequency to search in the LS routine. Default = 1/0.1.
spp (int, optional) – The number of samples per peak. Default = 50.
-
LS_results
¶ - Type
astropy.table.Table
-
-
class
stella.
Visualize
(cnn, set='validation')¶ Creates diagnostic plots for the neural network.
-
confusion_matrix
(threshold=0.5, colormap='inferno')¶ Plots the confusion matrix of true positives, true negatives, false positives, and false negatives.
- Parameters
threshold (float, optional) – Defines the threshold for positive vs. negative cases. Default is 0.5 (50%).
colormap (str, optional) – Colormap to draw colors from to plot the light curves on the confusion matrix. Default is ‘inferno’.
-
loss_acc
(train_color='k', val_color='darkorange')¶ Plots the loss & accuracy curves for the training and validation sets.
- Parameters
train_color (str, optional) – Color to plot the training set in. Default is black.
val_color (str, optional) – Color to plot the validation set in. Default is dark orange.
-
precision_recall
(**kwargs)¶ Plots the ensemble-averaged precision recall metric.
- Parameters
**kwargs (dictionary, optional) – Dictionary of parameters to pass into matplotlib.
-