2. tofu.treat

Provide data handling class and methods (storing, processing, plotting...)

class tofu.treat.PreData(data, t=None, Chans=None, Id=None, Exp='AUG', shot=None, Diag='SXR', dtime=None, dtimeIn=False, SavePath=None, LIdDet=None, DtRef=None, MovMeanfreq=100, Resamp=True, interpkind='linear', indOut=None, indCorr=None, DF=None, Harm=True, DFEx=None, HarmEx=True, lt=[], lNames=[], Calc=True)[source]

A class defining a data-handling object, data is stored as read-only attribute, copies of it can be modified, methods for plotting, saving...

The name of the class refers to Pre-treatment Data (i.e.: in the context of tomography, data that is pre-treated before being fed to an inversion algorithm). ToFu provide a generic data-handling class, which comes a robust data storing policy: the input data is stored in a read-only attribute and the data-processing methods are used on a copy (e.g.: for computing the SVD, Fourier transform, shorten the time interval of interest, eliminate some channels...). Furthermore, methods for interactive plotting are provided as well as a saving method

obj
: PreData
The created instance
Corr_add(Val=[], LCrit=['Name', 'Cam', 'CamHead'], indCorr=None, Calc=True)[source]

Add channels to the list of channels that are thought to need correction

When a channel is suspected to need correction (mismatching retrofit due for example to wrong calibration), it can be included in a dedicated correction list. Channels in this list can then be discarded for the inversion, a correction coefficient can be computed from the retrofit, and the inversion can be re-done using this correction coefficient. This list works like the list of excluded / corrupted channels self.Out_list()

Parameters:
  • Val (list) – Fed to self.select(), list of values for criteria in LCrit that should be used to exclude channels (e.g.: list of channel names of camera names)
  • LCrit (list) – Fed to self.select(), list of criteria against which to select the channels matching the values in Val (should be attributes of tofu.pathfile.ID or of its USRdict attribute)
  • indCorr (None / np.ndarray) – Alternatively, you can directly pass a (N,) bool array whereN matches the number of channels and True means that a channel should be excluded, thus setting self._indCorr
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
Corr_list(Out='Name')[source]

Return the list of channel names needing correction

This lists the channels indicated by self._indOut, populated using self.Out_add() and de-populated using self.In_add(). The output can be returned as a list of channel Names

Parameters:Out (str) – Flag indicating in which form to return the output (fed to select())
Returns:L (list) – List of excluded channels in the required form
Corr_remove(Val=[], LCrit=['Name', 'Cam', 'CamHead'], Calc=True)[source]

Add channels to the list of channels to be re-inserted as valid channels

Works like self.In_add() (i.e.: opposite of self.Corr_add())

Parameters:
  • Val (list) – Fed to self.select(), list of values for criteria in LCrit that should be used to exclude channels (e.g.: list of channel names of camera names)
  • LCrit (list) – Fed to self.select(), list of criteria against which to select the channels matching the values in Val (should be attributes of tofu.pathfile.ID or of its USRdict attribute)
  • indCorr (None / np.ndarray) – Alternatively, you can directly pass a (N,) bool array whereN matches the number of channels and True means that a channel should be excluded, thus setting self._indCorr
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
In_add(LVal=[], LCrit=['Name', 'Cam', 'CamHead'], Calc=True)[source]

Add channels to the list of channels to be re-included as valid channels

Provides a mechanism opposite to Out_add(). We you change your mind about a series of channel and think they should be re-included as valid, pass them to this method using the same arguments as self.Out_add()

Parameters:
  • Val (list) – Fed to self.select(), list of values for criteria in LCrit that should be used to exclude channels (e.g.: list of channel names of camera names)
  • LCrit (list) – Fed to self.select(), list of criteria against which to select the channels matching the values in Val (should be attributes of tofu.pathfile.ID or of its USRdict attribute)
  • indOut (None / np.ndarray) – Alternatively, you can directly pass a (N,) bool array whereN matches the number of channels and True means that a channel should be excluded, thus setting self._indOut
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
In_list(Out='Name')[source]

Return the list of included channel names (considered valid)

The equivalent of Out_list(), but this time returning the complementary list

Parameters:Out (str) – Flag indicating in which form to return the output (fed to select())
Returns:L (list) – List of excluded channels in the required form
Out_add(Val=[], LCrit=['Name', 'Cam', 'CamHead'], indOut=None, Calc=True)[source]

Add desired channels to the list of channels to be excluded

It is possible to store a list a list of channels that are thought to be corrupted or more generally that, after closer inspection, are considered not fit. This list is then automatically passed on to further ToFu objects (e.g.: for inversions), so that the corresponding data is excluded from all further processes. PreData provides methods to append channel names to this list (in fact you can even exclude whole cameras).

Parameters:
  • Val (list) – Fed to self.select(), list of values for criteria in LCrit that should be used to exclude channels (e.g.: list of channel names of camera names)
  • LCrit (list) – Fed to self.select(), list of criteria against which to select the channels matching the values in Val (should be attributes of tofu.pathfile.ID or of its USRdict attribute)
  • indOut (None / np.ndarray) – Alternatively, you can directly pass a (N,) bool array whereN matches the number of channels and True means that a channel should be excluded, thus setting self._indOut
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
Out_list(Out='Name')[source]

Return the list of excluded channel names (considered corrupted)

This lists the channels indicated by self._indOut, populated using self.Out_add() and de-populated using self.In_add(). The output can be returned as a list of channel Names

Parameters:Out (str) – Flag indicating in which form to return the output (fed to select())
Returns:L (list) – List of excluded channels in the required form
interp(lt=[], lNames=[], Calc=True)[source]

Perform linear interpolation of data at chosen times for chosen channels

As opposed to self.set_t(), this method shall be used to interpolate data of a small number of channels at a small sumber of time points. Use this to correct a small number of time points that are clearly corrupted when you think the rest shall be preserved.

!!! This is done with respect to the reference time vector and dataset, to avoid propagating errors through later data treatment (use self.plot(V=’Ref’) to plot the reference data set) !!!

Parameters:
  • lt (list) – Times at which linear interpolation should be performed
  • lNames (list) –
    Channels for which interpolation should be performed, one element per corresponding time point, elements can be:
    • list of str: list of channel names that should be interpolated for the corresponding time point
    • str: single channel name that should be interpolated for the corresponding time point
    • ‘All’: all channels should be interpolated for the corresponding time point
  • Calc (bool) – Flag indicating whether data should be updated immediately

Examples

>> obj.interp(lt=[2.55, 5.10, 6.84], lNames=[[‘H_021’,’J_014’], ‘F_10’, ‘All’], Calc=True)
Will perform interpolation for 2 channels for the first time point, for one channel for the second, and for all channels for the last time point
plot(a4=False)[source]

Plot the signal in an interactive window, no arguments needed

Plot an interactive matplotlib window to explore the data

Parameters:a4 (bool) – Flag indicating whether the figure should be the size of a a4 sheet of paper (to facilitate printing)
Returns:Lax (list) – List of plt.Axes on which the plots are made
plot_fft(Val=None, Crit='Name', V='simple', tselect=None, Fselect=None, PreExp=None, PostExp=None, Log='or', InOut='In', SpectNorm=True, DTF=None, RatDef=100.0, Inst=True, MainF=True, ylim=(None, None), cmap=<matplotlib.colors.LinearSegmentedColormap object>, a4=False)[source]

Plot the power spectrum (fft) of the chosen signals

Computes the fft of the data and plots the power spectrum, normalized or not, for the chosen channels

Parameters Val, Crit, PreExp, PostExp, Log and InOut are for channel selection and are fed to select()

Parameters:
  • V (str) – Flag indicating whether the plot should be interactive, values in [‘simple’,’inter’]
  • tselect (None /) –
  • Fselect (None /) –
  • SpectNorm (bool) – Flag, if True the power spectrum is normalised to its maximum at each time step (default: True)
  • DTF (float) – Size (in seconds) of the running time window to be used for the windowed fft
  • RatDef (float) – Used if DTF not provided, the number by which the total signal duration is divided to get a time window
  • Inst (bool) – Flag, if true, the average of the signal is substracted at each time step to emphasize high frequencies (higher than the one associated to the running time window, default: True)
  • MainF (bool) – Flag
  • ylim (tuple) – Each limit which is not None is fed to plt.Axes.set_ylim()
  • a4 (bool) – Flag, if true the figure is sized so as to fill a a4 paper sheet
Returns:

Lax (list) – List of plt.Axes on which the plots were made

plot_svd(Modes=10, NRef=None, a4=False, Test=True)[source]

Plot the chosen modes (topos and chronos) of the svd of the data, and the associated spectrum on a separate figure

Performs a svd of the data and plots the singular values, the temporal and spacial modes

Modes
: int / iterable
Index of the modes to be plotted, the modes and sorted in decreasing order of singular value
  • int : plots all modes in range(0,Modes)
  • iterable : plots all modes whose index is contained in Modes
NRef
: None
Number of columns in the plot, if None set to len(Modes)/2 (i.e.: 2 modes plotted per axes)
a4
: bool
Flag indicating whether the figure should be the size of a a4 sheet of paper (to facilitate printing)
Test
: bool
Flag indicating whether the inputs should be tested for conformity
Returns:Lax (list) – List of plt.Axes on which the plots were made
save(SaveName=None, Path=None, Mode='npz', compressed=False)[source]

Save the object in folder Name, under file name SaveName, using specified mode

Most tofu objects can be saved automatically as numpy arrays (.npz, recommended) at the default location (recommended) by simply calling self.save()

Parameters:
  • SaveName (None / str) – The name to be used for the saved file, if None (recommended) uses self.Id.SaveName
  • Path (None / str) – Path specifying where to save the file, if None (recommended) uses self.Id.SavePath
  • Mode (str) – Flag specifying whether to save the object as a numpy array file (‘.npz’, recommended) or an object using cPickle (not recommended, heavier and may cause retro-compatibility issues)
  • compressed (bool) – Flag, used when Mode=’npz’, indicating whether to use np.savez or np.savez_compressed (slower saving and loading but smaller files)
select(Val=None, Crit='Name', PreExp=None, PostExp=None, Log='any', InOut='In', Out=<type 'bool'>, ToIn=False)[source]

Return a sub-set of the data (channels-wise selection)

Return an array of indices of channels selected according to the chosen criteria with chosen values Use either Val or (PreExp and PostExp)

Parameters:
  • Val (list or str) – List of values that the chosen criteria must match (converted to one-item list if str)
  • Crit (str) – Criterion used to select some channels, must be among their tfpf.ID class attributes (e.g.: ‘Name’, ‘SaveName’...) or IFTF.ID.USRdict (‘Cam’,...)
  • PreExp (list or str) – List of str expressions to be fed to eval(PreExp[ii]+” Detect.Crit “+PostExp[ii]) or eval(PreExp[ii]+” Detect.USRdict.Crit “+PostExp[ii])
  • PostExp (list or str) – List of str expressions to be fed to eval(PreExp[ii]+” Detect.Crit “+PostExp[ii]) or eval(PreExp[ii]+” Detect.USRdict.Crit “+PostExp[ii])
  • Log (str) – Flag (‘or’ or ‘and’) indicating whether to select the channels matching all criteria or any
  • InOut (str) – Flag (‘In’ or ‘Out’) indicating whether to select all channels matching the criterion, or all except those
  • Out (type or str) – Flag (bool, int or an attribute of tfpf.ID or tfpf.ID.USRdict) indicating whether to return an array of boolean indices or int indices, or a list of the chosen attributes (e.g.: ‘Name’)
  • ToIn (bool) – Flag indicating whether indices should be returned with respect to the channels that are considered as included only (see obj.In_list() to see these channels)
Returns:

ind (np.ndarray) – Indices of the selected channels, as a bool or int array

Examples

>> ind = TFT.PreData.select(Val=[‘H’,’J’], Crit=’Cam’, Log=’any’, InOut=’In’, Out=bool)
Will return a bool array of the indices of all channels for which ‘Cam’ is ‘H’ or ‘J’
>> ind = PreData.select(Crit=’Name’, PreExp=[“‘F’ in ”, “‘6’ in “], Log=’and’, InOut=’In’, Out=int)
Will return an int array of indices of all channels for which ‘F’ and ‘6’ are both included in the name
>> ind = PreData.select(Crit=’CamHead’, PreExp=[“‘F’ in ”, “‘2’ in “], Log=’any’, InOut=’Out’, Out=’Name’)
Will return the names (as a list) of all channels except those that have a camera head name that includes a ‘F’ or a ‘2’ (i.e.: except camera heads ‘F’ and ‘H2’, ‘I2’, ‘J2’, ‘K2’)
set_Dt(Dt=None, Calc=True)[source]

Set the time interval to which the data should be limited (does not affect the reference data)

While the original data set and time base are always preserved in the background, you can change your mind and focus on a smaller interval included in the original one. This can be convenient for applying data treatment (SVD, fft...) to parts of the signal lifetime only.

Parameters:
  • Dt (None / list) – The time interval of interest, as a list of len()=2 in increasing values
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
set_PhysNoise(Mode='svd', Phys=[0, 1, 2, 3, 4, 5, 6, 7], DF=[10000.0, 11000.0], DFEx=None, Harm=True, HarmEx=True, Deg=0, Nbin=3, LimRatio=0.05, Plot=False)[source]

Use a svd or a fft to estimate the physical part of the signal and the part which can be assimilated to noise, then uses specified degree for polynomial noise model

This method provides an easy way to compute the noise level on each channel. It can be done in 2 different ways:

  • ‘svd’: you have to provide the mode numbers that you think can be considered as physical, the signal will be re-constructed from these and the rest discarded as noise
  • ‘fft’: you have to provide the frequency window that you think is physical (optionaly the higher harmonics can be included), the signal is re-constructed via inverse fourier and the rest discarded as noise

To help you decide which mode numbers of frequency interval to use, you can preliminarily use self.plot_svd() and self.plot_fft() to visualize the decompositions.

Note : this is only used to compute a noise estimate, stored separately, the total original signal is preserved

Parameters:
  • Mode (str) – Flag indicating with which method should the noise be estimated (‘svd’ or ‘fft’)
  • list (DFEx) – Modes to be extracted from the svd (default: first 8 modes), use method .plot_svd() to choose the modes
  • list – 2 values delimiting a frequency interval (in Hz) from which to extract signal using a fft and rfft
  • bool (Plot) – Flag, if True all the available higher harmonics of FreqIn will also be included in the physical signal
  • list – 2 values delimiting a frequency interval (in Hz) that shall be avoided in the physical signal (relevant if some high harmonics of DF intersect DFEx)
  • bool – Flag, if True all the available higher harmonics of Freqout will also be avoided in the physical signal
  • int (Nbin) – Degree to be used for the polynomial noise model
  • int – Number of bins to be used for evaluating the noise (std) at various signal values
  • float (LimRatio) – Ratio ... to be finished...
  • bool – Flag, if True the histogram of the estimated noise is plotted

Examples

>> obj.set_PhysNoise(Mode=’svd’, Phys=[0,1,2,3,4,5], Deg=0)
Will take the first 6 modes of the signal svd and consider as physical, the rest is used to compute a constant (Deg=0) noise estimate on each channel
set_Resamp(t=None, f=None, Method='movavrg', interpkind='linear', Calc=True)[source]

Re-sample the data and time vector

Use a new time vector that can either be:
  • provided directly (if t is not None)
  • computed from an input sampling frequency (if f is not None)

If but t and f are provided, t is used as the time vector and f is only used for the moving average

Then, the data is re-computed on this new time vector using either interpolation (‘interp’) or moving average (‘movavrg’)

Parameters:
  • t (None / np.ndarray) –
  • f (None / int / float) –
  • Method (str) –
  • Resamp (bool) –
  • interpkind (str) –
  • Calc (bool) – Flag indicating whether the calculation should be triggered immediately
set_fft(DF=None, Harm=True, DFEx=None, HarmEx=True, Calc=True)[source]

Return the FFT-filtered signal (and the rest) in the chosen frequency window (in Hz) and in all the higher harmonics (optional)

Can also exclude a given interval and its higher harmonics from the filtering (optional)

Parameters:
  • DF (iterable) – Iterable of len()=2, containing the lower and upper bounds of the frequency interval (Hz) to be used for filtering
  • Harm (bool) – If True all the higher harmonics of the interval DF will also be included
  • DFEx (list) – List or tuple of len()=2, containing the lower and upper bounds of the frequency interval to be excluded from filtering (in case it overlaps with some high harmonics of DF)
  • HarmEx (bool) – If True all the higher harmonics of the interval DFEx will also be excluded
substract_Dt(tsub=None, Calc=True)[source]

Allows subtraction of data at one time step from all data

Can be convenient for plotting background-subtracted signal (background meaning signal before a reference time step).

Parameters:
  • tsub (int / float / iterable) –
    A time value, or a time interval indicating which part of the signal is to be considered as reference and subtracted from the rest
    • int / float :
  • Calc (bool) – Flag indicating whether data should be updated immediately

2.1. Indices and tables