Pyplatefit tutorial

Pyplatefit tutorial

Roland Bacon

Introduction

Pyplatefit is a python module to perform emission line fitting of astronomical spectra. Although it has been developed for the MUSE integral field spectrograph, it should work for spectrum delivered by other instruments, provided that the input spectrum is in MPDAF format.

The program take as input a spectrum and an input redshift and return the list of emission lines parameters: e.g. flux, velocity dispersion, equivalent width and their errors. The program perform a continuum fit, using a simple population model and then fit the emission lines after continuum subtraction. In addition the program can also perform basic absorption lines fit.

Lines are organized into families. All lines from the same family share the same velocity (or redshift) and velocity dispersion. In addition to the two Balmer and Forbidden families, resonnant lines like \({\rm Ly\alpha}\) or \({\rm Mg_{II}}\) are fitted separately as distinct families. Absorption lines are also processed as a separate family. Each family is fitted independantly.

All emission (and absorption) lines are modelled as Gaussians with 3 parameters: velocity, velocity dispersion (in \({\rm km s^{-1}}\)) and flux. The only exception, is the \({\rm Ly\alpha}\) line which is fitted as a skewed Gaussian, with an additional \({\rm \gamma}\) parameter measing the asymetry of the line.

\[{\rm F(\lambda) = F_{max} \left[ 1 + erf \left( \gamma \frac{\left(\lambda - \lambda_{0}\right)}{\sqrt 2 \sigma} \right) \right] \left[ exp \left( - \frac{\left(\lambda - \lambda_{0}\right)^2}{2 \sigma^2} \right) \right] }\]

Optionnaly, double peaked \({\rm Ly\alpha}\) line can be fitted as the sum of two skewed Gaussians.

The program use the front-end lmfit minimization package based on scipy optimize non-linear least-squares minimization. The default minimization routine is the Trust Region Reflective least squares algorithm, which is more robust and gives better estimate error estimate than the classical Levenberg–Marquardt algorithm. Other minimization algorithm can be selected, see lmfit documentation for detail. The MCMC emcee package is also used to improve error computation.

Importing modules

[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import os
import mpdaf, pyplatefit
from mpdaf.obj import Spectrum
from pyplatefit import fit_spec, plot_fit, print_res
print(f"MPDAF version {mpdaf.__version__}")
print(f"Pyplatefit version {pyplatefit.__version__}")
MPDAF version 3.6.dev21+g8acab809
Pyplatefit version 0.8.dev0+g02018a5.d20220511

Basic usage

Input data

The input spectrum must be in MPDAF format, ie with proper wcs information and data and variance extensions. See MPDAF documentation for more detailed information. The input redshift z (in vacuum) must be precise enough. Pyplatefit will refine it but cannot deal with poorly known redshift.

[2]:
datadir = '../tests/test_data'
sp = Spectrum(os.path.join(datadir, 'udf10_00002.fits'))
sp.info()
[INFO] 3681 Spectrum (../tests/test_data/udf10_00002.fits)
[INFO] .data(3681) (1e-20 erg / (Angstrom cm2 s)), .var(3681)
[INFO] wavelength: min:4750.00 max:9350.00 step:1.25 Angstrom

Running the minimisation

fit_spec is the main routine, see later for the meaning of all input parameters.

[3]:
res = fit_spec(sp, z=0.41892)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 21.3 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Found 2 non resonnant line families to fit
[DEBUG] Init Fit of family balmer
[DEBUG] Found 9 lines to fit
[DEBUG] added 9 gaussian to the fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 13 lines to fit
[DEBUG] added 13 gaussian to the fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: balmer
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 7 iterations, reached minimum = 192511.056 and redChi2 = 249.367
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 9 iterations, reached minimum = 196213.255 and redChi2 = 255.486
[DEBUG] Saving balmer results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab

By default the logging is in DEBUG mode, but it is possible to run it in quiet mode, using the python logging module.

[4]:
import logging
logger = logging.getLogger('pyplatefit')
logger.setLevel('INFO')
res = fit_spec(sp, z=0.41892)
logger.setLevel('DEBUG')

Analysis of result

The program return a python dictionary with a following items:

  • ztable: the redshift table (astropy format)

  • lines: the lines table (astropy format)

  • spec_fit: the fitted spectrum (MPDAF format)

Note that the resulting dictionary include many other items. We describe here only the ones relevant for this basic usage. More is given later in the advanced usage section.

The ztable contain global information for each fitted family.

[5]:
res['ztable']
[5]:
Table length=2
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
balmer82.133.860.419311.83e-050.4189266.114.26HBETA20.6213.2812.9895249.3773least_squares
forbidden92.495.290.419362.50e-050.4189266.656.07OII3727b17.2616.8620.72125255.4993least_squares

In this case the fit converge to similar velocity (VEL) and velocity dispersion (VDISP) for the balmer and forbidden lines. The highest SNRMAX and corresponding LINE are reported. The number of fitted lines before (NL) and after SNR>3 clipping (NL_CLIPPED) are also given. An relevant parameter is the SNR sum of the clipped lines (SNRSUM_CLIPPED). The convergence status (STATUS), number of function evaluation (NFEV), reduced \(\chi^2\) (RCHI2) and name of the algorithm (METHOD) are also given. Codification of convergence status is specific for each method, for the default least_squares (Trust Reflective Least Square) method see the scipy documentation. When convergence is achieved we expect to get status=3, and status=0 if the algorithm did not converge in the maximum of iterations.

The detailed information for each fitted lines is given in the lines table. In the following we list the five highest SNR emission lines, excluding the blended lines (e.g OII3727b = OII3726+OII3729).

[6]:
lines = res['lines'].copy()
lines = lines[~lines['ISBLEND']]
lines.sort('SNR')
lines[::-1][0:5]
[6]:
Table length=5
FAMILYLINEISBLENDLBDA_RESTDNAMEVELVEL_ERRZZ_ERRZ_INITVDINSTVDISPVDISP_ERRFLUXFLUX_ERRSNRSKEWSKEW_ERRLBDA_OBSPEAK_OBSLBDA_LEFTLBDA_RIGHTFWHM_OBSNSTDLBDA_LNSTDLBDA_RNSTDBLENDEQWEQW_ERRCONT_OBSCONTCONT_ERR
str20str20boolfloat64str20float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64int64float64float64float64float64float64
balmerHBETAFalse4862.6882.133.860.419311.83e-050.4189242.9366.114.268567.47415.4320.620.000.006899.741884.016897.606901.874.27-2.196893.336906.150-7.860.40768.171089.9830.89
forbiddenOII3729False3729.88[Oɪɪ]92.495.290.419362.50e-050.4189261.9766.656.076065.85435.2413.940.000.005292.561506.245290.675294.453.78-2.445282.935298.233727-7.200.55593.92842.7342.22
forbiddenNII6584False6585.28None92.495.290.419362.50e-050.4189233.1966.656.0711604.351017.5211.400.000.009344.301994.899341.579347.035.47-1.799336.109352.500-11.651.45701.99996.06292.68
balmerHGAMMAFalse4341.6882.133.860.419311.83e-050.4189249.8066.114.263647.94348.9210.450.000.006160.48855.676158.476162.484.01-1.856154.476166.490-3.280.32784.021112.4636.08
forbiddenOII3726False3727.09None92.495.290.419362.50e-050.4189262.0466.656.074340.88426.0910.190.000.005288.601078.165286.715290.493.78-2.445282.935298.233727-5.190.53589.73836.7841.88

The detailed explanation of all columns is given in appendix. Note that parameters (unless specfified by OBS in the column name) are given in rest frame. Velocity dispersion (VDISP) is corrected for MUSE instrumental resolution.

A visual check can be performed with the plot_fit routine.

[7]:
fig,ax = plt.subplots(1,3,figsize=(20,5))
plot_fit(ax[0], res)
plot_fit(ax[1], res, line='OII3729')
plot_fit(ax[2], res, line='OII3729', line_only=True, margin=20)
_images/tutorial_17_0.png

The left panel display the full spectrum (in black) and its continuum fit (in blue) with the location of fitted lines (in red). The central panel display a zoom on the [OII] doublet and the right panel the continuum subtracted spectrum.

Improving continuum fit

From the central panel of the previous plot, one can see that the continuum fit (in blue) is slightly offseted with respect to the data. The reason is that the continuum estimation performed in the first step of the fitting process use a fixed value of z, as given by the input redshift. In that case it is wise to use ziter=True in fit_spec to allow a second iteration.

[8]:
logger = logging.getLogger('pyplatefit')
logger.setLevel('INFO')
res2 = fit_spec(sp, z=0.41892, ziter=True)
logger.setLevel('DEBUG')
fig,ax = plt.subplots(1,2,figsize=(15,5))
plot_fit(ax[0], res, line='HDELTA', margin=30, infoline=['FLUX','SNR'])
plot_fit(ax[1], res2, line='HDELTA', margin=30, infoline=['FLUX','SNR'])
_images/tutorial_20_0.png

Fitting all lines together

In some case, it is better to fit all lines together rather than splitting the fit by families. This is done with the fit_all=True flag.

Note that \({\rm Ly\alpha}\), given its systematic offset with respect to the systemic velocity, is still fitted separatly.

[9]:
logger.setLevel('INFO')
res = fit_spec(sp, z=0.41892, ziter=True, fit_all=True)
logger.setLevel('DEBUG')
res['ztable']
[9]:
Table length=1
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
all3.540.710.419343.38e-060.4193364.550.80HBETA90.4775.4188.28211614.4883least_squares

Working with \({\rm Ly\alpha}\)

The \({\rm Ly\alpha}\) line is often the only line available at high z. It is generally broad and asymmetric and then it is better fitted with the skewed gaussian model.

[10]:
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res = fit_spec(sp, z=3.18817)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 16.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 14 lines to fit
[DEBUG] added 14 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family civ1548
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 188.100 and redChi2 = 0.308
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] Both `ftol` and `xtol` termination conditions are satisfied. after 16 iterations, reached minimum = 209.310 and redChi2 = 0.350
[DEBUG] Fitting of Line Family: civ1548
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 14 iterations, reached minimum = 212.787 and redChi2 = 0.349
[DEBUG] Saving lya results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving civ1548 results to tablines and ztab

In the output debug log, one can see that the continuum fit failed. This is the case when the cotniuum is too faint, then a constant is used (median of the continuum).

We also see that in addition to forbidden lines, \({\rm Ly\alpha}\) and the resonnant CIV doublet were fitted separately. However when looking to the ztable, we see that all lines except \({\rm Ly\alpha}\) are below the detection limit.

[11]:
res['ztable']
[11]:
Table length=3
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
civ1548120.5950.633.189857.07e-043.1881750.0568.64CIV1549b1.541.540.00100.35143least_squares
forbidden-121.11151.503.186482.12e-033.18817300.00124.47SiIV1403b1.562.080.00800.35164least_squares
lyalpha37.0321.983.188693.07e-043.18817263.9649.61LYALPHA7.317.317.31110.31113least_squares

As shown in the following plot, the \({\rm Ly\alpha}\) line is well fitted by a skewed Gaussian with \(\gamma\) positive (SKEW = 8.4), confirming the strong red asymetry of the line profile.

[12]:
fig,ax = plt.subplots(1,2,figsize=(15,5))
plot_fit(ax[0], res)
plot_fit(ax[1], res, line='LYALPHA', infoline=['SNR','SKEW'])
_images/tutorial_28_0.png

When the initial redshift is poorly known the fit might not converge to the right solution. One can check the provided initial solution with the start=True option in the plot_line. As shown in the left panel of the above plot, the initial solution (in blue) was too far from the left peak to converge to the solution.

In that case it is possible to use the option flag find_lya_vel_offset=True in fit_spec to perform an initail peak search prior to the fit. As it can be seen from the right panel, this help the process to converge to the solution.

However, this option must be used with care as it can easily pick another peaked as initial solution.

[13]:
logger.setLevel('INFO')
res = fit_spec(sp, z=3.22)
res2 = fit_spec(sp, z=3.22, find_lya_vel_offset=True)
fig,ax = plt.subplots(1,2,figsize=(15,5),sharey=True)
plot_fit(ax[0], res, line='LYALPHA', start=True, line_only=True)
plot_fit(ax[1], res2, line='LYALPHA', start=True, line_only=True)
_images/tutorial_30_0.png

Double peaked \({\rm Ly\alpha}\) line profile

In some case, the \({\rm Ly\alpha}\) line profile can display a blue bump or even a double peak. In that case we can use the option flag dble_lyafit=True to perform the double skewed gaussain fit.

[14]:
sp = Spectrum(os.path.join(datadir,'udf10_00106.fits'))
res = fit_spec(sp, z=3.27554, dble_lyafit=True)
[15]:
fig,ax = plt.subplots(1,1,figsize=(7,5),sharey=True)
plot_fit(ax, res, line='LYALPHA1')
_images/tutorial_33_0.png

The two line peaks are fitted together with two skewed gaussian. The two profiles are names LYALPHA1 and LYALPHA2 for the blue and red components respectively. The paramaters of the fit are the velocity (or redshift) measured as the center of the peak separation (SEP), the flux, velocity dispersion and skewness of each peak. Note that the redshit given by this fit is the center of the two peaks and not the peak of the line as it is for a single line fit.

[16]:
lines = res['lines'].copy()
lines = lines[~lines['ISBLEND']]
lines[lines['FAMILY']=='lyalpha']['LINE','VEL','Z','SEP','VDISP','FLUX','SKEW']
[16]:
Table length=2
LINEVELZSEPVDISPFLUXSKEW
str20float64float64float64float64float64float64
LYALPHA134.183.27603515.83194.33680.34-2.78
LYALPHA234.183.27603515.83307.581080.474.04

Working with doublets and blends

Many lines are double (or more) which can be marginally resolved at the instrumental spectral resolution. For these lines, the fit report the parameters of each lines but also for the blend, adding a b at the end of the line name (eg OII3727b). The returned table has a BLEND column flag which allow to filter in/out these lines.

[17]:
sp = Spectrum(os.path.join(datadir,'mosaic_01102.fits'))
res1 = fit_spec(sp, z=0.6198)
lines = res1['lines'].copy()
lines[lines['ISBLEND']]
[17]:
Table length=1
FAMILYLINEISBLENDLBDA_RESTDNAMEVELVEL_ERRZZ_ERRZ_INITVDINSTVDISPVDISP_ERRFLUXFLUX_ERRSNRSKEWSKEW_ERRLBDA_OBSPEAK_OBSLBDA_LEFTLBDA_RIGHTFWHM_OBSNSTDLBDA_LNSTDLBDA_RNSTDBLENDEQWEQW_ERRCONT_OBSCONTCONT_ERR
str20str20boolfloat64str20float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64int64float64float64float64float64float64
forbiddenOII3727bTrue3728.48[Oɪɪ]5.9811.070.619835.98e-050.619800.0037.4015.08165.3027.755.960.000.006037.8532.016034.086041.613.010.000.000.003727-10.512.489.7115.7312.24

In some case we may also want to constrain the line ratios to avoid unphysical fit. This is activated with the option flag use_line_ratios=True.

By default the following line ratios are defined:

\({\rm \left( 0.3 < \frac{OII_{3729}}{OII_{3726}} < 1.5 \right)}\) and \({\rm \left( 0.6 < \frac{CIII_{1909}}{CIII_{1907}} < 1.2 \right)}\)

We will see later how to define new lines ratio.

[18]:
res2 = fit_spec(sp, z=0.6198, use_line_ratios=True)
for res in [res1,res2]:
    lines = res['lines'].copy()
    r1 = lines[lines['LINE']=='OII3726'][0]
    r2 = lines[lines['LINE']=='OII3729'][0]
    r21 = r2['FLUX']/r1['FLUX']
    print('ratio: ',r21)
ratio:  1.6316695024665604
ratio:  1.4999997928730038
[19]:
fig,ax = plt.subplots(1,2,figsize=(12,5), sharey=True)
plot_fit(ax[0], res1, line='OII3726', line_only=True, margin=20)
plot_fit(ax[1], res2, line='OII3726', line_only=True, margin=20)
_images/tutorial_40_0.png

Fitting absorption lines

The program offer an option to fit absorption lines. This is activated with the flag fitabs=True. The absorption lines fit is performed in a third step, after the emission line fit. The emission lines are subtracted from the original spectrum, a polynomial is performed to subtract the low order continuum and then all absorption lines are fitted simultaneously similarely to the emission line fit.

Note that, in the case of strong emission lines, the lines residuals left by the emission line fitting, together with the accuracy of the polynomial continuum fit, may lead to inaccurate results in the absorption lines fit.

[20]:
sp = Spectrum(os.path.join(datadir,'DR2_001028.fits'))
res = fit_spec(sp, z=1.90578, fitabs=True)
[21]:
res['ztable']
[21]:
Table length=3
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
forbidden-52.8434.801.905273.37e-041.9057888.0231.01CII2326b2.784.210.00801.26183least_squares
mgii2796279.7617.561.908491.70e-041.9057856.5121.07MgII2799b4.724.724.72111.29273least_squares
abs-16.325.771.905625.60e-051.90578116.566.00MgII2799b16.5023.9728.581080.7463least_squares
[22]:
fig,ax = plt.subplots(1,2,figsize=(12,5))
plot_fit(ax[0], res)
plot_fit(ax[1], res, line='FeII2374', abs_line=True)
_images/tutorial_44_0.png
[23]:
fig,ax = plt.subplots(1,3,figsize=(12,5))
plot_fit(ax[0], res, line='MgII2796', legend=False)
plot_fit(ax[1], res, line='MgII2796', line_only=True, legend=False)
plot_fit(ax[2], res, line='MgII2796', abs_line=True, legend=False)
_images/tutorial_45_0.png

One can see in these last plots that both MgII absorptions and the resonnant MgII emission doublet are raisonably well fitted.

For passive galaxy with no emission line, one can also skip the emission line fitting by setting the fitlines=False flag.

[24]:
res = fit_spec(sp, z=1.90578, fitabs=True, fitlines=False)
res['ztable']
[24]:
Table length=1
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
abs-21.175.631.905575.45e-051.90578108.825.93MgII2799b15.7623.2627.751080.7973least_squares

Fitting accuracy

To judge the fit accuracy pyplatefit provide two indicators:

The reduced chi square value for the simultaneous fit of all lines for each family. This value is given by the lmfit least square routine.

To assess the goodness of the fit for each individual lines, the parameter NSTD is returned in the lines table. NSTD is the log10 of the standard deviation of the residuals after normalisation of the flux by the total model flux. Computation is performed in a window centered on the line peak with a width relative to the FWHM of the line. The window wavelengths limits are saved in the table as LBDA_LNSTD and LBDA_RNSTD. The smaller NSTD, the better. Typical good line fit have NSTD lower than -2.

[25]:
sp = Spectrum(os.path.join(datadir,'udf10_00002.fits'))
res1 = fit_spec(sp, z=0.41892)
res2 = fit_spec(sp, z=0.41892, linepars=dict(vdisp=(100,150,300)))
[26]:
fig,ax = plt.subplots(1,2,figsize=(12,5), sharey=True)
kw = dict(line='HBETA', line_only=True, margin=20, infoline=['NSTD'], infoz=['RCHI2'], legend=False)
plot_fit(ax[0], res1, **kw)
plot_fit(ax[1], res2, **kw)
_images/tutorial_50_0.png

Improving error estimate using MCMC

Although the lmfit least square routine provide in general a reasonnable estimation of errors, it may give wrong results when the line is very noisy or when model parameters are strongly correlated. This is for example the case for the double \({\rm Ly\alpha}\) line fit.

In that case, pyplatefit has an option to evaluate the errors using Markov Chain Monte Carlo using the emcee python module, through the lmfit interface. These options are set with two boolean flags: mcmc_lya and mcmc_all, to run the MCMC for respectively only the \({\rm Ly\alpha}\) line or all emission and absorption lines.

Please note that this option is quite cpu intensive and thus it is recommended to activate only if this is requested.

We use the previous example to show how it works. We compare the \({\rm Ly\alpha}\) fit with and without the MCMC option.

[27]:
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res0 = fit_spec(sp, z=3.18817, lines=['LYALPHA'])
[28]:
logger.setLevel('DEBUG')
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res1 = fit_spec(sp, z=3.18817, lines=['LYALPHA'], mcmc_lya=True)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 1.9 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 0 non resonnant line families to fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 23.191 and redChi2 = 0.346
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'steps': 10000, 'nwalkers': 40, 'progress': True, 'run_mcmc_kwargs': {'skip_initial_state_check': False}}
100%|██████████| 10000/10000 [01:04<00:00, 156.03it/s]
[DEBUG] status 1 after 400000 fcn eval, chain size ratio = 1.5 max autocorr time = 130.6 mean acceptance fraction = 0.53, reached minimum = 23.446 and redChi2 = 0.350
[DEBUG] Saving lya results to tablines and ztab
[29]:
fig,ax = plt.subplots(1,2,figsize=(12,5), sharey=True)
kw = dict(line='LYALPHA', line_only=True, margin=20, infoline=['FLUX','SNR'], infoz=['METHOD'], legend=False)
plot_fit(ax[0], res0, **kw)
plot_fit(ax[1], res1, **kw)
_images/tutorial_55_0.png

As shown in the plot above the solutions are nearly identical. This is expected given that the MCMC routine use as starting point the least-square solution, and only then explore the parameter space to derive the probability distribution. Note however that the S/N derived is much lower (70%) than the one returned by the least square routine.

To check if the MCMC solution has converged we need to check the autocorrelation time returned by the routine. According to the emcee documentation, the chain length (Steps) must be large enough with respect to the autocorrelation time (Acor). In practice it is recommended to have \(\rm Steps > 50 \times Acor\) for all parameters. The program print the minimum value of 1.5 for the chain ratio which indicate that we are fine.

We can also have a more detailed view of the returned parameters using the print_res function.

[30]:
print_res(res1, 'lyalpha')
[30]:
Table masked=True length=5
NameMin_boundMax_boundInit_valueValueMedianInit_StdStdAcorAcor_ratioMin_p99Min_p95Max_p95Max_p99
bytes40float32float32float32float32float32float32float32float32float32float32float32float32float32
dv_lyalpha-500.000500.000-30.330-30.328-24.32723.28357.119130.6241.531--------
vdisp_lyalpha50.000700.000263.959263.965274.02752.56092.98367.5102.963--------
lyalpha_LYALPHA_asymgauss_l0-infinf1215.6701215.670----nan------------
lyalpha_LYALPHA_asymgauss_flux0.000inf117.554117.567118.99317.02828.85358.1723.438--------
lyalpha_LYALPHA_asymgauss_asym-1.00010.0008.4258.4345.2636.2633.32395.0852.103--------

The returned table gives detailed information for all model parameters. For each parameter we have the bounds (Min_bound and Max_bound), the initial value and standard deviation (Init_value, Init_Std) which are the one returned by the least-square fit, the maximum likehood value (Value), the median value (Median) of the distribution and the standard deviation (Std). We can also see the autocorrelation time (Acor) and the ratio discussed above (Acor_ratio). If the MCMC has converged we should have Acor_ratio larger than 1 for all parameters. For example the flux error estimate has a very large value of 3.7 and then we can be sure that the errors is well estimated.

We will see later that the parameter Acor_ratio is also returned as par_RTAU in the lines table.

Now we run the fit for all lines for the same spectra, using the mcmc_all flag.

[31]:
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res2 = fit_spec(sp, z=3.18817, mcmc_all=True)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 16.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 14 lines to fit
[DEBUG] added 14 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family civ1548
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 188.100 and redChi2 = 0.308
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'steps': 10000, 'nwalkers': 40, 'progress': True, 'run_mcmc_kwargs': {'skip_initial_state_check': False}}
100%|██████████| 10000/10000 [00:59<00:00, 166.86it/s]
[DEBUG] status 1 after 400000 fcn eval, chain size ratio = 1.9 max autocorr time = 106.4 mean acceptance fraction = 0.52, reached minimum = 188.341 and redChi2 = 0.309
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] Both `ftol` and `xtol` termination conditions are satisfied. after 16 iterations, reached minimum = 209.310 and redChi2 = 0.350
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'steps': 10000, 'nwalkers': 160, 'progress': True, 'run_mcmc_kwargs': {'skip_initial_state_check': False}}
100%|██████████| 10000/10000 [10:02<00:00, 16.59it/s]
The chain is shorter than 50 times the integrated autocorrelation time for 16 parameter(s). Use this estimate with caution and run a longer chain!
N/50 = 200;
tau: [618.68915442 538.40293423 562.54241887 614.39365113 550.02117346
 572.08205515 549.5646877  546.86227608 557.1934568  573.14339719
 553.58374741 580.04399232 534.58610532 540.38357826 583.24799832
 592.4571212 ]
[DEBUG] status 0 after 1600000 fcn eval, chain size ratio = 0.3 max autocorr time = 618.7 mean acceptance fraction = 0.16, reached minimum = 210.695 and redChi2 = 0.352
[DEBUG] Fitting of Line Family: civ1548
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 14 iterations, reached minimum = 212.787 and redChi2 = 0.349
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'steps': 10000, 'nwalkers': 40, 'progress': True, 'run_mcmc_kwargs': {'skip_initial_state_check': False}}
100%|██████████| 10000/10000 [01:05<00:00, 152.69it/s]
The chain is shorter than 50 times the integrated autocorrelation time for 1 parameter(s). Use this estimate with caution and run a longer chain!
N/50 = 200;
tau: [109.35791379  97.65140256 642.37228804  88.02403696]
[DEBUG] status 0 after 400000 fcn eval, chain size ratio = 0.3 max autocorr time = 642.4 mean acceptance fraction = 0.43, reached minimum = 214.066 and redChi2 = 0.351
[DEBUG] Saving lya results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving civ1548 results to tablines and ztab

One can have a summary of the fit by looking to the ztable

[32]:
res2['ztable']
[32]:
Table length=3
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHODNSTEPSRCHAINNBADRCHAIN_CLIPNBAD_CLIP
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25int64float64int64float64int64
civ1548120.64258.793.189863.62e-033.1881749.9989.84CIV1549b1.031.030.00100.354000000emcee100000.3110.000
forbidden-121.10226.723.186483.17e-033.18817300.0042.78CIII1909b1.322.140.00800.3516000000emcee100000.32140.000
lyalpha37.0359.933.188698.37e-043.18817263.9493.81LYALPHA4.114.114.11110.314000001emcee100001.8801.880

One can see from the status value of 0, that the MCMC did not converge for the civ1548 and forbidden families. Indeed the reported value for the chain length ratio (RCHAIN) is below 1. The NBAD column indicate the number of lines that are below 1. All lines have a SNR below 3 as indicated by the zero value of NL_CLIPPED. Note that the SNR value is probably not very reliable because the MCMC did not converge. The velocity dispersion seems also high. We can check this by looking to the bounds value returned by print_res.

[33]:
print_res(res2, 'forbidden')[0:3]
[33]:
Table masked=True length=3
NameMin_boundMax_boundInit_valueValueMedianInit_StdStdAcorAcor_ratioMin_p99Min_p95Max_p95Max_p99
bytes40float32float32float32float32float32float32float32float32float32float32float32float32float32
dv_forbidden-500.000500.000-121.113-121.104-74.678151.501226.719618.6890.323--------
vdisp_forbidden5.000300.000300.000300.000268.465124.47442.780538.4030.371--------
forbidden_NV1238_gauss_l0-infinf1238.8201238.820----nan------------

As shown above, the least-square fit reach the velocity dispersion bounds. Let’s run the fit separately to see if we obtain a better solution.

[34]:
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res4 = fit_spec(sp, z=3.18817, mcmc_all=True, lines=['CIII1907','CIII1909'])
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 1.8 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 9 iterations, reached minimum = 19.882 and redChi2 = 0.316
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'steps': 10000, 'nwalkers': 40, 'progress': True, 'run_mcmc_kwargs': {'skip_initial_state_check': False}}
100%|██████████| 10000/10000 [01:04<00:00, 155.78it/s]
[DEBUG] status 1 after 400000 fcn eval, chain size ratio = 2.3 max autocorr time = 87.7 mean acceptance fraction = 0.45, reached minimum = 20.591 and redChi2 = 0.327
[DEBUG] Saving forbidden results to tablines and ztab
[35]:
res4['ztable']
[35]:
Table length=1
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHODNSTEPSRCHAINNBADRCHAIN_CLIPNBAD_CLIP
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25int64float64int64float64int64
forbidden-145.83276.133.186133.86e-033.1881782.5681.25CIII1909b1.891.890.00100.334000001emcee100002.2800.000

This time the velocity dispersion is ok and the RCHAIN value is ok. This is better shown in the following plot.

[36]:
ltab = res4['lines']
ltab = ltab[(ltab['LINE']=='CIII1907') | (ltab['LINE']=='CIII1909')]
ltab['LINE','SNR','FLUX','FLUX_RTAU','VDISP']
[36]:
Table length=2
LINESNRFLUXFLUX_RTAUVDISP
str20float64float64float64float64
CIII19071.8235.072.6182.56
CIII19090.539.682.3582.56
[37]:
fig,ax = plt.subplots(1,2,figsize=(12,5), sharey=True)
kw = dict(line='CIII1907', line_only=True, margin=20, infoline=['FLUX','SNR','FLUX_RTAU','VDISP'], legend=False)
plot_fit(ax[0], res2, **kw)
plot_fit(ax[1], res4, **kw)
_images/tutorial_69_0.png

When using the mcmc option, we have access to the full probability distribution of each parameters. The save_proba option will save the 95% and 99% bounds for each parameters. We illustrate this on the double peaked lya fit.

[38]:
sp = Spectrum(os.path.join(datadir,'udf10_00106.fits'))
res = fit_spec(sp, z=3.27554, dble_lyafit=True, lines=['LYALPHA'], mcmc_lya=True, mcmcpars=dict(save_proba=True))
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 2.0 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] Using double asymetric gaussian model
[DEBUG] added 1 double asymetric gaussian to the fit
[DEBUG] Found 0 non resonnant line families to fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 6 iterations, reached minimum = 46.432 and redChi2 = 0.714
[DEBUG] Emcee fitting: {'method': 'emcee', 'is_weighted': True, 'nwalkers': 80, 'steps': 15000}
100%|██████████| 15000/15000 [04:29<00:00, 55.66it/s]
[DEBUG] status 1 after 1200000 fcn eval, chain size ratio = 2.7 max autocorr time = 110.6 mean acceptance fraction = 0.45, reached minimum = 46.526 and redChi2 = 0.716
[DEBUG] Saving lya results to tablines and ztab
[39]:
print_res(res, 'lyalpha')
[39]:
Table masked=True length=9
NameMin_boundMax_boundInit_valueValueMedianInit_StdStdAcorAcor_ratioMin_p99Min_p95Max_p95Max_p99
bytes40float32float32float32float32float32float32float32float32float32float32float32float32float32
dv_lyalpha-500.000500.00034.18034.17834.2244.8305.98297.4093.08018.33423.75343.78547.688
vdisp1_lyalpha50.000700.000194.328194.329196.49111.93114.19596.7913.099160.546172.276219.412228.622
vdisp2_lyalpha50.000700.000307.577307.555309.83611.40114.02593.4443.210276.797286.658333.031342.889
lyalpha_LYALPHA_dbleasymgauss_l0-infinf1215.6701215.670----nan------------
lyalpha_LYALPHA_dbleasymgauss_sep80.0001000.000515.832515.810514.3719.66112.05997.5823.074488.237495.522535.751546.654
lyalpha_LYALPHA_dbleasymgauss_flux10.000inf680.343680.268680.27321.03224.55883.4103.597623.565640.039720.819737.965
lyalpha_LYALPHA_dbleasymgauss_flux20.000inf1080.4721080.4871080.35526.04530.45990.0183.3331010.3261030.8311131.4261153.226
lyalpha_LYALPHA_dbleasymgauss_asym1-10.0000.000-2.776-2.777-2.8860.4970.667110.5962.713-5.148-4.230-1.939-1.567
lyalpha_LYALPHA_dbleasymgauss_asym20.00010.0004.0404.0414.2190.5840.81498.2293.0542.7313.1035.8776.858

More information is available (see here). For example we can visualize the posterior distributions for the parameters using the corner package:

[40]:
import corner
lmres = res['dline']['lmfit_lya']
names = [e.split('_')[0] for e in lmres.var_names[0:3]] + [e.split('_')[-1] for e in lmres.var_names[3:]]
emcee_plot = corner.corner(lmres.flatchain, labels=names,
                           truths=list(lmres.params.valuesdict().values()))
_images/tutorial_74_0.png

Selecting and updating emission/absorption lines

The program comes with a default line list and provide options to select, adapt and update the line list.

The default line astropy table can be returned by the get_lines function.

[41]:
from pyplatefit import get_lines
get_lines(orig=True)
[41]:
Table length=75
LINEFAMILYLBDA_RESTDOUBLETMAINEMIABSRESONANTDNAME
bytes9bytes9float32int64boolboolboolboolbytes10
OVI1032forbidden1031.911033FalseTrueFalseFalseNone
OVI1038forbidden1037.611033FalseTrueFalseFalseOvɪ
LYALPHAbalmer1215.670TrueTrueFalseTrueLyα
NV1238forbidden1238.821240FalseTrueFalseFalseNone
NV1242forbidden1242.81240FalseTrueFalseFalseNV
SiII1260ism1260.420FalseFalseTrueFalseSiɪɪ
OI1302ism1302.171303FalseFalseTrueFalse
SiII1304ism1304.371303FalseFalseTrueFalseSiɪɪ
CII1334ism1334.530FalseFalseTrueFalseCɪɪ
SiIV1394forbidden1393.761403FalseTrueTrueFalseNone
...........................
MgBism5175.440FalseFalseTrueFalseMgb
HeI5876forbidden5877.250FalseTrueFalseFalseNone
NaDism5891.940FalseFalseTrueFalseNaD
OI6300forbidden6302.050FalseTrueFalseFalse[Oɪ]
NII6548forbidden6549.850FalseTrueFalseFalseNone
HALPHAbalmer6564.610TrueTrueTrueFalse
NII6584forbidden6585.280TrueTrueFalseFalseNone
SII6717forbidden6718.290TrueTrueFalseFalseNone
SII6731forbidden6732.670TrueTrueFalseFalse[Sɪɪ]
ARIII7135forbidden7137.80FalseTrueFalseFalse[Arɪɪɪ]

These table contains the following columns:

  • LINE: the line name. There are no official naming convention for the line but we try to pick up the most commo names currently used.

  • FAMILY: the family name, currently we have balmer, forbidden and ism.

  • LBDA_REST: the vacuum wavelength rest frame

  • DOUBLET: an integer which indicate if non zero, that the line is part of a multiplet. All lines with identical values are belonging to the same multiplet.

  • MAIN: a boolean flag, indicate if the line is considered as a main line.

  • EMI: a boolean flag for emission line.

  • ABS: a boolean flag for absorption lines.

  • RESONNANT: a boolean flag for resonnant lines.

  • DNAME: display name used for plots.

When performing the fit it is possible to restrict the fit to some lines.

The option flag major_lines=True wil only fit the main lines.

[42]:
sp = Spectrum(os.path.join(datadir,'udf10_00723.fits'))
res = fit_spec(sp, z=3.18817, major_lines=True)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 5.3 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family civ1548
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 65.515 and redChi2 = 0.343
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 9 iterations, reached minimum = 88.991 and redChi2 = 0.466
[DEBUG] Fitting of Line Family: civ1548
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 14 iterations, reached minimum = 90.202 and redChi2 = 0.472
[DEBUG] Saving lya results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving civ1548 results to tablines and ztab
[43]:
res['lines']
[43]:
Table length=7
FAMILYLINEISBLENDLBDA_RESTDNAMEVELVEL_ERRZZ_ERRZ_INITVDINSTVDISPVDISP_ERRFLUXFLUX_ERRSNRSKEWSKEW_ERRLBDA_OBSPEAK_OBSLBDA_LEFTLBDA_RIGHTFWHM_OBSNSTDLBDA_LNSTDLBDA_RNSTDBLENDEQWEQW_ERRCONT_OBSCONTCONT_ERR
str20str20boolfloat64str20float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64int64float64float64float64float64float64
lyalphaLYALPHAFalse1215.67Lyα37.0323.183.188693.24e-043.1881765.68263.9652.32117.5516.956.948.426.235090.6419.335089.475095.135.66-1.575087.125104.100-----0.02-0.0814.53
civ1548CIV1548False1548.20None120.5958.913.189858.23e-043.1881746.4650.0579.860.007.190.000.000.006484.940.006483.206486.683.48-0.906479.726500.921549-----0.09-0.3811.46
civ1548CIV1549bTrue1549.48Cɪᴠ120.5958.913.189858.23e-043.188170.0050.0579.8615.6111.771.330.000.006490.324.216483.206497.443.480.000.000.001549-----0.11-0.4811.38
civ1548CIV1550False1550.77Cɪᴠ120.5958.913.189858.23e-043.1881746.3650.0579.8615.6111.771.330.000.006495.704.216493.966497.443.48-0.906479.726500.921549-----0.12-0.5011.37
forbiddenCIII1907False1906.68None-145.8044.003.186136.15e-043.1881736.7382.5438.2135.0716.112.180.000.007979.425.827976.597982.255.66-1.177970.927996.501909-----0.08-0.3213.28
forbiddenCIII1909bTrue1907.71Cɪɪɪ]-145.8044.003.186136.15e-043.188170.0082.5438.2144.7518.522.420.000.007983.715.827976.597990.835.670.000.000.001909-----0.08-0.3513.26
forbiddenCIII1909False1908.73Cɪɪɪ]-145.8044.003.186136.15e-043.1881736.7082.5438.219.689.221.050.000.007988.001.607985.167990.835.67-1.177970.927996.501909-----0.08-0.3313.29

It is also possible to fit only a subset of lines by providing the list of lines to fit

[44]:
res = fit_spec(sp, z=3.18817, lines=['LYALPHA','CIII1907','CIII1909'])
res['lines']
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 3.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 46.069 and redChi2 = 0.344
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 9 iterations, reached minimum = 69.546 and redChi2 = 0.519
[DEBUG] Saving lya results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab
[44]:
Table length=4
FAMILYLINEISBLENDLBDA_RESTDNAMEVELVEL_ERRZZ_ERRZ_INITVDINSTVDISPVDISP_ERRFLUXFLUX_ERRSNRSKEWSKEW_ERRLBDA_OBSPEAK_OBSLBDA_LEFTLBDA_RIGHTFWHM_OBSNSTDLBDA_LNSTDLBDA_RNSTDBLENDEQWEQW_ERRCONT_OBSCONTCONT_ERR
str20str20boolfloat64str20float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64int64float64float64float64float64float64
lyalphaLYALPHAFalse1215.67Lyα37.0323.203.188693.24e-043.1881765.68263.9652.38117.5516.976.938.426.245090.6419.335089.475095.135.66-1.575087.125104.100-----0.02-0.0814.53
forbiddenCIII1907False1906.68None-145.8046.433.186136.49e-043.1881736.7382.5440.3335.0717.002.060.000.007979.425.827976.597982.255.66-1.177970.927996.501909-----0.08-0.3213.28
forbiddenCIII1909bTrue1907.71Cɪɪɪ]-145.8046.433.186136.49e-043.188170.0082.5440.3344.7519.542.290.000.007983.715.827976.597990.835.670.000.000.001909-----0.08-0.3513.26
forbiddenCIII1909False1908.73Cɪɪɪ]-145.8046.433.186136.49e-043.1881736.7082.5440.339.689.730.990.000.007988.001.607985.167990.835.67-1.177970.927996.501909-----0.08-0.3313.29

Finally, it is also possible to provide its own table. It must have the same format than the default table.

In the following example, we retreive the default lines table and set the RESONNANT flag to False for the CIV1548,1550 lines. We then run fit_spec and provide the modified lines table. Now the CIV doublet is fitted together with the other forbidden lines.

[45]:
mylines = get_lines(orig=True)
mylines['RESONANT'][(mylines['LINE']=='CIV1548') | (mylines['LINE']=='CIV1550')] = False
res = fit_spec(sp, z=3.18817, lines=mylines, major_lines=True)
res['ztable']
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from user line table...
[DEBUG] 5.3 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Init Lya Fit
[DEBUG] added 1 asymetric gaussian to the fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 4 lines to fit
[DEBUG] added 4 gaussian to the fit
[DEBUG] Found 0 resonnant line families to fit
[DEBUG] Fitting of Line Family: lya
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 11 iterations, reached minimum = 65.515 and redChi2 = 0.343
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 22 iterations, reached minimum = 88.646 and redChi2 = 0.469
[DEBUG] Saving lya results to tablines and ztab
[DEBUG] Saving forbidden results to tablines and ztab
[45]:
Table length=2
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
forbidden121.00155.523.189862.17e-033.18817172.35133.22CIV1549b1.351.790.00200.47223least_squares
lyalpha37.0323.183.188693.24e-043.18817263.9652.32LYALPHA6.946.946.94110.34113least_squares

Adjusting fit parameters

A number of parameters can be changed to adapt the fitting parameters, update bounds, add new line ratios, etc. These parameters are given to fit_spec as python dictionary.

The linepars dictionary is used to define bounds and line ratios of the fitted variable. It can have the following items: - vel = (vmin,vinit,vmax): bounds and initial values of the velocity with respect to the given redshift. Default is (-500,0,500) km/s. Used for all line families except absorption lines.

  • vdisp = (vmin,vinit,vmax): bounds and initial values of the rest-frame velocity dispersion. Default is (5,50,300) km/s. Used for all line families except \({\rm Ly\alpha}\) and absorption lines.

  • velabs = (vmin,vinit,vmax): bounds and initial values of the velocity with respect to the given redshift. Default is (-500,0,500) km/s. Used for absorption lines.

  • vdispabs = (vmin,vinit,vmax): bounds and initial values of the rest-frame velocity dispersion. Default is (5,50,300) km/s. Used for absorption lines.

  • vdisp_lya = (vmin,vinit,vmax): bounds and initial values of the rest-frame \({\rm Ly\alpha}\) velocity dispersion. Default is (50,150,700) km/s.

  • gamma_lya = (vmin,vinit,vmax): bounds and initial values of the skewed \({\rm \gamma \, Ly\alpha}\) parameter. Default is (-1,0,10).

  • sep_2lya = (vmin,vinit,vmax): bounds and initial values of the peak separation for double \({\rm Ly\alpha}\) fit. Default is (80,500,1000) km/s.

  • gamma_2lya1 = (vmin,vinit,vmax): bounds and initial values of the skewed \({\rm \gamma \, Ly\alpha}\) parameter for the blue line. Default is (-10,-2,0).

  • gamma_2lya2 = (vmin,vinit,vmax): bounds and initial values of the skewed \({\rm \gamma \, Ly\alpha}\) parameter for the red line. Default is (0,2,10).

  • line_ratios = list of lines ratio bounds. default is [(“CIII1907”, “CIII1909”, 0.6, 1.2),(“OII3726”, “OII3729”, 0.3, 1.5)]

  • polydegabs = polynomial degree for continuum fit prior to absorption fit, default 12

  • polyiterabs = maximum of sig-clip iteration for continuum fit prior to absorption fit, default 3

  • polywmask = window half size used form masking in polynomial fit prior to absorption fit, default 3.0 Angstroem

  • windmax = window half size to find peak to estimate initial value for the fitting process (default 10 Angstroem)

  • minsnr = minimum SNR value used to clip for ztable computation, default 3.0

  • nstd_relsize = relative size with respect to FWHM for line NSTD estimate, default 3.0

Additional parameters for the MCMC error estimate are given in the mcmpars dictionary: - steps = length of the MCMC chain. Default is 0, which translate to 10000, except for double \(/rm Ly\alpha\) fit where it is 15000. If the process has not converged this is the parameter to adjust. - progress = display a running bar (True/False), default True - nworkers = number of workers. Dafult is 0, which translate to ten times the number of fitted parameters - save_proba = if True, save the 1%, 5%, 95% and 99% percentile for all variables.

For example to constrain the velocity dispersion and the emission line ratio between MgII2796,2803 you have to run the following.

[46]:
sp = Spectrum(os.path.join(datadir,'DR2_001028.fits'))
ratio = [("MgII2796", "MgII2803", 0.5, 1.5)]
params = dict(line_ratios=ratio, vdisp=(5,10,50))
res = fit_spec(sp, z=1.90578, linepars=params)
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 12.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 15 lines to fit
[DEBUG] added 15 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family mgii2796
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 17 iterations, reached minimum = 569.896 and redChi2 = 1.264
[DEBUG] Fitting of Line Family: mgii2796
[DEBUG] Lmfit fitting: {'method': 'least_square', 'xtol': 0.001}
[DEBUG] `xtol` termination condition is satisfied. after 13 iterations, reached minimum = 599.141 and redChi2 = 1.291
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving mgii2796 results to tablines and ztab
[47]:
res['ztable']
[47]:
Table length=2
FAMILYVELVEL_ERRZZ_ERRZ_INITVDISPVDISP_ERRLINESNRMAXSNRSUMSNRSUM_CLIPPEDNLNL_CLIPPEDRCHI2NFEVSTATUSMETHOD
str20float64float64float64float64float64float64float64str20float64float64float64int64int64float64int64int64str25
forbidden-64.3020.151.905161.95e-041.9057850.0022.80CII2326b4.624.624.62811.26173least_squares
mgii2796274.9516.101.908451.56e-041.9057850.0020.03MgII2799b4.744.744.74111.29133least_squares

The minpars dictionary is used for the minimization process with lmfit. The default items are: - method = ‘least_squares’, name of the lmfit minimization method - xtol = 0.001, tolerance requested on fitted parameters

The name of parameters are function of the selected minimization, see lmfit minimization and scipy minimization for the list of minimization and associated parameters.

For example, to add a tolerance on the residual and a maxmim number number of function evaluation, given scipy least_squares documentation we will use the following:

[48]:
params = dict(method='least_squares', xtol=1.e-3, ftol=1.e-3, max_nfev=200)
logger.setLevel('DEBUG')
res = fit_spec(sp, z=1.90578, minpars=params)
logger.setLevel('INFO')
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 12.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 15 lines to fit
[DEBUG] added 15 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family mgii2796
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'least_squares', 'xtol': 0.001, 'ftol': 0.001, 'max_nfev': 200}
[DEBUG] `ftol` termination condition is satisfied. after 6 iterations, reached minimum = 567.640 and redChi2 = 1.259
[DEBUG] Fitting of Line Family: mgii2796
[DEBUG] Lmfit fitting: {'method': 'least_squares', 'xtol': 0.001, 'ftol': 0.001, 'max_nfev': 200}
[DEBUG] `ftol` termination condition is satisfied. after 23 iterations, reached minimum = 599.069 and redChi2 = 1.291
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving mgii2796 results to tablines and ztab

We can also use try a different minimization routine. For example, the Nelder-Mead simplex algorithm. But note that it use a much larger number of iterations.

[49]:
params = dict(method='nelder', options=dict(xtol=1.e-3))
logger.setLevel('DEBUG')
res = fit_spec(sp, z=1.90578, minpars=params)
logger.setLevel('INFO')
[DEBUG] Fit continuum
[DEBUG] Fit emission lines
[DEBUG] Preparing data for fit
[DEBUG] Getting lines from default line table...
[DEBUG] 12.7 % of the spectrum is used for fitting.
[DEBUG] Initialize fit
[DEBUG] Found 1 non resonnant line families to fit
[DEBUG] Init Fit of family forbidden
[DEBUG] Found 15 lines to fit
[DEBUG] added 15 gaussian to the fit
[DEBUG] Found 1 resonnant line families to fit
[DEBUG] Init fitting of family mgii2796
[DEBUG] Found 2 lines to fit
[DEBUG] added 2 gaussian to the fit
[DEBUG] Fitting of Line Family: forbidden
[DEBUG] Lmfit fitting: {'method': 'nelder', 'options': {'xtol': 0.001}}
[DEBUG] Maximum number of function evaluations has been exceeded. after 3401 iterations, reached minimum = 571.309 and redChi2 = 1.267
[DEBUG] Fitting of Line Family: mgii2796
[DEBUG] Lmfit fitting: {'method': 'nelder', 'options': {'xtol': 0.001}}
[DEBUG] Optimization terminated successfully. after 172 iterations, reached minimum = 637.016 and redChi2 = 1.373
[DEBUG] Saving forbidden results to tablines and ztab
[DEBUG] Saving mgii2796 results to tablines and ztab

LSF

By default, the reported rest frame velocity dispersion are corrected from the instrumental LSF. It is possible to disable this correction by settings lsf=None in fit_spec.

The LSF model is the following MUSE LSF model: \({\rm FWHM \left(\lambda\right) = 5.19939 - 7.56746 \, 10^{-04} \lambda + 4.93397 \, 10^{-08} \lambda^2}\) with FWHM in Angstroem.

It is possible to use its own LSF model by giving his own function \({\rm FWHM \left(\lambda\right)}\), like in the following:

[50]:
import numpy as np
def mylsf(lbda):
    return np.interp(lbda, [4750,9300], [3.5,3.2])

logger.setLevel('INFO')
sp = Spectrum(os.path.join(datadir, 'udf10_00002.fits'))
res0 = fit_spec(sp, z=0.41892)
res1 = fit_spec(sp, z=0.41892, lsf=mylsf)
res2 = fit_spec(sp, z=0.41892, lsf=None)
for res in [res0,res1,res2]:
    lines = res['lines']
    r = lines[lines['LINE']=='OIII5007'][0]
    vdinst = r['VDINST'] if 'VDINST' in lines.columns else 0
    print(f"VDINST={vdinst:.2f} VDISP={r['VDISP']:.2f} FWHM_OBS={r['FWHM_OBS']:.2f}")
VDINST=41.44 VDISP=66.65 FWHM_OBS=4.38
VDINST=59.91 VDISP=54.75 FWHM_OBS=4.53
VDINST=0.00 VDISP=84.04 FWHM_OBS=4.69

Advanced usage

To perform specific operation, one could use the Pyplatefit class. See the class documentation.

Here is an example:

[51]:
from pyplatefit import Platefit
pf = Platefit()
sp = Spectrum(os.path.join(datadir,'udf10_00002.fits'))
z = 0.41892
rescont = pf.fit_cont(sp, z, vdisp=80)
fig,ax = plt.subplots(1,1,figsize=(12,5))
tab = rescont['table_spec']
ax.plot(tab['RESTWL'],tab['FLUX'])
ax.plot(tab['RESTWL'],tab['CONT'], '-r')
ax.set_ylim(400,1500)
[51]:
(400.0, 1500.0)
_images/tutorial_94_1.png
[52]:
resline = pf.fit_lines(rescont['line_spec'], z)
fig,ax = plt.subplots(1,1,figsize=(12,5))
tab = resline['table_spec']
ax.plot(tab['RESTWL'],tab['FLUX'], '-b')
ax.plot(tab['RESTWL'],tab['FORBIDDEN_FIT_LSQ'], '-r', label='forbidden')
ax.plot(tab['RESTWL'],tab['BALMER_FIT_LSQ'], '-b', label='balmer')
ax.legend()
[52]:
<matplotlib.legend.Legend at 0x7fe36bf76ee0>
_images/tutorial_95_1.png

Output results in detail

The dictionary return by fit_spec contain the following items:

  • ztable: the redshift table (see fit_spec in the doc main menu for the meaning of table columns)

  • lines: the lines table (see fit_spec in the doc main menu for the meaning of table columns)

  • spec: the input spectrum

  • spec_fit: the full fit spectrum (cont_fit + line_fit + abs_fit)

  • cont_spec: the fitted model continuum + smooth low order residual spectrum

  • cont_fit: the fitted model continuum spectrum

  • line_spec: the emission line spectrum after subtraction of the fitted continuum (spec - cont_spec)

  • line_fit: the emission line fit spectrum

  • line_initfit: the initial solution for the emission line fit

  • abs_cont: the polynomial continuum fit spectrum

  • abs_init: the initial solution for the absorption fit (spec - abs_cont)

  • abs_fit: the fitted absorption lines spectrum

  • dline: dictionary with detailed results for the emission or absorption lines fit

  • dcont: dictionary with detailed results for the continuum fit

The dictionary dline contain the following items:

  • table_spec: rest frame emission line table data used in the fitting process

  • abs_table_spec: rest frame ansorption line table data used in the fitting process

  • lmfit_family: lmfit raw result with family, the family name.

The dictionary dcont contain the following items:

  • table_spec: rest frame flux table data used in the fitting process

  • success: boolean True if fit was successful

  • status: output fit message

  • z: metallicity for the best fit

  • ebv: E(B-V)

  • init_z: used redshift for the fit

  • chi2: \(\chi^2\) fit

  • ages: return list of ages for the model library

  • weights: returned list of weights for the model library

For example to display the detail lmfit result for the balmer family one can do (check lmfit for all the available info):

[53]:
sp = Spectrum(os.path.join(datadir, 'udf10_00002.fits'))
res = fit_spec(sp, z=0.41892)
res['dline']['lmfit_balmer']
[53]:

Fit Statistics

fitting methodleast_squares
# function evals7
# data points783
# variables11
chi-square 192511.056
reduced chi-square 249.366652
Akaike info crit. 4332.23974
Bayesian info crit. 4383.53420

Variables

name value standard error relative error initial value min max vary
dv_balmer 82.1268012 3.85608873 (4.70%) 0 -500.000000 500.000000 True
vdisp_balmer 66.1085450 4.26045940 (6.44%) 50 5.00000000 300.000000 True
balmer_H11_gauss_l0 3771.69995 0.00000000 (0.00%) 3771.7 -inf inf False
balmer_H11_gauss_flux 196.541129 395.983771 (201.48%) 342.9287624503839 0.00000000 inf True
balmer_H10_gauss_l0 3798.97998 0.00000000 (0.00%) 3798.98 -inf inf False
balmer_H10_gauss_flux 323.278214 384.247522 (118.86%) 343.22898429783106 0.00000000 inf True
balmer_H9_gauss_l0 3836.46997 0.00000000 (0.00%) 3836.47 -inf inf False
balmer_H9_gauss_flux 573.322336 380.741042 (66.41%) 487.2757666583839 0.00000000 inf True
balmer_H8_gauss_l0 3890.14990 0.00000000 (0.00%) 3890.15 -inf inf False
balmer_H8_gauss_flux 1302.15623 386.597943 (29.69%) 1106.9039675218417 0.00000000 inf True
balmer_HEPSILON_gauss_l0 3971.19995 0.00000000 (0.00%) 3971.2 -inf inf False
balmer_HEPSILON_gauss_flux 1107.83505 378.329015 (34.15%) 944.1861023360569 0.00000000 inf True
balmer_HDELTA_gauss_l0 4102.89014 0.00000000 (0.00%) 4102.89 -inf inf False
balmer_HDELTA_gauss_flux 2051.52603 379.267737 (18.49%) 1652.1183642247636 0.00000000 inf True
balmer_HGAMMA_gauss_l0 4341.68018 0.00000000 (0.00%) 4341.68 -inf inf False
balmer_HGAMMA_gauss_flux 3647.93827 348.920115 (9.56%) 2994.6600370494198 0.00000000 inf True
balmer_HBETA_gauss_l0 4862.68018 0.00000000 (0.00%) 4862.68 -inf inf False
balmer_HBETA_gauss_flux 8567.47174 415.428624 (4.85%) 7094.975428006051 0.00000000 inf True
balmer_HALPHA_gauss_l0 6564.60986 0.00000000 (0.00%) 6564.61 -inf inf False
balmer_HALPHA_gauss_flux 23693.2431 2927.08919 (12.35%) 18925.797028608496 0.00000000 inf True

Correlations (unreported correlations are < 0.100)

vdisp_balmerbalmer_HBETA_gauss_flux0.3279
dv_balmerbalmer_HBETA_gauss_flux0.2289
dv_balmerbalmer_HALPHA_gauss_flux-0.2239
vdisp_balmerbalmer_HGAMMA_gauss_flux0.2237
vdisp_balmerbalmer_HDELTA_gauss_flux0.1048
[ ]:

[ ]: