Warning: This program is not maintained anymore. In the near future, it will be removed from the set of GIPSY tasks. An improved version with graphical user interface is called XGAUFIT. Program: GAUFIT Purpose: GAUFIT estimates parameters for multi component one dimensional gauss functions using data in selected profiles. It can use these estimates to fit the parameters to the data in a least squares fit. Category: ANALYSIS, PROFILES, FITTING File: gaufit.c Author: M. Vogelaar Keywords: STARTNEW= Start improved version of this program? [Y]/N A new version with graphical user interface is called XGAUFIT. You can start it here. INSET= Give set and profile direction: Set (subsets) and one operation axis which is the set axis that indicates the direction of the profile. The subsets must be a contiguous block. This is checked by the program. (Maximum number of subsets is 2048.) BOX= Area of operation: [entire subset] OPTION= Option(s) for output: [list options] This input determines the size of your output set and what is stored in the output: 0 Initial estimate(s) 1 Fitted gaussian(s) 2 Errors on the fitted parameters (See description.) NGAUSS= Maximum number of gaussians in profile: [1] (The maximum number is 8.) OUTSET= Give output set (subsets): You need only to give a name. The number of output subsets depends on OPTION= and the maximum number of gaussians fitted to the profile (see description). The calculation of the initial estimates. ========================================= The fit routine in GAUFIT needs initial estimates for the parameters. These estimates can be read from set (ESTSET=) or can be calculated by the program using a method described by Schwarz, 1968, Bull.Astr.Inst. Netherlands, Volume 19, 405. The associated routine needs the r.m.s. noise in the profile (ESTRMS=) and a smoothing parameter (Q=). Estimates can be rejected before gaussians are fitted with the hidden keywords ESTCUTAMP= and ESTCUTDISP=. The default is that no estimates are rejected. ESTSET= Input set/subsets with estimates for fit: [calculate] Keyword is asked hidden if 0 (Initial estimate(s)) was part of the list in OPTION= The number of subsets must be 3*NGAUSS= For each gauss the subsets contain: 1: Estimated amplitudes of component 1 2: Estimated centre of component 1 3: Estimated dispersion of component 1 4: Estimated amplitudes of component 2 5: .... etc. ESTRMS= Give r.m.s. noise level in profiles: (Only asked if no ESTSET= is present) The r.m.s. noise is used in the calculation of the initial estimates for determining the region in the profile with signal. NCORR= Give Number of correlated subsets: [1.0] If Hanning smoothing has been applied, the subsets are two by two correlated as is specified by the default. This is important to correct the r.m.s. noise for making the initial estimates, and also to get correct errors on the parameters; they are a factor sqrt(NCORR=) higher if the subsets are correlated. ** ESTCUTAMP= Lower cutoff for amplitude of est. in : [0.0] (Only asked if no ESTSET= is present) Initial estimates below this amplitude are discarded. ** ESTCUTDISP= Lower cutoff for dispersion of est. in : [minimum] (Only asked if no ESTSET= is present) Initial estimates with a dispersion below this value are discarded. The default is the equivalent of 1 pixel. If a value entered by the user corresponds to a value in pixels less than 1, the cutoff value is set to the equivalent of 1 pixel (usually the channel separation). Q= Give smoothing parameter: [2] Number of data points used in the calculation of the initial estimate is 2Q + 1 around the peak. Q= accepts more than one value. If the fitting fails with the first value, it tries again with the next etc. Example: Q=3 4 5 2 Calculation of the fitted parameters. ===================================== The actual fitting is a least-squares fit of a function to a set of data points. The method used is described in: Marquardt, J.Soc.Ind.Appl.Math. 11, 431 (1963). It is a mixture of the steepest descent method and the Taylor method. The keywords CUTAMP= and CUTDISP= are filters used after fitting the profiles. CUTAMP= Lower cutoff for fitted amplitude in : [0.0] Fitted Gaussians with amplitude lower than this value are rejected. The units are generated by the keyword prompt. CUTDISP= Lower cutoff for fitted dispersion in : [0.0] Gaussians with dispersion lower than this value are rejected. The units are generated by the keyword prompt. TOLERANCE= Fractional tolerance for the chi square: [0.0] Fitting is stopped when successive iterations fail to produce a decrement in reduced chi-squared less than TOLERANCE= The value cannot be less than a certain minimum as set by the system. This means that maximum accuracy can be obtained with TOLERANCE=0.0 If you have profiles with a lot of signal, you can increase the tolerance (e.g. 0.01, or 0.1). **MAXITS= Maximum number of iterations in fit: [50] If this number is exceeded, no parameters could be fitted and blanks are written to the output set. **LAB= Value for mixing parameter: [0.001] Mixing parameter in the least squares fit function. LAB= determines the initial weight of steepest descent method relative to the Taylor method. LAB= should be a small value (i.e. 0.001). Sorting of gaussians. ===================== SORT= Sort output-mode: [0]/1/2/3 (Only asked if option 1 and/or 2 is selected) 0: Do NOT sort. 1: Sort components in DECREASING order of peak value. 2: Sort components in INCREASING order of distance to user given value on operation axis (often the central velocity) as given in CENTVAL= 3: Sort components in DECREASING order of dispersion. CENTVAL= Give centre for sorting in : [calculated] Only asked if SORT=2. Sorting is with respect to this value. The default is the physical coordinate of the centre of the operation axis. The units are generated by the keyword prompt. Description: The decomposition of profiles into Gaussian components can be a powerful method in the study of the kinematics of neutral hydrogen. This program calculates estimates (or reads them from another set) and fits multi component gaussians. If the estimates are not supplied, they are calculated by fitting second degree polynomials to the profiles (See: Schwarz, 1968, Bull.Astr.Inst. Netherlands, Volume 19, 405-413). The errors on the fitted parameters are calculated as real errors by means of the covariance matrix in the fit and the average deviation between data and fit over the entire profile. However, if the region with signal is short compared to the full length of the profile, the average deviation between data and fit is primarily determined along the part of the baseline where no signal is present and then it becomes almost equal to the noise in the profile and the real errors determined this way virtually become formal errors. How to use the TOLERANCE= keyword: ================================= Usually only signal is present in a small region in the profile while the chi square is calculated over the entire baseline. In this case the Chi square primarily represents the fitting of the baseline instead of the gauss and then a very tiny decrease of the chi square might very well be significant for the fitting of the gauss to the signal. So one should be very careful not setting TOLERANCE= too high. If TOLERANCE=0 then the minimization is stopped at machine precision which is of the order of 1e-5. Example: In order to fit gaussians to profiles you need to specify first the input set and the subsets (INSET=). The number of subsets must be greater than 1 and less than 2048 and the subsets are specified by the name of the axis in the direction of your profiles. Suppose you have a set NGC3198 with axes: RA-NCP from -134 to 115 DEC-NCP from -116 to 135 FREQ-OHEL from 2 to 116 and you want to create an output set where the subsets represent the estimated and the fitted parameters (amplitude, central value [central velocity] and dispersion) of gaussians using data in frequency direction (in this case velocities). You allow a maximum of two gaussians in a profile and want to sort the output in order of decreasing amplitude. Then: INSET=NGC3198 freq BOX= OPTION=0 1 2 SORT=1 NGAUSS=2 You have to know the r.m.s. level of the noise in your profiles otherwise it will be impossible to find reasonable estimates. E.g. ESTRMS=0.21 (in units of your image data). This value is modified if the subsets are correlated. For instance with NCORR=2 the value for the r.m.s. is adjusted and also the errors on the fitted parameters will be modified. For the smoothing parameters the array Q=2 1 3 4 5 is used. So we can add the keywords: ESTRMS=0.21 NCORR=2 Q=2 1 3 4 5 Next the set for the output must be supplied (OUTSET=). If it did not exist the program will create a new set with the FREQ axis replaced by a PARAM axis. Its length depends on the selected options for output (OPTION=) and the maximum number of gaussians allowed to be fitted to one profile (NGAUSS=) as follows: For each individual estimated gaussian, fitted gaussians and set of errors on its parameters 3 output subsets are reserved. Suppose you selected: OUTSET=NGC3198PARAM Then your NGC3198PARAM subsets will contain: param 1: Estimated amplitudes of component 1 param 2: Estimated centre of component 1 param 3: Estimated dispersion of component 1 param 4: Estimated amplitudes of component 2 param 5: Estimated centre of component 2 param 6: Estimated dispersion of component 2 param 7: Fitted amplitudes of component 1 param 8: Fitted centre of component 1 param 9: Fitted dispersion of component 1 param 10: Fitted amplitudes of component 2 param 11: Fitted centre of component 2 param 12: Fitted dispersion of component 2 param 13: Error in fitted amplitudes of component 1 param 14: Error in fitted centre of component 1 param 15: Error in fitted dispersion of component 1 param 16: Error in fitted amplitudes of component 2 param 17: Error in fitted centre of component 2 param 18: Error in fitted dispersion of component 2 If the components are sorted in amplitude then subset 1 contains the highest amplitude estimates (Schwarz method) and 7 contains the highest fitted amplitudes (Least squares method). If no parameters could be estimated or fitted then the components are set to blank. The sizes of the other axes of the output are copied from the input and the choice for BOX= does not influence their sizes. If the set already existed, the program will only write output into the region supplied by the user in BOX=. This way an already existing set with estimates or fits can be improved by overwriting the contents in BOX= (For example with other initial estimates or different Q=, TOLERANCE= or MAXITS=) The output of a GAUFIT run: ========================== Profiles from set: NGC3198 Results in set: NGC3198PARAM param 1: Estimated amplitudes of component 1 param 2: Estimated centre of component 1 .... .... Max. number of gaussians in profile is 2 Range operation axis: [393.465,865.341] KM/S Fitted gaussians will have amplitude > 0.000000 W.U. Fitted gaussians will have dispersion > 0.000000 KM/S One pixel on operation axis has (average) width 4.139263 KM/S Processed 63000 profiles in 1998.01 sec (1730.10 cpu sec) Number of profiles fitted with Q= 2: 55816 Number of profiles fitted with Q= 1: 3325 Number of profiles fitted with Q= 3: 740 Number of profiles fitted with Q= 4: 177 Number of profiles fitted with Q= 5: 56 For 2872 profiles no estimate was found. For 2886 profiles the fitting failed. 421 times the max. no. iterations was exceeded. In general, the program will work for profiles in any direction and instead of velocity you could think of any kind of physical coordinate. The program automatically inserts the correct units in the header of the output set. Literature: An alternative least squares method is described in: 1968, Bull.Astr.Inst. Netherlands, Volume 118, 465-487. Updates: Apr 22, 1991: VOG, Document created. Mrt 15, 1994: VOG, Separate cutoff keywords in cutoffs for estimate- and lsqfit routine. Oct 18, 1994: VOG, Included FJ Sickings NCORR= keyword and looping over numbers in Q= array. Mar 03, 1995: VOG, Keep floats between -FLT_MAX and FLT_MAX. Sorting of estimates allowed. Jan 08, 1997: VOG, New minimum for estimated dispersion in gauest_c. Bug removed in increasing 'qcount'. (program crashed on alpha only). Nov 30, 1998: VOG, Set initial value of 'ncorr' to 1 in all cases.