admin管理员组

文章数量:1417724

I just discovered Zfit and I am very happy about its capabilities. I have a very naive question.

I am trying to create an ExtendedUnbinnedNLL made of exponential pdfs with weighted data. I am extracting the hesse errors for :

  1. The exponential parameters --> Which seem reasonable

  2. The normalization parameters (yield) --> Which is almost 0.

I am not sure why (2) happens. Through other frameworks (Like RooFit etc.) I get the almost the same results for the exponential parameter and its error and the same normalization parameter but higher error.

I would expect the errors to be larger on the normalization parameter errors. Maybe they should be ~sqrt(Nbkg) where NBkg is the expected number of events [or the yield (sum of weights) in this case].

I am using zfit version : 0.16.0

I attached the code and the log file down below.

Thank you very much for your time!

Code :

import tensorflow as tf
import zfit
from zfit import z  # math backend of zfit
from plot_handler import *

# Observable definition
obs = zfit.Space('x', limits=(105, 160))

# Define individual PDFs with unique lambda parameters and normalization
categories = ["Run2 LM1", "Run2 LM2", "Run2 LM3", "Run2 LM4", "Run2 HM1", "Run2 HM2", "Run2 HM3", "Run3 LM1", "Run3 LM2", "Run3 LM3", "Run3 LM4", "Run3 HM1", "Run3 HM2", "Run3 HM3"]

pdfs = {}
datasets = {}
params = {}
normalizations = {}

# Create the PlotHandler and book histograms
plotter = PlotHandler("m_yy_HM_6_5", "m_yy", 22, 105, 160)
hists = plotter.book_histograms(categories)

mini_path = "/eos/user/s/smeriano/BBYY_STUDIES/xmlAnaWSBuilder/yybb_template_bkg_only_updated_corr_bkg_xmls/categorized_minitrees"
category_files = {
    "Run2 LM1": f"{mini_path}/yybb_Run2LM_1.root",
    "Run2 LM2": f"{mini_path}/yybb_Run2LM_2.root",
    "Run2 LM3": f"{mini_path}/yybb_Run2LM_3.root",
    "Run2 LM4": f"{mini_path}/yybb_Run2LM_4.root",
    "Run2 HM1": f"{mini_path}/yybb_Run2HM_1.root",
    "Run2 HM2": f"{mini_path}/yybb_Run2HM_2.root",
    "Run2 HM3": f"{mini_path}/yybb_Run2HM_3.root",
    "Run3 LM1": f"{mini_path}/yybb_Run3LM_1.root",
    "Run3 LM2": f"{mini_path}/yybb_Run3LM_2.root",
    "Run3 LM3": f"{mini_path}/yybb_Run3LM_3.root",
    "Run3 LM4": f"{mini_path}/yybb_Run3LM_4.root",
    "Run3 HM1": f"{mini_path}/yybb_Run3HM_1.root",
    "Run3 HM2": f"{mini_path}/yybb_Run3HM_2.root",
    "Run3 HM3": f"{mini_path}/yybb_Run3HM_3.root",
}

# Loop through categories, create datasets, PDFs, and normalization parameters
for cat in categories:
    # Define a unique lambda parameter for each category
    params[cat] = zfit.Parameter(f"lambda_{cat.replace(' ', '_')}", -2, -5, 1)
    

    # Read data and create zfit datasets
    hists[cat].read_trees([category_files[cat]], "output", "m_yy", "total_weight")

    # Compute sum of weights for initial nbkg estimate
    sum_weights = hists[cat].raw_data["weight"].sum()

    # Define a normalization parameter for each category
    normalizations[cat] = zfit.Parameter(f"norm_{cat.replace(' ', '_')}", sum_weights, 0, 1e7)

    # Define an extended PDF (PDF * normalization) for each category
    pdfs[cat] = zfit.pdf.Exponential(obs=obs, lam=params[cat]).create_extended(normalizations[cat])

    datasets[cat] = zfit.Data.from_pandas(
        hists[cat].raw_data,
        obs=obs,
        weights="weight"
    )

# Use ExtendedUnbinnedNLL for extended PDFs
total_nll = zfit.loss.ExtendedUnbinnedNLL(
    model=list(pdfs.values()),  # List of extended PDFs
    data=list(datasets.values())  # List of datasets
)

minimizer = zfit.minimize.Minuit()

result = minimizer.minimize(total_nll)

# Print Hesse results for each parameter, including their values
print("Hesse Results for Each Category (Values and Errors):")
for cat in categories:
    # Get the parameter values
    lambda_value = params[cat].value()
    lambda_error = result.hesse(method='minuit_hesse', params=[params[cat]])[params[cat]]["error"]

    norm_value = normalizations[cat].value()
    norm_error = result.hesse(method='minuit_hesse', params=[normalizations[cat]])[normalizations[cat]]["error"]


    # Print results
    print(f"{cat}:")
    print(f"  Lambda (p0): {lambda_value:.5f} +/- {lambda_error:.5f}")
    print(f"  Normalization (nbkg): {norm_value:.5f} +/- {norm_error:.5f}")
    print("\n")


print(result)


print(result.covariance())



print(result.params)



# Plot and save results
plotter.manip_normalize()
plotter.plot()
plotter.save()

And the log file :

[smeriano@lxplus947 plot_handler]$ python3 scripts/multiple_mass.py > spyros.log
2025-01-29 14:55:12.016202: I tensorflow/core/util/port:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-01-29 14:55:12.043904: I tensorflow/core/platform/cpu_feature_guard:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
/cvmfs/sft.cern.ch/lcg/views/LCG_105/x86_64-el9-gcc11-opt/lib/python3.9/site-packages/zfit/__init__.py:63: UserWarning: TensorFlow warnings are by default suppressed by zfit. In order to show them, set the environment variable ZFIT_DISABLE_TF_WARNINGS=0. In order to suppress the TensorFlow warnings AND this warning, set ZFIT_DISABLE_TF_WARNINGS=1.
  warnings.warn(
/cvmfs/sft.cern.ch/lcg/views/LCG_105/x86_64-el9-gcc11-opt/lib/python3.9/site-packages/zfit/minimizers/fitresult.py:1204: ChangedFeatureWarning: The behavior of this functionality recently changed.To turn this warning off, use `zfit.settings.changed_warnings.hesse_name = False`  or 'all' with `zfit.settings.changed_warnings.all = False
Default name of hesse (which is currently the method name such as `minuit_hesse`or `hesse_np`) has changed to `hesse` (it still adds the old one as well. This will be removed in the future). INSTRUCTIONS: to stay compatible,  change wherever you access the error to 'hesse' (if you don't explicitly specify the name in hesse(...).
  warn_changed_feature(message, "hesse_name")


Initialized figure...
Hesse Results for Each Category (Values and Errors):
Run2 LM1:
  Lambda (p0): -0.02822 +/- 0.00060
  Normalization (nbkg): 116.92488 +/- 0.00000


Run2 LM2:
  Lambda (p0): -0.02882 +/- 0.00075
  Normalization (nbkg): 72.96577 +/- 0.00000


Run2 LM3:
  Lambda (p0): -0.02940 +/- 0.00098
  Normalization (nbkg): 41.76252 +/- 0.00000


Run2 LM4:
  Lambda (p0): -0.03090 +/- 0.00166
  Normalization (nbkg): 15.38696 +/- 0.00000


Run2 HM1:
  Lambda (p0): -0.02446 +/- 0.00089
  Normalization (nbkg): 39.57303 +/- 0.00000


Run2 HM2:
  Lambda (p0): -0.02610 +/- 0.00149
  Normalization (nbkg): 15.15226 +/- 0.00000


Run2 HM3:
  Lambda (p0): -0.02979 +/- 0.00156
  Normalization (nbkg): 12.84923 +/- 0.00000


Run3 LM1:
  Lambda (p0): -0.02600 +/- 0.00110
  Normalization (nbkg): 34.61648 +/- 0.00000


Run3 LM2:
  Lambda (p0): -0.02600 +/- 0.00130
  Normalization (nbkg): 21.89008 +/- 0.00000


Run3 LM3:
  Lambda (p0): -0.02407 +/- 0.00174
  Normalization (nbkg): 11.90366 +/- 0.00000


Run3 LM4:
  Lambda (p0): -0.03439 +/- 0.00291
  Normalization (nbkg): 4.09738 +/- 0.00000


Run3 HM1:
  Lambda (p0): -0.02366 +/- 0.00143
  Normalization (nbkg): 14.39771 +/- 0.00000


Run3 HM2:
  Lambda (p0): -0.02545 +/- 0.00236
  Normalization (nbkg): 5.68299 +/- 0.00000


Run3 HM3:
  Lambda (p0): -0.02910 +/- 0.00251
  Normalization (nbkg): 4.63954 +/- 0.00000


FitResult of
<ExtendedUnbinnedNLL model=multiple data=multiple constraints=[]> 
with
<Minuit Minuit tol=0.001>

╒═════════╤═════════════╤══════════════════╤═══════╤══════════════════════════════════╕
│  valid  │  converged  │  param at limit  │  edm  │   approx. fmin (full | internal) │
╞═════════╪═════════════╪══════════════════╪═══════╪══════════════════════════════════╡
│  True   │    True     │      False       │ 8e-05 │              1646.14 | -5232.491 │
╘═════════╧═════════════╧══════════════════╧═══════╧══════════════════════════════════╛

Parameters
name               value  (rounded)        hesse    at limit
---------------  ------------------  -----------  ----------
norm_Run2_LM1               116.925  +/- 2.5e-07       False
lambda_Run2_LM1          -0.0282155  +/-  0.0006       False
norm_Run2_LM2               72.9658  +/-   2e-07       False
lambda_Run2_LM2          -0.0288249  +/- 0.00075       False
norm_Run2_LM3               41.7625  +/- 1.5e-07       False
lambda_Run2_LM3          -0.0293982  +/- 0.00098       False
norm_Run2_LM4                15.387  +/- 7.5e-08       False
lambda_Run2_LM4          -0.0309028  +/-  0.0017       False
norm_Run2_HM1                39.573  +/- 1.2e-07       False
lambda_Run2_HM1          -0.0244577  +/- 0.00089       False
norm_Run2_HM2               15.1523  +/-   9e-08       False
lambda_Run2_HM2             -0.0261  +/-  0.0015       False
norm_Run2_HM3               12.8492  +/- 8.3e-08       False
lambda_Run2_HM3          -0.0297945  +/-  0.0016       False
norm_Run3_LM1               34.6165  +/- 1.1e-07       False
lambda_Run3_LM1          -0.0259956  +/-  0.0011       False
norm_Run3_LM2               21.8901  +/- 1.1e-07       False
lambda_Run3_LM2          -0.0260004  +/-  0.0013       False
norm_Run3_LM3               11.9037  +/-   8e-08       False
lambda_Run3_LM3          -0.0240717  +/-  0.0017       False
norm_Run3_LM4               4.09738  +/- 4.7e-08       False
lambda_Run3_LM4          -0.0343914  +/-  0.0029       False
norm_Run3_HM1               14.3977  +/- 7.3e-08       False
lambda_Run3_HM1          -0.0236562  +/-  0.0014       False
norm_Run3_HM2                 5.683  +/- 4.6e-08       False
lambda_Run3_HM2           -0.025451  +/-  0.0024       False
norm_Run3_HM3               4.63954  +/- 4.1e-08       False
lambda_Run3_HM3          -0.0291017  +/-  0.0025       False

name               value  (rounded)        hesse    at limit
---------------  ------------------  -----------  ----------
norm_Run2_LM1               116.925  +/- 2.5e-07       False
lambda_Run2_LM1          -0.0282155  +/-  0.0006       False
norm_Run2_LM2               72.9658  +/-   2e-07       False
lambda_Run2_LM2          -0.0288249  +/- 0.00075       False
norm_Run2_LM3               41.7625  +/- 1.5e-07       False
lambda_Run2_LM3          -0.0293982  +/- 0.00098       False
norm_Run2_LM4                15.387  +/- 7.5e-08       False
lambda_Run2_LM4          -0.0309028  +/-  0.0017       False
norm_Run2_HM1                39.573  +/- 1.2e-07       False
lambda_Run2_HM1          -0.0244577  +/- 0.00089       False
norm_Run2_HM2               15.1523  +/-   9e-08       False
lambda_Run2_HM2             -0.0261  +/-  0.0015       False
norm_Run2_HM3               12.8492  +/- 8.3e-08       False
lambda_Run2_HM3          -0.0297945  +/-  0.0016       False
norm_Run3_LM1               34.6165  +/- 1.1e-07       False
lambda_Run3_LM1          -0.0259956  +/-  0.0011       False
norm_Run3_LM2               21.8901  +/- 1.1e-07       False
lambda_Run3_LM2          -0.0260004  +/-  0.0013       False
norm_Run3_LM3               11.9037  +/-   8e-08       False
lambda_Run3_LM3          -0.0240717  +/-  0.0017       False
norm_Run3_LM4               4.09738  +/- 4.7e-08       False
lambda_Run3_LM4          -0.0343914  +/-  0.0029       False
norm_Run3_HM1               14.3977  +/- 7.3e-08       False
lambda_Run3_HM1          -0.0236562  +/-  0.0014       False
norm_Run3_HM2                 5.683  +/- 4.6e-08       False
lambda_Run3_HM2           -0.025451  +/-  0.0024       False
norm_Run3_HM3               4.63954  +/- 4.1e-08       False
lambda_Run3_HM3          -0.0291017  +/-  0.0025       False
Successfully saved figure at outputs!

本文标签: zfitNormalization parameter errors of ExtendedUnbinnedNLL with weighted dataStack Overflow