contur.factories package
contur.factories.test_observable module
- class contur.factories.test_observable.Observable(ana_obj, xsec, nev, sm=None)[source]
Processes and decorates
to a testable format- Parameters:
ana_obj (
AO to dress, containing signal info.xsec (
) – _XSEC scatter recording generator cross section in YODA file (contained in all Rivet run outputs)nev (
) – _EVTCOUNT scatter recording total generated events in YODA file (contained in all Rivet run outputs)sm (
) – Standard Model prediction for this observable
- property data_scale
Scale factor applied to the refdata histogram/scatter
type (
- doPlot()[source]
Public member function to build yoda plot members for interactive runs
These are only for display, they are not used in any of the statistics calculations.
- get_sm_pval()[source]
Calculate the pvalue compatibility (using chi2 survival) for the SM prediction and this measurement
- property has_theory
Bool representing if a theory prediction was found for the input signal
type (
- property likelihood
The instance of
derived from this histogramtype (
- property ref
Reference data, observed numbers input to test, scaled if required
type (
- property refplot
Reference data for plotting
type (
- property scaled
Bool representing if there is additional scaling applied on top of luminosity
type (
- property signal_scale
Scale factor applied to the signal histogram/scatter, derived generally from input nEv and xs
type (
- property sigplot
Signal for plotting
type (
- property stack_databg
Stacked, unscaled Signal+background for plotting (data as background)
type (
- property stack_smbg
Stacked, unscaled Signal+background for plotting (SM as background)
type (
- property thy
Reference SM theory data, scaled if required
type (
- property thyplot
Theory for plotting
type (
- class contur.factories.test_observable.ObservableValues(bin_widths=None, central_values=None, err_breakdown=None, covariance_matrix=None, diagonal_matrix=None, isref=False)[source]
A book-keeping class to contain all the numerical info (central values, err_breakdown, covariance) for a given binned observable.
contur.factories.likelihood module
This module contains the implementation of the likelihood calculation, and various functions to manipulate test statistics.
Abstracted from the underlying YODA
objects, this module defines
two ways to construct likelihood functions from numerical types:
– The base likelihood building blocks, representing the information extracted from an underlying histogram of potentially correlated observables.
– A shell to combineLikelihood
blocks into a full likelihood, automatically encodes assumption that the included blocks are uncorrelated.
- class contur.factories.likelihood.CombinedLikelihood(stat_type='all')[source]
Shell to combine
blocksThis class is used to extract the relevant test statistic from each individual
and combine them This is initialised with no arguments as it is just a shell to combine the individual components, and automatically encodes the fact that each block is uncorrelated with each otherTwo use cases: 1. Combining subpool likelihoods, where statistics are combined for all stat types.
2. Building the full likelihood, which is done separately for each stat type. This is because different histograms can provide the best exclusion in a pool depending on the stat type.
Technically this could be constructed by building a
with a master covariance matrix made forming block diagonals with each individual component. Avoiding this is faster but less rigourous- add_likelihood(likelihood)[source]
Add a
block to this combination likelihood- Parameters:
likelihood (
) – Instance of computed Likelihood
- calc_cls()[source]
Call the calculation of the CLs confidence interval
Triggers the parent class calculation of the CLs interval based on the sum of test statistics added with the
- combine_spey_models()[source]
Combines a list of spey models into a single one. Assumes models are statistically uncorrelated
- get_mu_hat(stat_type)[source]
Maximum likelihood estimator of the signal strength parameter. type (
- class contur.factories.likelihood.Likelihood(calculate=False, ratio=False, profile=False, lumi=1.0, lumi_fb=1.0, sxsec=None, bxsec=None, tags='', sm_values=None, measured_values=None, bsm_values=None, expected_values=None)[source]
Fundamental likelihood-block class and confidence-interval calculator
This class defines the structure of a series of observables to be constructed into a hypothesis test
- Keyword Arguments:
calculate (
) – Perform the statistics calculation (otherwise the input variables are just set up)ratio (
) – Flag if data is derived from a ratio measurement (not general! caveat emptor!)profile (
) – Flag if data is derived from a profile histogram measurementlumi (
) – the integrated luminosity in the units of the measurement. used to calculate expected stat uncertainty on signallumi_fb (
) – the integrated luminosity for the measurement in fb. used to scale for HL-LHCsxsec (
) – Signal cross-section in picobarns, if non-nulltags (
) – names of the histograms this is associated withsm_values (
) – All the numbers for the SM predictionmeasured_values (
) – All the numbers for the measurementbsm_values (
) – All the numbers for the signalexpected_values (
) – The SM prediction with data uncertainties.
- build_spey_models()[source]
Function to build an Spey statistical models for hypothesis testing. For each stat type, the model constructed is a multivariate Gaussian with one parameter, the signal strength.
- calculate(stat_type)[source]
Default mode: Calculates the CLs exclusion for this histogram (and this stat type)
Spey mode: Calculates several statistics using the spey statistical models, namely: - CLs exclusion - 95% CLs upper limit on the signal strength parameter mu - Maximum likelihood estimator of the signal strength parameter, muhat
- calculate_max_mu(stat_type)[source]
Calculate the 95% CLs upper limit on the signal strength parameter mu.
- calculate_mu_hat(stat_type)[source]
Calculate maximum likelihood estimator of the signal strength parameter, mu_hat.
- cleanup_model_list()[source]
Delete the single bin models, this can be done after the bin with the highest exclusion power is found
- find_dominant_bin(stat_type)[source]
Function to find the bin that gives the highest CLs for cases with no covariance matrix (either the matrix has no invserse or has not been succesfully built)
- get_mu_hat(type)[source]
Maximum likelihood estimator of the signal strength parameter.
type (
- get_mu_upper_limit(type)[source]
Upper limit on the signal strength parameter mu at 95% CLs
type (
- get_sm_pval()[source]
Calculate the pvalue compatibility (using chi2 survival) for the SM prediction and this measurement
- property pools
Pool that the test belongs to
settable parameter
type (
- spey_calculate_CLs(stat_type)[source]
Use the statistical model to calculate the CLs exclusion. This is calculated from the profile likelihood ratio
- property subpools
Subpool the test belongs to
settable parameter
type (
- property tags
Name(s) of source histograms for this block
settable parameter
type (
- contur.factories.likelihood.build_full_likelihood(sorted_blocks, stat_type)[source]
Function to build the full likelihood representing an entire
fileThis function takes the
and combines them as statistically uncorrelated diagonal contributions to aCombinedLikelihood
instance which is stored as an attribute to this class aslikelihood
- Keyword Arguments:
stat_type (
) – Stat type to build full likelihood for
- contur.factories.likelihood.combine_subpool_likelihoods(likelihood_blocks)[source]
build combined likelihoods for any active subpools, and add them to the list of likelihood blocks.
- contur.factories.likelihood.likelihood_blocks_find_dominant_ts(likelihood_blocks, stat_type)[source]
Function that finds the chi-square test statistic that gives the maximum confidence level for each likelihood block for which we don’t have a valid covariance matrix (either the matrix has no invserse or has not been succesfully built)
- contur.factories.likelihood.likelihood_blocks_ts_to_cls(likelihood_blocks, stat_type)[source]
Function that calculates the confidence level for each likelihood block extracted from the
file using the signal and background test statistic for the block
- contur.factories.likelihood.pval_to_cls(pval_tuple)[source]
Function to calculate a cls when passed background and signal p values.
notes: we are not actually varying a parameter of interest (mu), just checking mu=0 vs mu=1
the tail we integrate to get a p-value depends on whether you’re looking for signal-like or background-like tails. For the signal-like p-value we integrate over all the probability density less signal-like than was observed, i.e. to the right of the observed test stat.
For the background-like p-value we should integrate over the less background-like stuff, i.e. from -infty to t_obs… which is 1 - the t-obs…infty integral.
So CLs is the ratio of the two right-going integrals, which is nice and simple and symmetric, but looks asymmetric when written in terms of the p-values because they contain complementary definitions of the integral limits
The code has implemented them both as right-going integrals, so does look symmetric, hence this comment to hopefully avoid future confusion.
- Parameters:
pval_tuple (
Tuple of floats
) – Tuple, first element p-value of signal hypothesis, second p-value of background- Returns:
– Confidence Interval in CLs formalism
- contur.factories.likelihood.sort_blocks(likelihood_blocks, stat_type, omitted_pools='')[source]
Function that sorts the list of likelihood blocks extracted from the
fileThis function implements the sorting algorithm to sort the list of all extracted
blocks in thelikelihood_blocks
list, storing the reduced list in thesorted_blocks
list- Keyword Arguments:
stat_type (
) – Which statisic (default, smbg, expected, hlexpected) to sort on.
- contur.factories.likelihood.ts_to_cls(ts_tuple_list, tags)[source]
Method to directly cast a list of tuples of test statistics (tuple contains background and signal test stats) into a list of CLs values
notes: we are not actually varying a parameter of interest (mu), just checking mu=0 vs mu=1
the tail we integrate to get a p-value depends on whether you’re looking for signal-like or background-like tails. For the signal-like p-value we integrate over all the probability density less signal-like than was observed, i.e. to the right of the observed test stat.
For the background-like p-value we should integrate over the less background-like stuff, i.e. from -infty to t_obs… which is 1 - the t-obs…infty integral.
So CLs is the ratio of the two right-going integrals, which is nice and simple and symmetric, but looks asymmetric when written in terms of the p-values because they contain complementary definitions of the integral limits
The code has implemented them both as right-going integrals, so does look symmetric, hence this comment to hopefully avoid future confusion.
- Parameters:
ts_tuple_list (
) – list of tuples of tests statistics (tuples of the form (test stat background, test stat background))- Returns:
– List of Confidence Intervals in CLs formalism
contur.factories.likelihood_point module
- class contur.factories.likelihood_point.LikelihoodPoint(paramPoint={}, yodaFactory=None)[source]
Save the statistical information about a model parameter point in a run, which can then be manipulated to sort them, calculate a full likelihood result, exclusions result, test b result, test s+b result with related stat_type and a parameter point dictionary
If instantiated with a valid parameter dictionary this will be added as a property If instantiated with a valid YodaFactory, its likelihood blocks will be associated with this likelihood point
If these are not provided, a blank point will be created which can be populated later (e.g. from a results database)
Note that in general those likelihood blocks (ie the lists of likelihood objects) will not be present, since a result database does not store them. The statistics info can be retrieved from the relevant dictionaries, but not recalculated from scratch since this signal/background info won’t be available.
- get_dominant_analysis(stat_type, poolid=None, cls_cut=0.0)[source]
return the analysis object which has the biggest exclusion for this point.
- get_full_likelihood(stat_type=None)[source]
The full likelihood representing the result file in it’s entirety.
If stat_type is specified, return to entry for it. Else return the dict of all of them.
type (
- get_sorted_likelihood_blocks(stat_type=None)[source]
The list of reduced component likelihood blocks extracted from the result file, sorted according the test statisitic of type stat_type. If stat_type is None, return the whole dictionary.
type (
- property likelihood_blocks
The list of all component likelihood blocks extracted from the result file
This attribute is the total information in the result` file, but does not account for potential correlation/ overlap between the members of the list
type (
- recalculate_CLs(stat_type, omitted_pools='')[source]
recalculate the combined exclusion after excluding the omitted pool in the class :param omitted_pools:
string, the name of the pool to ignore
- resort_blocks(stat_type, omitted_pools='')[source]
Function to sort the
list. Used for resorting after a merging exclusively. :Keyword Arguments: * stat_type (string
) – which statisic type (default, SM background, expected or hlexpected) is being sorted by.
- store_param_point(paramPoint)[source]
- Parameters:
paramPoint (
) – keystring
param name : valuefloat
- store_point_info(statType, combinedExclusion, poolExclusion, poolHistos, poolTestb, poolTestsb, obs_excl_dict, yoda_files)[source]
- Parameters:
statType – string, represent the point type
combinedExclusion (
) – full likelihood for a parameter pointpoolExclusion (
) – keystring
pool name : valuedouble
poolHistos (
) – keystring
pool name : valuestring
poolTestb (
) – keystring
pool name : valuedouble
poolTestsb (
) – keystring
pool name : valuedouble
contur.factories.depot module
The Depot module contains the Depot class. This is intended to be the high level analysis control, most user access methods should be implemented at this level
- class contur.factories.depot.Depot[source]
Parent analysis class to initialise
This can be initialised as a blank canvas, then the desired workflow is to add parameter space points to the Depot using the
method. This appends each considered point to the objects internalpoints
. To get the point from a database to the Depot use theadd_points_from_db()
method.Path for writing out objects is determined by cfg.plot_dir
- add_point(yodafile, param_dict)[source]
Add yoda file and the corresponding parameter point into the depot
- add_points_from_db(file_path, runpoint=None)[source]
Get the info of model points from the result database into the depot class
@TODO write a “get from DB” method for likelihood_point?
- property frame
representing the CLs interval for each point inpoints
type (
- merge(depot)[source]
Function to merge this conturDepot instance with another.
Points with identical parameters will be combined. If point from the input Depot is not present in this Depot, it will be added.
- Parameters:
depot (
) – Additional instance to conturDepot to merge with this one
- property points
The master list of
instances added to the Depot instancetype (
- resort_points()[source]
Function to trigger rerunning of the sorting algorithm on all items in the depot, typically if this list has been affected by a merge by a call to
- write(outDir, args, yodafile=None, include_dominant_pools=True, include_per_pool_cls=True)[source]
Function to write depot information to disk
write a results db files to outDir if cfg.csvfile is not None, also write out a csv file containing the data
- Parameters:
outDir (
) – String of filesystem location to write out the pickle of this instance to
contur.factories.yoda_factories module
The yoda_factories module contains three main components in the middle of the data flow, sitting between the high level steering
in contur.factories.Depot
class and the lower level statistics in the contur.factories.Likelihood
- class contur.factories.yoda_factories.YodaFactory(yodaFilePath)[source]
Class controlling Conturs YODA file processing ability
This class is initialised from an os path to a
file and dresses it by iterating through each ao and wrapping that in an instance ofObservable
which encapsulates a YODA analysis object and derives the requiredLikelihood
block for it. This class then contains the aggregated information for all of these instances across the entireYODA
file.- Parameters:
yodaFilePath (
) – Validos.path
filesystem YODA file location
- contur.factories.yoda_factories.load_bg_data(path, sm_id='A', smtest=False)[source]
load the background (THY) and data (REF) for all the observables associated with this rivet analysis if smtest then read all SM predictions, if not then only read the default/selected one.
- contur.factories.yoda_factories.load_ref_ao(path, orig_ao, aos)[source]
Load the ao, with the path=path, into memory, as THY or REF object
- contur.factories.yoda_factories.load_ref_aos(f, analysis)[source]
Load the relevant analysis objects (REF or THY) from the file f.
- contur.factories.yoda_factories.load_sm_ao(path, orig_ao, sm)[source]
Load the ao, with the path=path, into memory, as THY or REF object
- contur.factories.yoda_factories.load_sm_aos(sm)[source]
Load the relevant analysis objects (REF or THY) from the file f.
- contur.factories.yoda_factories.root_n_errors(ao, is_evcount, nx=0.0, lumi=1.0, replace=False)[source]
Function to include root(number of expected events) errors in the uncertainties of 2D scatter.
The uncertainty based on the expected events for the relevant integrated luminosity. This is not about MC statistics!
The minimum uncertainty is one event… we are not doing proper low-stat treatment in tails, so this is a conservative fudge.
- Parameters:
ao – The
analysis object to be manipulated.nx – factor needed to convert to number of events for none-uniform bin widths (<0, not used, ==0, do nothing).
is_evcount – True is the plot is in event numbers. Otherwise assumed to be a differential cross section.
lumi – Integrated luminosity used to get event counts from differential cross sections
replace – If True replace the uncertainties. If False (default) add them in quadrature.
- Type:
- Type:
- Type:
- Type:
- Type: