Code an indicator

Edit the settings in src/indicators/mod.rs

The configurations for an indicator are represented as a field named after the indicator (R999) on the Settings struct, defined in src/indicators/mod.rs.

pub struct Settings {
    // prepare command.
    pub codelists: Option<HashMap<Codelist, HashMap<String, String>>>,
    pub defaults: Option<Defaults>,
    pub redactions: Option<Redactions>,
    pub corrections: Option<Corrections>,
    pub modifications: Option<Modifications>,
    // indicators command.
    pub currency: Option<String>,
    pub no_price_comparison_procurement_methods: Option<String>,
    pub price_comparison_procurement_methods: Option<String>,
    pub exclusions: Option<Exclusions>,
    pub R003: Option<R003>,
    pub R018: Option<R018>,
    pub R024: Option<FloatThreshold>, // ratio
    pub R025: Option<R025>,
    pub R028: Option<Empty>,
    pub R030: Option<Empty>,
    pub R035: Option<IntegerThreshold>, // count
    pub R036: Option<Empty>,
    pub R038: Option<R038>,
    pub R048: Option<R048>,
    pub R058: Option<FloatThreshold>, // ratio
}

In Cardinal, all configurations are optional. So, the field must be an Option<T>, and the fields on the struct that the Option contains (T) must also be optional.

If the indicator’s only configuration is a threshold (integer or decimal), then the IntegerThreshold or FloatThreshold struct can be used, shown below for easy reference.

pub struct IntegerThreshold {
    threshold: Option<usize>,
}
pub struct FloatThreshold {
    threshold: Option<f64>,
}

If the indicator has no configuration, the Empty struct can be used, which has no fields.

pub struct Empty {}

Otherwise, create a new struct named after the indicator. For example:

#[derive(Clone, Debug, Default, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct R999 {
    my_integer: Option<usize>,
    my_decimal: Option<f64>,
    my_text: Option<String>,
}

The #[serde(deny_unknown_fields)] attribute causes an error if the user sets an unknown property.

Example

R999’s methodology is “A competition completed with few submitted bids.” You will edit the settings to allow users to configure the number of submitted bids (the “threshold”) that raises the red flag.

In src/indicators/mod.rs, the Settings struct already has a field for the indicator from Add boilerplate content:

    pub R999: Option<Empty>,

As is, no configuration is allowed. Cardinal attempts to parse any properties in the [R999] section of the INI file into the Empty struct. Because the struct has no fields, no properties are parsed, and the user sees an error as feedback.

The number of submitted bids can be represented as an integer. To parse a property with the name threshold and an integer value, you can reuse the IntegerThreshold struct:

    pub R999: Option<IntegerThreshold>,

Users can now configure R999’s threshold, using the Settings file. For example:

[R999]
threshold = 2

Try it!

Follow the example, create a settings.ini file with the content above, and run:

echo '{}' | cargo run -- indicators --settings settings.ini -

The output should be {}, with no errors about unknown fields!

Write the module

Open the new module (src/indicators/r999.rs, in this example) in a text editor.

An indicator is an implementation of the Calculate trait on a struct (R999, in this example).

#[derive(Default)]
pub struct R999 {
}

impl Calculate for R999 {

Note that items (like structs) are scoped by their module. In other words, an R999 struct in mod.rs for the indicator’s configuration has no relation with the R999 struct in r999.rs for its internal state.

Hint

Comparing Rust to other languages, structs are like objects, and traits are like interfaces. Structs have data (“fields”), and impl blocks provide a struct’s methods. Like Python, items are scoped by module and are imported (use).

The Calculate trait declares four methods, which are defined in the impl block:

impl Calculate for R999 {
    fn new(settings: &mut Settings) -> Self {
        Self::default()
    }

    fn fold(&self, item: &mut Indicators, release: &Map<String, Value>, ocid: &str) {
    }

    fn reduce(&self, item: &mut Indicators, other: &mut Indicators) {
    }

    fn finalize(&self, item: &mut Indicators) {
    }
}

Edit the new method

If the indicator is not configurable, then the new method and the struct (R999) can be left as-is.

If the indicator is configurable, then the new method reads the settings arguments and returns an instance of the struct (the capitalized Self token refers to the struct).

Hint

To avoid unnecessary memory allocation, you can std::mem::take() the Settings field named after the indicator. Indicators should not use other indicators’ settings.

Example

R999’s methodology is “A competition completed with few submitted bids,” with the default for “few” being 1 bid.

So far, you added the R999 field to the Settings struct in src/indicators/mod.rs.

You can now move the field’s value into the R999 struct in the new module, src/indicators/r999.rs.

  1. Add a corresponding field to the R999 struct. All configurations are optional (in this case, Option<usize>), but the methodology is to set a default of 1. So, we can make the field non-optional on this struct:

    #[derive(Default)]
    pub struct R999 {
        threshold: usize,
    }
    

    If the field’s default value couldn’t be set at initialization, you would make it optional: for example, if the default value depended on order statistics, like quartiles.

    #[derive(Default)]
    pub struct R999 {
        threshold: Option<usize>,
    }
    
  2. Move the value from the Settings struct into the R999 struct:

        fn new(settings: &mut Settings) -> Self {
            Self {
                threshold: std::mem::take(&mut settings.R999).unwrap_or_default().threshold.unwrap_or(1),
            }
        }
    

    This incantation requires understanding the Option type, the Default trait and the std::mem::take() function. In short, the R999 struct’s threshold field is set to the configured value if set and the default value (1), otherwise.

    If the field’s default value couldn’t be set at initialization, you would omit the unwrap_or(1):

        fn new(settings: &mut Settings) -> Self {
            Self {
                threshold: std::mem::take(&mut settings.R999).unwrap_or_default().threshold,
            }
        }
    

Try it!

If you run the command again, the output should still be {}:

echo '{}' | cargo run -- indicators --settings settings.ini -

How data is prepared

As described in the overall workflow, data is prepared before it is processed. This avoids complicating the indicator calculations with many exceptions and edge cases.

Also, as described in the prepare workflow, the prepare command should only warn about quality issues that it can fix and that interfere with the indicator calculations.

With that in mind, while you implement the indicator, think about whether:

  • An existing configuration of the prepare command should be edited to include additional fields.

    For example, at the time of writing, the currency property of the defaults section only applies to /bids/details[]/value/currency, because no indicator uses other currency fields yet.

  • A new configuration should be added, to address a quality issue you encountered.

Create an issue on GitHub to request any changes to the prepare command.

How data is processed

Processing is divided into 3 steps: fold, reduce, and finalize. A trait method corresponds to each step.

Each method accepts an item argument, whose type is Indicators (named after the command).

The Indicators struct has a results field for the final results, and other fields – whose names are prefixed by indicator codes – for intermediate results:

pub struct Indicators {
    pub results: IndexMap<Group, IndexMap<String, HashMap<Indicator, f64>>>,
    pub meta: HashMap<Indicator, RoundMap>,
    pub maps: Maps,
    pub currency: Option<String>,
    /// The percentage difference between the winning bid and the second-lowest valid bid for each `ocid`.
    pub second_lowest_bid_ratios: HashMap<String, f64>,
    pub winner_and_lowest_non_winner: HashMap<String, [String; 2]>,
    /// The ratio of winning bids to submitted bids for each `bids/details/tenderers/id`.
    pub r025_tenderer: HashMap<String, Fraction>,
    /// The ratio of disqualified bids to submitted bids for each `buyer/id`.
    pub r038_buyer: HashMap<String, Fraction>,
    /// The ratio of disqualified bids to submitted bids for each `tender/procuringEntity/id`.
    pub r038_procuring_entity: HashMap<String, Fraction>,
    /// The ratio of disqualified bids to submitted bids for each `bids/details/tenderers/id`.
    pub r038_tenderer: HashMap<String, Fraction>,
    /// The item classifications for each `bids/details/tenderers/id`.
    pub r048_classifications: HashMap<String, (usize, HashSet<String>)>,
    /// Whether to map contracting processes to organizations.
    pub map: bool,
}

Cardinal processes compiled releases concurrently. The responsibilities of the 3 methods are:

Fold

Operate on a single compiled release (its release argument), and write either final results or intermediate results.

Reduce

Combine the intermediate results from the fold step (if any) into one Indicators instance. The other argument represents the instance that is to be combined.

Finalize

Use the intermediate results to write final results.

Use the set_result! macro to write final results. It accepts an item, group (OCID, Tenderer, Buyer, or ProcuringEntity), identifier, indicator code, and result as a decimal (f64). For example:

set_result!(item, OCID, ocid, R999, 1.0);

Or:

set_result!(item, Buyer, id, R999, 1.0);

Hint

If you remember, the indicator code was added as a variant to the Indicator enum in Add boilerplate content.

Note

Implementing an indicator often raises questions about its methodology. In general, try to implement it such that its result is stable. In other words, new data can cause a red flag to be raised, but shouldn’t cause it to be lowered. This typically means waiting for all relevant data to be available. For example, an indicator about the number of submitted bids should wait for all awards to be complete.

fold method

Final results

If the methodology considers compiled releases in isolation, the final results can be written by the fold method. In this case, the reduce and finalize methods can be deleted.

At this point, you need to know Rust, but you can study other indicators and adapt their code.

Example

R999’s methodology is “A competition completed with few submitted bids.” Comments are provided to ease reading.

    fn fold(&self, item: &mut Indicators, release: &Map<String, Value>, ocid: &str) {
        // A competition is complete if an award is complete.

        // This verbose condition is a typical way to traverse JSON.
        if let Some(Value::Array(awards)) = release.get("awards")
            // There are one or more complete awards.
            && awards.iter().any(
                // An award is complete if its status is "active".
                |award| award.get("status").map_or(false, |status| status.as_str() == Some("active"))
            )
        {
            // The Indicators struct has methods for common operations.
            let bids = Indicators::get_submitted_bids(release).len();
            // Thresholds are typically interpreted as inclusive (<= or >=).
            if bids <= self.threshold {
                // The indicator's value is the number of submitted bids.
                set_result!(item, OCID, ocid, R999, bids as f64);
            }
        }
    }

Try it!

If you run:

echo '{"ocid":"F","bids":{"details":[{"status":"valid"}]},"awards":[{"status":"active"}]}' | cargo run -- indicators --settings settings.ini -

The compiled release should be flagged by the R999 indicator!

{"OCID":{"F":{"R999":1.0}}}

Intermediate results

If the methodology considers compiled releases in aggregate – for example, it uses order statistics to identify outliers – then the fold method writes intermediate results to new field(s) on the Indicators struct. For example:

    /// The documentation for the field.
    pub r999_variable_name: HashMap<String, Fraction>,

To do

If you need guidance on this step, create an issue on GitHub.

reduce method

To do

If you need guidance on this step, create an issue on GitHub.

If the indicator considers and flags a subset of tenderers, buyers, or procuring entities, set item.maps. See r038.rs, for example.

finalize method

To do

If you need guidance on this step, create an issue on GitHub.

Update the init command

  • In src/lib.rs, edit the multiline string at the top of the init function to include a section for the new indicator, and any configurations as comments.

  • In docs/cli/init.md, edit the command’s output at the bottom of the file to match the multiline string.

Next step

Now, you can write the tests.