FrameCheck Methods

These methods allow you to declaratively define validation rules for pandas DataFrames. Each method returns the FrameCheck instance to allow method chaining.

Basic DataFrame Checks

framecheck.FrameCheck.empty(self)

Add a check to ensure the DataFrame is empty.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.not_empty(self)

Add a check to ensure the DataFrame is not empty.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.not_null(self, columns=None, warn_only=False)

Add a check to ensure specified columns have no null (NaN) values.

Parameters:
  • columns (list of str, optional) – Column names to check for null values. If None, all columns will be checked.

  • warn_only (bool, optional) – If True, failures are treated as warnings instead of errors.

Returns:

The updated FrameCheck instance with the null check added.

Return type:

FrameCheck

framecheck.FrameCheck.only_defined_columns(self)

Restrict validation to only the explicitly defined columns.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.raise_on_error(self)

Raise a ValueError if validation fails, instead of just returning the result.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Row-Level Checks

framecheck.FrameCheck.row_count(self, n=None, *, exact=None, min=None, max=None, warn_only=False)

Add a row count check for the DataFrame.

Parameters:
  • n (int, optional) – Shortcut for exact row count.

  • exact (int, optional) – Require exactly this many rows.

  • min (int, optional) – Minimum number of rows allowed.

  • max (int, optional) – Maximum number of rows allowed.

  • warn_only (bool, optional) – If True, failures will be treated as warnings.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Raises:

ValueError – If n is used alongside exact, min, or max.

framecheck.FrameCheck.unique(self, columns=None)

Add a uniqueness constraint on one or more columns.

Parameters:

columns (list of str, optional) – Columns that must contain unique combinations of values.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Column Checks

framecheck.FrameCheck.column(self, name, **kwargs)

Add validation rules for a single column.

Parameters:
  • name (str) – Name of the column.

  • type (str, optional) – The expected data type (e.g., ‘int’, ‘str’, ‘bool’).

  • warn_only (bool, optional) – If True, failures will be treated as warnings.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Raises:

RuntimeError – If called after .only_defined_columns() was set.

framecheck.FrameCheck.columns(self, names, **kwargs)

Apply the same column check logic to multiple columns.

Parameters:
  • names (list of str) – The column names to validate.

  • **kwargs – Additional keyword arguments passed to column().

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.columns_are(self, expected_columns, warn_only=False)

Require that the DataFrame contains only the specified columns in exact order.

Parameters:
  • expected_columns (list of str) – The expected column names.

  • warn_only (bool, optional) – If True, mismatches are warnings instead of errors.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Cross-Column Validations

framecheck.FrameCheck.compare(self, left_column, operator, right_column, type=None, description=None, warn_only=False)

Add a check comparing values between two columns.

This method creates a validation rule that ensures values in one column have the specified relationship to values in another column. It’s useful for validating business rules like “price > cost” or “end_date > start_date”.

Parameters:
  • left_column (str) – Name of the first column to compare.

  • operator (str) – Comparison operator: “<”, “<=”, “==”, “!=”, “>=”, or “>”.

  • right_column (str) – Name of the second column to compare.

  • type (str, optional) – Type of comparison to perform: ‘numeric’, ‘string’, ‘datetime’. If not specified, will try to infer from column types.

  • description (str, optional) – Custom description for the validation message.

  • warn_only (bool, optional) – If True, failures are warnings instead of errors.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Examples

>>> schema = (FrameCheck()
...     .column('price', type='float')
...     .column('cost', type='float')
...     .compare('price', '>', 'cost')
...     .not_null()
... )
>>> # With date comparison and custom error
>>> schema = (FrameCheck()
...     .column('start_date', type='datetime')
...     .column('end_date', type='datetime')
...     .compare('end_date', '>', 'start_date',
...             type='datetime',
...             description="End date must be after start date")
... )

Custom Checks

framecheck.FrameCheck.custom_check(self, function, description=None)

Add a custom user-defined validation function.

Parameters:
  • function (Callable) – A function that returns True for valid rows, False otherwise. For persistence across sessions, use @register_check_function decorator.

  • description (str, optional) – Description of the custom check.

Returns:

The updated FrameCheck instance.

Return type:

FrameCheck

Examples

>>> # Using a lambda (not serializable)
>>> check = FrameCheck().custom_check(lambda row: row['age'] >= 18, "Must be adult")
>>>
>>> # Using a registered function (serializable)
>>> @register_check_function()
>>> def valid_age(row):
...     return row['age'] >= 18
>>>
>>> check = FrameCheck().custom_check(valid_age, "Must be adult")

Validation Execution

framecheck.FrameCheck.validate(self, df)

Run all defined checks against the provided DataFrame.

Parameters:

df (pandas.DataFrame) – The DataFrame to validate.

Returns:

The result of the validation process.

Return type:

ValidationResult

Raises:

ValueError – If raise_on_error() was set and validation fails.

framecheck.FrameCheck.info(self)

Return a dictionary representation of all validation rules.

Returns:

Dictionary containing all column and DataFrame-level validations.

Return type:

dict

Persistence

framecheck.FrameCheck.to_json(self)

Convert validation rules to a JSON string.

Returns:

JSON representation of validation rules.

Return type:

str

framecheck.FrameCheck.to_dict(self)

Convert validation rules to a serializable dictionary.

Returns:

Dictionary representation of validation rules.

Return type:

dict

framecheck.FrameCheck.load(filepath)

Load a FrameCheck instance from a file.

Parameters:

filepath (str) – Path to the input file.

Returns:

Reconstructed FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.from_json(json_str)

Create a FrameCheck instance from a JSON string.

Parameters:

json_str (str) – JSON string containing serialized validation rules.

Returns:

Reconstructed FrameCheck instance.

Return type:

FrameCheck

framecheck.FrameCheck.from_dict(data)

Create a FrameCheck instance from a dictionary.

Parameters:

data (dict) – Dictionary containing serialized validation rules.

Returns:

Reconstructed FrameCheck instance.

Return type:

FrameCheck

Function Registry

framecheck.register_check_function(name=None)

Decorator to register a function as a serializable check function.

Parameters:

name (str, optional) – Custom name for the function. If not provided, the function’s name will be used.

Returns:

Decorator function that registers the decorated function.

Return type:

Callable

Examples

>>> @register_check_function()
>>> def valid_age(row):
...     return 18 <= row['age'] <= 65
>>> @register_check_function(name="custom_price_check")
>>> def check_price_margin(row):
...     return row['price'] >= row['cost'] * 1.2