FrameCheck Methods
These methods allow you to declaratively define validation rules for pandas DataFrames. Each method returns the FrameCheck instance to allow method chaining.
Basic DataFrame Checks
- framecheck.FrameCheck.empty(self)
Add a check to ensure the DataFrame is empty.
- Returns:
The updated FrameCheck instance.
- Return type:
- framecheck.FrameCheck.not_empty(self)
Add a check to ensure the DataFrame is not empty.
- Returns:
The updated FrameCheck instance.
- Return type:
- framecheck.FrameCheck.not_null(self, columns=None, warn_only=False)
Add a check to ensure specified columns have no null (NaN) values.
- Parameters:
columns (list of str, optional) – Column names to check for null values. If None, all columns will be checked.
warn_only (bool, optional) – If True, failures are treated as warnings instead of errors.
- Returns:
The updated FrameCheck instance with the null check added.
- Return type:
- framecheck.FrameCheck.only_defined_columns(self)
Restrict validation to only the explicitly defined columns.
- Returns:
The updated FrameCheck instance.
- Return type:
- framecheck.FrameCheck.raise_on_error(self)
Raise a ValueError if validation fails, instead of just returning the result.
- Returns:
The updated FrameCheck instance.
- Return type:
Row-Level Checks
- framecheck.FrameCheck.row_count(self, n=None, *, exact=None, min=None, max=None, warn_only=False)
Add a row count check for the DataFrame.
- Parameters:
n (int, optional) – Shortcut for exact row count.
exact (int, optional) – Require exactly this many rows.
min (int, optional) – Minimum number of rows allowed.
max (int, optional) – Maximum number of rows allowed.
warn_only (bool, optional) – If True, failures will be treated as warnings.
- Returns:
The updated FrameCheck instance.
- Return type:
- Raises:
ValueError – If n is used alongside exact, min, or max.
- framecheck.FrameCheck.unique(self, columns=None)
Add a uniqueness constraint on one or more columns.
- Parameters:
columns (list of str, optional) – Columns that must contain unique combinations of values.
- Returns:
The updated FrameCheck instance.
- Return type:
Column Checks
- framecheck.FrameCheck.column(self, name, **kwargs)
Add validation rules for a single column.
- Parameters:
name (str) – Name of the column.
type (str, optional) – The expected data type (e.g., ‘int’, ‘str’, ‘bool’).
warn_only (bool, optional) – If True, failures will be treated as warnings.
- Returns:
The updated FrameCheck instance.
- Return type:
- Raises:
RuntimeError – If called after .only_defined_columns() was set.
- framecheck.FrameCheck.columns(self, names, **kwargs)
Apply the same column check logic to multiple columns.
- Parameters:
names (list of str) – The column names to validate.
**kwargs – Additional keyword arguments passed to column().
- Returns:
The updated FrameCheck instance.
- Return type:
- framecheck.FrameCheck.columns_are(self, expected_columns, warn_only=False)
Require that the DataFrame contains only the specified columns in exact order.
- Parameters:
expected_columns (list of str) – The expected column names.
warn_only (bool, optional) – If True, mismatches are warnings instead of errors.
- Returns:
The updated FrameCheck instance.
- Return type:
Cross-Column Validations
- framecheck.FrameCheck.compare(self, left_column, operator, right_column, type=None, description=None, warn_only=False)
Add a check comparing values between two columns.
This method creates a validation rule that ensures values in one column have the specified relationship to values in another column. It’s useful for validating business rules like “price > cost” or “end_date > start_date”.
- Parameters:
left_column (str) – Name of the first column to compare.
operator (str) – Comparison operator: “<”, “<=”, “==”, “!=”, “>=”, or “>”.
right_column (str) – Name of the second column to compare.
type (str, optional) – Type of comparison to perform: ‘numeric’, ‘string’, ‘datetime’. If not specified, will try to infer from column types.
description (str, optional) – Custom description for the validation message.
warn_only (bool, optional) – If True, failures are warnings instead of errors.
- Returns:
The updated FrameCheck instance.
- Return type:
Examples
>>> schema = (FrameCheck() ... .column('price', type='float') ... .column('cost', type='float') ... .compare('price', '>', 'cost') ... .not_null() ... )
>>> # With date comparison and custom error >>> schema = (FrameCheck() ... .column('start_date', type='datetime') ... .column('end_date', type='datetime') ... .compare('end_date', '>', 'start_date', ... type='datetime', ... description="End date must be after start date") ... )
Custom Checks
- framecheck.FrameCheck.custom_check(self, function, description=None)
Add a custom user-defined validation function.
- Parameters:
function (Callable) – A function that returns True for valid rows, False otherwise. For persistence across sessions, use @register_check_function decorator.
description (str, optional) – Description of the custom check.
- Returns:
The updated FrameCheck instance.
- Return type:
Examples
>>> # Using a lambda (not serializable) >>> check = FrameCheck().custom_check(lambda row: row['age'] >= 18, "Must be adult") >>> >>> # Using a registered function (serializable) >>> @register_check_function() >>> def valid_age(row): ... return row['age'] >= 18 >>> >>> check = FrameCheck().custom_check(valid_age, "Must be adult")
Validation Execution
- framecheck.FrameCheck.validate(self, df)
Run all defined checks against the provided DataFrame.
- Parameters:
df (pandas.DataFrame) – The DataFrame to validate.
- Returns:
The result of the validation process.
- Return type:
ValidationResult
- Raises:
ValueError – If raise_on_error() was set and validation fails.
- framecheck.FrameCheck.info(self)
Return a dictionary representation of all validation rules.
- Returns:
Dictionary containing all column and DataFrame-level validations.
- Return type:
dict
Persistence
- framecheck.FrameCheck.to_json(self)
Convert validation rules to a JSON string.
- Returns:
JSON representation of validation rules.
- Return type:
str
- framecheck.FrameCheck.to_dict(self)
Convert validation rules to a serializable dictionary.
- Returns:
Dictionary representation of validation rules.
- Return type:
dict
- framecheck.FrameCheck.load(filepath)
Load a FrameCheck instance from a file.
- Parameters:
filepath (str) – Path to the input file.
- Returns:
Reconstructed FrameCheck instance.
- Return type:
- framecheck.FrameCheck.from_json(json_str)
Create a FrameCheck instance from a JSON string.
- Parameters:
json_str (str) – JSON string containing serialized validation rules.
- Returns:
Reconstructed FrameCheck instance.
- Return type:
- framecheck.FrameCheck.from_dict(data)
Create a FrameCheck instance from a dictionary.
- Parameters:
data (dict) – Dictionary containing serialized validation rules.
- Returns:
Reconstructed FrameCheck instance.
- Return type:
Function Registry
- framecheck.register_check_function(name=None)
Decorator to register a function as a serializable check function.
- Parameters:
name (str, optional) – Custom name for the function. If not provided, the function’s name will be used.
- Returns:
Decorator function that registers the decorated function.
- Return type:
Callable
Examples
>>> @register_check_function() >>> def valid_age(row): ... return 18 <= row['age'] <= 65
>>> @register_check_function(name="custom_price_check") >>> def check_price_margin(row): ... return row['price'] >= row['cost'] * 1.2