Roadmap
Planned — Near Term
High-priority features that build on existing infrastructure.
Schema Export
Export validation rules and schemas to standard interchange formats.
.to_json_schema()method on Validate for JSON Schema output.to_documentation()for auto-generated data documentation (Markdown, HTML)- Round-trip compatibility with YAML validation configs
Validation Registry
Organizational catalog for sharing and discovering validation definitions.
pb.ValidationRegistryfor storing and retrieving validations- Named validation lookup across teams
- Version tracking for validation definitions
- Integration with YAML-based validation configs
Rich Notification Ecosystem
Expand alerting beyond Slack to additional platforms.
send_email()— SMTP email notificationssend_pagerduty()— PagerDuty incident creationsend_discord()— Discord webhookssend_to_datadog()— Datadog events/metricssend_to_opsgenie()— Opsgenie alertstrigger_webhook()— Generic webhook support
Semantic Validation Enhancements
Improve the existing .prompt() LLM-based validation method.
- Batch processing optimization for LLM calls
- Confidence scores for semantic validations
- Advanced caching and cost optimization
- Fallback strategies for rate limits
- Custom prompt templates
Test Data Generation Enhancements
Extend the existing generate_dataset() capabilities.
- Schema interoperability: use a col_schema_match() schema to generate test data
- Unified schema model bridging simple schemas (column/type pairs) and advanced schemas (with field constraints)
- Edge case generation (nulls, boundaries, Unicode, etc.)
- Hypothesis integration for property-based testing
Planned — Medium Term
Features that expand Pointblank’s scope into new domains.
Intelligent Validation Recommendations
AI-powered suggestions for improving existing validations.
- Post-hoc validation improvement suggestions via
.suggest_improvements() - Automatic detection of uncovered columns
- Correlation-based rule suggestions
- Anomaly-based threshold recommendations
AI-Powered Data Documentation
Auto-generate data dictionaries and documentation from data and validations.
pb.document()function for AI-generated documentation- Multiple output formats (Markdown, HTML, PDF, Quarto)
- Integration with validation results for quality context
- Customizable documentation templates
- Incremental documentation updates
Natural Language Validation Queries
Define validations using plain English descriptions.
.from_prompt()method for NL-to-validation conversion- Interactive clarification for ambiguous requests
- Learning from corrections to improve suggestions
- Support for domain-specific terminology
Pipeline Integration Framework
First-class integration with data orchestration tools.
- Apache Airflow operators
- Prefect tasks and flows
- Dagster asset checks
- dbt test adapter
- Luigi tasks
- Kedro hooks
Data Observability Dashboard
Local web dashboard for monitoring data quality over time.
- Historical validation tracking in DuckDB/SQLite
- Trend visualization for data quality metrics
- Anomaly detection on validation metrics over time
- Alerting rules based on metric trends
- Exportable quality scorecards
Data Profiling & Drift Detection
Expand DataScan into comprehensive profiling with drift detection.
pb.DataProfileclass for comprehensive profiling- Profile persistence and comparison
- Statistical drift detection (KS test, PSI, etc.)
- Schema drift detection
- Distribution visualization
- Automatic validation generation from drift
Planned — Long Term
Larger efforts for future milestones.
Multi-Table & Cross-Dataset Validation
Validate relationships across tables and datasets.
pb.ValidateRelationshipsfor multi-table validation- Foreign key validation
- Referential integrity checks
- Cross-table aggregate validations
- Entity resolution checks
- Join quality validation
VS Code Extension
Bring Pointblank directly into the IDE.
- Inline validation preview while writing code
- Validation report viewer in VS Code
- YAML validation schema with autocomplete
- Quick fix suggestions for validation errors
- Data preview with quality indicators
Jupyter/Notebook Magic Commands
Delightful notebook validation experience.
%%pb_validatecell magic%pb_checkline magic for quick assertions%pb_buildinteractive validation builder widget%pb_reportinline report rendering- Automatic validation suggestions in notebooks
Schema Inference & Serialization
Seamlessly move between data, schemas, and validations.
Schema.from_data()with configurable inference- Export Schema to JSON Schema, Avro, SQL DDL, Pydantic
- Import Schema from Pydantic, SQL, Avro, JSON Schema
.with_rules_from_schema()to auto-generate validation steps- Schema diff and merge operations
- Schema evolution tracking
Type Hints & Static Analysis
Enable static type checking for validated data.
pb.ValidatedFramewrapper with schema tracking- Schema-aware type stubs for IDE support
- MyPy/Pyright plugin for static validation hints
- Integration with Polars/Pandas type systems
- Autocomplete for validated column names
Plugin Architecture
Allow third-party extensions to hook into the validation pipeline.
@pb.register_validationdecorator for custom validations@pb.register_actionfor custom notification actions- Plugin discovery and loading system
- Plugin marketplace/registry
- Plugin testing utilities
Backend Expansion
Certify additional database and data lake backends.
- Snowflake (full certification)
- BigQuery (full certification)
- Databricks (full certification)
- Delta Lake
- Apache Iceberg
- Redshift
- ClickHouse
Benchmarking & Performance
Establish Pointblank as the fastest Python validation library.
- Comprehensive benchmark suite
- Performance comparison with competitors
- Optimization for large datasets (streaming validation)
- Lazy evaluation optimization
- Parallel validation execution