Basic Usage¶
This guide covers the fundamental usage patterns for LavenderTown.
Creating an Inspector¶
The Inspector class is the main entry point for LavenderTown. It accepts a DataFrame (Pandas or Polars) and orchestrates data quality detection.
from lavendertown import Inspector
import pandas as pd
# Load your data
df = pd.read_csv("data.csv")
# Create inspector
inspector = Inspector(df)
Rendering the UI¶
To display the interactive Streamlit UI, call render():
import streamlit as st
inspector = Inspector(df)
inspector.render() # Must be called within Streamlit app context
The UI provides: - Overview metrics (total findings, severity breakdown) - Interactive charts and visualizations - Detailed findings table with filtering - Export options (JSON, CSV) - Custom rule management
Getting Findings Programmatically¶
You can also get findings without the UI:
inspector = Inspector(df)
findings = inspector.detect()
# Process findings
for finding in findings:
print(f"{finding.column}: {finding.description}")
Working with Findings¶
Each finding is a GhostFinding object with the following attributes:
ghost_type: Category of issue (e.g., "null", "type", "outlier")column: Affected column nameseverity: Severity level ("info", "warning", or "error")description: Human-readable descriptionrow_indices: List of affected row indices (when available)metadata: Additional diagnostic information
Filtering Findings¶
findings = inspector.detect()
# Filter by severity
errors = [f for f in findings if f.severity == "error"]
warnings = [f for f in findings if f.severity == "warning"]
# Filter by type
null_findings = [f for f in findings if f.ghost_type == "null"]
outlier_findings = [f for f in findings if f.ghost_type == "outlier"]
# Filter by column
price_findings = [f for f in findings if f.column == "price"]
Custom Detectors¶
You can provide custom detectors when creating an Inspector:
from lavendertown.detectors.null import NullGhostDetector
# Create custom detector with specific threshold
null_detector = NullGhostDetector(null_threshold=0.2) # 20% threshold
# Use with Inspector
inspector = Inspector(df, detectors=[null_detector])
Backend Detection¶
LavenderTown automatically detects whether you're using Pandas or Polars:
import pandas as pd
import polars as pl
# Pandas DataFrame
df_pandas = pd.DataFrame({"value": [1, 2, 3]})
inspector = Inspector(df_pandas) # Automatically uses Pandas backend
# Polars DataFrame
df_polars = pl.DataFrame({"value": [1, 2, 3]})
inspector = Inspector(df_polars) # Automatically uses Polars backend
File Upload Component¶
Version
This feature was introduced in v0.6.0.
LavenderTown includes an enhanced file upload component for Streamlit applications with drag-and-drop support, animated progress indicators, and automatic encoding detection.
Using the Upload Component¶
import streamlit as st
from lavendertown.ui.upload import render_file_upload
from lavendertown import Inspector
# Upload file with enhanced UI
uploaded_file, df, encoding_used = render_file_upload(st)
if uploaded_file is not None:
if df is not None:
st.success(f"File loaded with {encoding_used} encoding")
# Use with Inspector
inspector = Inspector(df)
inspector.render()
else:
st.error("Could not read the uploaded file")
Features¶
- Drag-and-drop interface: Enhanced styling for intuitive file uploads
- Animated progress: Multi-stage progress indicators for visual feedback
- Automatic encoding detection: Tries UTF-8, Latin-1, ISO-8859-1, and CP1252
- File validation: Clear error messages for invalid files
- File size warnings: Alerts for large files (>10MB)
See the Upload Component API Reference for detailed documentation.
Configuration¶
LavenderTown supports configuration through environment variables and .env files. Configuration is automatically loaded when the package is imported.
Environment Variables¶
Create a .env file in your project root or home directory:
The package searches for .env files in:
1. Current directory
2. Parent directories (up to project root)
3. Home directory (as .lavendertown.env)
Using Configuration¶
from lavendertown.config import get_config, get_config_bool, get_config_int
# Get configuration values
log_level = get_config("LAVENDERTOWN_LOG_LEVEL", "WARNING")
output_dir = get_config("LAVENDERTOWN_OUTPUT_DIR", "./output")
# Get typed values
debug_mode = get_config_bool("LAVENDERTOWN_DEBUG", False)
max_rows = get_config_int("LAVENDERTOWN_MAX_ROWS", 1000000)
Configuration is automatically loaded when you import LavenderTown, so no additional setup is required.
Modular UI Components¶
Version
This feature was introduced in v0.7.0.
LavenderTown now supports a modular UI component system that allows you to customize the Inspector interface. You can enable/disable components, reorder them, or create completely custom layouts.
Using Custom UI Layouts¶
from lavendertown import Inspector
from lavendertown.ui.layout import ComponentLayout, create_default_layout
from lavendertown.ui.base import ComponentWrapper
from lavendertown.ui.overview import render_overview
from lavendertown.ui.charts import render_charts
# Create a minimal layout with only overview and charts
custom_layout = ComponentLayout(
components=[
ComponentWrapper(
name="overview",
render_func=render_overview,
order=10,
requires_findings=True,
),
ComponentWrapper(
name="charts",
render_func=render_charts,
order=20,
requires_df=True,
requires_findings=True,
requires_backend=True,
),
]
)
inspector = Inspector(df, ui_layout=custom_layout)
inspector.render()
Disabling Components¶
# Start with default layout and disable components
layout = create_default_layout()
layout.disable_component("sidebar")
layout.disable_component("rule_management")
inspector = Inspector(df, ui_layout=layout)
inspector.render()
See the Modular UI Components Guide for detailed documentation.
Interactive Visualizations with Plotly¶
Version
This feature was introduced in v0.7.0.
LavenderTown now supports Plotly as an optional visualization backend for interactive charts with zoom, pan, and 3D visualizations.
Installing Plotly Support¶
Using Plotly Backend¶
The visualization backend can be selected in the UI when viewing charts. Plotly provides: - Interactive time-series charts with zoom and pan - 3D scatter plots for multi-dimensional outlier visualization - Enhanced bar charts for ghost type distribution - Heatmaps for correlation analysis
Enhanced UI Components¶
Version
This feature was introduced in v0.7.0.
LavenderTown integrates with Streamlit Extras for enhanced UI components including metric cards, badges, and improved layouts.
Installing Streamlit Extras¶
The enhanced components automatically fall back to standard Streamlit components if Streamlit Extras is not installed.
Database Backend for Collaboration¶
Version
This feature was introduced in v0.7.0.
LavenderTown now supports SQLAlchemy-based database storage for collaboration features, enabling multi-user scenarios and scalable report storage.
Installing Database Support¶
Configuring Database Backend¶
Set environment variables to use database storage:
LAVENDERTOWN_STORAGE_TYPE=database
LAVENDERTOWN_DATABASE_URL=postgresql://user:pass@localhost/lavendertown
For SQLite (default):
Next Steps¶
- Learn about Detectors for different detection methods
- Explore Custom Rules for domain-specific validation
- Check out Drift Detection for dataset comparison
- See Modular UI Components for customizing the interface