Collaboration Features¶
Version
Collaboration features (annotations, shareable reports) were introduced in v0.2.0. Database backend support was added in v0.7.0.
LavenderTown includes collaboration features for teams to work together on data quality issues.
Overview¶
Collaboration features enable: - Adding annotations to findings - Creating shareable reports - Tracking issue status - Team workflows
Annotations¶
Adding Annotations¶
Add comments and tags to findings:
from lavendertown.collaboration.api import add_annotation
from lavendertown import Inspector
inspector = Inspector(df)
findings = inspector.detect()
# Add annotation to a finding
annotation = add_annotation(
finding=findings[0],
author="Data Team",
comment="This looks like a data entry error",
tags=["data-entry", "needs-review"],
status="needs-investigation"
)
Status Values¶
reviewed: Finding has been reviewedfixed: Issue has been fixedfalse_positive: Not actually an issueneeds-investigation: Requires further investigation
Retrieving Annotations¶
Get annotations for a finding:
from lavendertown.collaboration.api import get_annotations
annotations = get_annotations(finding)
for ann in annotations:
print(f"{ann.author}: {ann.comment}")
print(f"Tags: {ann.tags}")
print(f"Status: {ann.status}")
Shareable Reports¶
Creating Reports¶
Create reports to share with team members:
from lavendertown.collaboration.api import create_shareable_report
report = create_shareable_report(
title="Q4 Data Quality Report",
author="Data Team",
findings=findings,
annotations=annotations
)
Exporting Reports¶
Export reports to JSON files:
from lavendertown.collaboration.api import export_report
report_path = export_report(report)
print(f"Report saved to: {report_path}")
Importing Reports¶
Import previously exported reports:
from lavendertown.collaboration.api import import_report
report = import_report("report.json")
print(f"Title: {report.title}")
print(f"Author: {report.author}")
print(f"Findings: {len(report.findings)}")
print(f"Annotations: {len(report.annotations)}")
UI Integration¶
Collaboration features are integrated into the Streamlit UI:
- Run your app with
inspector.render() - Select a finding in the findings table
- Add annotations through the UI
- Create and export reports
- Import reports from other team members
Storage¶
File-Based Storage (Default)¶
Annotations and reports are stored in a .lavendertown/ directory:
Note: The .lavendertown/ directory is automatically created and should be added to .gitignore to avoid committing collaboration data.
Database Backend (v0.7.0)¶
For multi-user scenarios and scalable storage, you can use SQLAlchemy-based database storage:
# Install database support
pip install lavendertown[database]
# Configure via environment variables
export LAVENDERTOWN_STORAGE_TYPE=database
export LAVENDERTOWN_DATABASE_URL=postgresql://user:pass@localhost/lavendertown
Supported Databases:
- SQLite (default): Local database at .lavendertown/lavendertown.db
- PostgreSQL: Multi-user database for team collaboration
See the Database Storage API Reference for detailed documentation.
Workflow Example¶
from lavendertown import Inspector
from lavendertown.collaboration.api import (
add_annotation,
create_shareable_report,
export_report
)
# Analyze data
inspector = Inspector(df)
findings = inspector.detect()
# Add annotations
for finding in findings[:3]: # Annotate first 3 findings
add_annotation(
finding=finding,
author="Analyst",
comment="Needs data source verification",
tags=["verification", "critical"],
status="needs-investigation"
)
# Create report
report = create_shareable_report(
title="Weekly Data Quality Review",
author="Data Team",
findings=findings
)
# Export for sharing
report_path = export_report(report)
print(f"Report ready: {report_path}")
CLI Integration¶
Collaboration features are available via CLI:
# Share a report
lavendertown share report.json
# Import a report
lavendertown import-report report.json
Best Practices¶
- Use descriptive comments: Explain why an issue is important
- Tag appropriately: Use consistent tags across the team
- Update status: Keep status current as issues are resolved
- Regular reports: Create regular reports for stakeholders
- Version control: Track report versions for historical reference
Next Steps¶
- Learn about Basic Usage for general data quality analysis
- See API Reference for detailed documentation