Releases: awslabs/deequ
Releases · awslabs/deequ
2.0.10
New Features
- Are unique check by @eycho-am in #599
- add DQDL parser dependency by @happy-coral in #603
- scaffolding for checking data quality agains DQDL rulesets by @happy-coral in #604
- Implement translation of rules and add converter for RowCount rule by @happy-coral in #606
Maintenance / Fixes
- feature/replace-rdd by @shriyavanvari in #586
- Adds a test to verify that Deequ's isContainedIn constraint correctly handles string values containing single quotes in the verification process. by @D-Minor in #602
New Contributors
- @shriyavanvari made their first contribution in #586
- @D-Minor made their first contribution in #602
- @happy-coral made their first contribution in #603
Full Changelog: 2.0.9...2.0.10
2.0.9
2.0.8
New Features
- Configurable RetainCompletenessRule by @zeotuan in #564
- Optional specification of instance name in CustomSQL analyzer metric. by @tylermcdaniel0 in #569
- Adding Wilson Score Confidence Interval Strategy by @zeotuan in #567
- CustomAggregator by @joshuazexter in #572
- Add commits from master branch to release/2.0.8-spark-3.5 by @eycho-am in #587
Maintenance / Fixes
- fix typo by @bojackli in #574
- Fix performance of building row-level results by @marcantony in #577
New Contributors
- @joshuazexter made their first contribution in #572
- @bojackli made their first contribution in #574
Full Changelog: 2.0.7...2.0.8
2.0.7
What's Changed
Upgrades
New Features
- New type of MetricsRepository by @VenkataKarthikP:
- Using Spark tables as the data source in #518
- Row Level Result Treatment Options by @eycho-am:
- Anomaly Detection Changes by @zeotuan:
- Add Daily Season with Hourly Interval to HoltWinter in #546
- New analyzers:
- RatioOfSums by @scott-gunn in #552
- Column Count Analyzer and Check by @mentekid in #555
Maintenance/Fixes
- Fix Breeze dependency conflict in Anomaly Detection Spark 3.4+ by @zeotuan in #545
- Data Sync / DatasetMatch changes by @VenkataKarthikP:
- Row level results fixes:
- Add analyzerOption to add filteredRowOutcome for isPrimaryKey Check by @eycho-am in #537
- Fix bug in MinLength and MaxLength when NullBehavior.EmptyString by @eycho-am in #538
- [Min/Max] Apply filtered row behavior at the row level evaluation by @rdsharma26 in #543
- [MinLength/MaxLength] Apply filtered row behavior at the row level evaluation by @rdsharma26 in #547
- Fix for satisfies row level results bug by @rdsharma26 in #553
New Contributors
- @VenkataKarthikP made their first contribution in #518
- @scott-gunn made their first contribution in #552
Full Changelog: 2.0.6...2.0.7
2.0.6
What's Changed
- NEW: Exact Quantile Check
- Creation of Exact Quantile Check by @jmilis2000 in #512
- Data Synchronization/Matching fixes
- Delegate to Spark for checking existence of columns in the given dataframes by @rdsharma26 in #515
- Verify that non key columns exist in each dataset by @rdsharma26 in #517
- Addition of tests
- Test that exceptions within a check's constraints do not affect other… by @tylermcdaniel0 in #516
New Contributors
- @jmilis2000 made their first contribution in #512
- @tylermcdaniel0 made their first contribution in #516
Full Changelog: 2.0.5...2.0.6
2.0.5
What's Changed
- Spark 3.4 Update
- NEW: Custom SQL analyzer
- Analyzer Improvements
New Contributors
Full Changelog: 2.0.4...2.0.5
2.0.4
What's Changed
- Row-Level Results:
- MinLength by @eycho-am in #465
- Uniqueness by @eycho-am in #471
- ColumnValues by @zixianzh1 in #476
- ReferentialIntegrity by @rdsharma26 in #466
- [Experimental] DataSynchronization by @rdsharma26 in #473
- Referential Integrity:
- Updated Referential Integrity to support multiple columns by @rdsharma26 in #463
- Constraints and Condition Changes:
- Add population stability index (PSI) to distance methods by @bevhanno in #480
- Fix chi-square test conditions by @bevhanno in #482
- Missing Column Precondition for Compliance Check - issue fix 467 by @samarth-c1 in #478
- Addition of HasMax/HasMin/HasStandardDeviation/HasMean constraint suggestions by @rdsharma26 in #489
- Alternative aggregate functions to calculate histogram values. by @akalotkin in #475
New Contributors
- @zixianzh1 made their first contribution in #476
- @samarth-c1 made their first contribution in #478
- @akalotkin made their first contribution in #475
Full Changelog: 2.0.3...2.0.4
2.0.3
What's Changed
- Adding chi-square distance method for categorical variables by @bevhanno in #444
- [WIP] Row Level Results by @mentekid in #451
- [Experimental] Addition of dataset comparison utilities by @rdsharma26 in #449
New Contributors
- @rdsharma26 made their first contribution in #447
- @bevhanno made their first contribution in #444
- @mentekid made their first contribution in #451
Full Changelog: 2.0.2...2.0.3