Skip to content

Conversation

@Necrouser045
Copy link

@Necrouser045 Necrouser045 commented Dec 1, 2025

User description

Summary

This PR adds full support for two statistical distribution metrics:

  • SkewnessMetric
  • KurtosisMetric

These metrics already existed in MetricsType but lacked implementations,
datasource handlers, and test coverage. This PR completes the feature.


Changes Included

Core

  • Added SkewnessMetric and KurtosisMetric implementations in numeric_metric.py
  • Registered both metrics in manager.py

Datasources

  • Added SQL datasource handlers:
    • query_get_skewness
    • query_get_kurtosis
  • Added placeholder Search datasource handlers for the same

Tests

  • Added full test suite for skewness and kurtosis:
    • SQL datasource (with and without filters)
    • Search datasource
    • Validation rules
    • Identity generation

All new tests pass.


Why This is Needed

Although skewness and kurtosis were defined in the enum, the framework did not support them.
Users attempting to use these metrics encountered runtime errors. This PR brings complete,
tested support for these two metrics.


Result

7 tests passed successfully


Security Fixes

  • Added _sanitize_identifier to validate SQL identifiers.
  • Updated skewness/kurtosis queries to prevent SQL injection.
  • Compliance warnings resolved.

Thanks for reviewing!


PR Type

Enhancement


Description

  • Implement SkewnessMetric and KurtosisMetric classes with full datasource support

  • Add SQL datasource handlers for skewness and kurtosis calculations

  • Add placeholder Search datasource handlers raising NotImplementedError

  • Register both metrics in the metric manager for framework integration

  • Add comprehensive test suite covering SQL, Search, filters, and validation


Diagram Walkthrough

flowchart LR
  A["MetricsType.SKEWNESS<br/>MetricsType.KURTOSIS"] -->|"Register in"| B["MetricManager"]
  B -->|"Instantiate"| C["SkewnessMetric<br/>KurtosisMetric"]
  C -->|"Call _generate_metric_value"| D["SQLDataSource<br/>SearchIndexDataSource"]
  D -->|"Execute"| E["query_get_skewness<br/>query_get_kurtosis"]
  E -->|"Return"| F["Metric Value"]
Loading

File Walkthrough

Relevant files
Enhancement
numeric_metric.py
Add SkewnessMetric and KurtosisMetric implementations       

dcs_core/core/metric/numeric_metric.py

  • Added SkewnessMetric class extending FieldMetrics with metric identity
    generation
  • Added KurtosisMetric class extending FieldMetrics with metric identity
    generation
  • Both classes implement _generate_metric_value() supporting SQL and
    Search datasources
  • Both classes handle optional filters and raise ValueError for invalid
    datasource types
+66/-0   
sql_datasource.py
Add SQL skewness and kurtosis query handlers                         

dcs_core/core/datasource/sql_datasource.py

  • Added query_get_skewness() method executing SQL skewness function on
    field
  • Added query_get_kurtosis() method executing SQL kurtosis function on
    field
  • Both methods support optional filters via execute_query() helper
+9/-0     
search_datasource.py
Add Search datasource placeholder handlers                             

dcs_core/core/datasource/search_datasource.py

  • Added query_get_skewness() method raising NotImplementedError
  • Added query_get_kurtosis() method raising NotImplementedError
  • Placeholder implementations for future Search datasource support
+7/-0     
Configuration changes
manager.py
Register skewness and kurtosis metrics in manager               

dcs_core/core/metric/manager.py

  • Imported SkewnessMetric and KurtosisMetric from numeric_metric module
  • Registered both metrics in METRIC_MAPPING dictionary with enum values
  • Maps MetricsType.SKEWNESS to "SkewnessMetric" and MetricsType.KURTOSIS
    to "KurtosisMetric"
+4/-0     
Tests
test_numeric_metric.py
Add comprehensive skewness and kurtosis metric tests         

tests/core/metric/test_numeric_metric.py

  • Imported SkewnessMetric and KurtosisMetric test classes
  • Added 7 new test cases in TestSkewnessAndKurtosisMetric class
  • Tests cover SQL datasource with/without filters, Search datasource,
    and validation thresholds
  • Tests verify metric value calculation and validation logic for both
    metrics
+124/-0 

Summary by CodeRabbit

  • New Features

    • Added Skewness and Kurtosis statistical metrics available for SQL and search-backed data sources.
    • Metrics can be computed on individual fields with optional filtering and integrate with the existing metric framework.
  • Tests

    • Added comprehensive tests covering skewness and kurtosis across supported backends, with validations and filter scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 1, 2025

Warning

Rate limit exceeded

@Necrouser045 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 17 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 08e160e and cbb949b.

📒 Files selected for processing (1)
  • dcs_core/core/datasource/sql_datasource.py (1 hunks)

Walkthrough

Adds Skewness and Kurtosis metrics: two new metric classes, SQL implementations (with identifier sanitization), search datasource stubs that raise NotImplementedError, metric manager registration, and tests covering SQL and search variants.

Changes

Cohort / File(s) Summary
DataSource — SQL
dcs_core/core/datasource/sql_datasource.py
Added _sanitize_identifier(); query_get_skewness() and query_get_kurtosis() that sanitize identifiers, build SELECT skewness(...) / SELECT kurtosis(...) queries and delegate to execute_query, supporting optional filters.
DataSource — Search
dcs_core/core/datasource/search_datasource.py
Added query_get_skewness() and query_get_kurtosis() public methods that raise NotImplementedError (stubs).
Metric Implementation
dcs_core/core/metric/numeric_metric.py
Added SkewnessMetric and KurtosisMetric (subclass FieldMetrics), implement get_metric_identity() and _generate_metric_value() dispatching to SQL and search datasource methods.
Metric Manager
dcs_core/core/metric/manager.py
Imported new metric classes and updated METRIC_CLASS_MAPPING to include MetricsType.SKEWNESS and MetricsType.KURTOSIS.
Tests
tests/core/metric/test_numeric_metric.py
Added TestSkewnessAndKurtosisMetric covering PostgreSQL and OpenSearch variants for both metrics, including validation and filter scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Verify _sanitize_identifier covers all SQL identifier edge cases and expected character sets.
  • Inspect constructed SQL for correct quoting/escaping and correct placement/formatting of filters.
  • Confirm NotImplementedError stubs in search datasource are intentional and message clarity.
  • Check metric registration keys match MetricsType enum values and existing naming conventions.
  • Review new tests for adequate coverage and realistic mocks for SQL and search backends.

Suggested reviewers

  • AnshumanX304
  • rishav2803
  • Ryuk-me

Poem

🐰 I hopped through code to add two new tunes,
Skewness and Kurtosis beneath the moon,
SQL hums moments, search waits its part,
Tests tickle numbers, metrics warm my heart. 🥕📈

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.52% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding SkewnessMetric and KurtosisMetric with SQL/Search datasource support and tests, which matches the primary objective of the PR.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all required template sections: it identifies the enhancement type, provides detailed summary and changes, explains rationale, includes test results, and notes security improvements.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 1, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
SQL injection

Description: SQL query is constructed via string interpolation with unescaped table and field
identifiers (f"SELECT skewness({field}) FROM {table}" and f"SELECT kurtosis({field}) FROM
{table}"), enabling SQL injection if table or field are influenced by user input;
identifiers should be safely quoted/validated or bound via parameterized constructs.
sql_datasource.py [1138-1145]

Referred Code
def query_get_skewness(self, table, field, filters=None):
    query = f"SELECT skewness({field}) FROM {table}"
    return self.execute_query(query, filters)


def query_get_kurtosis(self, table, field, filters=None):
    query = f"SELECT kurtosis({field}) FROM {table}"
    return self.execute_query(query, filters)
DoS via unhandled error

Description: Unimplemented search datasource methods for skewness and kurtosis raise
NotImplementedError, which if reachable in production paths can cause denial of service
for certain inputs; ensure callers guard against unsupported metric types or feature-flag
these paths.
search_datasource.py [450-454]

Referred Code
def query_get_skewness(self, index_name, field, filters=None):
    raise NotImplementedError("Skewness metric not implemented for search datasource")

def query_get_kurtosis(self, index_name, field, filters=None):
    raise NotImplementedError("Kurtosis metric not implemented for search datasource")
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing validation: The SQL methods build queries via string interpolation without validating or
parameterizing the table and field inputs, lacking edge-case handling and risking failures
on invalid names or filters.

Referred Code
def query_get_skewness(self, table, field, filters=None):
    query = f"SELECT skewness({field}) FROM {table}"
    return self.execute_query(query, filters)


def query_get_kurtosis(self, table, field, filters=None):
    query = f"SELECT kurtosis({field}) FROM {table}"
    return self.execute_query(query, filters)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
SQL injection risk: The methods construct SQL using f-strings with unvalidated table and field identifiers and
pass filters to execute_query without visible parameterization, introducing a potential
injection vector.

Referred Code
def query_get_skewness(self, table, field, filters=None):
    query = f"SELECT skewness({field}) FROM {table}"
    return self.execute_query(query, filters)


def query_get_kurtosis(self, table, field, filters=None):
    query = f"SELECT kurtosis({field}) FROM {table}"
    return self.execute_query(query, filters)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No auditing: Newly added metric query methods perform data operations without any audit logging of who
executed them or the outcome, which may be required for critical actions depending on
deployment context.

Referred Code
def query_get_skewness(self, table, field, filters=None):
    query = f"SELECT skewness({field}) FROM {table}"
    return self.execute_query(query, filters)


def query_get_kurtosis(self, table, field, filters=None):
    query = f"SELECT kurtosis({field}) FROM {table}"
    return self.execute_query(query, filters)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 1, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Security
Prevent SQL injection by sanitizing inputs
Suggestion Impact:The commit addressed SQL injection by sanitizing the table and field identifiers via a new _sanitize_identifier method and applying it in both functions, achieving the suggestion’s intent though not using the exact suggested helpers.

code diff:

+    def _sanitize_identifier(self, identifier: str) -> str:
+        """
+        Sanitize SQL identifiers to prevent injection.
+        Allows only alphanumeric characters and underscores.
+        """
+        if not isinstance(identifier, str):
+            raise ValueError("Identifier must be a string")
+
+        if not identifier.replace("_", "").isalnum():
+            raise ValueError(f"Invalid SQL identifier: {identifier}")
+
+        return identifier
 
     def query_get_skewness(self, table, field, filters=None):
+        field = self._sanitize_identifier(field)
+        table = self._sanitize_identifier(table)
+
         query = f"SELECT skewness({field}) FROM {table}"
         return self.execute_query(query, filters)
 
 
     def query_get_kurtosis(self, table, field, filters=None):
+        field = self._sanitize_identifier(field)
+        table = self._sanitize_identifier(table)
+
         query = f"SELECT kurtosis({field}) FROM {table}"
         return self.execute_query(query, filters)

To prevent SQL injection, sanitize the table and field inputs in
query_get_skewness and query_get_kurtosis by using the
self.qualified_table_name() and self.quote_column() helper methods.

dcs_core/core/datasource/sql_datasource.py [1138-1145]

 def query_get_skewness(self, table, field, filters=None):
-    query = f"SELECT skewness({field}) FROM {table}"
+    query = f"SELECT skewness({self.quote_column(field)}) FROM {self.qualified_table_name(table)}"
     return self.execute_query(query, filters)
 
 
 def query_get_kurtosis(self, table, field, filters=None):
-    query = f"SELECT kurtosis({field}) FROM {table}"
+    query = f"SELECT kurtosis({self.quote_column(field)}) FROM {self.qualified_table_name(table)}"
     return self.execute_query(query, filters)

[Suggestion processed]

Suggestion importance[1-10]: 10

__

Why: The suggestion correctly identifies a critical SQL injection vulnerability and provides a fix that aligns with the existing code patterns for sanitizing table and column names.

High
High-level
SQL implementation lacks database portability

The SQL implementation for skewness and kurtosis uses non-standard functions,
which limits database portability. This should be addressed by using standard
SQL aggregates or by documenting the database-specific dependency.

Examples:

dcs_core/core/datasource/sql_datasource.py [1138-1140]
    def query_get_skewness(self, table, field, filters=None):
        query = f"SELECT skewness({field}) FROM {table}"
        return self.execute_query(query, filters)
dcs_core/core/datasource/sql_datasource.py [1143-1145]
    def query_get_kurtosis(self, table, field, filters=None):
        query = f"SELECT kurtosis({field}) FROM {table}"
        return self.execute_query(query, filters)

Solution Walkthrough:

Before:

# dcs_core/core/datasource/sql_datasource.py

class SQLDataSource(DataSource):
    # ...
    def query_get_skewness(self, table, field, filters=None):
        query = f"SELECT skewness({field}) FROM {table}"
        return self.execute_query(query, filters)

    def query_get_kurtosis(self, table, field, filters=None):
        query = f"SELECT kurtosis({field}) FROM {table}"
        return self.execute_query(query, filters)

After:

# dcs_core/core/datasource/sql_datasource.py

class SQLDataSource(DataSource):
    # ...
    def query_get_skewness(self, table, field, filters=None):
        # Option 1: Implement with standard SQL for portability
        # (conceptual example)
        portable_query = """
        WITH stats AS (SELECT AVG({field}) as mean, STDDEV({field}) as stddev FROM {table})
        SELECT (SUM(POWER({field} - mean, 3)) / COUNT({field})) / POWER(stddev, 3)
        FROM {table}, stats GROUP BY stddev
        """
        # Option 2: Check dialect and raise error if unsupported
        if self.dialect not in ["postgresql"]:
            raise NotImplementedError("Skewness not supported for this SQL dialect")
        
        query = f"SELECT skewness({field}) FROM {table}"
        return self.execute_query(query, filters)
    
    # ... similar logic for query_get_kurtosis
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a significant portability issue by using non-standard SQL functions (skewness(), kurtosis()) in a generic SQLDataSource, which limits the feature to specific database systems and undermines the abstraction.

High
Possible issue
Fix incorrect test for unimplemented feature

Update the test_should_return_skewness_opensearch_without_filter test to assert
that get_metric_value() returns None, as the underlying query_get_skewness
method for SearchIndexDataSource raises a NotImplementedError.

tests/core/metric/test_numeric_metric.py [1074-1088]

 def test_should_return_skewness_opensearch_without_filter(self):
     mock_data_source = Mock(spec=SearchIndexDataSource)
     mock_data_source.data_source_name = "test_data_source"
-    mock_data_source.query_get_skewness.return_value = 0.0
+    mock_data_source.query_get_skewness.side_effect = NotImplementedError(
+        "Skewness metric not implemented for search datasource"
+    )
 
     metric = SkewnessMetric(
         name="skewness_metric_test",
         data_source=mock_data_source,
         index_name="numeric_metric_test",
         metric_type=MetricsType.SKEWNESS,
         field_name="value",
     )
 
     result = metric.get_metric_value()
-    assert result.value == 0.0
+    assert result is None
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that a test case is inconsistent with the implementation, providing a false positive. Fixing this improves the accuracy and reliability of the test suite.

Medium
General
Ensure consistent use of enum values
Suggestion Impact:The commit updated the METRIC_MAPPING keys to use .value for SKEWNESS and KURTOSIS, matching the suggested change.

code diff:

-        MetricsType.SKEWNESS: "SkewnessMetric",
-        MetricsType.KURTOSIS: "KurtosisMetric",
+        MetricsType.SKEWNESS.value: "SkewnessMetric",
+        MetricsType.KURTOSIS.value: "KurtosisMetric",

In METRIC_MAPPING, use MetricsType.SKEWNESS.value and MetricsType.KURTOSIS.value
as keys instead of the enum members directly to maintain consistency with other
entries.

dcs_core/core/metric/manager.py [67-69]

 MetricsType.CUSTOM_SQL.value: "CustomSqlMetric",
-MetricsType.SKEWNESS: "SkewnessMetric",
-MetricsType.KURTOSIS: "KurtosisMetric",
+MetricsType.SKEWNESS.value: "SkewnessMetric",
+MetricsType.KURTOSIS.value: "KurtosisMetric",

[Suggestion processed]

Suggestion importance[1-10]: 5

__

Why: The suggestion correctly points out an inconsistency in how enum members are used as dictionary keys, and fixing it improves code consistency and robustness against future changes to the enum.

Low
  • Update

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/core/metric/test_numeric_metric.py (1)

1037-1158: Skewness/Kurtosis tests are consistent and useful

The new TestSkewnessAndKurtosisMetric suite:

  • Mirrors the existing numeric-metric patterns (Postgres vs. OpenSearch, with/without filters, plus validation).
  • Correctly stubs query_get_skewness / query_get_kurtosis on mocks, so tests are isolated from datasource details.
  • Exercises validation semantics in both directions (lt and gt thresholds).

If you want parity with some of the other metric tests, you could optionally add OpenSearch + filter cases for skewness and kurtosis, but the current coverage is already solid.

dcs_core/core/datasource/search_datasource.py (1)

449-455: Search datasource stubs are fine; consider adding type hints/docstrings

Defining query_get_skewness / query_get_kurtosis as NotImplementedError stubs is a reasonable interim step so metrics can at least fail fast when used with search backends.

For consistency with the rest of this class, consider tightening them up with type hints and brief docstrings, e.g.:

-    def query_get_skewness(self, index_name, field, filters=None):
-        raise NotImplementedError("Skewness metric not implemented for search datasource")
+    def query_get_skewness(
+        self, index_name: str, field: str, filters: Dict | None = None
+    ) -> float:
+        """Get skewness for a numeric field in a search index."""
+        raise NotImplementedError(
+            "Skewness metric not implemented for search datasource"
+        )

and similarly for query_get_kurtosis.

dcs_core/core/metric/numeric_metric.py (1)

415-479: New Skewness/Kurtosis metric implementations look correct and consistent

The SkewnessMetric and KurtosisMetric classes:

  • Use MetricIdentity.generate_identity with MetricsType.SKEWNESS / MetricsType.KURTOSIS and the same identity fields as other FieldMetrics implementations.
  • Correctly dispatch _generate_metric_value based on datasource type:
    • SQLDataSourcequery_get_skewness / query_get_kurtosis with table/field/filters.
    • SearchIndexDataSource → corresponding search methods with index_name/field/filters.
  • Fall back to ValueError("Invalid data source type") in the same style as existing metrics.

From a core-metric perspective, this wiring is solid. Any future change to use a more specific exception type should probably be applied uniformly across all FieldMetrics subclasses, not just these two.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 425a7f9 and 4f24cc2.

📒 Files selected for processing (5)
  • dcs_core/core/datasource/search_datasource.py (1 hunks)
  • dcs_core/core/datasource/sql_datasource.py (1 hunks)
  • dcs_core/core/metric/manager.py (2 hunks)
  • dcs_core/core/metric/numeric_metric.py (1 hunks)
  • tests/core/metric/test_numeric_metric.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
dcs_core/core/datasource/sql_datasource.py (1)
dcs_core/core/datasource/search_datasource.py (2)
  • query_get_skewness (450-451)
  • query_get_kurtosis (453-454)
dcs_core/core/datasource/search_datasource.py (1)
dcs_core/core/datasource/sql_datasource.py (2)
  • query_get_skewness (1138-1140)
  • query_get_kurtosis (1143-1145)
dcs_core/core/metric/numeric_metric.py (4)
dcs_core/core/metric/base.py (1)
  • FieldMetrics (213-234)
dcs_core/core/common/models/metric.py (1)
  • MetricsType (26-53)
dcs_core/core/datasource/sql_datasource.py (2)
  • query_get_skewness (1138-1140)
  • query_get_kurtosis (1143-1145)
dcs_core/core/datasource/search_datasource.py (2)
  • query_get_skewness (450-451)
  • query_get_kurtosis (453-454)
tests/core/metric/test_numeric_metric.py (5)
dcs_core/core/metric/numeric_metric.py (2)
  • SkewnessMetric (415-446)
  • KurtosisMetric (448-479)
dcs_core/core/datasource/sql_datasource.py (3)
  • SQLDataSource (28-1145)
  • query_get_skewness (1138-1140)
  • query_get_kurtosis (1143-1145)
dcs_core/core/datasource/search_datasource.py (3)
  • query_get_skewness (450-451)
  • SearchIndexDataSource (23-454)
  • query_get_kurtosis (453-454)
dcs_core/core/common/models/metric.py (1)
  • MetricsType (26-53)
dcs_core/core/metric/base.py (1)
  • get_metric_value (127-175)
dcs_core/core/metric/manager.py (2)
dcs_core/core/metric/numeric_metric.py (2)
  • SkewnessMetric (415-446)
  • KurtosisMetric (448-479)
dcs_core/core/common/models/metric.py (1)
  • MetricsType (26-53)
🪛 Ruff (0.14.6)
dcs_core/core/datasource/sql_datasource.py

1139-1139: Possible SQL injection vector through string-based query construction

(S608)


1144-1144: Possible SQL injection vector through string-based query construction

(S608)

dcs_core/core/metric/numeric_metric.py

446-446: Prefer TypeError exception for invalid type

(TRY004)


446-446: Avoid specifying long messages outside the exception class

(TRY003)


479-479: Prefer TypeError exception for invalid type

(TRY004)


479-479: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (3)
dcs_core/core/metric/manager.py (1)

39-40: New numeric metric imports look correct

Importing SkewnessMetric and KurtosisMetric here is consistent with how other numeric metrics are exposed for the globals() factory; no issues from this change.

tests/core/metric/test_numeric_metric.py (1)

33-34: Imports for new metrics are wired correctly

Exposing SkewnessMetric and KurtosisMetric in the test module aligns with the new metric implementations in numeric_metric.py; nothing to change here.

dcs_core/core/datasource/sql_datasource.py (1)

1138-1145: Align skewness/kurtosis SQL helpers with existing query_ patterns (and avoid S608 risk)*

The new methods currently do:

def query_get_skewness(self, table, field, filters=None):
    query = f"SELECT skewness({field}) FROM {table}"
    return self.execute_query(query, filters)

def query_get_kurtosis(self, table, field, filters=None):
    query = f"SELECT kurtosis({field}) FROM {table}"
    return self.execute_query(query, filters)

Two problems:

  1. Inconsistent with the rest of the class: other helpers (query_get_avg, query_get_variance, etc.) use qualified_table_name and quote_column, and explicitly splice filters into the SQL string.
  2. Static-analysis‑flagged SQL construction: embedding raw table/field without quoting is exactly what Ruff's S608 is warning about and can lead to malformed SQL or injection if those values are not fully controlled.

To match existing patterns and address the warning, consider something along these lines:

-    def query_get_skewness(self, table, field, filters=None):
-        query = f"SELECT skewness({field}) FROM {table}"
-        return self.execute_query(query, filters)
+    def query_get_skewness(
+        self, table: str, field: str, filters: str | None = None
+    ) -> float:
+        qualified_table_name = self.qualified_table_name(table)
+        field = self.quote_column(field)
+        query = f"SELECT skewness({field}) FROM {qualified_table_name}"
+        if filters:
+            query += f" WHERE {filters}"
+        result = self.fetchone(query)
+        return result[0] if result else None
@@
-    def query_get_kurtosis(self, table, field, filters=None):
-        query = f"SELECT kurtosis({field}) FROM {table}"
-        return self.execute_query(query, filters)
+    def query_get_kurtosis(
+        self, table: str, field: str, filters: str | None = None
+    ) -> float:
+        qualified_table_name = self.qualified_table_name(table)
+        field = self.quote_column(field)
+        query = f"SELECT kurtosis({field}) FROM {qualified_table_name}"
+        if filters:
+            query += f" WHERE {filters}"
+        result = self.fetchone(query)
+        return result[0] if result else None

This keeps behavior in line with other numeric metrics and reduces the surface area for SQL construction issues.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
dcs_core/core/datasource/sql_datasource.py (1)

1159-1164: Critical: Same issues as query_get_skewness - execute_query undefined, pattern inconsistency, database compatibility.

This method has identical issues to query_get_skewness:

  • self.execute_query() does not exist in the class
  • Does not follow established patterns using qualified_table_name() and quote_column()
  • kurtosis() is not a standard SQL function

Apply the same fix pattern:

     def query_get_kurtosis(self, table, field, filters=None):
-        field = self._sanitize_identifier(field)
-        table = self._sanitize_identifier(table)
-
-        query = f"SELECT kurtosis({field}) FROM {table}"
-        return self.execute_query(query, filters)
+        qualified_table_name = self.qualified_table_name(table)
+        field = self.quote_column(field)
+        query = f"SELECT kurtosis({field}) FROM {qualified_table_name}"
+        if filters:
+            query += f" WHERE {filters}"
+        return self.fetchone(query)[0]
🧹 Nitpick comments (1)
dcs_core/core/datasource/sql_datasource.py (1)

1138-1149: Use TypeError for type validation and consider integration with existing quoting methods.

The sanitizer validates identifiers correctly for alphanumeric and underscore characters. However:

  1. Type validation should raise TypeError rather than ValueError (line 1144).
  2. This sanitizer doesn't integrate with the existing quote_column() and qualified_table_name() methods used throughout this class. Consider whether the sanitizer should be used in conjunction with those methods or if it's meant to replace them.

Apply this diff to use TypeError for type validation:

     def _sanitize_identifier(self, identifier: str) -> str:
         """
         Sanitize SQL identifiers to prevent injection.
         Allows only alphanumeric characters and underscores.
         """
         if not isinstance(identifier, str):
-            raise ValueError("Identifier must be a string")
+            raise TypeError("Identifier must be a string")
 
         if not identifier.replace("_", "").isalnum():
             raise ValueError(f"Invalid SQL identifier: {identifier}")
 
         return identifier
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f24cc2 and 59664a8.

📒 Files selected for processing (1)
  • dcs_core/core/datasource/sql_datasource.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
dcs_core/core/datasource/sql_datasource.py (1)
dcs_core/core/datasource/search_datasource.py (2)
  • query_get_skewness (450-451)
  • query_get_kurtosis (453-454)
🪛 Ruff (0.14.6)
dcs_core/core/datasource/sql_datasource.py

1144-1144: Prefer TypeError exception for invalid type

(TRY004)


1144-1144: Avoid specifying long messages outside the exception class

(TRY003)


1147-1147: Avoid specifying long messages outside the exception class

(TRY003)


1155-1155: Possible SQL injection vector through string-based query construction

(S608)


1163-1163: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (1)
dcs_core/core/datasource/sql_datasource.py (1)

1151-1156: I was unable to verify the claims in this review comment due to repository access issues. The shell scripts failed to clone the repository, preventing me from examining:

  • Whether execute_query() method actually exists in the class
  • The actual implementation patterns of other query_get_* methods (e.g., query_get_max, query_get_avg)
  • Whether _sanitize_identifier() is a legitimate method in the class
  • Whether the suggested diff pattern using qualified_table_name(), quote_column(), and fetchone() is correct
  • How skewness() function is used elsewhere in the codebase

The review comment makes specific, actionable claims about critical runtime failures and inconsistent patterns that require verification against the actual codebase. Without access to verify these factual assertions, I cannot confirm whether the issues are real or if the suggested fixes are appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant