Skip to content

feat(ibis): add OpenTelemetry tracing logs #1080

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 6, 2025

Conversation

goldmedal
Copy link
Contributor

@goldmedal goldmedal commented Mar 5, 2025

Description

This PR invokes OpenTelemetry as the tracing framework. It makes us easily export the tracing log to an external framework (e.g., datadog, jaeger, GCP cloud tracing, ..).

How to test

Try to use the following just command to start the ibis server, which will export the tracing log the console:

just run-trace-console

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Integrated comprehensive tracing capabilities with OpenTelemetry to enhance observability of critical operations.
    • Introduced a new command that launches the server with real-time trace logs.
    • Added a new document outlining traced metrics utilized within the ibis-server codebase.
  • Documentation

    • Updated the user guide with steps to enable tracing and configure zero-code instrumentation for improved monitoring.

Copy link

coderabbitai bot commented Mar 5, 2025

Walkthrough

The changes integrate OpenTelemetry tracing throughout the Ibis server. A new "Enable Tracing" section is added to the README with instructions and a new command target in the justfile. Instrumentation is applied to many functions across modules—including rewriter, model substitution, connector classes, and FastAPI endpoints—by wrapping them with OpenTelemetry tracer spans. Additionally, utility functions have been enhanced for header context management, and new dependencies for OpenTelemetry have been added to the project configuration.

Changes

File(s) Change Summary
ibis-server/README.md Added "Enable Tracing" section with instructions for installing dependencies and using the just run-trace-console command.
ibis-server/app/mdl/rewriter.py, ibis-server/app/mdl/substitute.py, ibis-server/app/model/connector.py Instrumented key class methods with OpenTelemetry tracer spans to monitor execution without modifying core logic.
ibis-server/app/routers/v2/connector.py, ibis-server/app/routers/v3/connector.py Updated FastAPI endpoint signatures to include a headers parameter and applied tracer spans for enhanced request observability.
ibis-server/app/util.py Added tracing spans to utility functions and introduced new helper function (build_context) for managing header context.
ibis-server/justfile Added a new target run-trace-console to launch the FastAPI server with OpenTelemetry instrumentation enabled.
ibis-server/pyproject.toml Added dependencies for opentelemetry-api and opentelemetry-sdk with version constraints (>=1.30.0).
ibis-server/Metrics.md Introduced a new document detailing traced metrics across various modules and API endpoints.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Tracer
    participant Service
    Client->>API: Sends request with headers
    API->>Tracer: Start span (build_context extracts header info)
    API->>Service: Execute business logic
    Service-->>API: Return result
    Tracer-->>API: End span
    API->>Client: Return response
Loading

Suggested labels

core

Suggested reviewers

  • onlyjackfrost
  • wwwy3y3

Poem

I'm a happy rabbit, hopping through code all day,
With tracing spans lit, guiding my every way.
Logs and headers twinkle like stars in the night,
In a garden of changes, everything feels just right.
Hop on the trail of progress—watch me leap with delight!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 72929af and 98ac503.

📒 Files selected for processing (3)
  • ibis-server/Metrics.md (1 hunks)
  • ibis-server/README.md (1 hunks)
  • ibis-server/app/util.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • ibis-server/Metrics.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • ibis-server/app/util.py
🧰 Additional context used
🪛 LanguageTool
ibis-server/README.md

[uncategorized] ~81-~81: Possible missing preposition found.
Context: ...un the server bash just run ### Enable Tracing We uses OpenTelemetry as its tr...

(AI_HYDRA_LEO_MISSING_TO)


[grammar] ~82-~82: The pronoun ‘We’ must be used with a non-third-person form of a verb.
Context: ...ash just run ``` ### Enable Tracing We uses OpenTelemetry as its tracing framework....

(NON3PRS_VERB)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci
🔇 Additional comments (1)
ibis-server/README.md (1)

81-90: Fix the Tracing Documentation

There are a few issues in this section:

  • The sentence "We uses OpenTelemetry as its tracing framework." contains a grammatical error. Consider revising it to "We use OpenTelemetry as our tracing framework."
  • The fenced code block (lines 84–86) currently doesn’t specify a language. Specifying it as bash (i.e. using "```bash") will improve syntax highlighting and readability.
  • Optionally, adding hyperlinks to the relevant OpenTelemetry documentation could further enhance clarity for users.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~81-~81: Possible missing preposition found.
Context: ...un the server bash just run ### Enable Tracing We uses OpenTelemetry as its tr...

(AI_HYDRA_LEO_MISSING_TO)


[grammar] ~82-~82: The pronoun ‘We’ must be used with a non-third-person form of a verb.
Context: ...ash just run ``` ### Enable Tracing We uses OpenTelemetry as its tracing framework....

(NON3PRS_VERB)


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added documentation Improvements or additions to documentation ibis dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Mar 5, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
ibis-server/app/routers/v2/connector.py (1)

114-115: 🛠️ Refactor suggestion

Missing tracing for get_db_version endpoint

While all other endpoints have been updated to include tracing, the get_db_version endpoint has been overlooked.

For consistency, consider adding tracing to this endpoint as well:

 @router.post("/{data_source}/metadata/version")
-def get_db_version(data_source: DataSource, dto: MetadataDTO) -> str:
-    return MetadataFactory.get_metadata(data_source, dto.connection_info).get_version()
+def get_db_version(
+    data_source: DataSource,
+    dto: MetadataDTO,
+    headers: Annotated[str | None, Header()] = None,
+) -> str:
+    span_name = f"v2_metadata_version_{data_source}"
+    with tracer.start_as_current_span(
+        name=span_name, kind=trace.SpanKind.SERVER, context=build_context(headers)
+    ):
+        return MetadataFactory.get_metadata(data_source, dto.connection_info).get_version()
🧹 Nitpick comments (10)
ibis-server/pyproject.toml (1)

39-40: Appropriate OpenTelemetry dependencies

Adding these dependencies with version constraints of >=1.30.0 is the correct approach. This allows compatibility with future versions while ensuring a minimum feature set.

However, you might want to consider adding the instrumentation packages as well for automatic instrumentation of common libraries.

opentelemetry-api = ">=1.30.0"
opentelemetry-sdk = ">=1.30.0"
+opentelemetry-instrumentation-fastapi = ">=0.44b0"
+opentelemetry-instrumentation-httpx = ">=0.44b0"
ibis-server/app/mdl/substitute.py (3)

18-18: Correctly implemented span for substitute method

The span decorator is properly configured with the appropriate name and span kind. This will provide valuable tracing information about the method's execution.

Consider adding attributes to the span to provide more context about the operation.

-@tracer.start_as_current_span("substitute", kind=trace.SpanKind.INTERNAL)
+@tracer.start_as_current_span("substitute", kind=trace.SpanKind.INTERNAL)
 def substitute(self, sql: str, write: str | None = None) -> str:
+    current_span = trace.get_current_span()
+    current_span.set_attribute("sql.query", sql)
+    current_span.set_attribute("sql.dialect", self.data_source.value)
+    if write:
+        current_span.set_attribute("sql.write_dialect", write)

18-36: Consider adding internal spans for better granularity

For a complex method like this, consider adding nested spans for key operations to provide more detailed tracing.

 @tracer.start_as_current_span("substitute", kind=trace.SpanKind.INTERNAL)
 def substitute(self, sql: str, write: str | None = None) -> str:
-    ast = parse_one(sql, dialect=self.data_source.value)
-    root = build_scope(ast)
+    with tracer.start_as_current_span("parse_sql"):
+        ast = parse_one(sql, dialect=self.data_source.value)
+    
+    with tracer.start_as_current_span("build_scope"):
+        root = build_scope(ast)
+    
+    with tracer.start_as_current_span("replace_models"):
         for scope in root.traverse():
             for alias, (node, source) in scope.selected_sources.items():
                 if isinstance(source, exp.Table):
                     model = self._find_model(source)
                     if model is None:
                         raise SubstituteError(f"Model not found: {source}")
                     source.replace(
                         exp.Table(
                             catalog=quote(self.manifest["catalog"]),
                             db=quote(self.manifest["schema"]),
                             this=quote(model["name"]),
                             alias=quote(alias),
                         )
                     )
-    return ast.sql(dialect=write)
+    
+    with tracer.start_as_current_span("generate_sql"):
+        result = ast.sql(dialect=write)
+    return result

46-50: Consider adding tracing to _find_model method

This helper method might benefit from tracing, especially if model lookups could be a performance bottleneck.

+@tracer.start_as_current_span("_find_model", kind=trace.SpanKind.INTERNAL)
 def _find_model(self, source: exp.Table) -> dict | None:
     catalog = source.catalog or ""
     schema = source.db or ""
     table = source.name
-    return self.model_dict.get(f"{catalog}.{schema}.{table}", None)
+    key = f"{catalog}.{schema}.{table}"
+    current_span = trace.get_current_span()
+    current_span.set_attribute("model.key", key)
+    result = self.model_dict.get(key, None)
+    current_span.set_attribute("model.found", result is not None)
+    return result
ibis-server/app/util.py (1)

104-105: Consider removing unnecessary function

The get_header function simply returns its input without any transformation, making it potentially redundant if not used for specific purposes elsewhere.

-def get_header(headers: Headers) -> Headers:
-    return headers

Unless this function is used elsewhere as a dependency injection point or for future extensibility, it could be removed to simplify the codebase.

ibis-server/app/routers/v3/connector.py (1)

78-93: Consider adding error attributes to spans

While the span implementation is correct, consider adding error details as span attributes when exceptions occur for better observability.

with tracer.start_as_current_span(
    name=span_name, kind=trace.SpanKind.SERVER, context=build_context(headers)
) as span:
    try:
        validator = Validator(
            Connector(data_source, dto.connection_info),
            Rewriter(dto.manifest_str, data_source=data_source, experiment=True),
        )
        await validator.validate(rule_name, dto.parameters, dto.manifest_str)
        return Response(status_code=204)
+    except Exception as e:
+        span.set_status(trace.StatusCode.ERROR)
+        span.record_exception(e)
+        raise
ibis-server/app/model/connector.py (2)

71-77: Consider adding span attributes for query details

While the span implementation is correct, adding attributes like query length or complexity metrics would enhance observability.

@tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT)
def query(self, sql: str, limit: int) -> pd.DataFrame:
+    with trace.use_span(trace.get_current_span()) as span:
+        span.set_attribute("db.statement.length", len(sql))
+        span.set_attribute("db.result.limit", limit)
    return self.connection.sql(sql).limit(limit).to_pandas()

111-122: Distinguish spans between connector implementations

Consider using more specific span names to differentiate between different connector implementations.

-@tracer.start_as_current_span("connector_query", kind=trace.SpanKind.CLIENT)
+@tracer.start_as_current_span("canner_connector_query", kind=trace.SpanKind.CLIENT)
def query(self, sql: str, limit: int) -> pd.DataFrame:
    # ...
ibis-server/app/routers/v2/connector.py (2)

41-46: Consider adding span attributes for enhanced observability

While the current implementation correctly creates spans, adding attributes to these spans would provide richer context for debugging and monitoring.

Consider enhancing the spans with relevant attributes. For example:

 with tracer.start_as_current_span(
     name=span_name, kind=trace.SpanKind.SERVER, context=build_context(headers)
-):
+) as span:
+    span.set_attribute("data_source", str(data_source))
+    span.set_attribute("operation", "query")
+    # Add other relevant attributes specific to the endpoint

This pattern could be applied to all endpoints, customizing the attributes based on the specific operation and available context data.

Also applies to: 67-70, 89-92, 104-107, 124-126, 139-142, 157-160


41-56: Add error handling to spans

The current implementation doesn't explicitly handle exceptions within spans. Adding proper error handling would ensure errors are properly recorded in traces.

Consider updating the span implementations to include error handling:

 with tracer.start_as_current_span(
     name=span_name, kind=trace.SpanKind.SERVER, context=build_context(headers)
-):
+) as span:
+    try:
         # Existing implementation
+    except Exception as e:
+        span.record_exception(e)
+        span.set_status(trace.StatusCode.ERROR, str(e))
+        raise

This pattern ensures that exceptions are captured in the trace, providing better observability for debugging issues.

Also applies to: 67-80, 89-95, 104-110, 124-129, 139-147, 157-171

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b698a58 and 72929af.

⛔ Files ignored due to path filters (1)
  • ibis-server/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • ibis-server/README.md (1 hunks)
  • ibis-server/app/mdl/rewriter.py (6 hunks)
  • ibis-server/app/mdl/substitute.py (1 hunks)
  • ibis-server/app/model/connector.py (8 hunks)
  • ibis-server/app/routers/v2/connector.py (5 hunks)
  • ibis-server/app/routers/v3/connector.py (3 hunks)
  • ibis-server/app/util.py (2 hunks)
  • ibis-server/justfile (1 hunks)
  • ibis-server/pyproject.toml (1 hunks)
🧰 Additional context used
🪛 LanguageTool
ibis-server/README.md

[grammar] ~82-~82: The pronoun ‘We’ must be used with a non-third-person form of a verb.
Context: ...ash just run ``` ### Enable Tracing We uses OpenTelemetry as its tracing framework....

(NON3PRS_VERB)

🪛 markdownlint-cli2 (0.17.2)
ibis-server/README.md

84-84: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci
🔇 Additional comments (39)
ibis-server/justfile (1)

26-27: Well-implemented tracing configuration

The run-trace-console command correctly configures OpenTelemetry instrumentation with appropriate settings. Exporting traces to console while disabling metrics is a good approach for local development and debugging.

ibis-server/app/mdl/substitute.py (2)

1-1: Good addition of OpenTelemetry import

The OpenTelemetry trace module import is correctly added.


9-10: Well-structured tracer initialization

Creating the tracer with __name__ is the recommended approach as it associates spans with the correct module.

ibis-server/app/util.py (5)

7-10: Good import organization for OpenTelemetry

The necessary imports for OpenTelemetry and header handling are properly organized. This is essential for setting up distributed tracing correctly.


14-14: Appropriate tracer instantiation

The tracer is correctly initialized using the module name, which is a best practice for OpenTelemetry instrumentation.


17-19: Well-structured span decoration for base64_to_dict

The tracing span is properly configured with a descriptive name and the correct kind for internal operations.


22-22: Well-structured span decoration for to_json

Similar to the previous function, this is properly instrumented with a descriptive span name and appropriate kind.


98-101: Properly implemented context extraction

The build_context function correctly handles header propagation for distributed tracing. The null check is important for robustness.

ibis-server/app/mdl/rewriter.py (7)

7-8: Clean import for OpenTelemetry

The import for the trace module is properly isolated and clearly indicates the dependency.


25-26: Correct tracer initialization

The tracer is initialized with the module name, following OpenTelemetry best practices.


45-49: Appropriate span for _transpile method

The span is correctly configured with a descriptive name and the appropriate SpanKind.INTERNAL for an internal operation.


51-61: Well-structured tracing for rewrite method

The span properly wraps the entire method, which is a good approach for capturing the entire operation duration.


63-71: Comprehensive tracing for manifest extraction

The span correctly wraps the manifest extraction logic, including potential exception handling, which is valuable for debugging.


95-105: Use of CLIENT span kind for external rewrite

Excellent choice of using SpanKind.CLIENT for the external rewriter's method, which correctly indicates that this operation involves communication with an external service.


115-122: Consistent tracing for embedded rewrite

The span for the embedded rewrite method maintains consistency with the external rewrite method, while correctly using INTERNAL kind.

ibis-server/app/routers/v3/connector.py (8)

3-6: Properly structured imports for tracing and headers

The imports are cleanly organized with minimal changes to include OpenTelemetry tracing and FastAPI header support.


21-24: Updated util import and tracer initialization

The import for the new build_context function and the tracer initialization are properly implemented.


33-33: Appropriate header parameter addition

The header parameter is correctly added as an optional annotated parameter, following FastAPI best practices.


35-48: Well-structured span for query endpoint

The span is properly configured with a dynamic name based on the operation and data source, and correctly uses the SERVER kind. The context extraction from headers is also properly implemented.


52-59: Consistent tracing approach for dry_plan endpoint

The span follows the same pattern as the query endpoint, maintaining consistency across endpoints.


63-74: Dynamic span naming for data source specific endpoint

Good practice of including the data source in the span name for better context in traces.


97-108: Synchronous function with tracing

Good implementation of tracing for a synchronous function, maintaining the same pattern as the async endpoints.


112-129: Comprehensive tracing for complex model_substitute endpoint

The span properly wraps the entire operation, which involves multiple steps (substitution, dry run, etc.).

ibis-server/app/model/connector.py (6)

18-19: Clean import for OpenTelemetry

The import is isolated and clearly indicates the dependency.


34-35: Correct tracer initialization

The tracer uses the module name for proper identification.


38-55: Tracing for connector initialization

Good practice to trace the initialization process, which helps identify startup performance issues.


94-97: Well-named span for error handling method

The span name clearly indicates its purpose for error message generation.


158-179: Detailed nested span for BigQuery schema handling

Excellent detail in tracing specific error handling logic with a dedicated span. This will help diagnose schema-related issues.


196-213: Appropriate spans for DuckDB operations

The spans are correctly configured with INTERNAL kind, reflecting that these operations don't cross service boundaries.

ibis-server/app/routers/v2/connector.py (10)

3-5: Added OpenTelemetry tracing imports correctly

The imports are properly organized and include the necessary OpenTelemetry trace module.


22-22: Successfully imported build_context utility

The build_context utility function is appropriately imported for handling tracing context.


25-25: Proper tracer initialization

The tracer is correctly initialized using the module name as per OpenTelemetry best practices.


33-56: Well-implemented tracing for query endpoint

The implementation effectively wraps the existing functionality with a tracing span. The span name is dynamically generated based on the dry_run parameter, which provides valuable context.


60-80: Proper tracing implementation for validate endpoint

The tracing span correctly wraps the validation logic with an appropriate span name.


83-95: Appropriate tracing for metadata table endpoint

The implementation follows the established pattern for tracing consistently.


98-110: Good tracing implementation for constraints endpoint

The span naming convention is consistent with other endpoints, which will help with trace analysis.


119-129: Appropriate tracing for dry_plan endpoint

The span name is clear and the tracing implementation is correct.


133-147: Good implementation for dry_plan_for_data_source

The tracing implementation is consistent with other endpoints.


151-171: Well-structured tracing for model_substitute endpoint

The tracing implementation correctly encompasses all operations within the endpoint.

@goldmedal goldmedal requested a review from douenergy March 6, 2025 02:17
Copy link
Contributor

@douenergy douenergy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good. Just a small nitpick: I think we should add what metrics we currently trace.

i just add metrics.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @douenergy. It looks great!

@douenergy douenergy merged commit 72a0e97 into Canner:main Mar 6, 2025
7 checks passed
@goldmedal goldmedal deleted the feat/add-tracing-log branch March 6, 2025 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation ibis python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants