Skip to content

feat(low-code): add check dynamic stream #223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 16, 2025

Conversation

lazebnyi
Copy link
Contributor

@lazebnyi lazebnyi commented Jan 16, 2025

What

The current implementation can only check connections for static streams. If a source implements only dynamic streams, this check is bypassed.

Fixed: https://github.com/airbytehq/airbyte-internal-issues/issues/11371

How

Added CheckDynamicStream to the CDK to natively check connections for cases where the source implements only dynamic streams.

Summary by CodeRabbit

  • New Features

    • Introduced CheckDynamicStream to verify connectivity of dynamic streams.
    • Added support for checking multiple streams with configurable stream count.
    • Enhanced schema validation to support dynamic stream checks.
  • Improvements

    • Updated component factory to handle dynamic stream creation.
    • Expanded type mapping for more flexible stream checking.
    • Improved error handling for stream availability checks.
  • Documentation

    • Updated schema and component documentation to reflect new dynamic stream capabilities.
  • Tests

    • Added tests for CheckDynamicStream functionality, covering various HTTP response scenarios.

Copy link
Contributor

coderabbitai bot commented Jan 16, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a CheckDynamicStream feature to the Airbyte CDK, enhancing connection checking capabilities for declarative sources. It adds a new component for verifying connectivity across multiple streams with a configurable stream count. The changes encompass several files, including new model definitions, schema updates, and modifications to existing infrastructure to support dynamic stream checking.

Changes

File Change Summary
airbyte_cdk/sources/declarative/checks/__init__.py Added imports for new models and CheckDynamicStream, created COMPONENTS_CHECKER_TYPE_MAPPING, updated __all__
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py New CheckDynamicStream class implementing dynamic stream connection checking
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Updated check property to support multiple stream check types, added CheckDynamicStream definition
airbyte_cdk/sources/declarative/manifest_declarative_source.py Modified connection checker to use COMPONENTS_CHECKER_TYPE_MAPPING
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Added CheckDynamicStream class, updated type hints for check field
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Added method to create CheckDynamicStream components
unit_tests/sources/declarative/checks/test_check_dynamic_stream.py New test file for CheckDynamicStream functionality

Possibly related PRs

Suggested Reviewers

  • maxi297
  • darynaishchenko
  • aldogonzalez8

Hey there! 👋 I noticed you've added a really cool dynamic stream checking feature. Would you be open to discussing how this might impact existing declarative source configurations? Wdyt? The new stream_count parameter seems like it could be super flexible for different use cases. 🚀

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (2)

21-22: Minor correction in docstring

Would you consider updating the docstring to change "numbers of streams to check" to "number of streams to check"? Wdyt?


37-50: Instantiate HttpAvailabilityStrategy outside the loop

To improve efficiency, how about instantiating the HttpAvailabilityStrategy before the loop? This way, we avoid creating a new instance in each iteration. Wdyt?

You could apply this diff:

     def check_connection(
         self, source: AbstractSource, logger: logging.Logger, config: Mapping[str, Any]
     ) -> Tuple[bool, Any]:
         streams = source.streams(config=config)
         if len(streams) == 0:
             return False, f"No streams to connect to from source {source}"

+        availability_strategy = HttpAvailabilityStrategy()
         for stream_index in range(min(self.stream_count, len(streams))):
             stream = streams[stream_index]
-            availability_strategy = HttpAvailabilityStrategy()
             try:
                 stream_is_available, reason = availability_strategy.check_availability(
                     stream, logger
                 )
                 if not stream_is_available:
                     return False, reason
             except Exception as error:
                 logger.error(
                     f"Encountered an error trying to connect to stream {stream.name}. Error: \n {traceback.format_exc()}"
                 )
                 return False, f"Unable to connect to stream {stream.name} - {error}"
         return True, None
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c109297 and 1d5f5bc.

📒 Files selected for processing (8)
  • airbyte_cdk/sources/declarative/checks/__init__.py (1 hunks)
  • airbyte_cdk/sources/declarative/checks/check_dynamic_stream.py (1 hunks)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (2 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • airbyte_cdk/sources/declarative/requesters/README.md (2 hunks)
  • unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/sources/declarative/requesters/README.md
👮 Files not reviewed due to content moderation or server errors (4)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py

[warning] 5-455: Import block is un-sorted or un-formatted

airbyte_cdk/sources/declarative/manifest_declarative_source.py

[warning] 5-56: Import block is un-sorted or un-formatted

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py

[warning] 5-14: Import block is un-sorted or un-formatted

airbyte_cdk/sources/declarative/checks/__init__.py

[warning] 5-21: Import block is un-sorted or un-formatted

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/checks/__init__.py (1)

5-21: Address linter warning on import formatting

There's a linter warning indicating that the import block is un-sorted or un-formatted. Maybe we can sort and format the imports to comply with the code style guidelines? Wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] 5-21: Import block is un-sorted or un-formatted

unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (1)

5-14: Address linter warning on import formatting

There's a linter warning about the imports being un-sorted or un-formatted. Perhaps we could sort and format the import statements to fix this? Wdyt?

🧰 Tools
🪛 GitHub Actions: Linters

[warning] 5-14: Import block is un-sorted or un-formatted

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Jan 16, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@lazebnyi lazebnyi changed the title [low-code]: add check dynamic stream feat[low-code]: add check dynamic stream Jan 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

847-852: LGTM! Would you consider adding docstring for better maintainability?

The implementation looks good and follows the established patterns. For consistency with other create methods, would you consider adding a docstring to document the purpose and parameters? wdyt?

Here's a suggested docstring:

 @staticmethod
 def create_check_dynamic_stream(
     model: CheckDynamicStreamModel, config: Config, **kwargs: Any
 ) -> CheckDynamicStream:
+    """Creates a CheckDynamicStream component for checking dynamic streams.
+
+    Args:
+        model: The model containing the configuration for the dynamic stream checker
+        config: The connector configuration
+        **kwargs: Additional keyword arguments
+
+    Returns:
+        A CheckDynamicStream instance configured with the specified stream count
+    """
     return CheckDynamicStream(stream_count=model.stream_count, parameters={})
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d5f5bc and 8f017a8.

📒 Files selected for processing (4)
  • airbyte_cdk/sources/declarative/checks/__init__.py (1 hunks)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py (2 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
  • unit_tests/sources/declarative/checks/test_check_dynamic_stream.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • airbyte_cdk/sources/declarative/manifest_declarative_source.py
  • unit_tests/sources/declarative/checks/test_check_dynamic_stream.py
  • airbyte_cdk/sources/declarative/checks/init.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)

57-57: LGTM!

The import statement is correctly placed in the imports section and follows the alphabetical order.


124-126: LGTM!

The import statement for the model is correctly placed and follows the same pattern as other model imports.

@lazebnyi lazebnyi changed the title feat[low-code]: add check dynamic stream feat(low-code): add check dynamic stream Jan 16, 2025
@github-actions github-actions bot added the enhancement New feature or request label Jan 16, 2025
Copy link
Contributor

@darynaishchenko darynaishchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! in terms of source-airtable this is exactly what we need

@lazebnyi
Copy link
Contributor Author

lazebnyi commented Jan 16, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

308-322: The schema looks good! Would you consider enhancing the documentation?

The implementation is solid, but the documentation could be more helpful for users. What do you think about:

  1. Adding examples of when to use CheckDynamicStream vs CheckStream?
  2. Clarifying in the stream_count description how this number is used (e.g., "Specifies how many dynamic streams to sample during the check operation")?
  3. Adding example values for stream_count to guide users?
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

586-588: Consider adding more diverse examples for scopes, wdyt?

The current example shows only CRM-related scopes. Consider adding examples from different domains to better illustrate the variety of possible scope combinations.

         examples=[
-            ["crm.list.read", "crm.objects.contacts.read", "crm.schema.contacts.read"]
+            ["crm.list.read", "crm.objects.contacts.read"],
+            ["analytics.read", "users.write", "files.all"],
+            ["read:users", "write:posts", "admin"]
         ],
🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. The code does not conform to the project's formatting standards.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01b528f and 233c383.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (12 hunks)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/models/declarative_component_schema.py

[warning] File requires formatting. The code does not conform to the project's formatting standards.

⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (7)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

21-23: LGTM! Clean implementation of multiple check types.

The schema change elegantly supports both static and dynamic stream checks while maintaining backward compatibility.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)

55-62: LGTM! Clean implementation of CheckDynamicStream.

The implementation is simple, focused, and well-documented. The stream_count field has clear description and title.

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. The code does not conform to the project's formatting standards.


1670-1670: LGTM! Good enhancement to support dynamic stream checking.

The update to both DeclarativeSource1 and DeclarativeSource2 to support CheckDynamicStream is backward compatible and maintains consistency between the two versions.

Also applies to: 1696-1696

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. The code does not conform to the project's formatting standards.


992-1006: LGTM! Enhanced type safety with explicit type hints.

The addition of type hints improves code maintainability and helps catch potential type-related issues early. The formatting is consistent across all updated sections.

Also applies to: 1615-1617, 2058-2062, 2105-2107, 2141-2145, 2209-2214

🧰 Tools
🪛 GitHub Actions: Linters

[warning] File requires formatting. The code does not conform to the project's formatting standards.

airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)

57-57: LGTM! Clean import addition.

The import of CheckDynamicStream maintains alphabetical order and follows the project's import style.


499-499: LGTM! Proper factory mapping.

The addition of CheckDynamicStreamModel to PYDANTIC_MODEL_TO_CONSTRUCTOR maintains alphabetical order and follows the existing pattern.


853-858: LGTM! Well-structured implementation.

The create_check_dynamic_stream method follows the same pattern as create_check_stream, is properly decorated as a static method, and has a clean implementation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)

55-62: LGTM! Clean implementation of the new CheckDynamicStream class.

The class is well-structured with proper field descriptions and types. The stream_count field is appropriately marked as required with clear documentation.

Quick question: Would it be helpful to add a default or example value for the stream_count field to guide users? For example: examples=[1, 5]. wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 233c383 and 04aa255.

📒 Files selected for processing (1)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)

1662-1662: LGTM! Good update to support both static and dynamic stream checks.

The change to make check field support both CheckStream and CheckDynamicStream aligns well with the PR objective of enabling connection checks for dynamic streams.


1688-1688: LGTM! Consistent implementation across both source types.

The same change is correctly applied to DeclarativeSource2, maintaining consistency in the schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants