Skip to content

Add --exclude-from flag to sync commands #2660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

anton-107
Copy link
Contributor

@anton-107 anton-107 commented Apr 4, 2025

Why

Exclude-from flag allows users to exclude file patterns from sync operations by providing a .gitignore -like file containing file patterns that should be excluded from sync

Usage:

$ databricks sync <SRC> <DEST> --exclude-from .databricksignore
$ databricks bundle sync --exclude-from .databricksignore

Tests

  1. Added new acceptance tests
  2. Added new unit test

@anton-107 anton-107 temporarily deployed to test-trigger-is April 4, 2025 15:27 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is April 4, 2025 15:29 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is April 4, 2025 15:33 — with GitHub Actions Inactive
@anton-107 anton-107 marked this pull request as ready for review April 4, 2025 15:33
cat > .databricksignore << EOF
project-folder/blob/*
project-folder/*.sql
EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to create these from script rather than create it in your worktree and commit with git?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, lets serialize test fixture files.

EOF
cat > .databricksignore << EOF
project-folder/blob/*
project-folder/*.sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check different version of .databricksignore:

  • refers to exact file (existing / missing(
  • refers to exact directory (existing / missing)
  • contains "**"

scanner := bufio.NewScanner(bytes.NewReader(data))
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
// Skip empty lines and comments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to duplicate functionality already present in go-gitignore

https://github.com/sabhiram/go-gitignore/blob/master/ignore.go#L182

// CompileIgnoreFile uses an ignore file as the input, parses the lines out of
// the file and invokes the CompileIgnoreLines method.
func CompileIgnoreFile(fpath string) (*GitIgnore, error) {

Pattern function strips comments and empty lines:
https://github.com/sabhiram/go-gitignore/blob/master/ignore.go#L77

	// Strip comments [Rule 2]
	if strings.HasPrefix(line, `#`) {
		return nil, false
	}

	// Trim string [Rule 3]
	// TODO: Handle [Rule 3], when the " " is escaped with a \
	line = strings.Trim(line, " ")

cat > .databricksignore << EOF
project-folder/blob/*
project-folder/*.sql
EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, lets serialize test fixture files.

Copy link
Contributor

@shreyas-goenka shreyas-goenka Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an independent test instead of inlining it within bundle/sync? Makes it easier to reason about.

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect --include-from to be almost identical, could you include it in this PR?

Separately, I would prefer if we not use .databricksignore even in our test files. It implies (to me) there are semantics associated with this file name, even though that's something we explicitly don't want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants