You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* remove invalid legacy option
* remove unused option
* the tests pass but this is quite messy
* very slight clean up
* Add skip options to csv format
* fix some of the typing issues
* fixme comment
* remove extra log message
* fix typing issues
* skip before header
* skip after header
* format
* add another test
* Automated Commit - Formatting Changes
* auto generate column names
* delete dead code
* update title and description
* true and false values
* Update the tests
* Add comment
* missing test
* rename
* update expected spec
* move to method
* Update comment
* fix typo
* remove unused import
* Add a comment
* None records do not pass the WaitForDiscoverPolicy
* format
* remove second branch to ensure we always go through the same processing
* Raise an exception if the record is None
* reset
* Update tests
* handle unquoted newlines
* Automated Commit - Formatting Changes
* Update test case so the quoting is explicit
* Update comment
* Automated Commit - Formatting Changes
* Fail validation if skipping rows before header and header is autogenerated
* always fail if a record cannot be parsed
* format
* set write line_no in error message
* remove none check
* Automated Commit - Formatting Changes
* enable autogenerate test
* remove duplicate test
* missing unit tests
* Update
* remove branching
* remove unused none check
* Update tests
* remove branching
* format
* extract to function
* comment
* missing type
* type annotation
* use set
* Document that the strings are case-sensitive
* public -> private
* add unit test
* newline
---------
Co-authored-by: girarda <[email protected]>
description="The quoting behavior determines when a value in a row should have quote marks added around it. For example, if Quote Non-numeric is specified, while reading, quotes are expected for row values that do not contain numbers. Or for Quote All, every row value will be expecting quotes.",
48
52
)
49
-
50
-
# Noting that the existing S3 connector had a config option newlines_in_values. This was only supported by pyarrow and not
51
-
# the Python csv package. It has a little adoption, but long term we should ideally phase this out because of the drawbacks
52
-
# of using pyarrow
53
+
null_values: Set[str] =Field(
54
+
title="Null Values",
55
+
default=[],
56
+
description="A set of case-sensitive strings that should be interpreted as null values. For example, if the value 'NA' should be interpreted as null, enter 'NA' in this field.",
57
+
)
58
+
skip_rows_before_header: int=Field(
59
+
title="Skip Rows Before Header",
60
+
default=0,
61
+
description="The number of rows to skip before the header row. For example, if the header row is on the 3rd row, enter 2 in this field.",
62
+
)
63
+
skip_rows_after_header: int=Field(
64
+
title="Skip Rows After Header", default=0, description="The number of rows to skip after the header row."
65
+
)
66
+
autogenerate_column_names: bool=Field(
67
+
title="Autogenerate Column Names",
68
+
default=False,
69
+
description="Whether to autogenerate column names if column_names is empty. If true, column names will be of the form “f0”, “f1”… If false, column names will be read from the first CSV row after skip_rows_before_header.",
70
+
)
71
+
true_values: Set[str] =Field(
72
+
title="True Values",
73
+
default=DEFAULT_TRUE_VALUES,
74
+
description="A set of case-sensitive strings that should be interpreted as true values.",
75
+
)
76
+
false_values: Set[str] =Field(
77
+
title="False Values",
78
+
default=DEFAULT_FALSE_VALUES,
79
+
description="A set of case-sensitive strings that should be interpreted as false values.",
0 commit comments