-
Notifications
You must be signed in to change notification settings - Fork 49
[Java] Add accessors to get type parameters from vector classes #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Jacques Nadeau / @jacques-n: In general, I feel like the arrow pojo/field APIs are very inadequate. We built something up at Dremio to try to improve the behavior/consolidate typical handling tasks. We've discussed how much of it would be generally useful but have yet to come up with a solution. Would love your thoughts on whether any portion of what we did would be generally helpful. Our work against Arrow types: https://github.com/dremio/dremio-oss/blob/master/sabot/logical/src/main/java/com/dremio/common/expression/CompleteType.java |
Bryan Cutler / @BryanCutler: My use case is in Spark when constructing type specific writers for a given private def createFieldWriter(vector: ValueVector): ArrowFieldWriter = {
vector match {
...
case vector: NullableTimeStampMicroTZVector =>
val field = vector.getField()
val timeZone = field.getType.asInstanceOf[ArrowType.Timestamp].getTimezone
// do something with timeZone
new TimestampWriter(vector)
... Since the vector has already been casted, it would be more convenient to just access the timezone from there instead of having to also cast the type. Then it would simply to this private def createFieldWriter(vector: ValueVector): ArrowFieldWriter = {
vector match {
...
case vector: NullableTimeStampMicroTZVector =>
val timeZone = vector.getTimezone()
// do something with timeZone
new TimestampWriter(vector)
... |
Jacques Nadeau / @jacques-n: |
Bryan Cutler / @BryanCutler: |
Jacques Nadeau / @jacques-n:
For example, class NullableTimeStampMicroTZVector { Given declaration Then I find this a much easier thing to code to (especially if using code generation) as opposed to having specialized method names for each type. I haven't thought through all the ramifications of this approach but was throwing it out there. |
Bryan Cutler / @BryanCutler: What do you think about adding a method |
Jacques Nadeau / @jacques-n: I like the getType() method and prefer that over many differently named methods. |
Bryan Cutler / @BryanCutler: |
Vector classes contain private copies of each param in the
ArrowType
, but does not have any public api to access them. So if given a vector you would have to get theField
from the and cast to the correct type. For example, with aTimeStampMicroTZVector
and trying to get the timezone:It would be more convenient to have direct accessors for these type params for the vector types that have parameters:
Reporter: Bryan Cutler / @BryanCutler
PRs and other links:
Note: This issue was originally created as ARROW-1361. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: