How can use a method similar to pandas' groupby to group a dataset and use `numpy.random.uniform` function？

Question 1:
How can I use a method similar to pandas' groupby to group a dataset, calculate the maximum value for each group, and generate a pipeline that can create a PMML file using sklearn2pmml? I have a logic reference in the following code, but it does not execute correctly when generating the PMML file. I am currently investigating the cause and looking for alternative solutions. My guess is that Jpmml does not have a similar function, so it cannot be converted. Is my understanding correct?

Reference code for Question 1:

```
class MaxIncomeTransformer(BaseEstimator, TransformerMixin):  
    def __init__(self, groupby_column, target_column, output_columns=None):  
        self.groupby_column = groupby_column  
        self.target_column = target_column  
        self.output_columns = output_columns  

    def fit(self, X, y=None):  
        return self  

    def transform(self, X):  
        if not isinstance(X, pd.DataFrame):  
            X = pd.DataFrame(X)  

        # Find the index of maximum income for each group  
        idx = X.groupby(self.groupby_column)[self.target_column].idxmax()  
        result = X.loc[idx].reset_index(drop=True)  
        
        # if have output_columns, use it   
        if self.output_columns is not None:  
            result.columns = self.output_columns  
        
        return result[self.output_columns[1]]   
    
#score1 Pipeline  
fraud_final_cols = ['msisdn','score1']  
mapper_final_fraud = DataFrameMapper([  
                                      (['msisdn', 'score1'],   
                                       [MaxIncomeTransformer(groupby_column='msisdn', target_column='score1',output_columns=['msisdn', 'score1'])],  
                                       {'alias':'score1'})  
],input_df=True,df_out=True)  
```

Question 2:
How can I use a function like random.uniform(0.1, 0.2) within ExpressionTransformer to randomly generate numbers in a specific range?
What I want to achieve is to add some perturbations or random values to the result, so that the result is evenly distributed in a certain interval.
Reference code for Question 2:

```
mapper_fea2 = DataFrameMapper([  
    (['score_1_wld', 'score_1_zljr', 'score_1_zlhqd'],  
     [ExpressionTransformer("random.uniform(0.1, 0.2) if X[0]==1 or X[1]==1 or X[2]==1 else 0")],  
     {'alias': 'score_1'}),  
    ], input_df=True, df_out=True)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How can use a method similar to pandas' groupby to group a dataset and use `numpy.random.uniform` function？ #432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How can use a method similar to pandas' groupby to group a dataset and use numpy.random.uniform function？ #432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

How can use a method similar to pandas' groupby to group a dataset and use `numpy.random.uniform` function？ #432