Skip to content

ValueError: Sample larger than population or is negative #38

Closed
@shashankbansal6

Description

@shashankbansal6

Hi,

I have a small dataset that I am trying to augment. For some of the questions, I am getting the following error:

ValueError                                Traceback (most recent call last)
<ipython-input-337-336aea02b7a2> in <module>
      2 print(len(text))
      3 aug = naw.BertAug(action="insert")
----> 4 augmented_text = aug.augment(text)
      5 print("Original:")
      6 print(text)

~/anaconda3/lib/python3.7/site-packages/nlpaug/base_augmenter.py in augment(self, data)
     69 
     70         if self.action == Action.INSERT:
---> 71             return self.insert(data)
     72         elif self.action == Action.SUBSTITUTE:
     73             return self.substitute(data)

~/anaconda3/lib/python3.7/site-packages/nlpaug/augmenter/word/bert.py in insert(self, data)
     85         for aug_idx in aug_idxes:
     86             results.insert(aug_idx, nml.Bert.MASK)
---> 87             new_word = self.sample(self.model.predict(results, nml.Bert.MASK, self.aug_n), 1)[0]
     88             results[aug_idx] = new_word
     89 

~/anaconda3/lib/python3.7/site-packages/nlpaug/base_augmenter.py in sample(cls, x, num)
    109     @classmethod
    110     def sample(cls, x, num):
--> 111         return random.sample(x, num)
    112 
    113     def generate_aug_cnt(self, size, aug_p=None):

~/anaconda3/lib/python3.7/random.py in sample(self, population, k)
    319         n = len(population)
    320         if not 0 <= k <= n:
--> 321             raise ValueError("Sample larger than population or is negative")
    322         result = [None] * k
    323         setsize = 21        # size of a small set minus size of an empty list

ValueError: Sample larger than population or is negative

After some research, I came across this https://stackoverflow.com/questions/20861497/sample-larger-than-population-in-random-sample-python
but I am still not sure what exactly the issue is. It works sometimes but other times it returns this error. Is it something to do with my questions? Is there a specific format I need to follow for the questions?

Any help would be much appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions