Skip to content

Support collection of a model coverage during generation #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 5 tasks
buzden opened this issue Jul 25, 2023 · 2 comments
Open
2 of 5 tasks

Support collection of a model coverage during generation #78

buzden opened this issue Jul 25, 2023 · 2 comments
Labels
code: enhancement New feature or improvement part: generators Related to generators part: model coverage Related to model coverage of generators status: feature request Request for new functionality or improvement

Comments

@buzden
Copy link
Owner

buzden commented Jul 25, 2023

It would be good to collect some information about model coverage, i.e. coverage in terms of the original data type (for derived generators) and/or coverage in terms of alternatives inside a generator.

The topic is not univocal, there are definitely a ton of ways how such coverage can be gained and how can it be represented. We'd like to increase slowly starting from collecting just which types and type families were present during a series of generation disregarding the position of this generation, say, in recursion; then we can continue with adding information about which particular constructors were actually used in generation in particular types or type families, again disregarding the position. Then, we can further refine this, say, by considering the position in the generation tree, or by considering coverage of constructors separately for different values of indices for indexed type families.

For the first two steps, there are at least the following steps to be done:

  • We need an elaboration script being given a data type family name, showing the list of all data type families that are used inside. This is needed to list all possible items of a coverage of type families (and their constructors).
  • A data structure indexed by a generation tree (i.e., a generator), which shows which alternative was used to generate this or that data value. This data structure would be returned by an alternative variant of unGen. Having additionally description values of oneOfs, using this indexed data structure, we can at least know which data type families were tried to generate a value.
  • Then, we can extend descriptions of oneOfs to contain particular information about particular constructors. It would allow us to evaluate model coverage more precisely. This all, of course, applied only to derived generators, and to generators, whose authors added descriptions corresponding to the derived ones.
  • Maybe, we need some structure on the descriptions, i.e. descriptions would be not just Maybe String, but some our custom FromString type, which may contain type itself, and, possibly, a Vect of subdescriptions for each alternative.
  • Maybe, we could invent some kind of a %macro, which can fill appropriate description of a hand-written generators.
@buzden buzden added code: enhancement New feature or improvement part: generators Related to generators status: feature request Request for new functionality or improvement labels Jul 25, 2023
@buzden
Copy link
Owner Author

buzden commented Aug 8, 2023

Another idea to support collection of information needed for model coverage is to add special operations (and thus, constructors for the Gen type) for logging, namely two of them:

  • logging of a point, intended to be used with the >>= operation (this needs changing the type of the Bind generator);
  • logging of a bracket, i.e. marking the whole subgenerator; this allows to mark special positions, say, in a product.

The latter one seems to be more useful. Say, it is able to catch lots of stages of metric's refinement: types-only, types+constructors, types+constructors at constructor's positions. For example, when we have

data Gen : ... -> Type where
  ...
  Log : LogType -> (subgen : Gen ...) -> Gen ... 
  ...

and a data type X

data X : Type where
  A : Nat -> X -> X -> X
  B : Nat -> X

(for some LogType, I don't know what can it be yet) we can define a generator during derivation in the following way:

genX : Gen ... X
genX_A : Gen ... X
genX_B : Gen ... X

genX_A = log (constructor "A") [| A (log (constrPos 0) genNat) (log (constPos 1) genX) (log (constPos 2) genX) |]
genX_B = log (constructor "B") [| B (log (constrPos 0) genNat) |]

genX = log (type "X") $ oneOf
  [ genX_A
  , genX_B
  ]

However, there is a problem in definition of, say, <*> for Log constructor on the both sides, since if we do our usual simplification trick, say, putting all Logs out, we will lose cool property for precise logging of the constructor position.

Shortly speaking, I don't know how can Applicative be implemented with the Log constructor like above, which preserves all needed properties for nice logging.

@buzden
Copy link
Owner Author

buzden commented Aug 13, 2023

Even the Log constructor seems to be redundant.

Consider having List LogType (whatever LogType is, as above) only being added to the Pure and Raw constructors. Then, log function may add it directly to the Pure and Raw, ignore it when Empty (however, it can be added even there; but in lots places everything have to be rewritten to not to lose such log messages; still, I don't see any reason for doing this), propagating this log down to alternatives of OneOf and to the LHS of the Bind.

I'm not sure about Bind, because log ?whatever (rawGen >>= cont) seems to tell about the whole bind, i.e., it should not be equivalent to log ?whatever rawGen >>= cont. This can be solved by storing the same List LogType in the Bind constructor itself.

Anyway, I'm not sure it's possible to build a correct tree during unGen when data it stored in this way (without having a separate node for an applicative composition).


Surely, we could store Maybe LogType in every node, but this smells strongly with higher-kinded data, and should be done properly and generally.

@buzden buzden added the part: model coverage Related to model coverage of generators label Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code: enhancement New feature or improvement part: generators Related to generators part: model coverage Related to model coverage of generators status: feature request Request for new functionality or improvement
Projects
None yet
Development

No branches or pull requests

1 participant