Every path below comes with a prefix packages/pytea
.
src/pytea.ts
: PyTea entry pointsrc/service
: Service modules (language server / analysis manager)pyteaService.ts
: Manage Python scripts, setting, logging, and import resolutions
src/frontend
: Translate Python script into PyTea IRtorchFrontend.ts
: Main translation enginetorchStatements.ts
: Definition of PyTea IR
src/backend
: Run online analysis and collect constraintstorchBackend.ts
: Main analysis enginecontext.ts
: Implementation ofContext
andContextSet
. Look after the interfaces of those classes.constraintSet.ts
: Constraint set of a single path. Most of the methods can be accessed from aContext
, so you should not use those methods in this class directly.constraintType.ts
: Definitions of constraintsconstraintSolver.ts
: Online constraint checker (Simple Linear CAS)expUtils.ts
: Simplify symbolic expressions and extract information from itsharpValues.ts
: Values of PyTea IR Static semeanticssharpEnvironments.ts
: Heap and Enviroment of PyTea IRsymExpressions.ts
: Symbolic variables and expressionsrange.ts
: Abstract range domain
src/pylibImplements
: Implementations of PyTea LibCall (Semantics of PyTorch and 3rd-party libraries)index.ts
: RegisterlibCallMap
libcall.ts
: Implemntation of special LibCalls exceptexplicit
(e.g.,import
,callKV
,exportGlobal
). Those are mainly for special Python semantics likef(*args, **kwargs)
(variadic/keyword parameter).
pylib
: Implementation of Python builtin and 3rd-party libraries
- Implement PyTorch (or other library) API by referring to
pylib
and supported Python syntax on directorypylib
. For some well-formed high-level modules like torchvision/models/resnet.py, you can completely copy and paste the original source. - Replace expressions which require explicit constraints by appropriate
LibCall
methods, likeLibCall.torch.matmul(self, other)
, orLibCall.guard.require_lt(0, x)
. EachLibCall.XXX.YYY(...)
will invoke the functionYYY(...)
fromsrc/pylibImplements/XXX/index.ts
orsrc/pylibImplements/XXX.ts
. - Suppose that we use a function
LibCall.foo.bar(x, y)
from Python script. You should implementbar(ctx: Context<LCBase.ExplicitParams>, source?: ParseNode): ContextSet<ShValue>
onsrc/pylibImplements/backend/foo/index.ts
. If there is no directory or filefoo
, create it and registerlibCallImpls
atsrc/pylibImplements/backend/index.ts
. - Register
foo
atlibCallImpls
.libCallImpls
is usually placed in the end of the file.
- PyTea internal values (Context, Env, Heap, Value, Constraint) are implemented with Immutable.js, the efficient persistent data structure library for JS.
- Thus, you cannot fix the value in-place. Every constraint have to be created by context method such as
ctx.genLte(..., ...)
- Thus, you cannot fix the value in-place. Every constraint have to be created by context method such as
- Integer ranges are 0-based, inclusive lower bound and exclusive upper bound, which is analagous to Python range.
ExpShape.slice(shape, start, end)
is equal toshape[start:end]
.
Context<T>
holds every information (e.g. ShEnv
, ShHeap
) to execute a single Python statement. ContextSet<T>
is a set of Context
s, which holds both successful paths and failed paths.
Context<T>
has several setXXX
chaining methods. The user should combine result value and environments to it and construct the next context.
ContextSet
is a monadic structure; the user can combine map
and flatMap
methods to control the execution path, and inject constraints by using require
method.
interface ContextProps<T> {
failId: number; // context id
env: ShEnv; // Id -> Addr
heap: ShHeap; // Addr -> Value
ctrSet: ConstraintSet; // Hard \/ Soft \/ Path constraints
retVal: T; // return value of the last expression or statement
// the properties below are internal values; the user should not directly modify these values.
// SVFunc is a type of Python function, string is a name of LibCall.
callStack: List<[SVFunc | string, CodeSource | undefined]>;
logs: List<ShValue>; // error logs (appended by warnXXX, failXXX, etc...)
imported: ShEnv; // qualified path -> Addr.
relPath: string; // relative path of current file.
// if it is set, this context is regarded as a failed path.
failed?: SVError;
}
To implement semantics of Python or PyTorch API, user should use a bunch of methods in ContextMethods<T>
.
Beware that Context<T>
is immutable. Each method builds a new Context
or ContextSet
, and the source Context
will not be modified.
In a context of LibCall
implementation, the below methods are usually used or required:
toSet
,toSetWith
: Make a single context to context set.toSetWith
sets a return value.setXXX
: Setter function ofContext
. This method is not an in-place method.getAttrDeep
: Get an attribute with__getattr__
method. If the attribute is already assinged,__getattr__
will be ignored just like the Python semantics. It might call any nondeterministic function, so the type of the return value isContextSet
.warn
,warnWithMsg
: Write warning log and set newSVError
type value toctx.retVal
.fail
,failWithMsg
: Immediately stops current path.warnTensorWithMsg
: Write warning log, but set a symbolic unknown-ranked tensor toctx.retVal
. If the shape of the return value cannot be statically known, this method should be called.genSymXXX
: Generate new symbolic variable.genIntGte
,genFlaotGte
: Return new constrainted symbolic variable. the result context holds that variable inctx.retVal
.genConstRankedShape
: Return new ranked tensor.partialDims
is a partial map which defines each dimension of result tensor.genRankedShape
: Return new ranked tensor with unconstrainted symbolic dimensions.genXXX
(e.g.genEq
): Constructors of constraintsparseSize
: if theiterable
is a list or tuple of integer, PyTea regards those values as dimenstions, and returns newExpShape
using those dimensions. If the parsing is failed, it will return string type of error message.shXXX
: Basic shape operations regarding constrints. If the constraints are violated, this context will be added to failed path in the resultContextSet
.require
: Soft constraint setter.guarantee
: Hard constraint setter.ifThenElse
: Return two paths (true/false paths) with a given constraint. If the given constraint can be immediately known to be constant true or false, the other side of result set will be empty set.getCachedRange
: Return a conservative range of a givenExpNum
by regarding currentctrSet
.checkImmediate
: Discriminate that a given constraint is constant true or false, or undecidable by currentctrSet
.
src/backend/backUtils.ts
fetchAddr(value, heap)
: Recursively dereference the address ifvalue
isSVAddr
type.sanitizeAddr(value, heap)
: Recursively dereference the address, but stops when the dereferenced value isSVObject
type.
src/pylibImplements/backend/utils.ts
:genTensor(ctx, shape, source?)
: returntorch.Tensor
object with a given shape.fetchSize(mayAddr, heap)
: ifmayAddr
is a pointer of aSVSize
value, return thatSVSize
object.
ShValue
consists of Python objects and primitive values which are expressed in symbolic expression, such as a + 3 * b
.
Literal values such as SVInt
, SVFloat
, SVString
, and SVBool
have JS number
, string
, boolean
, or SymExp
object. ctx.getCachedRange
can calculate current conservative (i.e. lower and upper bound) range of each value.
SVObject
consists of three internal maps and its pointer (i.e. address). If SVObject
is 'shapable', which can be regarded as shaped value such as Tensor, ndarray, or nested array, PyTea will assign shape
property to it. From this situations, SVObject
value can be casted to SVSize
type. SVSize
type can be treated as Python tuple or torch.Size
.
Three maps in SVObject
represent 'array' (e.g. 'a[1]'), 'attributes' (e.g. a.x
), and 'dictionary' (e.g. a['key']
]). Mathematically, SVObject
is a disjoint union of three maps, Attr -> Value
, Int -> Value
, and String -> Value
.
Unlike the original Python, there is no hash function of a plain object in PyTea. Therefore, we cannot use a plain object as a key of a dictionary. (e.g. 'x = object(); a[x] = 3' is prohibited.)
To make new ShValue, the user should use any derived ShValue.create
functions such as SVInt.create
.