-
Notifications
You must be signed in to change notification settings - Fork 29
/
Copy pathChangeLog
224 lines (199 loc) · 10.4 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
Change Log
CLRadeonExtender 0.1.9:
* add AMD Navi support for assembler and disassembler
* add shorter addressing of FLAT/GLOBAL/SCRATCH
* add literal immediate for SMRD addressing for GCN1.1
* add Amd3 OpenCL binary format for AMD Navi for AMD OpenCL implementation
* include specific extension in device name for ROCm-OpenCL platform
CLRadeonExtender 0.1.8:
* add chapter about binary formats to CLRX documentation
* add some informations about compilation under FreeBSD
* add '.nosectdiffs' to disable new section difference behaviour if new ROCm format choosen
* small optimization in the AsmScope destructor.
* add extra info about setting up number of the SGPRs register in documentation
* fixed OpenCL detection for AMDGPU-PRO
* add '.enum' pseudo-op to simplify defining enumerations
* add CLRX_VERSION_NUMBER and CLRX_POLICY_UNIFIED_SGPR_COUNT
* add policy to unify SGPR counting for all binary formats (by default disabled)
* in documentation fix some some mistakes about building
* add preliminary support for CPU architectures (untested): SPARC, IA64 and MIPS
* add new '.dims' syntax for distinguish vector group ids and scalar local ids
* improve CLZ32/64 for MSVC
* introduce CTZ32/64
* while disassemblying determine minimal AMD driver version for GPU device type
(better code detection while disassemblying)
* fixed some types in documentation
* update list of GPU devices in documentation
* fix stupid and old bug in ImageMix sample
* change a GPU device name for VEGA11 to GFX902
* fixed segfault when attempt to disassemble old Gallium binaries using new Gallium binary format
* sort the kernels by an offset order by disassemblying
* better input data checking while disassemblying code
* add HSALayout mode for AMDCL2 format (similar code layout like in ROCm and Gallium formats)
* introduce kernel code parts ('.kcode' and '.kcodeend') to AMDCL2
* check sanity of use LDS in AMD VEGA architecture (can be used only in SCRATCH and GLOBAL)
* in source code add new types: GPUArchMask, AsmKernelId and AsmSectionId type.
* allow constant literals in sym regranges
* fixed symreg ranges checking
* fixed handling some the symbol names similar to some register names (like exec_masc)
* add new GPU devices to list (gfx904, gfx905, gfx906 and gfx907)
* add AMD VEGA 20 instruction set
* add much stuff to handle register allocation (still it doesn't work and it wasnot finished)
* add a DTree structure to save memory in storing register allocation structures
* fixed possible segfault while preparing to write when ASMKERN_INNER is present
CLRadeonExtender 0.1.7:
* update AmdCL2ABI chapter
* fixed kernel arguments sizes in GalliumCompute binary format
* add new GPU devices gfx902-gfx905
* update device tables for Amd Crimson drivers
* small fixes in DynLibrary interface
* add relocations to GalliumCompute binary format (for scratch buffer symbols)
* make getXXXDisasmInputFromBinaryXX as public interface
* speeding up evaluation of simple expressions without symbols
* add '.for' and '.while' pseudo-ops ('for' and 'while' loops)
* fixed some grammar/typos in CLRX documentation
* add GPU device names from ROCm-OpenCL
* handle new ROCm binary format with YAML metadatas (assembler and disassembler)
* add few pseudo-ops to ROCm handling
* add new pseudo-ops to set parameters in ROCm YAML metadata
* fixes in GalliumCompute binary generator (for conformant with standards)
* add '.reqd_work_group_size' pseudo-op (equivalent of '.cws')
* add support for work_group_size_hint and vec_type hint in Amd OpenCL 2.0 binary format
* some small bug fixes in ROCm disassembler
* updates in README.md and INSTALL files
* small sanitizations in DisasmAmd, DisasmAmdCL2 (argument type checking)
* change behaviour of '.cws' (.reqd_work_group_size) while setting default values
* add calculation of section differences in an expressions (for ROCm handling)
* fixed invalid reads (potential segfault) after undefining symbol
* fixed old stupid bug: resolve symbol value by using new value (or just if undefined then
do not resolve symbol) instead old unresolved symbol value later when expression
has been evaluated
* Add GOT table handling in ROCm binary format
* add new option '--newROCmBinFormat'
* add untested support for ROCm in CLHelper and VectorAdd sample
* add support for multiple OpenCL platforms in CLHelper and samples
* allow te call_convetion to 0xffffffff in AMDHSA config
* handle special cases with relatives while evaluating binary/logical operators
* small fixes in CLRX documentation and Unix manuals
* developing unfinished AsmRegAlloc
* add a missing access qualifier to images 'read_write' for AMD OpenCL 2.0
CLRadeonExtender 0.1.6:
* add support for Mesa3D 17.3.0 (GPU detection)
* fixed segfaults during disassemblying new Gallium binaries with AMD HSA
* add ability to supply defined symbols during using the CLHelper
* fixed CLRXDocs mistakes in GcnSrmdInstrs, GcmSmemInstrs, GcnVopXInstrs chapters.
* add GCN1.4 (VEGA) instruction's descriptons to CLRXDocs
* add support for GCN 1.4 (VEGA) to samples
* fixed encoding/decoding of SMEM instructions with SGPR offset (GCN 1.4)
* add a missing GCN 1.4 instructions
* fixed encoding/decoding of OP_SEL (GCN 1.4)
* fixed encoding/decoding of DS_READ_ADDTID_B32 (GCN 1.4)
* fixed encoding/decoding of TBUFFER_x_D16/BUFFER_x_D16 instructions for GCN 1.4
* fixed encoding CLAMP VOP3/VOPC instructions (GCN 1.4)
* allow to use OMOD, NEG, ABS, CLAMP modifiers in VOP3/VINTRP instructions
* add new VOP3/VINTRP instruction's descriptions to CLRXDocs
* update GCN timings chapter in CLRXDocs
CLRadeonExtender 0.1.5r1:
* add detection of OpenGL to CMakeLists.txt
* add more comments in the source code
* fixed hanging when ROCm code have hundreds or more kernels
* parameter in modifier can have any value
* add 'get_version' pseudo-operation
* add oldModParam mode (old modifier parameter's policy)
* fixes for ROCm disassembler module
* fixes for Gallium binary reader (accept new binaries with many kernels)
* added support for Mesa3D 17.2.x
* added Mesa3D/Gallium device names for AMD Polaris
* add new exceptions to code (to distinguish type of exception)
* fixed position in disassembler code in comments (mainly for Gallium/ROCm)
* add CLRXCLHelper library to facilitate running assembler code on the OpenCL
* move some GPU architecture versions tables to GPUId
* add new testcase GPUId
CLRadeonExtender 0.1.5:
* ignore case in an access qualifier name's (Amd and AmdCL2)
* improve handling a '\()' and '\@'
* add SDWA and DPP words to set instruction encoding
* fixing few CLRXDocs typos
* fixes for AMD RX VEGA (GFX900)
* disassembler prints an instruction's position in comments
* update GcnTimings
* update VectorAdd and ReverseBits for LLVM 4.0 and Mesa3D 17.0.0
* updates in ImageMix (correct workSize calculating for kernel)
* small fixes in disassembler
* disassembler can correctly disassemble GalliumCompute for LLVM 4.0
* add '--llvmVersion' to clrxdisasm
* dump AMD HSA configuration for GalliumCompute and AmdCL2 (like in ROCm format)
* disassembler add '@' to hwreg and sendmsg to make dump compatible with clrxasm
* add '--HSAConfig' to dump AmdCL2 kernel configuration as AMD HSA config
* add AMD HSA configuration pseudo-ops to GalliumCompute and AmdCL2 binary formats
* update device list for Gallium and ROCm binary formats for recognizing device
* fixed support for LLVM>=3.9 and Mesa3D>=17.0.0 in GalliumCompute
* add pseudo-op '.default_hsa_features' to AmdCL2, Gallium and ROCm formats
* update headers in code
* make error handling more compact in assembler's code
* fixed '.machine', '.codeversion' handling (do not print obsolete warnings)
* add pkg-config files to installation
* remove obsolete warnings in CMakeLists.txt
* added GFX901 support (RX VEGA with HBCC ?)
* add Config.h and amdbin/Elf.h headers to Doxygen documentation
* change lowest device for GCN 1.2 to Iceland in GPUId.
* add support for Windows developments environments: CygWin and MinGW
* make detecting of 64-bits more portable in CMakeLists.txt (use compiler to do)
* checking whether std::call_once is available for non full supported std threads
* use only C++ compiler to check features (Int128Detect.cpp)
CLRadeonExtender 0.1.4r1:
* fixed code operation in SMRD and SMEM instructions
* fixed parsing symbol register ranges begins from 'exec', 'vcc', 'tma', ...
* checking end of line at parsing symbol and regvar register ranges
CLRadeonExtender 0.1.4:
* add AMD RX VEGA support (GCN 1.4/VEGA)
* add symbol scopes
* add support for 32-bit AMD OpenCL 2.0 binaries
* update GPU device ids to latest drivers
* add Ellesmere and Baffin support for AMD OpenCL 1.2 binaries
* add support for LLVM 3.9, LLVM 4.0 and Mesa3D 17.0
* add new options to clrxasm (--llvmVersion)
* add GCN 1.2 instruction set documentation
* add new SMEM instruction (s_buffer_atomics)
* add GDS segment size to AMD OpenCL 2.0 binaries
* add code of samples for GCN 1.2
* add option to use old AMD OpenCL 1.2 binary format into samples
* add editor's syntax (NotePad++, Kate, Gedit, VIM)
* minor fixes in GCN assembler
* add modifier's parametrization
* add options to control case-sensitiviness in macro names
* fixed handling AMDOCL names for 32-bit Windows environment
* add installation rules for AMDGPU-PRO drivers (OpenSUSE and Ubuntu)
* add new pseudo-ops '.get_64bit', '.get_arch', '.get_format', '.get_gpu'
* add autodetection for LLVM and Mesa3D version
* find correct AMDOCL, MesaOCL and llvm-config at runtime
CLRadeonExtender 0.1.3:
* ROCm binary format support
* fixed '.format' pseudo-op
* fixed resolving variables in some specific cases
* fixed handling AmdCL2 format for device type later than GCN.1.1
* small fixes in documentation
* fixed disassemblying s_waitcnt
* fixed handling floating point literals in assembler and compatibility mode (bugFP)
* ARMv8 (AArch64) architecture support
* Android support
CLRadeonExtender 0.1.2:
* AMD OpenCL 2.0 support
* 64-bit Gallium binary format support
* support for new closed Linux and Windows drivers
* new samples
* documentation for OpenCL 2.0 support (includes ABI)
* documentation for GCN ISA FLAT encoding
* lit() specifier to distinguish literal and inline constant
* alternate macro syntax
* correct counting registers for automatic configuration
* fixed handling of conditionals and macro pseudo-ops
* disassembler can dump configuration in user-friendly form
CLRadeonExtender 0.1.1:
* support for Windows
* register ranges, and symbol's of register ranges
* GCN ISA documentation
* fixed AMD Catalyst and Gallium compute binary generator
* fixed clrxasm
CLRadeonExtender 0.1:
* first published version