Skip to content

Commit 996af87

Browse files
msaboffbterlson
authored andcommitted
Unify handling of RegExp CharacterClassEscapes \w and \W and Word Asserts \b and \B (#525)
* Proposed RegExp CharacterClassEscape changes for \w * Updated per what was agreed at the May 2016 meeting. Created a new abstract operation "WordCharacters()" that is used by both IsWordChar() for word assertions and \w/\W CharacterClassEscapes.
1 parent 1244a0b commit 996af87

File tree

1 file changed

+31
-248
lines changed

1 file changed

+31
-248
lines changed

spec.html

Lines changed: 31 additions & 248 deletions
Original file line numberDiff line numberDiff line change
@@ -27724,16 +27724,14 @@ <h1>Assertion</h1>
2772427724
</emu-alg>
2772527725

2772627726
<!-- es6num="21.2.2.6.1" -->
27727-
<emu-clause id="sec-runtime-semantics-iswordchar-abstract-operation" aoid="IsWordChar">
27728-
<h1>Runtime Semantics: IsWordChar Abstract Operation</h1>
27729-
<p>The abstract operation IsWordChar takes an integer parameter _e_ and performs the following steps:</p>
27727+
<emu-clause id="sec-runtime-semantics-wordcharacters-abstract-operation" aoid="WordCharacters">
27728+
<h1>Runtime Semantics: WordCharacters Abstract Operation</h1>
27729+
<p>The abstract operation WordCharacters performs the following steps:</p>
2773027730
<emu-alg>
27731-
1. If _e_ is -1 or _e_ is _InputLength_, return *false*.
27732-
1. Let _c_ be the character _Input_[_e_].
27733-
1. If _c_ is one of the sixty-three characters below, return *true*.
27734-
<figure>
27735-
<table class="lightweight-table">
27736-
<tbody>
27731+
1. Create a set _A_ of characters containing the sixty-three characters:
27732+
<figure>
27733+
<table class="lightweight-table">
27734+
<tbody>
2773727735
<tr>
2773827736
<td>
2773927737
`a`
@@ -27959,14 +27957,30 @@ <h1>Runtime Semantics: IsWordChar Abstract Operation</h1>
2795927957
<td>
2796027958
</td>
2796127959
</tr>
27962-
</tbody>
27963-
</table>
27964-
</figure>
27965-
1. Return *false*.
27960+
</tbody>
27961+
</table>
27962+
</figure>
27963+
1. Create an empty set _U_.
27964+
1. For every character _c_ not in set _A_ where Canonicalize(_c_) is in _A_, add _c_ to _U_.
27965+
1. Assert: Unless _Unicode_ and _IgnoreCase_ are both true, _U_ is empty.
27966+
1. Add the characters in set _U_ to set _A_.
27967+
1. Return _A_.
2796627968
</emu-alg>
2796727969
</emu-clause>
27970+
<!-- es6num="21.2.2.6.2" -->
27971+
<emu-clause id="sec-runtime-semantics-iswordchar-abstract-operation" aoid="IsWordChar">
27972+
<h1>Runtime Semantics: IsWordChar Abstract Operation</h1>
27973+
<p>The abstract operation IsWordChar takes an integer parameter _e_ and performs the following steps:</p>
27974+
<emu-alg>
27975+
1. If _e_ is -1 or _e_ is _InputLength_, return *false*.
27976+
1. Let _c_ be the character _Input_[_e_].
27977+
1. Let _WordChars_ be _WordCharacters_().
27978+
1. If _c_ is in _WordChars_, return *true*.
27979+
1. Return *false*.
27980+
</emu-alg>
27981+
</emu-clause>
27982+
</emu-clause>
2796827983
</emu-clause>
27969-
2797027984
<!-- es6num="21.2.2.7" -->
2797127985
<emu-clause id="sec-quantifier">
2797227986
<h1>Quantifier</h1>
@@ -28358,240 +28372,9 @@ <h1>CharacterClassEscape</h1>
2835828372
<p>The production <emu-grammar>CharacterClassEscape :: `d`</emu-grammar> evaluates by returning the ten-element set of characters containing the characters `0` through `9` inclusive.</p>
2835928373
<p>The production <emu-grammar>CharacterClassEscape :: `D`</emu-grammar> evaluates by returning the set of all characters not included in the set returned by <emu-grammar>CharacterClassEscape :: `d`</emu-grammar>.</p>
2836028374
<p>The production <emu-grammar>CharacterClassEscape :: `s`</emu-grammar> evaluates by returning the set of characters containing the characters that are on the right-hand side of the |WhiteSpace| or |LineTerminator| productions.</p>
28361-
<p>The production <emu-grammar>CharacterClassEscape :: `S`</emu-grammar> evaluates by returning the set of all characters not included in the set returned by <emu-grammar>CharacterClassEscape :: `s`</emu-grammar>.</p>
28362-
<p>The production <emu-grammar>CharacterClassEscape :: `w`</emu-grammar> evaluates by returning the set of characters containing the sixty-three characters:</p>
28363-
<figure>
28364-
<table class="lightweight-table">
28365-
<tbody>
28366-
<tr>
28367-
<td>
28368-
`a`
28369-
</td>
28370-
<td>
28371-
`b`
28372-
</td>
28373-
<td>
28374-
`c`
28375-
</td>
28376-
<td>
28377-
`d`
28378-
</td>
28379-
<td>
28380-
`e`
28381-
</td>
28382-
<td>
28383-
`f`
28384-
</td>
28385-
<td>
28386-
`g`
28387-
</td>
28388-
<td>
28389-
`h`
28390-
</td>
28391-
<td>
28392-
`i`
28393-
</td>
28394-
<td>
28395-
`j`
28396-
</td>
28397-
<td>
28398-
`k`
28399-
</td>
28400-
<td>
28401-
`l`
28402-
</td>
28403-
<td>
28404-
`m`
28405-
</td>
28406-
<td>
28407-
`n`
28408-
</td>
28409-
<td>
28410-
`o`
28411-
</td>
28412-
<td>
28413-
`p`
28414-
</td>
28415-
<td>
28416-
`q`
28417-
</td>
28418-
<td>
28419-
`r`
28420-
</td>
28421-
<td>
28422-
`s`
28423-
</td>
28424-
<td>
28425-
`t`
28426-
</td>
28427-
<td>
28428-
`u`
28429-
</td>
28430-
<td>
28431-
`v`
28432-
</td>
28433-
<td>
28434-
`w`
28435-
</td>
28436-
<td>
28437-
`x`
28438-
</td>
28439-
<td>
28440-
`y`
28441-
</td>
28442-
<td>
28443-
`z`
28444-
</td>
28445-
</tr>
28446-
<tr>
28447-
<td>
28448-
`A`
28449-
</td>
28450-
<td>
28451-
`B`
28452-
</td>
28453-
<td>
28454-
`C`
28455-
</td>
28456-
<td>
28457-
`D`
28458-
</td>
28459-
<td>
28460-
`E`
28461-
</td>
28462-
<td>
28463-
`F`
28464-
</td>
28465-
<td>
28466-
`G`
28467-
</td>
28468-
<td>
28469-
`H`
28470-
</td>
28471-
<td>
28472-
`I`
28473-
</td>
28474-
<td>
28475-
`J`
28476-
</td>
28477-
<td>
28478-
`K`
28479-
</td>
28480-
<td>
28481-
`L`
28482-
</td>
28483-
<td>
28484-
`M`
28485-
</td>
28486-
<td>
28487-
`N`
28488-
</td>
28489-
<td>
28490-
`O`
28491-
</td>
28492-
<td>
28493-
`P`
28494-
</td>
28495-
<td>
28496-
`Q`
28497-
</td>
28498-
<td>
28499-
`R`
28500-
</td>
28501-
<td>
28502-
`S`
28503-
</td>
28504-
<td>
28505-
`T`
28506-
</td>
28507-
<td>
28508-
`U`
28509-
</td>
28510-
<td>
28511-
`V`
28512-
</td>
28513-
<td>
28514-
`W`
28515-
</td>
28516-
<td>
28517-
`X`
28518-
</td>
28519-
<td>
28520-
`Y`
28521-
</td>
28522-
<td>
28523-
`Z`
28524-
</td>
28525-
</tr>
28526-
<tr>
28527-
<td>
28528-
`0`
28529-
</td>
28530-
<td>
28531-
`1`
28532-
</td>
28533-
<td>
28534-
`2`
28535-
</td>
28536-
<td>
28537-
`3`
28538-
</td>
28539-
<td>
28540-
`4`
28541-
</td>
28542-
<td>
28543-
`5`
28544-
</td>
28545-
<td>
28546-
`6`
28547-
</td>
28548-
<td>
28549-
`7`
28550-
</td>
28551-
<td>
28552-
`8`
28553-
</td>
28554-
<td>
28555-
`9`
28556-
</td>
28557-
<td>
28558-
`_`
28559-
</td>
28560-
<td>
28561-
</td>
28562-
<td>
28563-
</td>
28564-
<td>
28565-
</td>
28566-
<td>
28567-
</td>
28568-
<td>
28569-
</td>
28570-
<td>
28571-
</td>
28572-
<td>
28573-
</td>
28574-
<td>
28575-
</td>
28576-
<td>
28577-
</td>
28578-
<td>
28579-
</td>
28580-
<td>
28581-
</td>
28582-
<td>
28583-
</td>
28584-
<td>
28585-
</td>
28586-
<td>
28587-
</td>
28588-
<td>
28589-
</td>
28590-
</tr>
28591-
</tbody>
28592-
</table>
28593-
</figure>
28594-
<p>The production <emu-grammar>CharacterClassEscape :: `W`</emu-grammar> evaluates by returning the set of all characters not included in the set returned by <emu-grammar>CharacterClassEscape :: `w`</emu-grammar>.</p>
28375+
<p>The production <emu-grammar>CharacterClassEscape :: `S`</emu-grammar> evaluates by returning the set of all characters not included in the set returned by <emu-grammar>CharacterClassEscape :: `s`</emu-grammar> .</p>
28376+
<p>The production <emu-grammar>CharacterClassEscape :: `w`</emu-grammar> evaluates by returning the set of all characters returned by _WordCharacters_().</p>
28377+
<p>The production <emu-grammar>CharacterClassEscape :: `W`</emu-grammar> evaluates by returning the set of all characters not included in the set returned by <emu-grammar>CharacterClassEscape :: `w`</emu-grammar> .</p>
2859528378
</emu-clause>
2859628379

2859728380
<!-- es6num="21.2.2.13" -->

0 commit comments

Comments
 (0)