Skip to content

Commit 4936575

Browse files
miss-islingtongodlygeekpablogsal
authored
[3.13] gh-119517: Fixes for pasting in pyrepl (GH-120253) (#120353)
gh-119517: Fixes for pasting in pyrepl (GH-120253) * Remove pyrepl's optimization for self-insert This will be replaced by a less specialized optimization. * Use line-buffering when pyrepl echoes pastes Previously echoing was totally suppressed until the entire command had been pasted and the terminal ended paste mode, but this gives the user no feedback to indicate that an operation is in progress. Drawing something to the screen once per line strikes a balance between perceived responsiveness and performance. * Remove dead code from pyrepl `msg_at_bottom` is always true. * Speed up pyrepl's screen rendering computation The Reader in pyrepl doesn't hold a complete representation of the screen area being drawn as persistent state. Instead, it recomputes it, on each keypress. This is fast enough for a few hundred bytes, but incredibly slow as the input buffer grows into the kilobytes (likely because of pasting). Rather than making some expensive and expansive changes to the repl's internal representation of the screen, add some caching: remember some data from one refresh to the next about what was drawn to the screen and, if we don't find anything that has invalidated the results that were computed last time around, reuse them. To keep this caching as simple as possible, all we'll do is look for lines in the buffer that were above the cursor the last time we were asked to update the screen, and that are still above the cursor now. We assume that nothing can affect a line that comes before both the old and new cursor location without us being informed. Based on this assumption, we can reuse old lines, which drastically speeds up the overwhelmingly common case where the user is typing near the end of the buffer. * Speed up pyrepl prompt drawing Cache the `can_colorize()` call rather than repeatedly recomputing it. This call looks up an environment variable, and is called once per character typed at the REPL. The environment variable lookup shows up as a hot spot when profiling, and we don't expect this to change while the REPL is running. * Speed up pasting multiple lines into the REPL Previously, we were checking whether the command should be accepted each time a line break was encountered, but that's not the expected behavior. In bracketed paste mode, we expect everything pasted to be part of a single block of code, and encountering a newline shouldn't behave like a user pressing <Enter> to execute a command. The user should always have a chance to review the pasted command before running it. * Use a read buffer for input in pyrepl Previously we were reading one byte at a time, which causes much slower IO than necessary. Instead, read in chunks, processing previously read data before asking for more. * Optimize finding width of a single character `wlen` finds the width of a multi-character string by adding up the width of each character, and then subtracting the width of any escape sequences. It's often called for single character strings, however, which can't possibly contain escape sequences. Optimize for that case. * Optimize disp_str for ASCII characters Since every ASCII character is known to display as single width, we can avoid not only the Unicode data lookup in `disp_str` but also the one hidden in `str_width` for them. * Speed up cursor movements in long pyrepl commands When the current pyrepl command buffer contains many lines, scrolling up becomes slow. We have optimizations in place to reuse lines above the cursor position from one refresh to the next, but don't currently try to reuse lines below the cursor position in the same way, so we wind up with quadratic behavior where all lines of the buffer below the cursor are recomputed each time the cursor moves up another line. Optimize this by only computing one screen's worth of lines beyond the cursor position. Any lines beyond that can't possibly be shown by the console, and bounding this makes scrolling up have linear time complexity instead. --------- (cherry picked from commit 32a0fab) Signed-off-by: Matt Wozniski <[email protected]> Co-authored-by: Matt Wozniski <[email protected]> Co-authored-by: Pablo Galindo <[email protected]>
1 parent 2dd07fd commit 4936575

File tree

6 files changed

+134
-57
lines changed

6 files changed

+134
-57
lines changed

Lib/_pyrepl/commands.py

-3
Original file line numberDiff line numberDiff line change
@@ -368,8 +368,6 @@ def do(self) -> None:
368368
r = self.reader
369369
text = self.event * r.get_arg()
370370
r.insert(text)
371-
if len(text) == 1 and r.pos == len(r.buffer):
372-
r.calc_screen = r.append_to_screen
373371

374372

375373
class insert_nl(EditCommand):
@@ -483,4 +481,3 @@ def do(self) -> None:
483481
self.reader.paste_mode = False
484482
self.reader.in_bracketed_paste = False
485483
self.reader.dirty = True
486-
self.reader.calc_screen = self.reader.calc_complete_screen

Lib/_pyrepl/completing_reader.py

+2-6
Original file line numberDiff line numberDiff line change
@@ -209,10 +209,6 @@ def do(self) -> None:
209209
r = self.reader # type: ignore[assignment]
210210

211211
commands.self_insert.do(self)
212-
213-
if r.cmpltn_menu_visible or r.cmpltn_message_visible:
214-
r.calc_screen = r.calc_complete_screen
215-
216212
if r.cmpltn_menu_visible:
217213
stem = r.get_stem()
218214
if len(stem) < 1:
@@ -261,8 +257,8 @@ def after_command(self, cmd: Command) -> None:
261257
if not isinstance(cmd, (complete, self_insert)):
262258
self.cmpltn_reset()
263259

264-
def calc_complete_screen(self) -> list[str]:
265-
screen = super().calc_complete_screen()
260+
def calc_screen(self) -> list[str]:
261+
screen = super().calc_screen()
266262
if self.cmpltn_menu_visible:
267263
ly = self.lxy[1]
268264
screen[ly:ly] = self.cmpltn_menu

Lib/_pyrepl/reader.py

+109-45
Original file line numberDiff line numberDiff line change
@@ -35,15 +35,13 @@
3535
# types
3636
Command = commands.Command
3737
if False:
38-
from typing import Callable
3938
from .types import Callback, SimpleContextManager, KeySpec, CommandName
40-
CalcScreen = Callable[[], list[str]]
4139

4240

4341
def disp_str(buffer: str) -> tuple[str, list[int]]:
4442
"""disp_str(buffer:string) -> (string, [int])
4543
46-
Return the string that should be the printed represenation of
44+
Return the string that should be the printed representation of
4745
|buffer| and a list detailing where the characters of |buffer|
4846
get used up. E.g.:
4947
@@ -54,11 +52,17 @@ def disp_str(buffer: str) -> tuple[str, list[int]]:
5452
b: list[int] = []
5553
s: list[str] = []
5654
for c in buffer:
57-
if ord(c) > 128 and unicodedata.category(c).startswith("C"):
55+
if ord(c) < 128:
56+
s.append(c)
57+
b.append(1)
58+
elif unicodedata.category(c).startswith("C"):
5859
c = r"\u%04x" % ord(c)
59-
s.append(c)
60-
b.append(wlen(c))
61-
b.extend([0] * (len(c) - 1))
60+
s.append(c)
61+
b.append(str_width(c))
62+
b.extend([0] * (len(c) - 1))
63+
else:
64+
s.append(c)
65+
b.append(str_width(c))
6266
return "".join(s), b
6367

6468

@@ -230,16 +234,59 @@ class Reader:
230234
commands: dict[str, type[Command]] = field(default_factory=make_default_commands)
231235
last_command: type[Command] | None = None
232236
syntax_table: dict[str, int] = field(default_factory=make_default_syntax_table)
233-
msg_at_bottom: bool = True
234237
keymap: tuple[tuple[str, str], ...] = ()
235238
input_trans: input.KeymapTranslator = field(init=False)
236239
input_trans_stack: list[input.KeymapTranslator] = field(default_factory=list)
237240
screen: list[str] = field(default_factory=list)
238241
screeninfo: list[tuple[int, list[int]]] = field(init=False)
239242
cxy: tuple[int, int] = field(init=False)
240243
lxy: tuple[int, int] = field(init=False)
241-
calc_screen: CalcScreen = field(init=False)
242244
scheduled_commands: list[str] = field(default_factory=list)
245+
can_colorize: bool = False
246+
247+
## cached metadata to speed up screen refreshes
248+
@dataclass
249+
class RefreshCache:
250+
in_bracketed_paste: bool = False
251+
screen: list[str] = field(default_factory=list)
252+
screeninfo: list[tuple[int, list[int]]] = field(init=False)
253+
line_end_offsets: list[int] = field(default_factory=list)
254+
pos: int = field(init=False)
255+
cxy: tuple[int, int] = field(init=False)
256+
dimensions: tuple[int, int] = field(init=False)
257+
258+
def update_cache(self,
259+
reader: Reader,
260+
screen: list[str],
261+
screeninfo: list[tuple[int, list[int]]],
262+
) -> None:
263+
self.in_bracketed_paste = reader.in_bracketed_paste
264+
self.screen = screen.copy()
265+
self.screeninfo = screeninfo.copy()
266+
self.pos = reader.pos
267+
self.cxy = reader.cxy
268+
self.dimensions = reader.console.width, reader.console.height
269+
270+
def valid(self, reader: Reader) -> bool:
271+
dimensions = reader.console.width, reader.console.height
272+
dimensions_changed = dimensions != self.dimensions
273+
paste_changed = reader.in_bracketed_paste != self.in_bracketed_paste
274+
return not (dimensions_changed or paste_changed)
275+
276+
def get_cached_location(self, reader: Reader) -> tuple[int, int]:
277+
offset = 0
278+
earliest_common_pos = min(reader.pos, self.pos)
279+
num_common_lines = len(self.line_end_offsets)
280+
while num_common_lines > 0:
281+
offset = self.line_end_offsets[num_common_lines - 1]
282+
if earliest_common_pos > offset:
283+
break
284+
num_common_lines -= 1
285+
else:
286+
offset = 0
287+
return offset, num_common_lines
288+
289+
last_refresh_cache: RefreshCache = field(default_factory=RefreshCache)
243290

244291
def __post_init__(self) -> None:
245292
# Enable the use of `insert` without a `prepare` call - necessary to
@@ -252,60 +299,69 @@ def __post_init__(self) -> None:
252299
self.screeninfo = [(0, [])]
253300
self.cxy = self.pos2xy()
254301
self.lxy = (self.pos, 0)
255-
self.calc_screen = self.calc_complete_screen
302+
self.can_colorize = can_colorize()
303+
304+
self.last_refresh_cache.screeninfo = self.screeninfo
305+
self.last_refresh_cache.pos = self.pos
306+
self.last_refresh_cache.cxy = self.cxy
307+
self.last_refresh_cache.dimensions = (0, 0)
256308

257309
def collect_keymap(self) -> tuple[tuple[KeySpec, CommandName], ...]:
258310
return default_keymap
259311

260-
def append_to_screen(self) -> list[str]:
261-
new_screen = self.screen.copy() or ['']
312+
def calc_screen(self) -> list[str]:
313+
"""Translate changes in self.buffer into changes in self.console.screen."""
314+
# Since the last call to calc_screen:
315+
# screen and screeninfo may differ due to a completion menu being shown
316+
# pos and cxy may differ due to edits, cursor movements, or completion menus
262317

263-
new_character = self.buffer[-1]
264-
new_character_len = wlen(new_character)
318+
# Lines that are above both the old and new cursor position can't have changed,
319+
# unless the terminal has been resized (which might cause reflowing) or we've
320+
# entered or left paste mode (which changes prompts, causing reflowing).
321+
num_common_lines = 0
322+
offset = 0
323+
if self.last_refresh_cache.valid(self):
324+
offset, num_common_lines = self.last_refresh_cache.get_cached_location(self)
265325

266-
last_line_len = wlen(new_screen[-1])
267-
if last_line_len + new_character_len >= self.console.width: # We need to wrap here
268-
new_screen[-1] += '\\'
269-
self.screeninfo[-1][1].append(1)
270-
new_screen.append(self.buffer[-1])
271-
self.screeninfo.append((0, [new_character_len]))
272-
else:
273-
new_screen[-1] += self.buffer[-1]
274-
self.screeninfo[-1][1].append(new_character_len)
275-
self.cxy = self.pos2xy()
326+
screen = self.last_refresh_cache.screen
327+
del screen[num_common_lines:]
276328

277-
# Reset the function that is used for completing the screen
278-
self.calc_screen = self.calc_complete_screen
279-
return new_screen
329+
screeninfo = self.last_refresh_cache.screeninfo
330+
del screeninfo[num_common_lines:]
331+
332+
last_refresh_line_end_offsets = self.last_refresh_cache.line_end_offsets
333+
del last_refresh_line_end_offsets[num_common_lines:]
280334

281-
def calc_complete_screen(self) -> list[str]:
282-
"""The purpose of this method is to translate changes in
283-
self.buffer into changes in self.screen. Currently it rips
284-
everything down and starts from scratch, which whilst not
285-
especially efficient is certainly simple(r).
286-
"""
287-
lines = self.get_unicode().split("\n")
288-
screen: list[str] = []
289-
screeninfo: list[tuple[int, list[int]]] = []
290335
pos = self.pos
291-
for ln, line in enumerate(lines):
336+
pos -= offset
337+
338+
lines = "".join(self.buffer[offset:]).split("\n")
339+
cursor_found = False
340+
lines_beyond_cursor = 0
341+
for ln, line in enumerate(lines, num_common_lines):
292342
ll = len(line)
293343
if 0 <= pos <= ll:
294-
if self.msg and not self.msg_at_bottom:
295-
for mline in self.msg.split("\n"):
296-
screen.append(mline)
297-
screeninfo.append((0, []))
298344
self.lxy = pos, ln
345+
cursor_found = True
346+
elif cursor_found:
347+
lines_beyond_cursor += 1
348+
if lines_beyond_cursor > self.console.height:
349+
# No need to keep formatting lines.
350+
# The console can't show them.
351+
break
299352
prompt = self.get_prompt(ln, ll >= pos >= 0)
300353
while "\n" in prompt:
301354
pre_prompt, _, prompt = prompt.partition("\n")
355+
last_refresh_line_end_offsets.append(offset)
302356
screen.append(pre_prompt)
303357
screeninfo.append((0, []))
304358
pos -= ll + 1
305359
prompt, lp = self.process_prompt(prompt)
306360
l, l2 = disp_str(line)
307361
wrapcount = (wlen(l) + lp) // self.console.width
308362
if wrapcount == 0:
363+
offset += ll + 1 # Takes all of the line plus the newline
364+
last_refresh_line_end_offsets.append(offset)
309365
screen.append(prompt + l)
310366
screeninfo.append((lp, l2))
311367
else:
@@ -321,22 +377,27 @@ def calc_complete_screen(self) -> list[str]:
321377
column += character_width
322378
pre = prompt if i == 0 else ""
323379
if len(l) > index_to_wrap_before:
380+
offset += index_to_wrap_before
324381
post = "\\"
325382
after = [1]
326383
else:
384+
offset += index_to_wrap_before + 1 # Takes the newline
327385
post = ""
328386
after = []
387+
last_refresh_line_end_offsets.append(offset)
329388
screen.append(pre + l[:index_to_wrap_before] + post)
330389
screeninfo.append((prelen, l2[:index_to_wrap_before] + after))
331390
l = l[index_to_wrap_before:]
332391
l2 = l2[index_to_wrap_before:]
333392
i += 1
334393
self.screeninfo = screeninfo
335394
self.cxy = self.pos2xy()
336-
if self.msg and self.msg_at_bottom:
395+
if self.msg:
337396
for mline in self.msg.split("\n"):
338397
screen.append(mline)
339398
screeninfo.append((0, []))
399+
400+
self.last_refresh_cache.update_cache(self, screen, screeninfo)
340401
return screen
341402

342403
@staticmethod
@@ -456,7 +517,7 @@ def get_prompt(self, lineno: int, cursor_on_line: bool) -> str:
456517
`lineno'."""
457518
if self.arg is not None and cursor_on_line:
458519
prompt = f"(arg: {self.arg}) "
459-
elif self.paste_mode:
520+
elif self.paste_mode and not self.in_bracketed_paste:
460521
prompt = "(paste) "
461522
elif "\n" in self.buffer:
462523
if lineno == 0:
@@ -468,7 +529,7 @@ def get_prompt(self, lineno: int, cursor_on_line: bool) -> str:
468529
else:
469530
prompt = self.ps1
470531

471-
if can_colorize():
532+
if self.can_colorize:
472533
prompt = f"{ANSIColors.BOLD_MAGENTA}{prompt}{ANSIColors.RESET}"
473534
return prompt
474535

@@ -604,6 +665,9 @@ def update_screen(self) -> None:
604665

605666
def refresh(self) -> None:
606667
"""Recalculate and refresh the screen."""
668+
if self.in_bracketed_paste and self.buffer and not self.buffer[-1] == "\n":
669+
return
670+
607671
# this call sets up self.cxy, so call it first.
608672
self.screen = self.calc_screen()
609673
self.console.refresh(self.screen, self.cxy)
@@ -627,7 +691,7 @@ def do_cmd(self, cmd: tuple[str, list[str]]) -> None:
627691

628692
self.after_command(command)
629693

630-
if self.dirty and not self.in_bracketed_paste:
694+
if self.dirty:
631695
self.refresh()
632696
else:
633697
self.update_cursor()

Lib/_pyrepl/readline.py

+4
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,10 @@ def do(self) -> None:
263263
r = self.reader # type: ignore[assignment]
264264
r.dirty = True # this is needed to hide the completion menu, if visible
265265

266+
if self.reader.in_bracketed_paste:
267+
r.insert("\n")
268+
return
269+
266270
# if there are already several lines and the cursor
267271
# is not on the last one, always insert a new \n.
268272
text = r.get_unicode()

Lib/_pyrepl/unix_console.py

+17-3
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,8 @@ def __init__(
150150

151151
self.pollob = poll()
152152
self.pollob.register(self.input_fd, select.POLLIN)
153+
self.input_buffer = b""
154+
self.input_buffer_pos = 0
153155
curses.setupterm(term or None, self.output_fd)
154156
self.term = term
155157

@@ -197,6 +199,18 @@ def _my_getstr(cap: str, optional: bool = False) -> bytes | None:
197199
self.event_queue = EventQueue(self.input_fd, self.encoding)
198200
self.cursor_visible = 1
199201

202+
def __read(self, n: int) -> bytes:
203+
if not self.input_buffer or self.input_buffer_pos >= len(self.input_buffer):
204+
self.input_buffer = os.read(self.input_fd, 10000)
205+
206+
ret = self.input_buffer[self.input_buffer_pos : self.input_buffer_pos + n]
207+
self.input_buffer_pos += len(ret)
208+
if self.input_buffer_pos >= len(self.input_buffer):
209+
self.input_buffer = b""
210+
self.input_buffer_pos = 0
211+
return ret
212+
213+
200214
def change_encoding(self, encoding: str) -> None:
201215
"""
202216
Change the encoding used for I/O operations.
@@ -373,7 +387,7 @@ def get_event(self, block: bool = True) -> Event | None:
373387
while self.event_queue.empty():
374388
while True:
375389
try:
376-
self.push_char(os.read(self.input_fd, 1))
390+
self.push_char(self.__read(1))
377391
except OSError as err:
378392
if err.errno == errno.EINTR:
379393
if not self.event_queue.empty():
@@ -491,7 +505,7 @@ def getpending(self):
491505
e.raw += e.raw
492506

493507
amount = struct.unpack("i", ioctl(self.input_fd, FIONREAD, b"\0\0\0\0"))[0]
494-
raw = os.read(self.input_fd, amount)
508+
raw = self.__read(amount)
495509
data = str(raw, self.encoding, "replace")
496510
e.data += data
497511
e.raw += raw
@@ -514,7 +528,7 @@ def getpending(self):
514528
e.raw += e.raw
515529

516530
amount = 10000
517-
raw = os.read(self.input_fd, amount)
531+
raw = self.__read(amount)
518532
data = str(raw, self.encoding, "replace")
519533
e.data += data
520534
e.raw += raw

Lib/_pyrepl/utils.py

+2
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ def str_width(c: str) -> int:
1616

1717

1818
def wlen(s: str) -> int:
19+
if len(s) == 1:
20+
return str_width(s)
1921
length = sum(str_width(i) for i in s)
2022
# remove lengths of any escape sequences
2123
sequence = ANSI_ESCAPE_SEQUENCE.findall(s)

0 commit comments

Comments
 (0)