Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1# -*- coding: utf-8 -*- 

2#@+leo-ver=5-thin 

3#@+node:ekr.20141012064706.18389: * @file leoAst.py 

4#@@first 

5# This file is part of Leo: https://leoeditor.com 

6# Leo's copyright notice is based on the MIT license: http://leoeditor.com/license.html 

7#@+<< docstring >> 

8#@+node:ekr.20200113081838.1: ** << docstring >> (leoAst.py) 

9""" 

10leoAst.py: This file does not depend on Leo in any way. 

11 

12The classes in this file unify python's token-based and ast-based worlds by 

13creating two-way links between tokens in the token list and ast nodes in 

14the parse tree. For more details, see the "Overview" section below. 

15 

16 

17**Stand-alone operation** 

18 

19usage: 

20 leoAst.py --help 

21 leoAst.py [--fstringify | --fstringify-diff | --orange | --orange-diff] PATHS 

22 leoAst.py --py-cov [ARGS] 

23 leoAst.py --pytest [ARGS] 

24 leoAst.py --unittest [ARGS] 

25 

26examples: 

27 --py-cov "-f TestOrange" 

28 --pytest "-f TestOrange" 

29 --unittest TestOrange 

30 

31positional arguments: 

32 PATHS directory or list of files 

33 

34optional arguments: 

35 -h, --help show this help message and exit 

36 --fstringify leonine fstringify 

37 --fstringify-diff show fstringify diff 

38 --orange leonine Black 

39 --orange-diff show orange diff 

40 --py-cov run pytest --cov on leoAst.py 

41 --pytest run pytest on leoAst.py 

42 --unittest run unittest on leoAst.py 

43 

44 

45**Overview** 

46 

47leoAst.py unifies python's token-oriented and ast-oriented worlds. 

48 

49leoAst.py defines classes that create two-way links between tokens 

50created by python's tokenize module and parse tree nodes created by 

51python's ast module: 

52 

53The Token Order Generator (TOG) class quickly creates the following 

54links: 

55 

56- An *ordered* children array from each ast node to its children. 

57 

58- A parent link from each ast.node to its parent. 

59 

60- Two-way links between tokens in the token list, a list of Token 

61 objects, and the ast nodes in the parse tree: 

62 

63 - For each token, token.node contains the ast.node "responsible" for 

64 the token. 

65 

66 - For each ast node, node.first_i and node.last_i are indices into 

67 the token list. These indices give the range of tokens that can be 

68 said to be "generated" by the ast node. 

69 

70Once the TOG class has inserted parent/child links, the Token Order 

71Traverser (TOT) class traverses trees annotated with parent/child 

72links extremely quickly. 

73 

74 

75**Applicability and importance** 

76 

77Many python developers will find asttokens meets all their needs. 

78asttokens is well documented and easy to use. Nevertheless, two-way 

79links are significant additions to python's tokenize and ast modules: 

80 

81- Links from tokens to nodes are assigned to the nearest possible ast 

82 node, not the nearest statement, as in asttokens. Links can easily 

83 be reassigned, if desired. 

84 

85- The TOG and TOT classes are intended to be the foundation of tools 

86 such as fstringify and black. 

87 

88- The TOG class solves real problems, such as: 

89 https://stackoverflow.com/questions/16748029/ 

90 

91**Known bug** 

92 

93This file has no known bugs *except* for Python version 3.8. 

94 

95For Python 3.8, syncing tokens will fail for function call such as: 

96 

97 f(1, x=2, *[3, 4], y=5) 

98 

99that is, for calls where keywords appear before non-keyword args. 

100 

101There are no plans to fix this bug. The workaround is to use Python version 

1023.9 or above. 

103 

104 

105**Figures of merit** 

106 

107Simplicity: The code consists primarily of a set of generators, one 

108for every kind of ast node. 

109 

110Speed: The TOG creates two-way links between tokens and ast nodes in 

111roughly the time taken by python's tokenize.tokenize and ast.parse 

112library methods. This is substantially faster than the asttokens, 

113black or fstringify tools. The TOT class traverses trees annotated 

114with parent/child links even more quickly. 

115 

116Memory: The TOG class makes no significant demands on python's 

117resources. Generators add nothing to python's call stack. 

118TOG.node_stack is the only variable-length data. This stack resides in 

119python's heap, so its length is unimportant. In the worst case, it 

120might contain a few thousand entries. The TOT class uses no 

121variable-length data at all. 

122 

123**Links** 

124 

125Leo... 

126Ask for help: https://groups.google.com/forum/#!forum/leo-editor 

127Report a bug: https://github.com/leo-editor/leo-editor/issues 

128leoAst.py docs: http://leoeditor.com/appendices.html#leoast-py 

129 

130Other tools... 

131asttokens: https://pypi.org/project/asttokens 

132black: https://pypi.org/project/black/ 

133fstringify: https://pypi.org/project/fstringify/ 

134 

135Python modules... 

136tokenize.py: https://docs.python.org/3/library/tokenize.html 

137ast.py https://docs.python.org/3/library/ast.html 

138 

139**Studying this file** 

140 

141I strongly recommend that you use Leo when studying this code so that you 

142will see the file's intended outline structure. 

143 

144Without Leo, you will see only special **sentinel comments** that create 

145Leo's outline structure. These comments have the form:: 

146 

147 `#@<comment-kind>:<user-id>.<timestamp>.<number>: <outline-level> <headline>` 

148""" 

149#@-<< docstring >> 

150#@+<< imports >> 

151#@+node:ekr.20200105054219.1: ** << imports >> (leoAst.py) 

152import argparse 

153import ast 

154import codecs 

155import difflib 

156import glob 

157import io 

158import os 

159import re 

160import sys 

161import textwrap 

162import tokenize 

163import traceback 

164from typing import List, Optional 

165#@-<< imports >> 

166v1, v2, junk1, junk2, junk3 = sys.version_info 

167py_version = (v1, v2) 

168 

169# Async tokens exist only in Python 3.5 and 3.6. 

170# https://docs.python.org/3/library/token.html 

171has_async_tokens = (3, 5) <= py_version <= (3, 6) 

172 

173# has_position_only_params = (v1, v2) >= (3, 8) 

174#@+others 

175#@+node:ekr.20191226175251.1: ** class LeoGlobals 

176#@@nosearch 

177 

178 

179class LeoGlobals: # pragma: no cover 

180 """ 

181 Simplified version of functions in leoGlobals.py. 

182 """ 

183 

184 total_time = 0.0 # For unit testing. 

185 

186 #@+others 

187 #@+node:ekr.20191226175903.1: *3* LeoGlobals.callerName 

188 def callerName(self, n): 

189 """Get the function name from the call stack.""" 

190 try: 

191 f1 = sys._getframe(n) 

192 code1 = f1.f_code 

193 return code1.co_name 

194 except Exception: 

195 return '' 

196 #@+node:ekr.20191226175426.1: *3* LeoGlobals.callers 

197 def callers(self, n=4): 

198 """ 

199 Return a string containing a comma-separated list of the callers 

200 of the function that called g.callerList. 

201 """ 

202 i, result = 2, [] 

203 while True: 

204 s = self.callerName(n=i) 

205 if s: 

206 result.append(s) 

207 if not s or len(result) >= n: 

208 break 

209 i += 1 

210 return ','.join(reversed(result)) 

211 #@+node:ekr.20191226190709.1: *3* leoGlobals.es_exception & helper 

212 def es_exception(self, full=True): 

213 typ, val, tb = sys.exc_info() 

214 for line in traceback.format_exception(typ, val, tb): 

215 print(line) 

216 fileName, n = self.getLastTracebackFileAndLineNumber() 

217 return fileName, n 

218 #@+node:ekr.20191226192030.1: *4* LeoGlobals.getLastTracebackFileAndLineNumber 

219 def getLastTracebackFileAndLineNumber(self): 

220 typ, val, tb = sys.exc_info() 

221 if typ == SyntaxError: 

222 # IndentationError is a subclass of SyntaxError. 

223 # SyntaxError *does* have 'filename' and 'lineno' attributes. 

224 return val.filename, val.lineno # type:ignore 

225 # 

226 # Data is a list of tuples, one per stack entry. 

227 # The tuples have the form (filename, lineNumber, functionName, text). 

228 data = traceback.extract_tb(tb) 

229 item = data[-1] # Get the item at the top of the stack. 

230 filename, n, functionName, text = item 

231 return filename, n 

232 #@+node:ekr.20200220065737.1: *3* LeoGlobals.objToString 

233 def objToString(self, obj, tag=None): 

234 """Simplified version of g.printObj.""" 

235 result = [] 

236 if tag: 

237 result.append(f"{tag}...") 

238 if isinstance(obj, str): 

239 obj = g.splitLines(obj) 

240 if isinstance(obj, list): 

241 result.append('[') 

242 for z in obj: 

243 result.append(f" {z!r}") 

244 result.append(']') 

245 elif isinstance(obj, tuple): 

246 result.append('(') 

247 for z in obj: 

248 result.append(f" {z!r}") 

249 result.append(')') 

250 else: 

251 result.append(repr(obj)) 

252 result.append('') 

253 return '\n'.join(result) 

254 #@+node:ekr.20191226190425.1: *3* LeoGlobals.plural 

255 def plural(self, obj): 

256 """Return "s" or "" depending on n.""" 

257 if isinstance(obj, (list, tuple, str)): 

258 n = len(obj) 

259 else: 

260 n = obj 

261 return '' if n == 1 else 's' 

262 #@+node:ekr.20191226175441.1: *3* LeoGlobals.printObj 

263 def printObj(self, obj, tag=None): 

264 """Simplified version of g.printObj.""" 

265 print(self.objToString(obj, tag)) 

266 #@+node:ekr.20191226190131.1: *3* LeoGlobals.splitLines 

267 def splitLines(self, s): 

268 """Split s into lines, preserving the number of lines and 

269 the endings of all lines, including the last line.""" 

270 # g.stat() 

271 if s: 

272 return s.splitlines(True) 

273 # This is a Python string function! 

274 return [] 

275 #@+node:ekr.20191226190844.1: *3* LeoGlobals.toEncodedString 

276 def toEncodedString(self, s, encoding='utf-8'): 

277 """Convert unicode string to an encoded string.""" 

278 if not isinstance(s, str): 

279 return s 

280 try: 

281 s = s.encode(encoding, "strict") 

282 except UnicodeError: 

283 s = s.encode(encoding, "replace") 

284 print(f"toEncodedString: Error converting {s!r} to {encoding}") 

285 return s 

286 #@+node:ekr.20191226190006.1: *3* LeoGlobals.toUnicode 

287 def toUnicode(self, s, encoding='utf-8'): 

288 """Convert bytes to unicode if necessary.""" 

289 tag = 'g.toUnicode' 

290 if isinstance(s, str): 

291 return s 

292 if not isinstance(s, bytes): 

293 print(f"{tag}: bad s: {s!r}") 

294 return '' 

295 b: bytes = s 

296 try: 

297 s2 = b.decode(encoding, 'strict') 

298 except(UnicodeDecodeError, UnicodeError): 

299 s2 = b.decode(encoding, 'replace') 

300 print(f"{tag}: unicode error. encoding: {encoding!r}, s2:\n{s2!r}") 

301 g.trace(g.callers()) 

302 except Exception: 

303 g.es_exception() 

304 print(f"{tag}: unexpected error! encoding: {encoding!r}, s2:\n{s2!r}") 

305 g.trace(g.callers()) 

306 return s2 

307 #@+node:ekr.20191226175436.1: *3* LeoGlobals.trace 

308 def trace(self, *args): 

309 """Print a tracing message.""" 

310 # Compute the caller name. 

311 try: 

312 f1 = sys._getframe(1) 

313 code1 = f1.f_code 

314 name = code1.co_name 

315 except Exception: 

316 name = '' 

317 print(f"{name}: {' '.join(str(z) for z in args)}") 

318 #@+node:ekr.20191226190241.1: *3* LeoGlobals.truncate 

319 def truncate(self, s, n): 

320 """Return s truncated to n characters.""" 

321 if len(s) <= n: 

322 return s 

323 s2 = s[: n - 3] + f"...({len(s)})" 

324 return s2 + '\n' if s.endswith('\n') else s2 

325 #@-others 

326#@+node:ekr.20200702114522.1: ** leoAst.py: top-level commands 

327#@+node:ekr.20200702114557.1: *3* command: fstringify_command 

328def fstringify_command(files): 

329 """ 

330 Entry point for --fstringify. 

331 

332 Fstringify the given file, overwriting the file. 

333 """ 

334 for filename in files: # pragma: no cover 

335 if os.path.exists(filename): 

336 print(f"fstringify {filename}") 

337 Fstringify().fstringify_file_silent(filename) 

338 else: 

339 print(f"file not found: {filename}") 

340#@+node:ekr.20200702121222.1: *3* command: fstringify_diff_command 

341def fstringify_diff_command(files): 

342 """ 

343 Entry point for --fstringify-diff. 

344 

345 Print the diff that would be produced by fstringify. 

346 """ 

347 for filename in files: # pragma: no cover 

348 if os.path.exists(filename): 

349 print(f"fstringify-diff {filename}") 

350 Fstringify().fstringify_file_diff(filename) 

351 else: 

352 print(f"file not found: {filename}") 

353#@+node:ekr.20200702115002.1: *3* command: orange_command 

354def orange_command(files): 

355 

356 for filename in files: # pragma: no cover 

357 if os.path.exists(filename): 

358 print(f"orange {filename}") 

359 Orange().beautify_file(filename) 

360 else: 

361 print(f"file not found: {filename}") 

362#@+node:ekr.20200702121315.1: *3* command: orange_diff_command 

363def orange_diff_command(files): 

364 

365 for filename in files: # pragma: no cover 

366 if os.path.exists(filename): 

367 print(f"orange-diff {filename}") 

368 Orange().beautify_file_diff(filename) 

369 else: 

370 print(f"file not found: {filename}") 

371#@+node:ekr.20160521104628.1: ** leoAst.py: top-level utils 

372if 1: # pragma: no cover 

373 #@+others 

374 #@+node:ekr.20200702102239.1: *3* function: main (leoAst.py) 

375 def main(): 

376 """Run commands specified by sys.argv.""" 

377 description = textwrap.dedent("""\ 

378 leo-editor/leo/unittests/core/test_leoAst.py contains unit tests (100% coverage). 

379 """) 

380 parser = argparse.ArgumentParser(description=description, formatter_class=argparse.RawTextHelpFormatter) 

381 parser.add_argument('PATHS', nargs='*', help='directory or list of files') 

382 group = parser.add_mutually_exclusive_group(required=False) # Don't require any args. 

383 add = group.add_argument 

384 add('--fstringify', dest='f', action='store_true', help='leonine fstringify') 

385 add('--fstringify-diff', dest='fd', action='store_true', help='show fstringify diff') 

386 add('--orange', dest='o', action='store_true', help='leonine Black') 

387 add('--orange-diff', dest='od', action='store_true', help='show orange diff') 

388 args = parser.parse_args() 

389 files = args.PATHS 

390 if len(files) == 1 and os.path.isdir(files[0]): 

391 files = glob.glob(f"{files[0]}{os.sep}*.py") 

392 if args.f: 

393 fstringify_command(files) 

394 if args.fd: 

395 fstringify_diff_command(files) 

396 if args.o: 

397 orange_command(files) 

398 if args.od: 

399 orange_diff_command(files) 

400 #@+node:ekr.20200107114409.1: *3* functions: reading & writing files 

401 #@+node:ekr.20200218071822.1: *4* function: regularize_nls 

402 def regularize_nls(s): 

403 """Regularize newlines within s.""" 

404 return s.replace('\r\n', '\n').replace('\r', '\n') 

405 #@+node:ekr.20200106171502.1: *4* function: get_encoding_directive 

406 encoding_pattern = re.compile(r'^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)') 

407 # This is the pattern in PEP 263. 

408 

409 def get_encoding_directive(bb): 

410 """ 

411 Get the encoding from the encoding directive at the start of a file. 

412 

413 bb: The bytes of the file. 

414 

415 Returns the codec name, or 'UTF-8'. 

416 

417 Adapted from pyzo. Copyright 2008 to 2020 by Almar Klein. 

418 """ 

419 for line in bb.split(b'\n', 2)[:2]: 

420 # Try to make line a string 

421 try: 

422 line2 = line.decode('ASCII').strip() 

423 except Exception: 

424 continue 

425 # Does the line match the PEP 263 pattern? 

426 m = encoding_pattern.match(line2) 

427 if not m: 

428 continue 

429 # Is it a known encoding? Correct the name if it is. 

430 try: 

431 c = codecs.lookup(m.group(1)) 

432 return c.name 

433 except Exception: 

434 pass 

435 return 'UTF-8' 

436 #@+node:ekr.20200103113417.1: *4* function: read_file 

437 def read_file(filename, encoding='utf-8'): 

438 """ 

439 Return the contents of the file with the given name. 

440 Print an error message and return None on error. 

441 """ 

442 tag = 'read_file' 

443 try: 

444 # Translate all newlines to '\n'. 

445 with open(filename, 'r', encoding=encoding) as f: 

446 s = f.read() 

447 return regularize_nls(s) 

448 except Exception: 

449 print(f"{tag}: can not read {filename}") 

450 return None 

451 #@+node:ekr.20200106173430.1: *4* function: read_file_with_encoding 

452 def read_file_with_encoding(filename): 

453 """ 

454 Read the file with the given name, returning (e, s), where: 

455 

456 s is the string, converted to unicode, or '' if there was an error. 

457 

458 e is the encoding of s, computed in the following order: 

459 

460 - The BOM encoding if the file starts with a BOM mark. 

461 - The encoding given in the # -*- coding: utf-8 -*- line. 

462 - The encoding given by the 'encoding' keyword arg. 

463 - 'utf-8'. 

464 """ 

465 # First, read the file. 

466 tag = 'read_with_encoding' 

467 try: 

468 with open(filename, 'rb') as f: 

469 bb = f.read() 

470 except Exception: 

471 print(f"{tag}: can not read {filename}") 

472 if not bb: 

473 return 'UTF-8', '' 

474 # Look for the BOM. 

475 e, bb = strip_BOM(bb) 

476 if not e: 

477 # Python's encoding comments override everything else. 

478 e = get_encoding_directive(bb) 

479 s = g.toUnicode(bb, encoding=e) 

480 s = regularize_nls(s) 

481 return e, s 

482 #@+node:ekr.20200106174158.1: *4* function: strip_BOM 

483 def strip_BOM(bb): 

484 """ 

485 bb must be the bytes contents of a file. 

486 

487 If bb starts with a BOM (Byte Order Mark), return (e, bb2), where: 

488 

489 - e is the encoding implied by the BOM. 

490 - bb2 is bb, stripped of the BOM. 

491 

492 If there is no BOM, return (None, bb) 

493 """ 

494 assert isinstance(bb, bytes), bb.__class__.__name__ 

495 table = ( 

496 # Test longer bom's first. 

497 (4, 'utf-32', codecs.BOM_UTF32_BE), 

498 (4, 'utf-32', codecs.BOM_UTF32_LE), 

499 (3, 'utf-8', codecs.BOM_UTF8), 

500 (2, 'utf-16', codecs.BOM_UTF16_BE), 

501 (2, 'utf-16', codecs.BOM_UTF16_LE), 

502 ) 

503 for n, e, bom in table: 

504 assert len(bom) == n 

505 if bom == bb[: len(bom)]: 

506 return e, bb[len(bom) :] 

507 return None, bb 

508 #@+node:ekr.20200103163100.1: *4* function: write_file 

509 def write_file(filename, s, encoding='utf-8'): 

510 """ 

511 Write the string s to the file whose name is given. 

512 

513 Handle all exeptions. 

514 

515 Before calling this function, the caller should ensure 

516 that the file actually has been changed. 

517 """ 

518 try: 

519 # Write the file with platform-dependent newlines. 

520 with open(filename, 'w', encoding=encoding) as f: 

521 f.write(s) 

522 except Exception as e: 

523 g.trace(f"Error writing {filename}\n{e}") 

524 #@+node:ekr.20200113154120.1: *3* functions: tokens 

525 #@+node:ekr.20191223093539.1: *4* function: find_anchor_token 

526 def find_anchor_token(node, global_token_list): 

527 """ 

528 Return the anchor_token for node, a token such that token.node == node. 

529 

530 The search starts at node, and then all the usual child nodes. 

531 """ 

532 

533 node1 = node 

534 

535 def anchor_token(node): 

536 """Return the anchor token in node.token_list""" 

537 # Careful: some tokens in the token list may have been killed. 

538 for token in get_node_token_list(node, global_token_list): 

539 if is_ancestor(node1, token): 

540 return token 

541 return None 

542 

543 # This table only has to cover fields for ast.Nodes that 

544 # won't have any associated token. 

545 

546 fields = ( 

547 # Common... 

548 'elt', 'elts', 'body', 'value', 

549 # Less common... 

550 'dims', 'ifs', 'names', 's', 

551 'test', 'values', 'targets', 

552 ) 

553 while node: 

554 # First, try the node itself. 

555 token = anchor_token(node) 

556 if token: 

557 return token 

558 # Second, try the most common nodes w/o token_lists: 

559 if isinstance(node, ast.Call): 

560 node = node.func 

561 elif isinstance(node, ast.Tuple): 

562 node = node.elts # type:ignore 

563 # Finally, try all other nodes. 

564 else: 

565 # This will be used rarely. 

566 for field in fields: 

567 node = getattr(node, field, None) 

568 if node: 

569 token = anchor_token(node) 

570 if token: 

571 return token 

572 else: 

573 break 

574 return None 

575 #@+node:ekr.20191231160225.1: *4* function: find_paren_token (changed signature) 

576 def find_paren_token(i, global_token_list): 

577 """Return i of the next paren token, starting at tokens[i].""" 

578 while i < len(global_token_list): 

579 token = global_token_list[i] 

580 if token.kind == 'op' and token.value in '()': 

581 return i 

582 if is_significant_token(token): 

583 break 

584 i += 1 

585 return None 

586 #@+node:ekr.20200113110505.4: *4* function: get_node_tokens_list 

587 def get_node_token_list(node, global_tokens_list): 

588 """ 

589 tokens_list must be the global tokens list. 

590 Return the tokens assigned to the node, or []. 

591 """ 

592 i = getattr(node, 'first_i', None) 

593 j = getattr(node, 'last_i', None) 

594 return [] if i is None else global_tokens_list[i : j + 1] 

595 #@+node:ekr.20191124123830.1: *4* function: is_significant & is_significant_token 

596 def is_significant(kind, value): 

597 """ 

598 Return True if (kind, value) represent a token that can be used for 

599 syncing generated tokens with the token list. 

600 """ 

601 # Making 'endmarker' significant ensures that all tokens are synced. 

602 return ( 

603 kind in ('async', 'await', 'endmarker', 'name', 'number', 'string') or 

604 kind == 'op' and value not in ',;()') 

605 

606 def is_significant_token(token): 

607 """Return True if the given token is a syncronizing token""" 

608 return is_significant(token.kind, token.value) 

609 #@+node:ekr.20191224093336.1: *4* function: match_parens 

610 def match_parens(filename, i, j, tokens): 

611 """Match parens in tokens[i:j]. Return the new j.""" 

612 if j >= len(tokens): 

613 return len(tokens) 

614 # Calculate paren level... 

615 level = 0 

616 for n in range(i, j + 1): 

617 token = tokens[n] 

618 if token.kind == 'op' and token.value == '(': 

619 level += 1 

620 if token.kind == 'op' and token.value == ')': 

621 if level == 0: 

622 break 

623 level -= 1 

624 # Find matching ')' tokens *after* j. 

625 if level > 0: 

626 while level > 0 and j + 1 < len(tokens): 

627 token = tokens[j + 1] 

628 if token.kind == 'op' and token.value == ')': 

629 level -= 1 

630 elif token.kind == 'op' and token.value == '(': 

631 level += 1 

632 elif is_significant_token(token): 

633 break 

634 j += 1 

635 if level != 0: # pragma: no cover. 

636 line_n = tokens[i].line_number 

637 raise AssignLinksError( 

638 f"\n" 

639 f"Unmatched parens: level={level}\n" 

640 f" file: {filename}\n" 

641 f" line: {line_n}\n") 

642 return j 

643 #@+node:ekr.20191223053324.1: *4* function: tokens_for_node 

644 def tokens_for_node(filename, node, global_token_list): 

645 """Return the list of all tokens descending from node.""" 

646 # Find any token descending from node. 

647 token = find_anchor_token(node, global_token_list) 

648 if not token: 

649 if 0: # A good trace for debugging. 

650 print('') 

651 g.trace('===== no tokens', node.__class__.__name__) 

652 return [] 

653 assert is_ancestor(node, token) 

654 # Scan backward. 

655 i = first_i = token.index 

656 while i >= 0: 

657 token2 = global_token_list[i - 1] 

658 if getattr(token2, 'node', None): 

659 if is_ancestor(node, token2): 

660 first_i = i - 1 

661 else: 

662 break 

663 i -= 1 

664 # Scan forward. 

665 j = last_j = token.index 

666 while j + 1 < len(global_token_list): 

667 token2 = global_token_list[j + 1] 

668 if getattr(token2, 'node', None): 

669 if is_ancestor(node, token2): 

670 last_j = j + 1 

671 else: 

672 break 

673 j += 1 

674 last_j = match_parens(filename, first_i, last_j, global_token_list) 

675 results = global_token_list[first_i : last_j + 1] 

676 return results 

677 #@+node:ekr.20200101030236.1: *4* function: tokens_to_string 

678 def tokens_to_string(tokens): 

679 """Return the string represented by the list of tokens.""" 

680 if tokens is None: 

681 # This indicates an internal error. 

682 print('') 

683 g.trace('===== token list is None ===== ') 

684 print('') 

685 return '' 

686 return ''.join([z.to_string() for z in tokens]) 

687 #@+node:ekr.20191231072039.1: *3* functions: utils... 

688 # General utility functions on tokens and nodes. 

689 #@+node:ekr.20191119085222.1: *4* function: obj_id 

690 def obj_id(obj): 

691 """Return the last four digits of id(obj), for dumps & traces.""" 

692 return str(id(obj))[-4:] 

693 #@+node:ekr.20191231060700.1: *4* function: op_name 

694 #@@nobeautify 

695 

696 # https://docs.python.org/3/library/ast.html 

697 

698 _op_names = { 

699 # Binary operators. 

700 'Add': '+', 

701 'BitAnd': '&', 

702 'BitOr': '|', 

703 'BitXor': '^', 

704 'Div': '/', 

705 'FloorDiv': '//', 

706 'LShift': '<<', 

707 'MatMult': '@', # Python 3.5. 

708 'Mod': '%', 

709 'Mult': '*', 

710 'Pow': '**', 

711 'RShift': '>>', 

712 'Sub': '-', 

713 # Boolean operators. 

714 'And': ' and ', 

715 'Or': ' or ', 

716 # Comparison operators 

717 'Eq': '==', 

718 'Gt': '>', 

719 'GtE': '>=', 

720 'In': ' in ', 

721 'Is': ' is ', 

722 'IsNot': ' is not ', 

723 'Lt': '<', 

724 'LtE': '<=', 

725 'NotEq': '!=', 

726 'NotIn': ' not in ', 

727 # Context operators. 

728 'AugLoad': '<AugLoad>', 

729 'AugStore': '<AugStore>', 

730 'Del': '<Del>', 

731 'Load': '<Load>', 

732 'Param': '<Param>', 

733 'Store': '<Store>', 

734 # Unary operators. 

735 'Invert': '~', 

736 'Not': ' not ', 

737 'UAdd': '+', 

738 'USub': '-', 

739 } 

740 

741 def op_name(node): 

742 """Return the print name of an operator node.""" 

743 class_name = node.__class__.__name__ 

744 assert class_name in _op_names, repr(class_name) 

745 return _op_names[class_name].strip() 

746 #@+node:ekr.20200107114452.1: *3* node/token creators... 

747 #@+node:ekr.20200103082049.1: *4* function: make_tokens 

748 def make_tokens(contents): 

749 """ 

750 Return a list (not a generator) of Token objects corresponding to the 

751 list of 5-tuples generated by tokenize.tokenize. 

752 

753 Perform consistency checks and handle all exeptions. 

754 """ 

755 

756 def check(contents, tokens): 

757 result = tokens_to_string(tokens) 

758 ok = result == contents 

759 if not ok: 

760 print('\nRound-trip check FAILS') 

761 print('Contents...\n') 

762 g.printObj(contents) 

763 print('\nResult...\n') 

764 g.printObj(result) 

765 return ok 

766 

767 try: 

768 five_tuples = tokenize.tokenize( 

769 io.BytesIO(contents.encode('utf-8')).readline) 

770 except Exception: 

771 print('make_tokens: exception in tokenize.tokenize') 

772 g.es_exception() 

773 return None 

774 tokens = Tokenizer().create_input_tokens(contents, five_tuples) 

775 assert check(contents, tokens) 

776 return tokens 

777 #@+node:ekr.20191027075648.1: *4* function: parse_ast 

778 def parse_ast(s): 

779 """ 

780 Parse string s, catching & reporting all exceptions. 

781 Return the ast node, or None. 

782 """ 

783 

784 def oops(message): 

785 print('') 

786 print(f"parse_ast: {message}") 

787 g.printObj(s) 

788 print('') 

789 

790 try: 

791 s1 = g.toEncodedString(s) 

792 tree = ast.parse(s1, filename='before', mode='exec') 

793 return tree 

794 except IndentationError: 

795 oops('Indentation Error') 

796 except SyntaxError: 

797 oops('Syntax Error') 

798 except Exception: 

799 oops('Unexpected Exception') 

800 g.es_exception() 

801 return None 

802 #@+node:ekr.20191231110051.1: *3* node/token dumpers... 

803 #@+node:ekr.20191027074436.1: *4* function: dump_ast 

804 def dump_ast(ast, tag='dump_ast'): 

805 """Utility to dump an ast tree.""" 

806 g.printObj(AstDumper().dump_ast(ast), tag=tag) 

807 #@+node:ekr.20191228095945.4: *4* function: dump_contents 

808 def dump_contents(contents, tag='Contents'): 

809 print('') 

810 print(f"{tag}...\n") 

811 for i, z in enumerate(g.splitLines(contents)): 

812 print(f"{i+1:<3} ", z.rstrip()) 

813 print('') 

814 #@+node:ekr.20191228095945.5: *4* function: dump_lines 

815 def dump_lines(tokens, tag='Token lines'): 

816 print('') 

817 print(f"{tag}...\n") 

818 for z in tokens: 

819 if z.line.strip(): 

820 print(z.line.rstrip()) 

821 else: 

822 print(repr(z.line)) 

823 print('') 

824 #@+node:ekr.20191228095945.7: *4* function: dump_results 

825 def dump_results(tokens, tag='Results'): 

826 print('') 

827 print(f"{tag}...\n") 

828 print(tokens_to_string(tokens)) 

829 print('') 

830 #@+node:ekr.20191228095945.8: *4* function: dump_tokens 

831 def dump_tokens(tokens, tag='Tokens'): 

832 print('') 

833 print(f"{tag}...\n") 

834 if not tokens: 

835 return 

836 print("Note: values shown are repr(value) *except* for 'string' tokens.") 

837 tokens[0].dump_header() 

838 for i, z in enumerate(tokens): 

839 # Confusing. 

840 # if (i % 20) == 0: z.dump_header() 

841 print(z.dump()) 

842 print('') 

843 #@+node:ekr.20191228095945.9: *4* function: dump_tree 

844 def dump_tree(tokens, tree, tag='Tree'): 

845 print('') 

846 print(f"{tag}...\n") 

847 print(AstDumper().dump_tree(tokens, tree)) 

848 #@+node:ekr.20200107040729.1: *4* function: show_diffs 

849 def show_diffs(s1, s2, filename=''): 

850 """Print diffs between strings s1 and s2.""" 

851 lines = list(difflib.unified_diff( 

852 g.splitLines(s1), 

853 g.splitLines(s2), 

854 fromfile=f"Old {filename}", 

855 tofile=f"New {filename}", 

856 )) 

857 print('') 

858 tag = f"Diffs for {filename}" if filename else 'Diffs' 

859 g.printObj(lines, tag=tag) 

860 #@+node:ekr.20191223095408.1: *3* node/token nodes... 

861 # Functions that associate tokens with nodes. 

862 #@+node:ekr.20200120082031.1: *4* function: find_statement_node 

863 def find_statement_node(node): 

864 """ 

865 Return the nearest statement node. 

866 Return None if node has only Module for a parent. 

867 """ 

868 if isinstance(node, ast.Module): 

869 return None 

870 parent = node 

871 while parent: 

872 if is_statement_node(parent): 

873 return parent 

874 parent = parent.parent 

875 return None 

876 #@+node:ekr.20191223054300.1: *4* function: is_ancestor 

877 def is_ancestor(node, token): 

878 """Return True if node is an ancestor of token.""" 

879 t_node = token.node 

880 if not t_node: 

881 assert token.kind == 'killed', repr(token) 

882 return False 

883 while t_node: 

884 if t_node == node: 

885 return True 

886 t_node = t_node.parent 

887 return False 

888 #@+node:ekr.20200120082300.1: *4* function: is_long_statement 

889 def is_long_statement(node): 

890 """ 

891 Return True if node is an instance of a node that might be split into 

892 shorter lines. 

893 """ 

894 return isinstance(node, ( 

895 ast.Assign, ast.AnnAssign, ast.AsyncFor, ast.AsyncWith, ast.AugAssign, 

896 ast.Call, ast.Delete, ast.ExceptHandler, ast.For, ast.Global, 

897 ast.If, ast.Import, ast.ImportFrom, 

898 ast.Nonlocal, ast.Return, ast.While, ast.With, ast.Yield, ast.YieldFrom)) 

899 #@+node:ekr.20200120110005.1: *4* function: is_statement_node 

900 def is_statement_node(node): 

901 """Return True if node is a top-level statement.""" 

902 return is_long_statement(node) or isinstance(node, ( 

903 ast.Break, ast.Continue, ast.Pass, ast.Try)) 

904 #@+node:ekr.20191231082137.1: *4* function: nearest_common_ancestor 

905 def nearest_common_ancestor(node1, node2): 

906 """ 

907 Return the nearest common ancestor node for the given nodes. 

908 

909 The nodes must have parent links. 

910 """ 

911 

912 def parents(node): 

913 aList = [] 

914 while node: 

915 aList.append(node) 

916 node = node.parent 

917 return list(reversed(aList)) 

918 

919 result = None 

920 parents1 = parents(node1) 

921 parents2 = parents(node2) 

922 while parents1 and parents2: 

923 parent1 = parents1.pop(0) 

924 parent2 = parents2.pop(0) 

925 if parent1 == parent2: 

926 result = parent1 

927 else: 

928 break 

929 return result 

930 #@+node:ekr.20191225061516.1: *3* node/token replacers... 

931 # Functions that replace tokens or nodes. 

932 #@+node:ekr.20191231162249.1: *4* function: add_token_to_token_list 

933 def add_token_to_token_list(token, node): 

934 """Insert token in the proper location of node.token_list.""" 

935 if getattr(node, 'first_i', None) is None: 

936 node.first_i = node.last_i = token.index 

937 else: 

938 node.first_i = min(node.first_i, token.index) 

939 node.last_i = max(node.last_i, token.index) 

940 #@+node:ekr.20191225055616.1: *4* function: replace_node 

941 def replace_node(new_node, old_node): 

942 """Replace new_node by old_node in the parse tree.""" 

943 parent = old_node.parent 

944 new_node.parent = parent 

945 new_node.node_index = old_node.node_index 

946 children = parent.children 

947 i = children.index(old_node) 

948 children[i] = new_node 

949 fields = getattr(old_node, '_fields', None) 

950 if fields: 

951 for field in fields: 

952 field = getattr(old_node, field) 

953 if field == old_node: 

954 setattr(old_node, field, new_node) 

955 break 

956 #@+node:ekr.20191225055626.1: *4* function: replace_token 

957 def replace_token(token, kind, value): 

958 """Replace kind and value of the given token.""" 

959 if token.kind in ('endmarker', 'killed'): 

960 return 

961 token.kind = kind 

962 token.value = value 

963 token.node = None # Should be filled later. 

964 #@-others 

965#@+node:ekr.20191027072910.1: ** Exception classes 

966class AssignLinksError(Exception): 

967 """Assigning links to ast nodes failed.""" 

968 

969 

970class AstNotEqual(Exception): 

971 """The two given AST's are not equivalent.""" 

972 

973 

974class FailFast(Exception): 

975 """Abort tests in TestRunner class.""" 

976#@+node:ekr.20141012064706.18390: ** class AstDumper 

977class AstDumper: # pragma: no cover 

978 """A class supporting various kinds of dumps of ast nodes.""" 

979 #@+others 

980 #@+node:ekr.20191112033445.1: *3* dumper.dump_tree & helper 

981 def dump_tree(self, tokens, tree): 

982 """Briefly show a tree, properly indented.""" 

983 self.tokens = tokens 

984 result = [self.show_header()] 

985 self.dump_tree_and_links_helper(tree, 0, result) 

986 return ''.join(result) 

987 #@+node:ekr.20191125035321.1: *4* dumper.dump_tree_and_links_helper 

988 def dump_tree_and_links_helper(self, node, level, result): 

989 """Return the list of lines in result.""" 

990 if node is None: 

991 return 

992 # Let block. 

993 indent = ' ' * 2 * level 

994 children: List[ast.AST] = getattr(node, 'children', []) 

995 node_s = self.compute_node_string(node, level) 

996 # Dump... 

997 if isinstance(node, (list, tuple)): 

998 for z in node: 

999 self.dump_tree_and_links_helper(z, level, result) 

1000 elif isinstance(node, str): 

1001 result.append(f"{indent}{node.__class__.__name__:>8}:{node}\n") 

1002 elif isinstance(node, ast.AST): 

1003 # Node and parent. 

1004 result.append(node_s) 

1005 # Children. 

1006 for z in children: 

1007 self.dump_tree_and_links_helper(z, level + 1, result) 

1008 else: 

1009 result.append(node_s) 

1010 #@+node:ekr.20191125035600.1: *3* dumper.compute_node_string & helpers 

1011 def compute_node_string(self, node, level): 

1012 """Return a string summarizing the node.""" 

1013 indent = ' ' * 2 * level 

1014 parent = getattr(node, 'parent', None) 

1015 node_id = getattr(node, 'node_index', '??') 

1016 parent_id = getattr(parent, 'node_index', '??') 

1017 parent_s = f"{parent_id:>3}.{parent.__class__.__name__} " if parent else '' 

1018 class_name = node.__class__.__name__ 

1019 descriptor_s = f"{node_id}.{class_name}: " + self.show_fields( 

1020 class_name, node, 30) 

1021 tokens_s = self.show_tokens(node, 70, 100) 

1022 lines = self.show_line_range(node) 

1023 full_s1 = f"{parent_s:<16} {lines:<10} {indent}{descriptor_s} " 

1024 node_s = f"{full_s1:<62} {tokens_s}\n" 

1025 return node_s 

1026 #@+node:ekr.20191113223424.1: *4* dumper.show_fields 

1027 def show_fields(self, class_name, node, truncate_n): 

1028 """Return a string showing interesting fields of the node.""" 

1029 val = '' 

1030 if class_name == 'JoinedStr': 

1031 values = node.values 

1032 assert isinstance(values, list) 

1033 # Str tokens may represent *concatenated* strings. 

1034 results = [] 

1035 fstrings, strings = 0, 0 

1036 for z in values: 

1037 assert isinstance(z, (ast.FormattedValue, ast.Str)) 

1038 if isinstance(z, ast.Str): 

1039 results.append(z.s) 

1040 strings += 1 

1041 else: 

1042 results.append(z.__class__.__name__) 

1043 fstrings += 1 

1044 val = f"{strings} str, {fstrings} f-str" 

1045 elif class_name == 'keyword': 

1046 if isinstance(node.value, ast.Str): 

1047 val = f"arg={node.arg}..Str.value.s={node.value.s}" 

1048 elif isinstance(node.value, ast.Name): 

1049 val = f"arg={node.arg}..Name.value.id={node.value.id}" 

1050 else: 

1051 val = f"arg={node.arg}..value={node.value.__class__.__name__}" 

1052 elif class_name == 'Name': 

1053 val = f"id={node.id!r}" 

1054 elif class_name == 'NameConstant': 

1055 val = f"value={node.value!r}" 

1056 elif class_name == 'Num': 

1057 val = f"n={node.n}" 

1058 elif class_name == 'Starred': 

1059 if isinstance(node.value, ast.Str): 

1060 val = f"s={node.value.s}" 

1061 elif isinstance(node.value, ast.Name): 

1062 val = f"id={node.value.id}" 

1063 else: 

1064 val = f"s={node.value.__class__.__name__}" 

1065 elif class_name == 'Str': 

1066 val = f"s={node.s!r}" 

1067 elif class_name in ('AugAssign', 'BinOp', 'BoolOp', 'UnaryOp'): # IfExp 

1068 name = node.op.__class__.__name__ 

1069 val = f"op={_op_names.get(name, name)}" 

1070 elif class_name == 'Compare': 

1071 ops = ','.join([op_name(z) for z in node.ops]) 

1072 val = f"ops='{ops}'" 

1073 else: 

1074 val = '' 

1075 return g.truncate(val, truncate_n) 

1076 #@+node:ekr.20191114054726.1: *4* dumper.show_line_range 

1077 def show_line_range(self, node): 

1078 

1079 token_list = get_node_token_list(node, self.tokens) 

1080 if not token_list: 

1081 return '' 

1082 min_ = min([z.line_number for z in token_list]) 

1083 max_ = max([z.line_number for z in token_list]) 

1084 return f"{min_}" if min_ == max_ else f"{min_}..{max_}" 

1085 #@+node:ekr.20191113223425.1: *4* dumper.show_tokens 

1086 def show_tokens(self, node, n, m, show_cruft=False): 

1087 """ 

1088 Return a string showing node.token_list. 

1089 

1090 Split the result if n + len(result) > m 

1091 """ 

1092 token_list = get_node_token_list(node, self.tokens) 

1093 result = [] 

1094 for z in token_list: 

1095 val = None 

1096 if z.kind == 'comment': 

1097 if show_cruft: 

1098 val = g.truncate(z.value, 10) # Short is good. 

1099 result.append(f"{z.kind}.{z.index}({val})") 

1100 elif z.kind == 'name': 

1101 val = g.truncate(z.value, 20) 

1102 result.append(f"{z.kind}.{z.index}({val})") 

1103 elif z.kind == 'newline': 

1104 # result.append(f"{z.kind}.{z.index}({z.line_number}:{len(z.line)})") 

1105 result.append(f"{z.kind}.{z.index}") 

1106 elif z.kind == 'number': 

1107 result.append(f"{z.kind}.{z.index}({z.value})") 

1108 elif z.kind == 'op': 

1109 if z.value not in ',()' or show_cruft: 

1110 result.append(f"{z.kind}.{z.index}({z.value})") 

1111 elif z.kind == 'string': 

1112 val = g.truncate(z.value, 30) 

1113 result.append(f"{z.kind}.{z.index}({val})") 

1114 elif z.kind == 'ws': 

1115 if show_cruft: 

1116 result.append(f"{z.kind}.{z.index}({len(z.value)})") 

1117 else: 

1118 # Indent, dedent, encoding, etc. 

1119 # Don't put a blank. 

1120 continue 

1121 if result and result[-1] != ' ': 

1122 result.append(' ') 

1123 # 

1124 # split the line if it is too long. 

1125 # g.printObj(result, tag='show_tokens') 

1126 if 1: 

1127 return ''.join(result) 

1128 line, lines = [], [] 

1129 for r in result: 

1130 line.append(r) 

1131 if n + len(''.join(line)) >= m: 

1132 lines.append(''.join(line)) 

1133 line = [] 

1134 lines.append(''.join(line)) 

1135 pad = '\n' + ' ' * n 

1136 return pad.join(lines) 

1137 #@+node:ekr.20191110165235.5: *3* dumper.show_header 

1138 def show_header(self): 

1139 """Return a header string, but only the fist time.""" 

1140 return ( 

1141 f"{'parent':<16} {'lines':<10} {'node':<34} {'tokens'}\n" 

1142 f"{'======':<16} {'=====':<10} {'====':<34} {'======'}\n") 

1143 #@+node:ekr.20141012064706.18392: *3* dumper.dump_ast & helper 

1144 annotate_fields = False 

1145 include_attributes = False 

1146 indent_ws = ' ' 

1147 

1148 def dump_ast(self, node, level=0): 

1149 """ 

1150 Dump an ast tree. Adapted from ast.dump. 

1151 """ 

1152 sep1 = '\n%s' % (self.indent_ws * (level + 1)) 

1153 if isinstance(node, ast.AST): 

1154 fields = [(a, self.dump_ast(b, level + 1)) for a, b in self.get_fields(node)] 

1155 if self.include_attributes and node._attributes: 

1156 fields.extend([(a, self.dump_ast(getattr(node, a), level + 1)) 

1157 for a in node._attributes]) 

1158 if self.annotate_fields: 

1159 aList = ['%s=%s' % (a, b) for a, b in fields] 

1160 else: 

1161 aList = [b for a, b in fields] 

1162 name = node.__class__.__name__ 

1163 sep = '' if len(aList) <= 1 else sep1 

1164 return '%s(%s%s)' % (name, sep, sep1.join(aList)) 

1165 if isinstance(node, list): 

1166 sep = sep1 

1167 return 'LIST[%s]' % ''.join( 

1168 ['%s%s' % (sep, self.dump_ast(z, level + 1)) for z in node]) 

1169 return repr(node) 

1170 #@+node:ekr.20141012064706.18393: *4* dumper.get_fields 

1171 def get_fields(self, node): 

1172 

1173 return ( 

1174 (a, b) for a, b in ast.iter_fields(node) 

1175 if a not in ['ctx',] and b not in (None, []) 

1176 ) 

1177 #@-others 

1178#@+node:ekr.20191227170628.1: ** TOG classes... 

1179#@+node:ekr.20191113063144.1: *3* class TokenOrderGenerator 

1180class TokenOrderGenerator: 

1181 """ 

1182 A class that traverses ast (parse) trees in token order. 

1183 

1184 Overview: https://github.com/leo-editor/leo-editor/issues/1440#issue-522090981 

1185 

1186 Theory of operation: 

1187 - https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-573661883 

1188 - http://leoeditor.com/appendices.html#tokenorder-classes-theory-of-operation 

1189 

1190 How to: http://leoeditor.com/appendices.html#tokenorder-class-how-to 

1191 

1192 Project history: https://github.com/leo-editor/leo-editor/issues/1440#issuecomment-574145510 

1193 """ 

1194 

1195 n_nodes = 0 # The number of nodes that have been visited. 

1196 #@+others 

1197 #@+node:ekr.20200103174914.1: *4* tog: Init... 

1198 #@+node:ekr.20191228184647.1: *5* tog.balance_tokens 

1199 def balance_tokens(self, tokens): 

1200 """ 

1201 TOG.balance_tokens. 

1202 

1203 Insert two-way links between matching paren tokens. 

1204 """ 

1205 count, stack = 0, [] 

1206 for token in tokens: 

1207 if token.kind == 'op': 

1208 if token.value == '(': 

1209 count += 1 

1210 stack.append(token.index) 

1211 if token.value == ')': 

1212 if stack: 

1213 index = stack.pop() 

1214 tokens[index].matching_paren = token.index 

1215 tokens[token.index].matching_paren = index 

1216 else: 

1217 g.trace(f"unmatched ')' at index {token.index}") 

1218 # g.trace(f"tokens: {len(tokens)} matched parens: {count}") 

1219 if stack: 

1220 g.trace("unmatched '(' at {','.join(stack)}") 

1221 return count 

1222 #@+node:ekr.20191113063144.4: *5* tog.create_links 

1223 def create_links(self, tokens, tree, file_name=''): 

1224 """ 

1225 A generator creates two-way links between the given tokens and ast-tree. 

1226 

1227 Callers should call this generator with list(tog.create_links(...)) 

1228 

1229 The sync_tokens method creates the links and verifies that the resulting 

1230 tree traversal generates exactly the given tokens in exact order. 

1231 

1232 tokens: the list of Token instances for the input. 

1233 Created by make_tokens(). 

1234 tree: the ast tree for the input. 

1235 Created by parse_ast(). 

1236 """ 

1237 # 

1238 # Init all ivars. 

1239 self.file_name = file_name 

1240 # For tests. 

1241 self.level = 0 

1242 # Python indentation level. 

1243 self.node = None 

1244 # The node being visited. 

1245 # The parent of the about-to-be visited node. 

1246 self.tokens = tokens 

1247 # The immutable list of input tokens. 

1248 self.tree = tree 

1249 # The tree of ast.AST nodes. 

1250 # 

1251 # Traverse the tree. 

1252 try: 

1253 while True: 

1254 next(self.visitor(tree)) 

1255 except StopIteration: 

1256 pass 

1257 # 

1258 # Ensure that all tokens are patched. 

1259 self.node = tree 

1260 yield from self.gen_token('endmarker', '') 

1261 #@+node:ekr.20191229071733.1: *5* tog.init_from_file 

1262 def init_from_file(self, filename): # pragma: no cover 

1263 """ 

1264 Create the tokens and ast tree for the given file. 

1265 Create links between tokens and the parse tree. 

1266 Return (contents, encoding, tokens, tree). 

1267 """ 

1268 self.level = 0 

1269 self.filename = filename 

1270 encoding, contents = read_file_with_encoding(filename) 

1271 if not contents: 

1272 return None, None, None, None 

1273 self.tokens = tokens = make_tokens(contents) 

1274 self.tree = tree = parse_ast(contents) 

1275 list(self.create_links(tokens, tree)) 

1276 return contents, encoding, tokens, tree 

1277 #@+node:ekr.20191229071746.1: *5* tog.init_from_string 

1278 def init_from_string(self, contents, filename): # pragma: no cover 

1279 """ 

1280 Tokenize, parse and create links in the contents string. 

1281 

1282 Return (tokens, tree). 

1283 """ 

1284 self.filename = filename 

1285 self.level = 0 

1286 self.tokens = tokens = make_tokens(contents) 

1287 self.tree = tree = parse_ast(contents) 

1288 list(self.create_links(tokens, tree)) 

1289 return tokens, tree 

1290 #@+node:ekr.20191223052749.1: *4* tog: Traversal... 

1291 #@+node:ekr.20191113063144.3: *5* tog.begin_visitor 

1292 begin_end_stack: List[str] = [] 

1293 node_index = 0 # The index into the node_stack. 

1294 node_stack: List[ast.AST] = [] # The stack of parent nodes. 

1295 

1296 def begin_visitor(self, node): 

1297 """Enter a visitor.""" 

1298 # Update the stats. 

1299 self.n_nodes += 1 

1300 # Do this first, *before* updating self.node. 

1301 node.parent = self.node 

1302 if self.node: 

1303 children = getattr(self.node, 'children', []) # type:ignore 

1304 children.append(node) 

1305 self.node.children = children 

1306 # Inject the node_index field. 

1307 assert not hasattr(node, 'node_index'), g.callers() 

1308 node.node_index = self.node_index 

1309 self.node_index += 1 

1310 # begin_visitor and end_visitor must be paired. 

1311 self.begin_end_stack.append(node.__class__.__name__) 

1312 # Push the previous node. 

1313 self.node_stack.append(self.node) 

1314 # Update self.node *last*. 

1315 self.node = node 

1316 #@+node:ekr.20200104032811.1: *5* tog.end_visitor 

1317 def end_visitor(self, node): 

1318 """Leave a visitor.""" 

1319 # begin_visitor and end_visitor must be paired. 

1320 entry_name = self.begin_end_stack.pop() 

1321 assert entry_name == node.__class__.__name__, f"{entry_name!r} {node.__class__.__name__}" 

1322 assert self.node == node, (repr(self.node), repr(node)) 

1323 # Restore self.node. 

1324 self.node = self.node_stack.pop() 

1325 #@+node:ekr.20200110162044.1: *5* tog.find_next_significant_token 

1326 def find_next_significant_token(self): 

1327 """ 

1328 Scan from *after* self.tokens[px] looking for the next significant 

1329 token. 

1330 

1331 Return the token, or None. Never change self.px. 

1332 """ 

1333 px = self.px + 1 

1334 while px < len(self.tokens): 

1335 token = self.tokens[px] 

1336 px += 1 

1337 if is_significant_token(token): 

1338 return token 

1339 # This will never happen, because endtoken is significant. 

1340 return None # pragma: no cover 

1341 #@+node:ekr.20191121180100.1: *5* tog.gen* 

1342 # Useful wrappers... 

1343 

1344 def gen(self, z): 

1345 yield from self.visitor(z) 

1346 

1347 def gen_name(self, val): 

1348 yield from self.visitor(self.sync_name(val)) # type:ignore 

1349 

1350 def gen_op(self, val): 

1351 yield from self.visitor(self.sync_op(val)) # type:ignore 

1352 

1353 def gen_token(self, kind, val): 

1354 yield from self.visitor(self.sync_token(kind, val)) # type:ignore 

1355 #@+node:ekr.20191113063144.7: *5* tog.sync_token & set_links 

1356 px = -1 # Index of the previously synced token. 

1357 

1358 def sync_token(self, kind, val): 

1359 """ 

1360 Sync to a token whose kind & value are given. The token need not be 

1361 significant, but it must be guaranteed to exist in the token list. 

1362 

1363 The checks in this method constitute a strong, ever-present, unit test. 

1364 

1365 Scan the tokens *after* px, looking for a token T matching (kind, val). 

1366 raise AssignLinksError if a significant token is found that doesn't match T. 

1367 Otherwise: 

1368 - Create two-way links between all assignable tokens between px and T. 

1369 - Create two-way links between T and self.node. 

1370 - Advance by updating self.px to point to T. 

1371 """ 

1372 node, tokens = self.node, self.tokens 

1373 assert isinstance(node, ast.AST), repr(node) 

1374 # g.trace( 

1375 # f"px: {self.px:2} " 

1376 # f"node: {node.__class__.__name__:<10} " 

1377 # f"kind: {kind:>10}: val: {val!r}") 

1378 # 

1379 # Step one: Look for token T. 

1380 old_px = px = self.px + 1 

1381 while px < len(self.tokens): 

1382 token = tokens[px] 

1383 if (kind, val) == (token.kind, token.value): 

1384 break # Success. 

1385 if kind == token.kind == 'number': 

1386 val = token.value 

1387 break # Benign: use the token's value, a string, instead of a number. 

1388 if is_significant_token(token): # pragma: no cover 

1389 line_s = f"line {token.line_number}:" 

1390 val = str(val) # for g.truncate. 

1391 raise AssignLinksError( 

1392 f" file: {self.filename}\n" 

1393 f"{line_s:>12} {token.line.strip()}\n" 

1394 f"Looking for: {kind}.{g.truncate(val, 40)!r}\n" 

1395 f" found: {token.kind}.{token.value!r}\n" 

1396 f"token.index: {token.index}\n") 

1397 # Skip the insignificant token. 

1398 px += 1 

1399 else: # pragma: no cover 

1400 val = str(val) # for g.truncate. 

1401 raise AssignLinksError( 

1402 f" file: {self.filename}\n" 

1403 f"Looking for: {kind}.{g.truncate(val, 40)}\n" 

1404 f" found: end of token list") 

1405 # 

1406 # Step two: Assign *secondary* links only for newline tokens. 

1407 # Ignore all other non-significant tokens. 

1408 while old_px < px: 

1409 token = tokens[old_px] 

1410 old_px += 1 

1411 if token.kind in ('comment', 'newline', 'nl'): 

1412 self.set_links(node, token) 

1413 # 

1414 # Step three: Set links in the found token. 

1415 token = tokens[px] 

1416 self.set_links(node, token) 

1417 # 

1418 # Step four: Advance. 

1419 self.px = px 

1420 #@+node:ekr.20191125120814.1: *6* tog.set_links 

1421 last_statement_node = None 

1422 

1423 def set_links(self, node, token): 

1424 """Make two-way links between token and the given node.""" 

1425 # Don't bother assigning comment, comma, parens, ws and endtoken tokens. 

1426 if token.kind == 'comment': 

1427 # Append the comment to node.comment_list. 

1428 comment_list = getattr(node, 'comment_list', []) # type:ignore 

1429 node.comment_list = comment_list + [token] 

1430 return 

1431 if token.kind in ('endmarker', 'ws'): 

1432 return 

1433 if token.kind == 'op' and token.value in ',()': 

1434 return 

1435 # *Always* remember the last statement. 

1436 statement = find_statement_node(node) 

1437 if statement: 

1438 self.last_statement_node = statement # type:ignore 

1439 assert not isinstance(self.last_statement_node, ast.Module) 

1440 if token.node is not None: # pragma: no cover 

1441 line_s = f"line {token.line_number}:" 

1442 raise AssignLinksError( 

1443 f" file: {self.filename}\n" 

1444 f"{line_s:>12} {token.line.strip()}\n" 

1445 f"token index: {self.px}\n" 

1446 f"token.node is not None\n" 

1447 f" token.node: {token.node.__class__.__name__}\n" 

1448 f" callers: {g.callers()}") 

1449 # Assign newlines to the previous statement node, if any. 

1450 if token.kind in ('newline', 'nl'): 

1451 # Set an *auxilliary* link for the split/join logic. 

1452 # Do *not* set token.node! 

1453 token.statement_node = self.last_statement_node 

1454 return 

1455 if is_significant_token(token): 

1456 # Link the token to the ast node. 

1457 token.node = node # type:ignore 

1458 # Add the token to node's token_list. 

1459 add_token_to_token_list(token, node) 

1460 #@+node:ekr.20191124083124.1: *5* tog.sync_name and sync_op 

1461 # It's valid for these to return None. 

1462 

1463 def sync_name(self, val): 

1464 aList = val.split('.') 

1465 if len(aList) == 1: 

1466 self.sync_token('name', val) 

1467 else: 

1468 for i, part in enumerate(aList): 

1469 self.sync_token('name', part) 

1470 if i < len(aList) - 1: 

1471 self.sync_op('.') 

1472 

1473 def sync_op(self, val): 

1474 """ 

1475 Sync to the given operator. 

1476 

1477 val may be '(' or ')' *only* if the parens *will* actually exist in the 

1478 token list. 

1479 """ 

1480 self.sync_token('op', val) 

1481 #@+node:ekr.20191113081443.1: *5* tog.visitor (calls begin/end_visitor) 

1482 def visitor(self, node): 

1483 """Given an ast node, return a *generator* from its visitor.""" 

1484 # This saves a lot of tests. 

1485 trace = False 

1486 if node is None: 

1487 return 

1488 if trace: 

1489 # Keep this trace. It's useful. 

1490 cn = node.__class__.__name__ if node else ' ' 

1491 caller1, caller2 = g.callers(2).split(',') 

1492 g.trace(f"{caller1:>15} {caller2:<14} {cn}") 

1493 # More general, more convenient. 

1494 if isinstance(node, (list, tuple)): 

1495 for z in node or []: 

1496 if isinstance(z, ast.AST): 

1497 yield from self.visitor(z) 

1498 else: # pragma: no cover 

1499 # Some fields may contain ints or strings. 

1500 assert isinstance(z, (int, str)), z.__class__.__name__ 

1501 return 

1502 # We *do* want to crash if the visitor doesn't exist. 

1503 method = getattr(self, 'do_' + node.__class__.__name__) 

1504 # Allow begin/end visitor to be generators. 

1505 self.begin_visitor(node) 

1506 yield from method(node) 

1507 self.end_visitor(node) 

1508 #@+node:ekr.20191113063144.13: *4* tog: Visitors... 

1509 #@+node:ekr.20191113063144.32: *5* tog.keyword: not called! 

1510 # keyword arguments supplied to call (NULL identifier for **kwargs) 

1511 

1512 # keyword = (identifier? arg, expr value) 

1513 

1514 def do_keyword(self, node): # pragma: no cover 

1515 """A keyword arg in an ast.Call.""" 

1516 # This should never be called. 

1517 # tog.hande_call_arguments calls self.gen(kwarg_arg.value) instead. 

1518 filename = getattr(self, 'filename', '<no file>') 

1519 raise AssignLinksError( 

1520 f"file: {filename}\n" 

1521 f"do_keyword should never be called\n" 

1522 f"{g.callers(8)}") 

1523 #@+node:ekr.20191113063144.14: *5* tog: Contexts 

1524 #@+node:ekr.20191113063144.28: *6* tog.arg 

1525 # arg = (identifier arg, expr? annotation) 

1526 

1527 def do_arg(self, node): 

1528 """This is one argument of a list of ast.Function or ast.Lambda arguments.""" 

1529 yield from self.gen_name(node.arg) 

1530 annotation = getattr(node, 'annotation', None) 

1531 if annotation is not None: 

1532 yield from self.gen_op(':') 

1533 yield from self.gen(node.annotation) 

1534 #@+node:ekr.20191113063144.27: *6* tog.arguments 

1535 # arguments = ( 

1536 # arg* posonlyargs, arg* args, arg? vararg, arg* kwonlyargs, 

1537 # expr* kw_defaults, arg? kwarg, expr* defaults 

1538 # ) 

1539 

1540 def do_arguments(self, node): 

1541 """Arguments to ast.Function or ast.Lambda, **not** ast.Call.""" 

1542 # 

1543 # No need to generate commas anywhere below. 

1544 # 

1545 # Let block. Some fields may not exist pre Python 3.8. 

1546 n_plain = len(node.args) - len(node.defaults) 

1547 posonlyargs = getattr(node, 'posonlyargs', []) # type:ignore 

1548 vararg = getattr(node, 'vararg', None) 

1549 kwonlyargs = getattr(node, 'kwonlyargs', []) # type:ignore 

1550 kw_defaults = getattr(node, 'kw_defaults', []) # type:ignore 

1551 kwarg = getattr(node, 'kwarg', None) 

1552 if 0: 

1553 g.printObj(ast.dump(node.vararg) if node.vararg else 'None', tag='node.vararg') 

1554 g.printObj([ast.dump(z) for z in node.args], tag='node.args') 

1555 g.printObj([ast.dump(z) for z in node.defaults], tag='node.defaults') 

1556 g.printObj([ast.dump(z) for z in posonlyargs], tag='node.posonlyargs') 

1557 g.printObj([ast.dump(z) for z in kwonlyargs], tag='kwonlyargs') 

1558 g.printObj([ast.dump(z) if z else 'None' for z in kw_defaults], tag='kw_defaults') 

1559 # 1. Sync the position-only args. 

1560 if posonlyargs: 

1561 for n, z in enumerate(posonlyargs): 

1562 # g.trace('pos-only', ast.dump(z)) 

1563 yield from self.gen(z) 

1564 yield from self.gen_op('/') 

1565 # 2. Sync all args. 

1566 for i, z in enumerate(node.args): 

1567 yield from self.gen(z) 

1568 if i >= n_plain: 

1569 yield from self.gen_op('=') 

1570 yield from self.gen(node.defaults[i - n_plain]) 

1571 # 3. Sync the vararg. 

1572 if vararg: 

1573 # g.trace('vararg', ast.dump(vararg)) 

1574 yield from self.gen_op('*') 

1575 yield from self.gen(vararg) 

1576 # 4. Sync the keyword-only args. 

1577 if kwonlyargs: 

1578 if not vararg: 

1579 yield from self.gen_op('*') 

1580 for n, z in enumerate(kwonlyargs): 

1581 # g.trace('keyword-only', ast.dump(z)) 

1582 yield from self.gen(z) 

1583 val = kw_defaults[n] 

1584 if val is not None: 

1585 yield from self.gen_op('=') 

1586 yield from self.gen(val) 

1587 # 5. Sync the kwarg. 

1588 if kwarg: 

1589 # g.trace('kwarg', ast.dump(kwarg)) 

1590 yield from self.gen_op('**') 

1591 yield from self.gen(kwarg) 

1592 

1593 #@+node:ekr.20191113063144.15: *6* tog.AsyncFunctionDef 

1594 # AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, 

1595 # expr? returns) 

1596 

1597 def do_AsyncFunctionDef(self, node): 

1598 

1599 if node.decorator_list: 

1600 for z in node.decorator_list: 

1601 # '@%s\n' 

1602 yield from self.gen_op('@') 

1603 yield from self.gen(z) 

1604 # 'asynch def (%s): -> %s\n' 

1605 # 'asynch def %s(%s):\n' 

1606 async_token_type = 'async' if has_async_tokens else 'name' 

1607 yield from self.gen_token(async_token_type, 'async') 

1608 yield from self.gen_name('def') 

1609 yield from self.gen_name(node.name) # A string 

1610 yield from self.gen_op('(') 

1611 yield from self.gen(node.args) 

1612 yield from self.gen_op(')') 

1613 returns = getattr(node, 'returns', None) 

1614 if returns is not None: 

1615 yield from self.gen_op('->') 

1616 yield from self.gen(node.returns) 

1617 yield from self.gen_op(':') 

1618 self.level += 1 

1619 yield from self.gen(node.body) 

1620 self.level -= 1 

1621 #@+node:ekr.20191113063144.16: *6* tog.ClassDef 

1622 def do_ClassDef(self, node, print_body=True): 

1623 

1624 for z in node.decorator_list or []: 

1625 # @{z}\n 

1626 yield from self.gen_op('@') 

1627 yield from self.gen(z) 

1628 # class name(bases):\n 

1629 yield from self.gen_name('class') 

1630 yield from self.gen_name(node.name) # A string. 

1631 if node.bases: 

1632 yield from self.gen_op('(') 

1633 yield from self.gen(node.bases) 

1634 yield from self.gen_op(')') 

1635 yield from self.gen_op(':') 

1636 # Body... 

1637 self.level += 1 

1638 yield from self.gen(node.body) 

1639 self.level -= 1 

1640 #@+node:ekr.20191113063144.17: *6* tog.FunctionDef 

1641 # FunctionDef( 

1642 # identifier name, arguments args, 

1643 # stmt* body, 

1644 # expr* decorator_list, 

1645 # expr? returns, 

1646 # string? type_comment) 

1647 

1648 def do_FunctionDef(self, node): 

1649 

1650 # Guards... 

1651 returns = getattr(node, 'returns', None) 

1652 # Decorators... 

1653 # @{z}\n 

1654 for z in node.decorator_list or []: 

1655 yield from self.gen_op('@') 

1656 yield from self.gen(z) 

1657 # Signature... 

1658 # def name(args): -> returns\n 

1659 # def name(args):\n 

1660 yield from self.gen_name('def') 

1661 yield from self.gen_name(node.name) # A string. 

1662 yield from self.gen_op('(') 

1663 yield from self.gen(node.args) 

1664 yield from self.gen_op(')') 

1665 if returns is not None: 

1666 yield from self.gen_op('->') 

1667 yield from self.gen(node.returns) 

1668 yield from self.gen_op(':') 

1669 # Body... 

1670 self.level += 1 

1671 yield from self.gen(node.body) 

1672 self.level -= 1 

1673 #@+node:ekr.20191113063144.18: *6* tog.Interactive 

1674 def do_Interactive(self, node): # pragma: no cover 

1675 

1676 yield from self.gen(node.body) 

1677 #@+node:ekr.20191113063144.20: *6* tog.Lambda 

1678 def do_Lambda(self, node): 

1679 

1680 yield from self.gen_name('lambda') 

1681 yield from self.gen(node.args) 

1682 yield from self.gen_op(':') 

1683 yield from self.gen(node.body) 

1684 #@+node:ekr.20191113063144.19: *6* tog.Module 

1685 def do_Module(self, node): 

1686 

1687 # Encoding is a non-syncing statement. 

1688 yield from self.gen(node.body) 

1689 #@+node:ekr.20191113063144.21: *5* tog: Expressions 

1690 #@+node:ekr.20191113063144.22: *6* tog.Expr 

1691 def do_Expr(self, node): 

1692 """An outer expression.""" 

1693 # No need to put parentheses. 

1694 yield from self.gen(node.value) 

1695 #@+node:ekr.20191113063144.23: *6* tog.Expression 

1696 def do_Expression(self, node): # pragma: no cover 

1697 """An inner expression.""" 

1698 # No need to put parentheses. 

1699 yield from self.gen(node.body) 

1700 #@+node:ekr.20191113063144.24: *6* tog.GeneratorExp 

1701 def do_GeneratorExp(self, node): 

1702 

1703 # '<gen %s for %s>' % (elt, ','.join(gens)) 

1704 # No need to put parentheses or commas. 

1705 yield from self.gen(node.elt) 

1706 yield from self.gen(node.generators) 

1707 #@+node:ekr.20210321171703.1: *6* tog.NamedExpr 

1708 # NamedExpr(expr target, expr value) 

1709 

1710 def do_NamedExpr(self, node): # Python 3.8+ 

1711 

1712 yield from self.gen(node.target) 

1713 yield from self.gen_op(':=') 

1714 yield from self.gen(node.value) 

1715 #@+node:ekr.20191113063144.26: *5* tog: Operands 

1716 #@+node:ekr.20191113063144.29: *6* tog.Attribute 

1717 # Attribute(expr value, identifier attr, expr_context ctx) 

1718 

1719 def do_Attribute(self, node): 

1720 

1721 yield from self.gen(node.value) 

1722 yield from self.gen_op('.') 

1723 yield from self.gen_name(node.attr) # A string. 

1724 #@+node:ekr.20191113063144.30: *6* tog.Bytes 

1725 def do_Bytes(self, node): 

1726 

1727 """ 

1728 It's invalid to mix bytes and non-bytes literals, so just 

1729 advancing to the next 'string' token suffices. 

1730 """ 

1731 token = self.find_next_significant_token() 

1732 yield from self.gen_token('string', token.value) 

1733 #@+node:ekr.20191113063144.33: *6* tog.comprehension 

1734 # comprehension = (expr target, expr iter, expr* ifs, int is_async) 

1735 

1736 def do_comprehension(self, node): 

1737 

1738 # No need to put parentheses. 

1739 yield from self.gen_name('for') # #1858. 

1740 yield from self.gen(node.target) # A name 

1741 yield from self.gen_name('in') 

1742 yield from self.gen(node.iter) 

1743 for z in node.ifs or []: 

1744 yield from self.gen_name('if') 

1745 yield from self.gen(z) 

1746 #@+node:ekr.20191113063144.34: *6* tog.Constant 

1747 def do_Constant(self, node): # pragma: no cover 

1748 """ 

1749 

1750 https://greentreesnakes.readthedocs.io/en/latest/nodes.html 

1751 

1752 A constant. The value attribute holds the Python object it represents. 

1753 This can be simple types such as a number, string or None, but also 

1754 immutable container types (tuples and frozensets) if all of their 

1755 elements are constant. 

1756 """ 

1757 

1758 # Support Python 3.8. 

1759 if node.value is None or isinstance(node.value, bool): 

1760 # Weird: return a name! 

1761 yield from self.gen_token('name', repr(node.value)) 

1762 elif node.value == Ellipsis: 

1763 yield from self.gen_op('...') 

1764 elif isinstance(node.value, str): 

1765 yield from self.do_Str(node) 

1766 elif isinstance(node.value, (int, float)): 

1767 yield from self.gen_token('number', repr(node.value)) 

1768 elif isinstance(node.value, bytes): 

1769 yield from self.do_Bytes(node) 

1770 elif isinstance(node.value, tuple): 

1771 yield from self.do_Tuple(node) 

1772 elif isinstance(node.value, frozenset): 

1773 yield from self.do_Set(node) 

1774 else: 

1775 # Unknown type. 

1776 g.trace('----- Oops -----', repr(node.value), g.callers()) 

1777 #@+node:ekr.20191113063144.35: *6* tog.Dict 

1778 # Dict(expr* keys, expr* values) 

1779 

1780 def do_Dict(self, node): 

1781 

1782 assert len(node.keys) == len(node.values) 

1783 yield from self.gen_op('{') 

1784 # No need to put commas. 

1785 for i, key in enumerate(node.keys): 

1786 key, value = node.keys[i], node.values[i] 

1787 yield from self.gen(key) # a Str node. 

1788 yield from self.gen_op(':') 

1789 if value is not None: 

1790 yield from self.gen(value) 

1791 yield from self.gen_op('}') 

1792 #@+node:ekr.20191113063144.36: *6* tog.DictComp 

1793 # DictComp(expr key, expr value, comprehension* generators) 

1794 

1795 # d2 = {val: key for key, val in d} 

1796 

1797 def do_DictComp(self, node): 

1798 

1799 yield from self.gen_token('op', '{') 

1800 yield from self.gen(node.key) 

1801 yield from self.gen_op(':') 

1802 yield from self.gen(node.value) 

1803 for z in node.generators or []: 

1804 yield from self.gen(z) 

1805 yield from self.gen_token('op', '}') 

1806 #@+node:ekr.20191113063144.37: *6* tog.Ellipsis 

1807 def do_Ellipsis(self, node): # pragma: no cover (Does not exist for python 3.8+) 

1808 

1809 yield from self.gen_op('...') 

1810 #@+node:ekr.20191113063144.38: *6* tog.ExtSlice 

1811 # https://docs.python.org/3/reference/expressions.html#slicings 

1812 

1813 # ExtSlice(slice* dims) 

1814 

1815 def do_ExtSlice(self, node): # pragma: no cover (deprecated) 

1816 

1817 # ','.join(node.dims) 

1818 for i, z in enumerate(node.dims): 

1819 yield from self.gen(z) 

1820 if i < len(node.dims) - 1: 

1821 yield from self.gen_op(',') 

1822 #@+node:ekr.20191113063144.40: *6* tog.Index 

1823 def do_Index(self, node): # pragma: no cover (deprecated) 

1824 

1825 yield from self.gen(node.value) 

1826 #@+node:ekr.20191113063144.39: *6* tog.FormattedValue: not called! 

1827 # FormattedValue(expr value, int? conversion, expr? format_spec) 

1828 

1829 def do_FormattedValue(self, node): # pragma: no cover 

1830 """ 

1831 This node represents the *components* of a *single* f-string. 

1832 

1833 Happily, JoinedStr nodes *also* represent *all* f-strings, 

1834 so the TOG should *never visit this node! 

1835 """ 

1836 filename = getattr(self, 'filename', '<no file>') 

1837 raise AssignLinksError( 

1838 f"file: {filename}\n" 

1839 f"do_FormattedValue should never be called") 

1840 

1841 # This code has no chance of being useful... 

1842 

1843 # conv = node.conversion 

1844 # spec = node.format_spec 

1845 # yield from self.gen(node.value) 

1846 # if conv is not None: 

1847 # yield from self.gen_token('number', conv) 

1848 # if spec is not None: 

1849 # yield from self.gen(node.format_spec) 

1850 #@+node:ekr.20191113063144.41: *6* tog.JoinedStr & helpers 

1851 # JoinedStr(expr* values) 

1852 

1853 def do_JoinedStr(self, node): 

1854 """ 

1855 JoinedStr nodes represent at least one f-string and all other strings 

1856 concatentated to it. 

1857 

1858 Analyzing JoinedStr.values would be extremely tricky, for reasons that 

1859 need not be explained here. 

1860 

1861 Instead, we get the tokens *from the token list itself*! 

1862 """ 

1863 for z in self.get_concatenated_string_tokens(): 

1864 yield from self.gen_token(z.kind, z.value) 

1865 #@+node:ekr.20191113063144.42: *6* tog.List 

1866 def do_List(self, node): 

1867 

1868 # No need to put commas. 

1869 yield from self.gen_op('[') 

1870 yield from self.gen(node.elts) 

1871 yield from self.gen_op(']') 

1872 #@+node:ekr.20191113063144.43: *6* tog.ListComp 

1873 # ListComp(expr elt, comprehension* generators) 

1874 

1875 def do_ListComp(self, node): 

1876 

1877 yield from self.gen_op('[') 

1878 yield from self.gen(node.elt) 

1879 for z in node.generators: 

1880 yield from self.gen(z) 

1881 yield from self.gen_op(']') 

1882 #@+node:ekr.20191113063144.44: *6* tog.Name & NameConstant 

1883 def do_Name(self, node): 

1884 

1885 yield from self.gen_name(node.id) 

1886 

1887 def do_NameConstant(self, node): # pragma: no cover (Does not exist in Python 3.8+) 

1888 

1889 yield from self.gen_name(repr(node.value)) 

1890 

1891 #@+node:ekr.20191113063144.45: *6* tog.Num 

1892 def do_Num(self, node): # pragma: no cover (Does not exist in Python 3.8+) 

1893 

1894 yield from self.gen_token('number', node.n) 

1895 #@+node:ekr.20191113063144.47: *6* tog.Set 

1896 # Set(expr* elts) 

1897 

1898 def do_Set(self, node): 

1899 

1900 yield from self.gen_op('{') 

1901 yield from self.gen(node.elts) 

1902 yield from self.gen_op('}') 

1903 #@+node:ekr.20191113063144.48: *6* tog.SetComp 

1904 # SetComp(expr elt, comprehension* generators) 

1905 

1906 def do_SetComp(self, node): 

1907 

1908 yield from self.gen_op('{') 

1909 yield from self.gen(node.elt) 

1910 for z in node.generators or []: 

1911 yield from self.gen(z) 

1912 yield from self.gen_op('}') 

1913 #@+node:ekr.20191113063144.49: *6* tog.Slice 

1914 # slice = Slice(expr? lower, expr? upper, expr? step) 

1915 

1916 def do_Slice(self, node): 

1917 

1918 lower = getattr(node, 'lower', None) 

1919 upper = getattr(node, 'upper', None) 

1920 step = getattr(node, 'step', None) 

1921 if lower is not None: 

1922 yield from self.gen(lower) 

1923 # Always put the colon between upper and lower. 

1924 yield from self.gen_op(':') 

1925 if upper is not None: 

1926 yield from self.gen(upper) 

1927 # Put the second colon if it exists in the token list. 

1928 if step is None: 

1929 token = self.find_next_significant_token() 

1930 if token and token.value == ':': 

1931 yield from self.gen_op(':') 

1932 else: 

1933 yield from self.gen_op(':') 

1934 yield from self.gen(step) 

1935 #@+node:ekr.20191113063144.50: *6* tog.Str & helper 

1936 def do_Str(self, node): 

1937 """This node represents a string constant.""" 

1938 # This loop is necessary to handle string concatenation. 

1939 for z in self.get_concatenated_string_tokens(): 

1940 yield from self.gen_token(z.kind, z.value) 

1941 #@+node:ekr.20200111083914.1: *7* tog.get_concatenated_tokens 

1942 def get_concatenated_string_tokens(self): 

1943 """ 

1944 Return the next 'string' token and all 'string' tokens concatenated to 

1945 it. *Never* update self.px here. 

1946 """ 

1947 trace = False 

1948 tag = 'tog.get_concatenated_string_tokens' 

1949 i = self.px 

1950 # First, find the next significant token. It should be a string. 

1951 i, token = i + 1, None 

1952 while i < len(self.tokens): 

1953 token = self.tokens[i] 

1954 i += 1 

1955 if token.kind == 'string': 

1956 # Rescan the string. 

1957 i -= 1 

1958 break 

1959 # An error. 

1960 if is_significant_token(token): # pragma: no cover 

1961 break 

1962 # Raise an error if we didn't find the expected 'string' token. 

1963 if not token or token.kind != 'string': # pragma: no cover 

1964 if not token: 

1965 token = self.tokens[-1] 

1966 filename = getattr(self, 'filename', '<no filename>') 

1967 raise AssignLinksError( 

1968 f"\n" 

1969 f"{tag}...\n" 

1970 f"file: {filename}\n" 

1971 f"line: {token.line_number}\n" 

1972 f" i: {i}\n" 

1973 f"expected 'string' token, got {token!s}") 

1974 # Accumulate string tokens. 

1975 assert self.tokens[i].kind == 'string' 

1976 results = [] 

1977 while i < len(self.tokens): 

1978 token = self.tokens[i] 

1979 i += 1 

1980 if token.kind == 'string': 

1981 results.append(token) 

1982 elif token.kind == 'op' or is_significant_token(token): 

1983 # Any significant token *or* any op will halt string concatenation. 

1984 break 

1985 # 'ws', 'nl', 'newline', 'comment', 'indent', 'dedent', etc. 

1986 # The (significant) 'endmarker' token ensures we will have result. 

1987 assert results 

1988 if trace: 

1989 g.printObj(results, tag=f"{tag}: Results") 

1990 return results 

1991 #@+node:ekr.20191113063144.51: *6* tog.Subscript 

1992 # Subscript(expr value, slice slice, expr_context ctx) 

1993 

1994 def do_Subscript(self, node): 

1995 

1996 yield from self.gen(node.value) 

1997 yield from self.gen_op('[') 

1998 yield from self.gen(node.slice) 

1999 yield from self.gen_op(']') 

2000 #@+node:ekr.20191113063144.52: *6* tog.Tuple 

2001 # Tuple(expr* elts, expr_context ctx) 

2002 

2003 def do_Tuple(self, node): 

2004 

2005 # Do not call gen_op for parens or commas here. 

2006 # They do not necessarily exist in the token list! 

2007 yield from self.gen(node.elts) 

2008 #@+node:ekr.20191113063144.53: *5* tog: Operators 

2009 #@+node:ekr.20191113063144.55: *6* tog.BinOp 

2010 def do_BinOp(self, node): 

2011 

2012 op_name_ = op_name(node.op) 

2013 yield from self.gen(node.left) 

2014 yield from self.gen_op(op_name_) 

2015 yield from self.gen(node.right) 

2016 #@+node:ekr.20191113063144.56: *6* tog.BoolOp 

2017 # BoolOp(boolop op, expr* values) 

2018 

2019 def do_BoolOp(self, node): 

2020 

2021 # op.join(node.values) 

2022 op_name_ = op_name(node.op) 

2023 for i, z in enumerate(node.values): 

2024 yield from self.gen(z) 

2025 if i < len(node.values) - 1: 

2026 yield from self.gen_name(op_name_) 

2027 #@+node:ekr.20191113063144.57: *6* tog.Compare 

2028 # Compare(expr left, cmpop* ops, expr* comparators) 

2029 

2030 def do_Compare(self, node): 

2031 

2032 assert len(node.ops) == len(node.comparators) 

2033 yield from self.gen(node.left) 

2034 for i, z in enumerate(node.ops): 

2035 op_name_ = op_name(node.ops[i]) 

2036 if op_name_ in ('not in', 'is not'): 

2037 for z in op_name_.split(' '): 

2038 yield from self.gen_name(z) 

2039 elif op_name_.isalpha(): 

2040 yield from self.gen_name(op_name_) 

2041 else: 

2042 yield from self.gen_op(op_name_) 

2043 yield from self.gen(node.comparators[i]) 

2044 #@+node:ekr.20191113063144.58: *6* tog.UnaryOp 

2045 def do_UnaryOp(self, node): 

2046 

2047 op_name_ = op_name(node.op) 

2048 if op_name_.isalpha(): 

2049 yield from self.gen_name(op_name_) 

2050 else: 

2051 yield from self.gen_op(op_name_) 

2052 yield from self.gen(node.operand) 

2053 #@+node:ekr.20191113063144.59: *6* tog.IfExp (ternary operator) 

2054 # IfExp(expr test, expr body, expr orelse) 

2055 

2056 def do_IfExp(self, node): 

2057 

2058 #'%s if %s else %s' 

2059 yield from self.gen(node.body) 

2060 yield from self.gen_name('if') 

2061 yield from self.gen(node.test) 

2062 yield from self.gen_name('else') 

2063 yield from self.gen(node.orelse) 

2064 #@+node:ekr.20191113063144.60: *5* tog: Statements 

2065 #@+node:ekr.20191113063144.83: *6* tog.Starred 

2066 # Starred(expr value, expr_context ctx) 

2067 

2068 def do_Starred(self, node): 

2069 """A starred argument to an ast.Call""" 

2070 yield from self.gen_op('*') 

2071 yield from self.gen(node.value) 

2072 #@+node:ekr.20191113063144.61: *6* tog.AnnAssign 

2073 # AnnAssign(expr target, expr annotation, expr? value, int simple) 

2074 

2075 def do_AnnAssign(self, node): 

2076 

2077 # {node.target}:{node.annotation}={node.value}\n' 

2078 yield from self.gen(node.target) 

2079 yield from self.gen_op(':') 

2080 yield from self.gen(node.annotation) 

2081 if node.value is not None: # #1851 

2082 yield from self.gen_op('=') 

2083 yield from self.gen(node.value) 

2084 #@+node:ekr.20191113063144.62: *6* tog.Assert 

2085 # Assert(expr test, expr? msg) 

2086 

2087 def do_Assert(self, node): 

2088 

2089 # Guards... 

2090 msg = getattr(node, 'msg', None) 

2091 # No need to put parentheses or commas. 

2092 yield from self.gen_name('assert') 

2093 yield from self.gen(node.test) 

2094 if msg is not None: 

2095 yield from self.gen(node.msg) 

2096 #@+node:ekr.20191113063144.63: *6* tog.Assign 

2097 def do_Assign(self, node): 

2098 

2099 for z in node.targets: 

2100 yield from self.gen(z) 

2101 yield from self.gen_op('=') 

2102 yield from self.gen(node.value) 

2103 #@+node:ekr.20191113063144.64: *6* tog.AsyncFor 

2104 def do_AsyncFor(self, node): 

2105 

2106 # The def line... 

2107 # Py 3.8 changes the kind of token. 

2108 async_token_type = 'async' if has_async_tokens else 'name' 

2109 yield from self.gen_token(async_token_type, 'async') 

2110 yield from self.gen_name('for') 

2111 yield from self.gen(node.target) 

2112 yield from self.gen_name('in') 

2113 yield from self.gen(node.iter) 

2114 yield from self.gen_op(':') 

2115 # Body... 

2116 self.level += 1 

2117 yield from self.gen(node.body) 

2118 # Else clause... 

2119 if node.orelse: 

2120 yield from self.gen_name('else') 

2121 yield from self.gen_op(':') 

2122 yield from self.gen(node.orelse) 

2123 self.level -= 1 

2124 #@+node:ekr.20191113063144.65: *6* tog.AsyncWith 

2125 def do_AsyncWith(self, node): 

2126 

2127 async_token_type = 'async' if has_async_tokens else 'name' 

2128 yield from self.gen_token(async_token_type, 'async') 

2129 yield from self.do_With(node) 

2130 #@+node:ekr.20191113063144.66: *6* tog.AugAssign 

2131 # AugAssign(expr target, operator op, expr value) 

2132 

2133 def do_AugAssign(self, node): 

2134 

2135 # %s%s=%s\n' 

2136 op_name_ = op_name(node.op) 

2137 yield from self.gen(node.target) 

2138 yield from self.gen_op(op_name_ + '=') 

2139 yield from self.gen(node.value) 

2140 #@+node:ekr.20191113063144.67: *6* tog.Await 

2141 # Await(expr value) 

2142 

2143 def do_Await(self, node): 

2144 

2145 #'await %s\n' 

2146 async_token_type = 'await' if has_async_tokens else 'name' 

2147 yield from self.gen_token(async_token_type, 'await') 

2148 yield from self.gen(node.value) 

2149 #@+node:ekr.20191113063144.68: *6* tog.Break 

2150 def do_Break(self, node): 

2151 

2152 yield from self.gen_name('break') 

2153 #@+node:ekr.20191113063144.31: *6* tog.Call & helpers 

2154 # Call(expr func, expr* args, keyword* keywords) 

2155 

2156 # Python 3 ast.Call nodes do not have 'starargs' or 'kwargs' fields. 

2157 

2158 def do_Call(self, node): 

2159 

2160 # The calls to gen_op(')') and gen_op('(') do nothing by default. 

2161 # Subclasses might handle them in an overridden tog.set_links. 

2162 yield from self.gen(node.func) 

2163 yield from self.gen_op('(') 

2164 # No need to generate any commas. 

2165 yield from self.handle_call_arguments(node) 

2166 yield from self.gen_op(')') 

2167 #@+node:ekr.20191204114930.1: *7* tog.arg_helper 

2168 def arg_helper(self, node): 

2169 """ 

2170 Yield the node, with a special case for strings. 

2171 """ 

2172 if isinstance(node, str): 

2173 yield from self.gen_token('name', node) 

2174 else: 

2175 yield from self.gen(node) 

2176 #@+node:ekr.20191204105506.1: *7* tog.handle_call_arguments 

2177 def handle_call_arguments(self, node): 

2178 """ 

2179 Generate arguments in the correct order. 

2180 

2181 Call(expr func, expr* args, keyword* keywords) 

2182 

2183 https://docs.python.org/3/reference/expressions.html#calls 

2184 

2185 Warning: This code will fail on Python 3.8 only for calls 

2186 containing kwargs in unexpected places. 

2187 """ 

2188 # *args: in node.args[]: Starred(value=Name(id='args')) 

2189 # *[a, 3]: in node.args[]: Starred(value=List(elts=[Name(id='a'), Num(n=3)]) 

2190 # **kwargs: in node.keywords[]: keyword(arg=None, value=Name(id='kwargs')) 

2191 # 

2192 # Scan args for *name or *List 

2193 args = node.args or [] 

2194 keywords = node.keywords or [] 

2195 

2196 def get_pos(obj): 

2197 line1 = getattr(obj, 'lineno', None) 

2198 col1 = getattr(obj, 'col_offset', None) 

2199 return line1, col1, obj 

2200 

2201 def sort_key(aTuple): 

2202 line, col, obj = aTuple 

2203 return line * 1000 + col 

2204 

2205 if 0: 

2206 g.printObj([ast.dump(z) for z in args], tag='args') 

2207 g.printObj([ast.dump(z) for z in keywords], tag='keywords') 

2208 

2209 if py_version >= (3, 9): 

2210 places = [get_pos(z) for z in args + keywords] 

2211 places.sort(key=sort_key) 

2212 ordered_args = [z[2] for z in places] 

2213 for z in ordered_args: 

2214 if isinstance(z, ast.Starred): 

2215 yield from self.gen_op('*') 

2216 yield from self.gen(z.value) 

2217 elif isinstance(z, ast.keyword): 

2218 if getattr(z, 'arg', None) is None: 

2219 yield from self.gen_op('**') 

2220 yield from self.arg_helper(z.value) 

2221 else: 

2222 yield from self.arg_helper(z.arg) 

2223 yield from self.gen_op('=') 

2224 yield from self.arg_helper(z.value) 

2225 else: 

2226 yield from self.arg_helper(z) 

2227 else: # pragma: no cover 

2228 # 

2229 # Legacy code: May fail for Python 3.8 

2230 # 

2231 # Scan args for *arg and *[...] 

2232 kwarg_arg = star_arg = None 

2233 for z in args: 

2234 if isinstance(z, ast.Starred): 

2235 if isinstance(z.value, ast.Name): # *Name. 

2236 star_arg = z 

2237 args.remove(z) 

2238 break 

2239 elif isinstance(z.value, (ast.List, ast.Tuple)): # *[...] 

2240 # star_list = z 

2241 break 

2242 raise AttributeError(f"Invalid * expression: {ast.dump(z)}") # pragma: no cover 

2243 # Scan keywords for **name. 

2244 for z in keywords: 

2245 if hasattr(z, 'arg') and z.arg is None: 

2246 kwarg_arg = z 

2247 keywords.remove(z) 

2248 break 

2249 # Sync the plain arguments. 

2250 for z in args: 

2251 yield from self.arg_helper(z) 

2252 # Sync the keyword args. 

2253 for z in keywords: 

2254 yield from self.arg_helper(z.arg) 

2255 yield from self.gen_op('=') 

2256 yield from self.arg_helper(z.value) 

2257 # Sync the * arg. 

2258 if star_arg: 

2259 yield from self.arg_helper(star_arg) 

2260 # Sync the ** kwarg. 

2261 if kwarg_arg: 

2262 yield from self.gen_op('**') 

2263 yield from self.gen(kwarg_arg.value) 

2264 #@+node:ekr.20191113063144.69: *6* tog.Continue 

2265 def do_Continue(self, node): 

2266 

2267 yield from self.gen_name('continue') 

2268 #@+node:ekr.20191113063144.70: *6* tog.Delete 

2269 def do_Delete(self, node): 

2270 

2271 # No need to put commas. 

2272 yield from self.gen_name('del') 

2273 yield from self.gen(node.targets) 

2274 #@+node:ekr.20191113063144.71: *6* tog.ExceptHandler 

2275 def do_ExceptHandler(self, node): 

2276 

2277 # Except line... 

2278 yield from self.gen_name('except') 

2279 if getattr(node, 'type', None): 

2280 yield from self.gen(node.type) 

2281 if getattr(node, 'name', None): 

2282 yield from self.gen_name('as') 

2283 yield from self.gen_name(node.name) 

2284 yield from self.gen_op(':') 

2285 # Body... 

2286 self.level += 1 

2287 yield from self.gen(node.body) 

2288 self.level -= 1 

2289 #@+node:ekr.20191113063144.73: *6* tog.For 

2290 def do_For(self, node): 

2291 

2292 # The def line... 

2293 yield from self.gen_name('for') 

2294 yield from self.gen(node.target) 

2295 yield from self.gen_name('in') 

2296 yield from self.gen(node.iter) 

2297 yield from self.gen_op(':') 

2298 # Body... 

2299 self.level += 1 

2300 yield from self.gen(node.body) 

2301 # Else clause... 

2302 if node.orelse: 

2303 yield from self.gen_name('else') 

2304 yield from self.gen_op(':') 

2305 yield from self.gen(node.orelse) 

2306 self.level -= 1 

2307 #@+node:ekr.20191113063144.74: *6* tog.Global 

2308 # Global(identifier* names) 

2309 

2310 def do_Global(self, node): 

2311 

2312 yield from self.gen_name('global') 

2313 for z in node.names: 

2314 yield from self.gen_name(z) 

2315 #@+node:ekr.20191113063144.75: *6* tog.If & helpers 

2316 # If(expr test, stmt* body, stmt* orelse) 

2317 

2318 def do_If(self, node): 

2319 #@+<< do_If docstring >> 

2320 #@+node:ekr.20191122222412.1: *7* << do_If docstring >> 

2321 """ 

2322 The parse trees for the following are identical! 

2323 

2324 if 1: if 1: 

2325 pass pass 

2326 else: elif 2: 

2327 if 2: pass 

2328 pass 

2329 

2330 So there is *no* way for the 'if' visitor to disambiguate the above two 

2331 cases from the parse tree alone. 

2332 

2333 Instead, we scan the tokens list for the next 'if', 'else' or 'elif' token. 

2334 """ 

2335 #@-<< do_If docstring >> 

2336 # Use the next significant token to distinguish between 'if' and 'elif'. 

2337 token = self.find_next_significant_token() 

2338 yield from self.gen_name(token.value) 

2339 yield from self.gen(node.test) 

2340 yield from self.gen_op(':') 

2341 # 

2342 # Body... 

2343 self.level += 1 

2344 yield from self.gen(node.body) 

2345 self.level -= 1 

2346 # 

2347 # Else and elif clauses... 

2348 if node.orelse: 

2349 self.level += 1 

2350 token = self.find_next_significant_token() 

2351 if token.value == 'else': 

2352 yield from self.gen_name('else') 

2353 yield from self.gen_op(':') 

2354 yield from self.gen(node.orelse) 

2355 else: 

2356 yield from self.gen(node.orelse) 

2357 self.level -= 1 

2358 #@+node:ekr.20191113063144.76: *6* tog.Import & helper 

2359 def do_Import(self, node): 

2360 

2361 yield from self.gen_name('import') 

2362 for alias in node.names: 

2363 yield from self.gen_name(alias.name) 

2364 if alias.asname: 

2365 yield from self.gen_name('as') 

2366 yield from self.gen_name(alias.asname) 

2367 #@+node:ekr.20191113063144.77: *6* tog.ImportFrom 

2368 # ImportFrom(identifier? module, alias* names, int? level) 

2369 

2370 def do_ImportFrom(self, node): 

2371 

2372 yield from self.gen_name('from') 

2373 for i in range(node.level): 

2374 yield from self.gen_op('.') 

2375 if node.module: 

2376 yield from self.gen_name(node.module) 

2377 yield from self.gen_name('import') 

2378 # No need to put commas. 

2379 for alias in node.names: 

2380 if alias.name == '*': # #1851. 

2381 yield from self.gen_op('*') 

2382 else: 

2383 yield from self.gen_name(alias.name) 

2384 if alias.asname: 

2385 yield from self.gen_name('as') 

2386 yield from self.gen_name(alias.asname) 

2387 #@+node:ekr.20191113063144.78: *6* tog.Nonlocal 

2388 # Nonlocal(identifier* names) 

2389 

2390 def do_Nonlocal(self, node): 

2391 

2392 # nonlocal %s\n' % ','.join(node.names)) 

2393 # No need to put commas. 

2394 yield from self.gen_name('nonlocal') 

2395 for z in node.names: 

2396 yield from self.gen_name(z) 

2397 #@+node:ekr.20191113063144.79: *6* tog.Pass 

2398 def do_Pass(self, node): 

2399 

2400 yield from self.gen_name('pass') 

2401 #@+node:ekr.20191113063144.81: *6* tog.Raise 

2402 # Raise(expr? exc, expr? cause) 

2403 

2404 def do_Raise(self, node): 

2405 

2406 # No need to put commas. 

2407 yield from self.gen_name('raise') 

2408 exc = getattr(node, 'exc', None) 

2409 cause = getattr(node, 'cause', None) 

2410 tback = getattr(node, 'tback', None) 

2411 yield from self.gen(exc) 

2412 yield from self.gen(cause) 

2413 yield from self.gen(tback) 

2414 #@+node:ekr.20191113063144.82: *6* tog.Return 

2415 def do_Return(self, node): 

2416 

2417 yield from self.gen_name('return') 

2418 yield from self.gen(node.value) 

2419 #@+node:ekr.20191113063144.85: *6* tog.Try 

2420 # Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) 

2421 

2422 def do_Try(self, node): 

2423 

2424 # Try line... 

2425 yield from self.gen_name('try') 

2426 yield from self.gen_op(':') 

2427 # Body... 

2428 self.level += 1 

2429 yield from self.gen(node.body) 

2430 yield from self.gen(node.handlers) 

2431 # Else... 

2432 if node.orelse: 

2433 yield from self.gen_name('else') 

2434 yield from self.gen_op(':') 

2435 yield from self.gen(node.orelse) 

2436 # Finally... 

2437 if node.finalbody: 

2438 yield from self.gen_name('finally') 

2439 yield from self.gen_op(':') 

2440 yield from self.gen(node.finalbody) 

2441 self.level -= 1 

2442 #@+node:ekr.20191113063144.88: *6* tog.While 

2443 def do_While(self, node): 

2444 

2445 # While line... 

2446 # while %s:\n' 

2447 yield from self.gen_name('while') 

2448 yield from self.gen(node.test) 

2449 yield from self.gen_op(':') 

2450 # Body... 

2451 self.level += 1 

2452 yield from self.gen(node.body) 

2453 # Else clause... 

2454 if node.orelse: 

2455 yield from self.gen_name('else') 

2456 yield from self.gen_op(':') 

2457 yield from self.gen(node.orelse) 

2458 self.level -= 1 

2459 #@+node:ekr.20191113063144.89: *6* tog.With 

2460 # With(withitem* items, stmt* body) 

2461 

2462 # withitem = (expr context_expr, expr? optional_vars) 

2463 

2464 def do_With(self, node): 

2465 

2466 expr: Optional[ast.AST] = getattr(node, 'context_expression', None) 

2467 items: List[ast.AST] = getattr(node, 'items', []) 

2468 yield from self.gen_name('with') 

2469 yield from self.gen(expr) 

2470 # No need to put commas. 

2471 for item in items: 

2472 yield from self.gen(item.context_expr) # type:ignore 

2473 optional_vars = getattr(item, 'optional_vars', None) 

2474 if optional_vars is not None: 

2475 yield from self.gen_name('as') 

2476 yield from self.gen(item.optional_vars) # type:ignore 

2477 # End the line. 

2478 yield from self.gen_op(':') 

2479 # Body... 

2480 self.level += 1 

2481 yield from self.gen(node.body) 

2482 self.level -= 1 

2483 #@+node:ekr.20191113063144.90: *6* tog.Yield 

2484 def do_Yield(self, node): 

2485 

2486 yield from self.gen_name('yield') 

2487 if hasattr(node, 'value'): 

2488 yield from self.gen(node.value) 

2489 #@+node:ekr.20191113063144.91: *6* tog.YieldFrom 

2490 # YieldFrom(expr value) 

2491 

2492 def do_YieldFrom(self, node): 

2493 

2494 yield from self.gen_name('yield') 

2495 yield from self.gen_name('from') 

2496 yield from self.gen(node.value) 

2497 #@-others 

2498#@+node:ekr.20191226195813.1: *3* class TokenOrderTraverser 

2499class TokenOrderTraverser: 

2500 """ 

2501 Traverse an ast tree using the parent/child links created by the 

2502 TokenOrderInjector class. 

2503 """ 

2504 #@+others 

2505 #@+node:ekr.20191226200154.1: *4* TOT.traverse 

2506 def traverse(self, tree): 

2507 """ 

2508 Call visit, in token order, for all nodes in tree. 

2509 

2510 Recursion is not allowed. 

2511 

2512 The code follows p.moveToThreadNext exactly. 

2513 """ 

2514 

2515 def has_next(i, node, stack): 

2516 """Return True if stack[i] is a valid child of node.parent.""" 

2517 # g.trace(node.__class__.__name__, stack) 

2518 parent = node.parent 

2519 return bool(parent and parent.children and i < len(parent.children)) 

2520 

2521 # Update stats 

2522 

2523 self.last_node_index = -1 # For visit 

2524 # The stack contains child indices. 

2525 node, stack = tree, [0] 

2526 seen = set() 

2527 while node and stack: 

2528 # g.trace( 

2529 # f"{node.node_index:>3} " 

2530 # f"{node.__class__.__name__:<12} {stack}") 

2531 # Visit the node. 

2532 assert node.node_index not in seen, node.node_index 

2533 seen.add(node.node_index) 

2534 self.visit(node) 

2535 # if p.v.children: p.moveToFirstChild() 

2536 children: List[ast.AST] = getattr(node, 'children', []) 

2537 if children: 

2538 # Move to the first child. 

2539 stack.append(0) 

2540 node = children[0] 

2541 # g.trace(' child:', node.__class__.__name__, stack) 

2542 continue 

2543 # elif p.hasNext(): p.moveToNext() 

2544 stack[-1] += 1 

2545 i = stack[-1] 

2546 if has_next(i, node, stack): 

2547 node = node.parent.children[i] 

2548 continue 

2549 # else... 

2550 # p.moveToParent() 

2551 node = node.parent 

2552 stack.pop() 

2553 # while p: 

2554 while node and stack: 

2555 # if p.hasNext(): 

2556 stack[-1] += 1 

2557 i = stack[-1] 

2558 if has_next(i, node, stack): 

2559 # Move to the next sibling. 

2560 node = node.parent.children[i] 

2561 break # Found. 

2562 # p.moveToParent() 

2563 node = node.parent 

2564 stack.pop() 

2565 # not found. 

2566 else: 

2567 break # pragma: no cover 

2568 return self.last_node_index 

2569 #@+node:ekr.20191227160547.1: *4* TOT.visit 

2570 def visit(self, node): 

2571 

2572 self.last_node_index += 1 

2573 assert self.last_node_index == node.node_index, ( 

2574 self.last_node_index, node.node_index) 

2575 #@-others 

2576#@+node:ekr.20200107165250.1: *3* class Orange 

2577class Orange: 

2578 """ 

2579 A flexible and powerful beautifier for Python. 

2580 Orange is the new black. 

2581 

2582 *Important*: This is a predominantly a *token*-based beautifier. 

2583 However, orange.colon and orange.possible_unary_op use the parse 

2584 tree to provide context that would otherwise be difficult to 

2585 deduce. 

2586 """ 

2587 # This switch is really a comment. It will always be false. 

2588 # It marks the code that simulates the operation of the black tool. 

2589 black_mode = False 

2590 

2591 # Patterns... 

2592 nobeautify_pat = re.compile(r'\s*#\s*pragma:\s*no\s*beautify\b|#\s*@@nobeautify') 

2593 

2594 # Patterns from FastAtRead class, specialized for python delims. 

2595 node_pat = re.compile(r'^(\s*)#@\+node:([^:]+): \*(\d+)?(\*?) (.*)$') # @node 

2596 start_doc_pat = re.compile(r'^\s*#@\+(at|doc)?(\s.*?)?$') # @doc or @ 

2597 at_others_pat = re.compile(r'^(\s*)#@(\+|-)others\b(.*)$') # @others 

2598 

2599 # Doc parts end with @c or a node sentinel. Specialized for python. 

2600 end_doc_pat = re.compile(r"^\s*#@(@(c(ode)?)|([+]node\b.*))$") 

2601 #@+others 

2602 #@+node:ekr.20200107165250.2: *4* orange.ctor 

2603 def __init__(self, settings=None): 

2604 """Ctor for Orange class.""" 

2605 if settings is None: 

2606 settings = {} 

2607 valid_keys = ( 

2608 'allow_joined_strings', 

2609 'max_join_line_length', 

2610 'max_split_line_length', 

2611 'orange', 

2612 'tab_width', 

2613 ) 

2614 # For mypy... 

2615 self.kind: str = '' 

2616 # Default settings... 

2617 self.allow_joined_strings = False # EKR's preference. 

2618 self.max_join_line_length = 88 

2619 self.max_split_line_length = 88 

2620 self.tab_width = 4 

2621 # Override from settings dict... 

2622 for key in settings: # pragma: no cover 

2623 value = settings.get(key) 

2624 if key in valid_keys and value is not None: 

2625 setattr(self, key, value) 

2626 else: 

2627 g.trace(f"Unexpected setting: {key} = {value!r}") 

2628 #@+node:ekr.20200107165250.51: *4* orange.push_state 

2629 def push_state(self, kind, value=None): 

2630 """Append a state to the state stack.""" 

2631 state = ParseState(kind, value) 

2632 self.state_stack.append(state) 

2633 #@+node:ekr.20200107165250.8: *4* orange: Entries 

2634 #@+node:ekr.20200107173542.1: *5* orange.beautify (main token loop) 

2635 def oops(self): 

2636 g.trace(f"Unknown kind: {self.kind}") 

2637 

2638 def beautify(self, contents, filename, tokens, tree, max_join_line_length=None, max_split_line_length=None): 

2639 """ 

2640 The main line. Create output tokens and return the result as a string. 

2641 """ 

2642 # Config overrides 

2643 if max_join_line_length is not None: 

2644 self.max_join_line_length = max_join_line_length 

2645 if max_split_line_length is not None: 

2646 self.max_split_line_length = max_split_line_length 

2647 # State vars... 

2648 self.curly_brackets_level = 0 # Number of unmatched '{' tokens. 

2649 self.decorator_seen = False # Set by do_name for do_op. 

2650 self.in_arg_list = 0 # > 0 if in an arg list of a def. 

2651 self.level = 0 # Set only by do_indent and do_dedent. 

2652 self.lws = '' # Leading whitespace. 

2653 self.paren_level = 0 # Number of unmatched '(' tokens. 

2654 self.square_brackets_stack: List[bool] = [] # A stack of bools, for self.word(). 

2655 self.state_stack: List["ParseState"] = [] # Stack of ParseState objects. 

2656 self.val = None # The input token's value (a string). 

2657 self.verbatim = False # True: don't beautify. 

2658 # 

2659 # Init output list and state... 

2660 self.code_list: List[Token] = [] # The list of output tokens. 

2661 self.code_list_index = 0 # The token's index. 

2662 self.tokens = tokens # The list of input tokens. 

2663 self.tree = tree 

2664 self.add_token('file-start', '') 

2665 self.push_state('file-start') 

2666 for i, token in enumerate(tokens): 

2667 self.token = token 

2668 self.kind, self.val, self.line = token.kind, token.value, token.line 

2669 if self.verbatim: 

2670 self.do_verbatim() 

2671 else: 

2672 func = getattr(self, f"do_{token.kind}", self.oops) 

2673 func() 

2674 # Any post pass would go here. 

2675 return tokens_to_string(self.code_list) 

2676 #@+node:ekr.20200107172450.1: *5* orange.beautify_file (entry) 

2677 def beautify_file(self, filename): # pragma: no cover 

2678 """ 

2679 Orange: Beautify the the given external file. 

2680 

2681 Return True if the file was changed. 

2682 """ 

2683 tag = 'beautify-file' 

2684 self.filename = filename 

2685 tog = TokenOrderGenerator() 

2686 contents, encoding, tokens, tree = tog.init_from_file(filename) 

2687 if not contents or not tokens or not tree: 

2688 print(f"{tag}: Can not beautify: {filename}") 

2689 return False 

2690 # Beautify. 

2691 results = self.beautify(contents, filename, tokens, tree) 

2692 # Something besides newlines must change. 

2693 if regularize_nls(contents) == regularize_nls(results): 

2694 print(f"{tag}: Unchanged: {filename}") 

2695 return False 

2696 if 0: # This obscures more import error messages. 

2697 # Show the diffs. 

2698 show_diffs(contents, results, filename=filename) 

2699 # Write the results 

2700 print(f"{tag}: Wrote {filename}") 

2701 write_file(filename, results, encoding=encoding) 

2702 return True 

2703 #@+node:ekr.20200107172512.1: *5* orange.beautify_file_diff (entry) 

2704 def beautify_file_diff(self, filename): # pragma: no cover 

2705 """ 

2706 Orange: Print the diffs that would resulf from the orange-file command. 

2707 

2708 Return True if the file would be changed. 

2709 """ 

2710 tag = 'diff-beautify-file' 

2711 self.filename = filename 

2712 tog = TokenOrderGenerator() 

2713 contents, encoding, tokens, tree = tog.init_from_file(filename) 

2714 if not contents or not tokens or not tree: 

2715 print(f"{tag}: Can not beautify: {filename}") 

2716 return False 

2717 # fstringify. 

2718 results = self.beautify(contents, filename, tokens, tree) 

2719 # Something besides newlines must change. 

2720 if regularize_nls(contents) == regularize_nls(results): 

2721 print(f"{tag}: Unchanged: {filename}") 

2722 return False 

2723 # Show the diffs. 

2724 show_diffs(contents, results, filename=filename) 

2725 return True 

2726 #@+node:ekr.20200107165250.13: *4* orange: Input token handlers 

2727 #@+node:ekr.20200107165250.14: *5* orange.do_comment 

2728 in_doc_part = False 

2729 

2730 def do_comment(self): 

2731 """Handle a comment token.""" 

2732 val = self.val 

2733 # 

2734 # Leo-specific code... 

2735 if self.node_pat.match(val): 

2736 # Clear per-node state. 

2737 self.in_doc_part = False 

2738 self.verbatim = False 

2739 self.decorator_seen = False 

2740 # Do *not clear other state, which may persist across @others. 

2741 # self.curly_brackets_level = 0 

2742 # self.in_arg_list = 0 

2743 # self.level = 0 

2744 # self.lws = '' 

2745 # self.paren_level = 0 

2746 # self.square_brackets_stack = [] 

2747 # self.state_stack = [] 

2748 else: 

2749 # Keep track of verbatim mode. 

2750 if self.beautify_pat.match(val): 

2751 self.verbatim = False 

2752 elif self.nobeautify_pat.match(val): 

2753 self.verbatim = True 

2754 # Keep trace of @doc parts, to honor the convention for splitting lines. 

2755 if self.start_doc_pat.match(val): 

2756 self.in_doc_part = True 

2757 if self.end_doc_pat.match(val): 

2758 self.in_doc_part = False 

2759 # 

2760 # General code: Generate the comment. 

2761 self.clean('blank') 

2762 entire_line = self.line.lstrip().startswith('#') 

2763 if entire_line: 

2764 self.clean('hard-blank') 

2765 self.clean('line-indent') 

2766 # #1496: No further munging needed. 

2767 val = self.line.rstrip() 

2768 else: 

2769 # Exactly two spaces before trailing comments. 

2770 val = ' ' + self.val.rstrip() 

2771 self.add_token('comment', val) 

2772 #@+node:ekr.20200107165250.15: *5* orange.do_encoding 

2773 def do_encoding(self): 

2774 """ 

2775 Handle the encoding token. 

2776 """ 

2777 pass 

2778 #@+node:ekr.20200107165250.16: *5* orange.do_endmarker 

2779 def do_endmarker(self): 

2780 """Handle an endmarker token.""" 

2781 # Ensure exactly one blank at the end of the file. 

2782 self.clean_blank_lines() 

2783 self.add_token('line-end', '\n') 

2784 #@+node:ekr.20200107165250.18: *5* orange.do_indent & do_dedent & helper 

2785 def do_dedent(self): 

2786 """Handle dedent token.""" 

2787 self.level -= 1 

2788 self.lws = self.level * self.tab_width * ' ' 

2789 self.line_indent() 

2790 if self.black_mode: # pragma: no cover (black) 

2791 state = self.state_stack[-1] 

2792 if state.kind == 'indent' and state.value == self.level: 

2793 self.state_stack.pop() 

2794 state = self.state_stack[-1] 

2795 if state.kind in ('class', 'def'): 

2796 self.state_stack.pop() 

2797 self.handle_dedent_after_class_or_def(state.kind) 

2798 

2799 def do_indent(self): 

2800 """Handle indent token.""" 

2801 new_indent = self.val 

2802 old_indent = self.level * self.tab_width * ' ' 

2803 if new_indent > old_indent: 

2804 self.level += 1 

2805 elif new_indent < old_indent: # pragma: no cover (defensive) 

2806 g.trace('\n===== can not happen', repr(new_indent), repr(old_indent)) 

2807 self.lws = new_indent 

2808 self.line_indent() 

2809 #@+node:ekr.20200220054928.1: *6* orange.handle_dedent_after_class_or_def 

2810 def handle_dedent_after_class_or_def(self, kind): # pragma: no cover (black) 

2811 """ 

2812 Insert blank lines after a class or def as the result of a 'dedent' token. 

2813 

2814 Normal comment lines may precede the 'dedent'. 

2815 Insert the blank lines *before* such comment lines. 

2816 """ 

2817 # 

2818 # Compute the tail. 

2819 i = len(self.code_list) - 1 

2820 tail: List[Token] = [] 

2821 while i > 0: 

2822 t = self.code_list.pop() 

2823 i -= 1 

2824 if t.kind == 'line-indent': 

2825 pass 

2826 elif t.kind == 'line-end': 

2827 tail.insert(0, t) 

2828 elif t.kind == 'comment': 

2829 # Only underindented single-line comments belong in the tail. 

2830 # @+node comments must never be in the tail. 

2831 single_line = self.code_list[i].kind in ('line-end', 'line-indent') 

2832 lws = len(t.value) - len(t.value.lstrip()) 

2833 underindent = lws <= len(self.lws) 

2834 if underindent and single_line and not self.node_pat.match(t.value): 

2835 # A single-line comment. 

2836 tail.insert(0, t) 

2837 else: 

2838 self.code_list.append(t) 

2839 break 

2840 else: 

2841 self.code_list.append(t) 

2842 break 

2843 # 

2844 # Remove leading 'line-end' tokens from the tail. 

2845 while tail and tail[0].kind == 'line-end': 

2846 tail = tail[1:] 

2847 # 

2848 # Put the newlines *before* the tail. 

2849 # For Leo, always use 1 blank lines. 

2850 n = 1 # n = 2 if kind == 'class' else 1 

2851 # Retain the token (intention) for debugging. 

2852 self.add_token('blank-lines', n) 

2853 for i in range(0, n + 1): 

2854 self.add_token('line-end', '\n') 

2855 if tail: 

2856 self.code_list.extend(tail) 

2857 self.line_indent() 

2858 #@+node:ekr.20200107165250.20: *5* orange.do_name 

2859 def do_name(self): 

2860 """Handle a name token.""" 

2861 name = self.val 

2862 if self.black_mode and name in ('class', 'def'): # pragma: no cover (black) 

2863 # Handle newlines before and after 'class' or 'def' 

2864 self.decorator_seen = False 

2865 state = self.state_stack[-1] 

2866 if state.kind == 'decorator': 

2867 # Always do this, regardless of @bool clean-blank-lines. 

2868 self.clean_blank_lines() 

2869 # Suppress split/join. 

2870 self.add_token('hard-newline', '\n') 

2871 self.add_token('line-indent', self.lws) 

2872 self.state_stack.pop() 

2873 else: 

2874 # Always do this, regardless of @bool clean-blank-lines. 

2875 self.blank_lines(2 if name == 'class' else 1) 

2876 self.push_state(name) 

2877 self.push_state('indent', self.level) 

2878 # For trailing lines after inner classes/defs. 

2879 self.word(name) 

2880 return 

2881 # 

2882 # Leo mode... 

2883 if name in ('class', 'def'): 

2884 self.word(name) 

2885 elif name in ( 

2886 'and', 'elif', 'else', 'for', 'if', 'in', 'not', 'not in', 'or', 'while' 

2887 ): 

2888 self.word_op(name) 

2889 else: 

2890 self.word(name) 

2891 #@+node:ekr.20200107165250.21: *5* orange.do_newline & do_nl 

2892 def do_newline(self): 

2893 """Handle a regular newline.""" 

2894 self.line_end() 

2895 

2896 def do_nl(self): 

2897 """Handle a continuation line.""" 

2898 self.line_end() 

2899 #@+node:ekr.20200107165250.22: *5* orange.do_number 

2900 def do_number(self): 

2901 """Handle a number token.""" 

2902 self.blank() 

2903 self.add_token('number', self.val) 

2904 #@+node:ekr.20200107165250.23: *5* orange.do_op 

2905 def do_op(self): 

2906 """Handle an op token.""" 

2907 val = self.val 

2908 if val == '.': 

2909 self.clean('blank') 

2910 self.add_token('op-no-blanks', val) 

2911 elif val == '@': 

2912 if self.black_mode: # pragma: no cover (black) 

2913 if not self.decorator_seen: 

2914 self.blank_lines(1) 

2915 self.decorator_seen = True 

2916 self.clean('blank') 

2917 self.add_token('op-no-blanks', val) 

2918 self.push_state('decorator') 

2919 elif val == ':': 

2920 # Treat slices differently. 

2921 self.colon(val) 

2922 elif val in ',;': 

2923 # Pep 8: Avoid extraneous whitespace immediately before 

2924 # comma, semicolon, or colon. 

2925 self.clean('blank') 

2926 self.add_token('op', val) 

2927 self.blank() 

2928 elif val in '([{': 

2929 # Pep 8: Avoid extraneous whitespace immediately inside 

2930 # parentheses, brackets or braces. 

2931 self.lt(val) 

2932 elif val in ')]}': 

2933 # Ditto. 

2934 self.rt(val) 

2935 elif val == '=': 

2936 # Pep 8: Don't use spaces around the = sign when used to indicate 

2937 # a keyword argument or a default parameter value. 

2938 if self.paren_level: 

2939 self.clean('blank') 

2940 self.add_token('op-no-blanks', val) 

2941 else: 

2942 self.blank() 

2943 self.add_token('op', val) 

2944 self.blank() 

2945 elif val in '~+-': 

2946 self.possible_unary_op(val) 

2947 elif val == '*': 

2948 self.star_op() 

2949 elif val == '**': 

2950 self.star_star_op() 

2951 else: 

2952 # Pep 8: always surround binary operators with a single space. 

2953 # '==','+=','-=','*=','**=','/=','//=','%=','!=','<=','>=','<','>', 

2954 # '^','~','*','**','&','|','/','//', 

2955 # Pep 8: If operators with different priorities are used, 

2956 # consider adding whitespace around the operators with the lowest priority(ies). 

2957 self.blank() 

2958 self.add_token('op', val) 

2959 self.blank() 

2960 #@+node:ekr.20200107165250.24: *5* orange.do_string 

2961 def do_string(self): 

2962 """Handle a 'string' token.""" 

2963 # Careful: continued strings may contain '\r' 

2964 val = regularize_nls(self.val) 

2965 self.add_token('string', val) 

2966 self.blank() 

2967 #@+node:ekr.20200210175117.1: *5* orange.do_verbatim 

2968 beautify_pat = re.compile( 

2969 r'#\s*pragma:\s*beautify\b|#\s*@@beautify|#\s*@\+node|#\s*@[+-]others|#\s*@[+-]<<') 

2970 

2971 def do_verbatim(self): 

2972 """ 

2973 Handle one token in verbatim mode. 

2974 End verbatim mode when the appropriate comment is seen. 

2975 """ 

2976 kind = self.kind 

2977 # 

2978 # Careful: tokens may contain '\r' 

2979 val = regularize_nls(self.val) 

2980 if kind == 'comment': 

2981 if self.beautify_pat.match(val): 

2982 self.verbatim = False 

2983 val = val.rstrip() 

2984 self.add_token('comment', val) 

2985 return 

2986 if kind == 'indent': 

2987 self.level += 1 

2988 self.lws = self.level * self.tab_width * ' ' 

2989 if kind == 'dedent': 

2990 self.level -= 1 

2991 self.lws = self.level * self.tab_width * ' ' 

2992 self.add_token('verbatim', val) 

2993 #@+node:ekr.20200107165250.25: *5* orange.do_ws 

2994 def do_ws(self): 

2995 """ 

2996 Handle the "ws" pseudo-token. 

2997 

2998 Put the whitespace only if if ends with backslash-newline. 

2999 """ 

3000 val = self.val 

3001 # Handle backslash-newline. 

3002 if '\\\n' in val: 

3003 self.clean('blank') 

3004 self.add_token('op-no-blanks', val) 

3005 return 

3006 # Handle start-of-line whitespace. 

3007 prev = self.code_list[-1] 

3008 inner = self.paren_level or self.square_brackets_stack or self.curly_brackets_level 

3009 if prev.kind == 'line-indent' and inner: 

3010 # Retain the indent that won't be cleaned away. 

3011 self.clean('line-indent') 

3012 self.add_token('hard-blank', val) 

3013 #@+node:ekr.20200107165250.26: *4* orange: Output token generators 

3014 #@+node:ekr.20200118145044.1: *5* orange.add_line_end 

3015 def add_line_end(self): 

3016 """Add a line-end request to the code list.""" 

3017 # This may be called from do_name as well as do_newline and do_nl. 

3018 assert self.token.kind in ('newline', 'nl'), self.token.kind 

3019 self.clean('blank') # Important! 

3020 self.clean('line-indent') 

3021 t = self.add_token('line-end', '\n') 

3022 # Distinguish between kinds of 'line-end' tokens. 

3023 t.newline_kind = self.token.kind 

3024 return t 

3025 #@+node:ekr.20200107170523.1: *5* orange.add_token 

3026 def add_token(self, kind, value): 

3027 """Add an output token to the code list.""" 

3028 tok = Token(kind, value) 

3029 tok.index = self.code_list_index # For debugging only. 

3030 self.code_list_index += 1 

3031 self.code_list.append(tok) 

3032 return tok 

3033 #@+node:ekr.20200107165250.27: *5* orange.blank 

3034 def blank(self): 

3035 """Add a blank request to the code list.""" 

3036 prev = self.code_list[-1] 

3037 if prev.kind not in ( 

3038 'blank', 

3039 'blank-lines', 

3040 'file-start', 

3041 'hard-blank', # Unique to orange. 

3042 'line-end', 

3043 'line-indent', 

3044 'lt', 

3045 'op-no-blanks', 

3046 'unary-op', 

3047 ): 

3048 self.add_token('blank', ' ') 

3049 #@+node:ekr.20200107165250.29: *5* orange.blank_lines (black only) 

3050 def blank_lines(self, n): # pragma: no cover (black) 

3051 """ 

3052 Add a request for n blank lines to the code list. 

3053 Multiple blank-lines request yield at least the maximum of all requests. 

3054 """ 

3055 self.clean_blank_lines() 

3056 prev = self.code_list[-1] 

3057 if prev.kind == 'file-start': 

3058 self.add_token('blank-lines', n) 

3059 return 

3060 for i in range(0, n + 1): 

3061 self.add_token('line-end', '\n') 

3062 # Retain the token (intention) for debugging. 

3063 self.add_token('blank-lines', n) 

3064 self.line_indent() 

3065 #@+node:ekr.20200107165250.30: *5* orange.clean 

3066 def clean(self, kind): 

3067 """Remove the last item of token list if it has the given kind.""" 

3068 prev = self.code_list[-1] 

3069 if prev.kind == kind: 

3070 self.code_list.pop() 

3071 #@+node:ekr.20200107165250.31: *5* orange.clean_blank_lines 

3072 def clean_blank_lines(self): 

3073 """ 

3074 Remove all vestiges of previous blank lines. 

3075 

3076 Return True if any of the cleaned 'line-end' tokens represented "hard" newlines. 

3077 """ 

3078 cleaned_newline = False 

3079 table = ('blank-lines', 'line-end', 'line-indent') 

3080 while self.code_list[-1].kind in table: 

3081 t = self.code_list.pop() 

3082 if t.kind == 'line-end' and getattr(t, 'newline_kind', None) != 'nl': 

3083 cleaned_newline = True 

3084 return cleaned_newline 

3085 #@+node:ekr.20200107165250.32: *5* orange.colon 

3086 def colon(self, val): 

3087 """Handle a colon.""" 

3088 

3089 def is_expr(node): 

3090 """True if node is any expression other than += number.""" 

3091 if isinstance(node, (ast.BinOp, ast.Call, ast.IfExp)): 

3092 return True 

3093 return isinstance( 

3094 node, ast.UnaryOp) and not isinstance(node.operand, ast.Num) 

3095 

3096 node = self.token.node 

3097 self.clean('blank') 

3098 if not isinstance(node, ast.Slice): 

3099 self.add_token('op', val) 

3100 self.blank() 

3101 return 

3102 # A slice. 

3103 lower = getattr(node, 'lower', None) 

3104 upper = getattr(node, 'upper', None) 

3105 step = getattr(node, 'step', None) 

3106 if any(is_expr(z) for z in (lower, upper, step)): 

3107 prev = self.code_list[-1] 

3108 if prev.value not in '[:': 

3109 self.blank() 

3110 self.add_token('op', val) 

3111 self.blank() 

3112 else: 

3113 self.add_token('op-no-blanks', val) 

3114 #@+node:ekr.20200107165250.33: *5* orange.line_end 

3115 def line_end(self): 

3116 """Add a line-end request to the code list.""" 

3117 # This should be called only be do_newline and do_nl. 

3118 node, token = self.token.statement_node, self.token 

3119 assert token.kind in ('newline', 'nl'), (token.kind, g.callers()) 

3120 # Create the 'line-end' output token. 

3121 self.add_line_end() 

3122 # Attempt to split the line. 

3123 was_split = self.split_line(node, token) 

3124 # Attempt to join the line only if it has not just been split. 

3125 if not was_split and self.max_join_line_length > 0: 

3126 self.join_lines(node, token) 

3127 self.line_indent() 

3128 # Add the indentation for all lines 

3129 # until the next indent or unindent token. 

3130 #@+node:ekr.20200107165250.40: *5* orange.line_indent 

3131 def line_indent(self): 

3132 """Add a line-indent token.""" 

3133 self.clean('line-indent') 

3134 # Defensive. Should never happen. 

3135 self.add_token('line-indent', self.lws) 

3136 #@+node:ekr.20200107165250.41: *5* orange.lt & rt 

3137 #@+node:ekr.20200107165250.42: *6* orange.lt 

3138 def lt(self, val): 

3139 """Generate code for a left paren or curly/square bracket.""" 

3140 assert val in '([{', repr(val) 

3141 if val == '(': 

3142 self.paren_level += 1 

3143 elif val == '[': 

3144 self.square_brackets_stack.append(False) 

3145 else: 

3146 self.curly_brackets_level += 1 

3147 self.clean('blank') 

3148 prev = self.code_list[-1] 

3149 if prev.kind in ('op', 'word-op'): 

3150 self.blank() 

3151 self.add_token('lt', val) 

3152 elif prev.kind == 'word': 

3153 # Only suppress blanks before '(' or '[' for non-keyworks. 

3154 if val == '{' or prev.value in ('if', 'else', 'return', 'for'): 

3155 self.blank() 

3156 elif val == '(': 

3157 self.in_arg_list += 1 

3158 self.add_token('lt', val) 

3159 else: 

3160 self.clean('blank') 

3161 self.add_token('op-no-blanks', val) 

3162 #@+node:ekr.20200107165250.43: *6* orange.rt 

3163 def rt(self, val): 

3164 """Generate code for a right paren or curly/square bracket.""" 

3165 assert val in ')]}', repr(val) 

3166 if val == ')': 

3167 self.paren_level -= 1 

3168 self.in_arg_list = max(0, self.in_arg_list - 1) 

3169 elif val == ']': 

3170 self.square_brackets_stack.pop() 

3171 else: 

3172 self.curly_brackets_level -= 1 

3173 self.clean('blank') 

3174 self.add_token('rt', val) 

3175 #@+node:ekr.20200107165250.45: *5* orange.possible_unary_op & unary_op 

3176 def possible_unary_op(self, s): 

3177 """Add a unary or binary op to the token list.""" 

3178 node = self.token.node 

3179 self.clean('blank') 

3180 if isinstance(node, ast.UnaryOp): 

3181 self.unary_op(s) 

3182 else: 

3183 self.blank() 

3184 self.add_token('op', s) 

3185 self.blank() 

3186 

3187 def unary_op(self, s): 

3188 """Add an operator request to the code list.""" 

3189 assert s and isinstance(s, str), repr(s) 

3190 self.clean('blank') 

3191 prev = self.code_list[-1] 

3192 if prev.kind == 'lt': 

3193 self.add_token('unary-op', s) 

3194 else: 

3195 self.blank() 

3196 self.add_token('unary-op', s) 

3197 #@+node:ekr.20200107165250.46: *5* orange.star_op 

3198 def star_op(self): 

3199 """Put a '*' op, with special cases for *args.""" 

3200 val = '*' 

3201 self.clean('blank') 

3202 if self.paren_level > 0: 

3203 prev = self.code_list[-1] 

3204 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

3205 self.blank() 

3206 self.add_token('op', val) 

3207 return 

3208 self.blank() 

3209 self.add_token('op', val) 

3210 self.blank() 

3211 #@+node:ekr.20200107165250.47: *5* orange.star_star_op 

3212 def star_star_op(self): 

3213 """Put a ** operator, with a special case for **kwargs.""" 

3214 val = '**' 

3215 self.clean('blank') 

3216 if self.paren_level > 0: 

3217 prev = self.code_list[-1] 

3218 if prev.kind == 'lt' or (prev.kind, prev.value) == ('op', ','): 

3219 self.blank() 

3220 self.add_token('op', val) 

3221 return 

3222 self.blank() 

3223 self.add_token('op', val) 

3224 self.blank() 

3225 #@+node:ekr.20200107165250.48: *5* orange.word & word_op 

3226 def word(self, s): 

3227 """Add a word request to the code list.""" 

3228 assert s and isinstance(s, str), repr(s) 

3229 if self.square_brackets_stack: 

3230 # A previous 'op-no-blanks' token may cancel this blank. 

3231 self.blank() 

3232 self.add_token('word', s) 

3233 elif self.in_arg_list > 0: 

3234 self.add_token('word', s) 

3235 self.blank() 

3236 else: 

3237 self.blank() 

3238 self.add_token('word', s) 

3239 self.blank() 

3240 

3241 def word_op(self, s): 

3242 """Add a word-op request to the code list.""" 

3243 assert s and isinstance(s, str), repr(s) 

3244 self.blank() 

3245 self.add_token('word-op', s) 

3246 self.blank() 

3247 #@+node:ekr.20200118120049.1: *4* orange: Split/join 

3248 #@+node:ekr.20200107165250.34: *5* orange.split_line & helpers 

3249 def split_line(self, node, token): 

3250 """ 

3251 Split token's line, if possible and enabled. 

3252 

3253 Return True if the line was broken into two or more lines. 

3254 """ 

3255 assert token.kind in ('newline', 'nl'), repr(token) 

3256 # Return if splitting is disabled: 

3257 if self.max_split_line_length <= 0: # pragma: no cover (user option) 

3258 return False 

3259 # Return if the node can't be split. 

3260 if not is_long_statement(node): 

3261 return False 

3262 # Find the *output* tokens of the previous lines. 

3263 line_tokens = self.find_prev_line() 

3264 line_s = ''.join([z.to_string() for z in line_tokens]) 

3265 # Do nothing for short lines. 

3266 if len(line_s) < self.max_split_line_length: 

3267 return False 

3268 # Return if the previous line has no opening delim: (, [ or {. 

3269 if not any(z.kind == 'lt' for z in line_tokens): # pragma: no cover (defensive) 

3270 return False 

3271 prefix = self.find_line_prefix(line_tokens) 

3272 # Calculate the tail before cleaning the prefix. 

3273 tail = line_tokens[len(prefix) :] 

3274 # Cut back the token list: subtract 1 for the trailing line-end. 

3275 self.code_list = self.code_list[: len(self.code_list) - len(line_tokens) - 1] 

3276 # Append the tail, splitting it further, as needed. 

3277 self.append_tail(prefix, tail) 

3278 # Add the line-end token deleted by find_line_prefix. 

3279 self.add_token('line-end', '\n') 

3280 return True 

3281 #@+node:ekr.20200107165250.35: *6* orange.append_tail 

3282 def append_tail(self, prefix, tail): 

3283 """Append the tail tokens, splitting the line further as necessary.""" 

3284 tail_s = ''.join([z.to_string() for z in tail]) 

3285 if len(tail_s) < self.max_split_line_length: 

3286 # Add the prefix. 

3287 self.code_list.extend(prefix) 

3288 # Start a new line and increase the indentation. 

3289 self.add_token('line-end', '\n') 

3290 self.add_token('line-indent', self.lws + ' ' * 4) 

3291 self.code_list.extend(tail) 

3292 return 

3293 # Still too long. Split the line at commas. 

3294 self.code_list.extend(prefix) 

3295 # Start a new line and increase the indentation. 

3296 self.add_token('line-end', '\n') 

3297 self.add_token('line-indent', self.lws + ' ' * 4) 

3298 open_delim = Token(kind='lt', value=prefix[-1].value) 

3299 value = open_delim.value.replace('(', ')').replace('[', ']').replace('{', '}') 

3300 close_delim = Token(kind='rt', value=value) 

3301 delim_count = 1 

3302 lws = self.lws + ' ' * 4 

3303 for i, t in enumerate(tail): 

3304 if t.kind == 'op' and t.value == ',': 

3305 if delim_count == 1: 

3306 # Start a new line. 

3307 self.add_token('op-no-blanks', ',') 

3308 self.add_token('line-end', '\n') 

3309 self.add_token('line-indent', lws) 

3310 # Kill a following blank. 

3311 if i + 1 < len(tail): 

3312 next_t = tail[i + 1] 

3313 if next_t.kind == 'blank': 

3314 next_t.kind = 'no-op' 

3315 next_t.value = '' 

3316 else: 

3317 self.code_list.append(t) 

3318 elif t.kind == close_delim.kind and t.value == close_delim.value: 

3319 # Done if the delims match. 

3320 delim_count -= 1 

3321 if delim_count == 0: 

3322 # Start a new line 

3323 self.add_token('op-no-blanks', ',') 

3324 self.add_token('line-end', '\n') 

3325 self.add_token('line-indent', self.lws) 

3326 self.code_list.extend(tail[i:]) 

3327 return 

3328 lws = lws[:-4] 

3329 self.code_list.append(t) 

3330 elif t.kind == open_delim.kind and t.value == open_delim.value: 

3331 delim_count += 1 

3332 lws = lws + ' ' * 4 

3333 self.code_list.append(t) 

3334 else: 

3335 self.code_list.append(t) 

3336 g.trace('BAD DELIMS', delim_count) 

3337 #@+node:ekr.20200107165250.36: *6* orange.find_prev_line 

3338 def find_prev_line(self): 

3339 """Return the previous line, as a list of tokens.""" 

3340 line = [] 

3341 for t in reversed(self.code_list[:-1]): 

3342 if t.kind in ('hard-newline', 'line-end'): 

3343 break 

3344 line.append(t) 

3345 return list(reversed(line)) 

3346 #@+node:ekr.20200107165250.37: *6* orange.find_line_prefix 

3347 def find_line_prefix(self, token_list): 

3348 """ 

3349 Return all tokens up to and including the first lt token. 

3350 Also add all lt tokens directly following the first lt token. 

3351 """ 

3352 result = [] 

3353 for i, t in enumerate(token_list): 

3354 result.append(t) 

3355 if t.kind == 'lt': 

3356 break 

3357 return result 

3358 #@+node:ekr.20200107165250.39: *5* orange.join_lines 

3359 def join_lines(self, node, token): 

3360 """ 

3361 Join preceding lines, if possible and enabled. 

3362 token is a line_end token. node is the corresponding ast node. 

3363 """ 

3364 if self.max_join_line_length <= 0: # pragma: no cover (user option) 

3365 return 

3366 assert token.kind in ('newline', 'nl'), repr(token) 

3367 if token.kind == 'nl': 

3368 return 

3369 # Scan backward in the *code* list, 

3370 # looking for 'line-end' tokens with tok.newline_kind == 'nl' 

3371 nls = 0 

3372 i = len(self.code_list) - 1 

3373 t = self.code_list[i] 

3374 assert t.kind == 'line-end', repr(t) 

3375 # Not all tokens have a newline_kind ivar. 

3376 assert t.newline_kind == 'newline' # type:ignore 

3377 i -= 1 

3378 while i >= 0: 

3379 t = self.code_list[i] 

3380 if t.kind == 'comment': 

3381 # Can't join. 

3382 return 

3383 if t.kind == 'string' and not self.allow_joined_strings: 

3384 # An EKR preference: don't join strings, no matter what black does. 

3385 # This allows "short" f-strings to be aligned. 

3386 return 

3387 if t.kind == 'line-end': 

3388 if getattr(t, 'newline_kind', None) == 'nl': 

3389 nls += 1 

3390 else: 

3391 break # pragma: no cover 

3392 i -= 1 

3393 # Retain at the file-start token. 

3394 if i <= 0: 

3395 i = 1 

3396 if nls <= 0: # pragma: no cover (rare) 

3397 return 

3398 # Retain line-end and and any following line-indent. 

3399 # Required, so that the regex below won't eat too much. 

3400 while True: 

3401 t = self.code_list[i] 

3402 if t.kind == 'line-end': 

3403 if getattr(t, 'newline_kind', None) == 'nl': # pragma: no cover (rare) 

3404 nls -= 1 

3405 i += 1 

3406 elif self.code_list[i].kind == 'line-indent': 

3407 i += 1 

3408 else: 

3409 break # pragma: no cover (defensive) 

3410 if nls <= 0: # pragma: no cover (defensive) 

3411 return 

3412 # Calculate the joined line. 

3413 tail = self.code_list[i:] 

3414 tail_s = tokens_to_string(tail) 

3415 tail_s = re.sub(r'\n\s*', ' ', tail_s) 

3416 tail_s = tail_s.replace('( ', '(').replace(' )', ')') 

3417 tail_s = tail_s.rstrip() 

3418 # Don't join the lines if they would be too long. 

3419 if len(tail_s) > self.max_join_line_length: # pragma: no cover (defensive) 

3420 return 

3421 # Cut back the code list. 

3422 self.code_list = self.code_list[:i] 

3423 # Add the new output tokens. 

3424 self.add_token('string', tail_s) 

3425 self.add_token('line-end', '\n') 

3426 #@-others 

3427#@+node:ekr.20200107170847.1: *3* class OrangeSettings 

3428class OrangeSettings: 

3429 

3430 pass 

3431#@+node:ekr.20200107170126.1: *3* class ParseState 

3432class ParseState: 

3433 """ 

3434 A class representing items in the parse state stack. 

3435 

3436 The present states: 

3437 

3438 'file-start': Ensures the stack stack is never empty. 

3439 

3440 'decorator': The last '@' was a decorator. 

3441 

3442 do_op(): push_state('decorator') 

3443 do_name(): pops the stack if state.kind == 'decorator'. 

3444 

3445 'indent': The indentation level for 'class' and 'def' names. 

3446 

3447 do_name(): push_state('indent', self.level) 

3448 do_dendent(): pops the stack once or twice if state.value == self.level. 

3449 

3450 """ 

3451 

3452 def __init__(self, kind, value): 

3453 self.kind = kind 

3454 self.value = value 

3455 

3456 def __repr__(self): 

3457 return f"State: {self.kind} {self.value!r}" 

3458 

3459 __str__ = __repr__ 

3460#@+node:ekr.20200122033203.1: ** TOT classes... 

3461#@+node:ekr.20191222083453.1: *3* class Fstringify (TOT) 

3462class Fstringify(TokenOrderTraverser): 

3463 """A class to fstringify files.""" 

3464 

3465 silent = True # for pytest. Defined in all entries. 

3466 line_number = 0 

3467 line = '' 

3468 

3469 #@+others 

3470 #@+node:ekr.20191222083947.1: *4* fs.fstringify 

3471 def fstringify(self, contents, filename, tokens, tree): 

3472 """ 

3473 Fstringify.fstringify: 

3474 

3475 f-stringify the sources given by (tokens, tree). 

3476 

3477 Return the resulting string. 

3478 """ 

3479 self.filename = filename 

3480 self.tokens = tokens 

3481 self.tree = tree 

3482 # Prepass: reassign tokens. 

3483 ReassignTokens().reassign(filename, tokens, tree) 

3484 # Main pass. 

3485 self.traverse(self.tree) 

3486 results = tokens_to_string(self.tokens) 

3487 return results 

3488 #@+node:ekr.20200103054101.1: *4* fs.fstringify_file (entry) 

3489 def fstringify_file(self, filename): # pragma: no cover 

3490 """ 

3491 Fstringify.fstringify_file. 

3492 

3493 The entry point for the fstringify-file command. 

3494 

3495 f-stringify the given external file with the Fstrinfify class. 

3496 

3497 Return True if the file was changed. 

3498 """ 

3499 tag = 'fstringify-file' 

3500 self.filename = filename 

3501 self.silent = False 

3502 tog = TokenOrderGenerator() 

3503 try: 

3504 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3505 if not contents or not tokens or not tree: 

3506 print(f"{tag}: Can not fstringify: {filename}") 

3507 return False 

3508 results = self.fstringify(contents, filename, tokens, tree) 

3509 except Exception as e: 

3510 print(e) 

3511 return False 

3512 # Something besides newlines must change. 

3513 changed = regularize_nls(contents) != regularize_nls(results) 

3514 status = 'Wrote' if changed else 'Unchanged' 

3515 print(f"{tag}: {status:>9}: {filename}") 

3516 if changed: 

3517 write_file(filename, results, encoding=encoding) 

3518 return changed 

3519 #@+node:ekr.20200103065728.1: *4* fs.fstringify_file_diff (entry) 

3520 def fstringify_file_diff(self, filename): # pragma: no cover 

3521 """ 

3522 Fstringify.fstringify_file_diff. 

3523 

3524 The entry point for the diff-fstringify-file command. 

3525 

3526 Print the diffs that would resulf from the fstringify-file command. 

3527 

3528 Return True if the file would be changed. 

3529 """ 

3530 tag = 'diff-fstringify-file' 

3531 self.filename = filename 

3532 self.silent = False 

3533 tog = TokenOrderGenerator() 

3534 try: 

3535 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3536 if not contents or not tokens or not tree: 

3537 return False 

3538 results = self.fstringify(contents, filename, tokens, tree) 

3539 except Exception as e: 

3540 print(e) 

3541 return False 

3542 # Something besides newlines must change. 

3543 changed = regularize_nls(contents) != regularize_nls(results) 

3544 if changed: 

3545 show_diffs(contents, results, filename=filename) 

3546 else: 

3547 print(f"{tag}: Unchanged: {filename}") 

3548 return changed 

3549 #@+node:ekr.20200112060218.1: *4* fs.fstringify_file_silent (entry) 

3550 def fstringify_file_silent(self, filename): # pragma: no cover 

3551 """ 

3552 Fstringify.fstringify_file_silent. 

3553 

3554 The entry point for the silent-fstringify-file command. 

3555 

3556 fstringify the given file, suppressing all but serious error messages. 

3557 

3558 Return True if the file would be changed. 

3559 """ 

3560 self.filename = filename 

3561 self.silent = True 

3562 tog = TokenOrderGenerator() 

3563 try: 

3564 contents, encoding, tokens, tree = tog.init_from_file(filename) 

3565 if not contents or not tokens or not tree: 

3566 return False 

3567 results = self.fstringify(contents, filename, tokens, tree) 

3568 except Exception as e: 

3569 print(e) 

3570 return False 

3571 # Something besides newlines must change. 

3572 changed = regularize_nls(contents) != regularize_nls(results) 

3573 status = 'Wrote' if changed else 'Unchanged' 

3574 # Write the results. 

3575 print(f"{status:>9}: {filename}") 

3576 if changed: 

3577 write_file(filename, results, encoding=encoding) 

3578 return changed 

3579 #@+node:ekr.20191222095754.1: *4* fs.make_fstring & helpers 

3580 def make_fstring(self, node): 

3581 """ 

3582 node is BinOp node representing an '%' operator. 

3583 node.left is an ast.Str node. 

3584 node.right reprsents the RHS of the '%' operator. 

3585 

3586 Convert this tree to an f-string, if possible. 

3587 Replace the node's entire tree with a new ast.Str node. 

3588 Replace all the relevant tokens with a single new 'string' token. 

3589 """ 

3590 trace = False 

3591 assert isinstance(node.left, ast.Str), (repr(node.left), g.callers()) 

3592 # Careful: use the tokens, not Str.s. This preserves spelling. 

3593 lt_token_list = get_node_token_list(node.left, self.tokens) 

3594 if not lt_token_list: # pragma: no cover 

3595 print('') 

3596 g.trace('Error: no token list in Str') 

3597 dump_tree(self.tokens, node) 

3598 print('') 

3599 return 

3600 lt_s = tokens_to_string(lt_token_list) 

3601 if trace: 

3602 g.trace('lt_s:', lt_s) 

3603 # Get the RHS values, a list of token lists. 

3604 values = self.scan_rhs(node.right) 

3605 if trace: 

3606 for i, z in enumerate(values): 

3607 dump_tokens(z, tag=f"RHS value {i}") 

3608 # Compute rt_s, self.line and self.line_number for later messages. 

3609 token0 = lt_token_list[0] 

3610 self.line_number = token0.line_number 

3611 self.line = token0.line.strip() 

3612 rt_s = ''.join(tokens_to_string(z) for z in values) 

3613 # Get the % specs in the LHS string. 

3614 specs = self.scan_format_string(lt_s) 

3615 if len(values) != len(specs): # pragma: no cover 

3616 self.message( 

3617 f"can't create f-fstring: {lt_s!r}\n" 

3618 f":f-string mismatch: " 

3619 f"{len(values)} value{g.plural(len(values))}, " 

3620 f"{len(specs)} spec{g.plural(len(specs))}") 

3621 return 

3622 # Replace specs with values. 

3623 results = self.substitute_values(lt_s, specs, values) 

3624 result = self.compute_result(lt_s, results) 

3625 if not result: 

3626 return 

3627 # Remove whitespace before ! and :. 

3628 result = self.clean_ws(result) 

3629 # Show the results 

3630 if trace: # pragma: no cover 

3631 before = (lt_s + ' % ' + rt_s).replace('\n', '<NL>') 

3632 after = result.replace('\n', '<NL>') 

3633 self.message( 

3634 f"trace:\n" 

3635 f":from: {before!s}\n" 

3636 f": to: {after!s}") 

3637 # Adjust the tree and the token list. 

3638 self.replace(node, result, values) 

3639 #@+node:ekr.20191222102831.3: *5* fs.clean_ws 

3640 ws_pat = re.compile(r'(\s+)([:!][0-9]\})') 

3641 

3642 def clean_ws(self, s): 

3643 """Carefully remove whitespace before ! and : specifiers.""" 

3644 s = re.sub(self.ws_pat, r'\2', s) 

3645 return s 

3646 #@+node:ekr.20191222102831.4: *5* fs.compute_result & helpers 

3647 def compute_result(self, lt_s, tokens): 

3648 """ 

3649 Create the final result, with various kinds of munges. 

3650 

3651 Return the result string, or None if there are errors. 

3652 """ 

3653 # Fail if there is a backslash within { and }. 

3654 if not self.check_back_slashes(lt_s, tokens): 

3655 return None # pragma: no cover 

3656 # Ensure consistent quotes. 

3657 if not self.change_quotes(lt_s, tokens): 

3658 return None # pragma: no cover 

3659 return tokens_to_string(tokens) 

3660 #@+node:ekr.20200215074309.1: *6* fs.check_back_slashes 

3661 def check_back_slashes(self, lt_s, tokens): 

3662 """ 

3663 Return False if any backslash appears with an {} expression. 

3664 

3665 Tokens is a list of lokens on the RHS. 

3666 """ 

3667 count = 0 

3668 for z in tokens: 

3669 if z.kind == 'op': 

3670 if z.value == '{': 

3671 count += 1 

3672 elif z.value == '}': 

3673 count -= 1 

3674 if (count % 2) == 1 and '\\' in z.value: 

3675 if not self.silent: 

3676 self.message( # pragma: no cover (silent during unit tests) 

3677 f"can't create f-fstring: {lt_s!r}\n" 

3678 f":backslash in {{expr}}:") 

3679 return False 

3680 return True 

3681 #@+node:ekr.20191222102831.7: *6* fs.change_quotes 

3682 def change_quotes(self, lt_s, aList): 

3683 """ 

3684 Carefully check quotes in all "inner" tokens as necessary. 

3685 

3686 Return False if the f-string would contain backslashes. 

3687 

3688 We expect the following "outer" tokens. 

3689 

3690 aList[0]: ('string', 'f') 

3691 aList[1]: ('string', a single or double quote. 

3692 aList[-1]: ('string', a single or double quote matching aList[1]) 

3693 """ 

3694 # Sanity checks. 

3695 if len(aList) < 4: 

3696 return True # pragma: no cover (defensive) 

3697 if not lt_s: # pragma: no cover (defensive) 

3698 self.message("can't create f-fstring: no lt_s!") 

3699 return False 

3700 delim = lt_s[0] 

3701 # Check tokens 0, 1 and -1. 

3702 token0 = aList[0] 

3703 token1 = aList[1] 

3704 token_last = aList[-1] 

3705 for token in token0, token1, token_last: 

3706 # These are the only kinds of tokens we expect to generate. 

3707 ok = ( 

3708 token.kind == 'string' or 

3709 token.kind == 'op' and token.value in '{}') 

3710 if not ok: # pragma: no cover (defensive) 

3711 self.message( 

3712 f"unexpected token: {token.kind} {token.value}\n" 

3713 f": lt_s: {lt_s!r}") 

3714 return False 

3715 # These checks are important... 

3716 if token0.value != 'f': 

3717 return False # pragma: no cover (defensive) 

3718 val1 = token1.value 

3719 if delim != val1: 

3720 return False # pragma: no cover (defensive) 

3721 val_last = token_last.value 

3722 if delim != val_last: 

3723 return False # pragma: no cover (defensive) 

3724 # 

3725 # Check for conflicting delims, preferring f"..." to f'...'. 

3726 for delim in ('"', "'"): 

3727 aList[1] = aList[-1] = Token('string', delim) 

3728 for z in aList[2:-1]: 

3729 if delim in z.value: 

3730 break 

3731 else: 

3732 return True 

3733 if not self.silent: # pragma: no cover (silent unit test) 

3734 self.message( 

3735 f"can't create f-fstring: {lt_s!r}\n" 

3736 f": conflicting delims:") 

3737 return False 

3738 #@+node:ekr.20191222102831.6: *5* fs.munge_spec 

3739 def munge_spec(self, spec): 

3740 """ 

3741 Return (head, tail). 

3742 

3743 The format is spec !head:tail or :tail 

3744 

3745 Example specs: s2, r3 

3746 """ 

3747 # To do: handle more specs. 

3748 head, tail = [], [] 

3749 if spec.startswith('+'): 

3750 pass # Leave it alone! 

3751 elif spec.startswith('-'): 

3752 tail.append('>') 

3753 spec = spec[1:] 

3754 if spec.endswith('s'): 

3755 spec = spec[:-1] 

3756 if spec.endswith('r'): 

3757 head.append('r') 

3758 spec = spec[:-1] 

3759 tail_s = ''.join(tail) + spec 

3760 head_s = ''.join(head) 

3761 return head_s, tail_s 

3762 #@+node:ekr.20191222102831.9: *5* fs.scan_format_string 

3763 # format_spec ::= [[fill]align][sign][#][0][width][,][.precision][type] 

3764 # fill ::= <any character> 

3765 # align ::= "<" | ">" | "=" | "^" 

3766 # sign ::= "+" | "-" | " " 

3767 # width ::= integer 

3768 # precision ::= integer 

3769 # type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" 

3770 

3771 format_pat = re.compile(r'%(([+-]?[0-9]*(\.)?[0.9]*)*[bcdeEfFgGnoxrsX]?)') 

3772 

3773 def scan_format_string(self, s): 

3774 """Scan the format string s, returning a list match objects.""" 

3775 result = list(re.finditer(self.format_pat, s)) 

3776 return result 

3777 #@+node:ekr.20191222104224.1: *5* fs.scan_rhs 

3778 def scan_rhs(self, node): 

3779 """ 

3780 Scan the right-hand side of a potential f-string. 

3781 

3782 Return a list of the token lists for each element. 

3783 """ 

3784 trace = False 

3785 # First, Try the most common cases. 

3786 if isinstance(node, ast.Str): 

3787 token_list = get_node_token_list(node, self.tokens) 

3788 return [token_list] 

3789 if isinstance(node, (list, tuple, ast.Tuple)): 

3790 result = [] 

3791 elts = node.elts if isinstance(node, ast.Tuple) else node 

3792 for i, elt in enumerate(elts): 

3793 tokens = tokens_for_node(self.filename, elt, self.tokens) 

3794 result.append(tokens) 

3795 if trace: 

3796 g.trace(f"item: {i}: {elt.__class__.__name__}") 

3797 g.printObj(tokens, tag=f"Tokens for item {i}") 

3798 return result 

3799 # Now we expect only one result. 

3800 tokens = tokens_for_node(self.filename, node, self.tokens) 

3801 return [tokens] 

3802 #@+node:ekr.20191226155316.1: *5* fs.substitute_values 

3803 def substitute_values(self, lt_s, specs, values): 

3804 """ 

3805 Replace specifiers with values in lt_s string. 

3806 

3807 Double { and } as needed. 

3808 """ 

3809 i, results = 0, [Token('string', 'f')] 

3810 for spec_i, m in enumerate(specs): 

3811 value = tokens_to_string(values[spec_i]) 

3812 start, end, spec = m.start(0), m.end(0), m.group(1) 

3813 if start > i: 

3814 val = lt_s[i:start].replace('{', '{{').replace('}', '}}') 

3815 results.append(Token('string', val[0])) 

3816 results.append(Token('string', val[1:])) 

3817 head, tail = self.munge_spec(spec) 

3818 results.append(Token('op', '{')) 

3819 results.append(Token('string', value)) 

3820 if head: 

3821 results.append(Token('string', '!')) 

3822 results.append(Token('string', head)) 

3823 if tail: 

3824 results.append(Token('string', ':')) 

3825 results.append(Token('string', tail)) 

3826 results.append(Token('op', '}')) 

3827 i = end 

3828 # Add the tail. 

3829 tail = lt_s[i:] 

3830 if tail: 

3831 tail = tail.replace('{', '{{').replace('}', '}}') 

3832 results.append(Token('string', tail[:-1])) 

3833 results.append(Token('string', tail[-1])) 

3834 return results 

3835 #@+node:ekr.20200214142019.1: *4* fs.message 

3836 def message(self, message): # pragma: no cover. 

3837 """ 

3838 Print one or more message lines aligned on the first colon of the message. 

3839 """ 

3840 # Print a leading blank line. 

3841 print('') 

3842 # Calculate the padding. 

3843 lines = g.splitLines(message) 

3844 pad = max(lines[0].find(':'), 30) 

3845 # Print the first line. 

3846 z = lines[0] 

3847 i = z.find(':') 

3848 if i == -1: 

3849 print(z.rstrip()) 

3850 else: 

3851 print(f"{z[:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

3852 # Print the remaining message lines. 

3853 for z in lines[1:]: 

3854 if z.startswith('<'): 

3855 # Print left aligned. 

3856 print(z[1:].strip()) 

3857 elif z.startswith(':') and -1 < z[1:].find(':') <= pad: 

3858 # Align with the first line. 

3859 i = z[1:].find(':') 

3860 print(f"{z[1:i+2].strip():>{pad+1}} {z[i+2:].strip()}") 

3861 elif z.startswith('>'): 

3862 # Align after the aligning colon. 

3863 print(f"{' ':>{pad+2}}{z[1:].strip()}") 

3864 else: 

3865 # Default: Put the entire line after the aligning colon. 

3866 print(f"{' ':>{pad+2}}{z.strip()}") 

3867 # Print the standard message lines. 

3868 file_s = f"{'file':>{pad}}" 

3869 ln_n_s = f"{'line number':>{pad}}" 

3870 line_s = f"{'line':>{pad}}" 

3871 print( 

3872 f"{file_s}: {self.filename}\n" 

3873 f"{ln_n_s}: {self.line_number}\n" 

3874 f"{line_s}: {self.line!r}") 

3875 #@+node:ekr.20191225054848.1: *4* fs.replace 

3876 def replace(self, node, s, values): 

3877 """ 

3878 Replace node with an ast.Str node for s. 

3879 Replace all tokens in the range of values with a single 'string' node. 

3880 """ 

3881 # Replace the tokens... 

3882 tokens = tokens_for_node(self.filename, node, self.tokens) 

3883 i1 = i = tokens[0].index 

3884 replace_token(self.tokens[i], 'string', s) 

3885 j = 1 

3886 while j < len(tokens): 

3887 replace_token(self.tokens[i1 + j], 'killed', '') 

3888 j += 1 

3889 # Replace the node. 

3890 new_node = ast.Str() 

3891 new_node.s = s 

3892 replace_node(new_node, node) 

3893 # Update the token. 

3894 token = self.tokens[i1] 

3895 token.node = new_node # type:ignore 

3896 # Update the token list. 

3897 add_token_to_token_list(token, new_node) 

3898 #@+node:ekr.20191231055008.1: *4* fs.visit 

3899 def visit(self, node): 

3900 """ 

3901 FStringify.visit. (Overrides TOT visit). 

3902 

3903 Call fs.makes_fstring if node is a BinOp that might be converted to an 

3904 f-string. 

3905 """ 

3906 if ( 

3907 isinstance(node, ast.BinOp) 

3908 and op_name(node.op) == '%' 

3909 and isinstance(node.left, ast.Str) 

3910 ): 

3911 self.make_fstring(node) 

3912 #@-others 

3913#@+node:ekr.20191231084514.1: *3* class ReassignTokens (TOT) 

3914class ReassignTokens(TokenOrderTraverser): 

3915 """A class that reassigns tokens to more appropriate ast nodes.""" 

3916 #@+others 

3917 #@+node:ekr.20191231084640.1: *4* reassign.reassign 

3918 def reassign(self, filename, tokens, tree): 

3919 """The main entry point.""" 

3920 self.filename = filename 

3921 self.tokens = tokens 

3922 self.tree = tree 

3923 self.traverse(tree) 

3924 #@+node:ekr.20191231084853.1: *4* reassign.visit 

3925 def visit(self, node): 

3926 """ReassignTokens.visit""" 

3927 # For now, just handle call nodes. 

3928 if not isinstance(node, ast.Call): 

3929 return 

3930 tokens = tokens_for_node(self.filename, node, self.tokens) 

3931 node0, node9 = tokens[0].node, tokens[-1].node 

3932 nca = nearest_common_ancestor(node0, node9) 

3933 if not nca: 

3934 return 

3935 # g.trace(f"{self.filename:20} nca: {nca.__class__.__name__}") 

3936 # Associate () with the call node. 

3937 i = tokens[-1].index 

3938 j = find_paren_token(i + 1, self.tokens) 

3939 if j is None: 

3940 return # pragma: no cover 

3941 k = find_paren_token(j + 1, self.tokens) 

3942 if k is None: 

3943 return # pragma: no cover 

3944 self.tokens[j].node = nca # type:ignore 

3945 self.tokens[k].node = nca # type:ignore 

3946 add_token_to_token_list(self.tokens[j], nca) 

3947 add_token_to_token_list(self.tokens[k], nca) 

3948 #@-others 

3949#@+node:ekr.20191227170803.1: ** Token classes 

3950#@+node:ekr.20191110080535.1: *3* class Token 

3951class Token: 

3952 """ 

3953 A class representing a 5-tuple, plus additional data. 

3954 

3955 The TokenOrderTraverser class creates a list of such tokens. 

3956 """ 

3957 

3958 def __init__(self, kind, value): 

3959 

3960 self.kind = kind 

3961 self.value = value 

3962 # 

3963 # Injected by Tokenizer.add_token. 

3964 self.five_tuple = None 

3965 self.index = 0 

3966 self.line = '' 

3967 # The entire line containing the token. 

3968 # Same as five_tuple.line. 

3969 self.line_number = 0 

3970 # The line number, for errors and dumps. 

3971 # Same as five_tuple.start[0] 

3972 # 

3973 # Injected by Tokenizer.add_token. 

3974 self.level = 0 

3975 self.node = None 

3976 

3977 def __repr__(self): 

3978 nl_kind = getattr(self, 'newline_kind', '') 

3979 s = f"{self.kind:}.{self.index:<3}" 

3980 return f"{s:>18}:{nl_kind:7} {self.show_val(80)}" 

3981 

3982 def __str__(self): 

3983 nl_kind = getattr(self, 'newline_kind', '') 

3984 return f"{self.kind}.{self.index:<3}{nl_kind:8} {self.show_val(80)}" 

3985 

3986 def to_string(self): 

3987 """Return the contribution of the token to the source file.""" 

3988 return self.value if isinstance(self.value, str) else '' 

3989 #@+others 

3990 #@+node:ekr.20191231114927.1: *4* token.brief_dump 

3991 def brief_dump(self): # pragma: no cover 

3992 """Dump a token.""" 

3993 return ( 

3994 f"{self.index:>3} line: {self.line_number:<2} " 

3995 f"{self.kind:>11} {self.show_val(100)}") 

3996 #@+node:ekr.20200223022950.11: *4* token.dump 

3997 def dump(self): # pragma: no cover 

3998 """Dump a token and related links.""" 

3999 # Let block. 

4000 node_id = self.node.node_index if self.node else '' 

4001 node_cn = self.node.__class__.__name__ if self.node else '' 

4002 return ( 

4003 f"{self.line_number:4} " 

4004 f"{node_id:5} {node_cn:16} " 

4005 f"{self.index:>5} {self.kind:>11} " 

4006 f"{self.show_val(100)}") 

4007 #@+node:ekr.20200121081151.1: *4* token.dump_header 

4008 def dump_header(self): # pragma: no cover 

4009 """Print the header for token.dump""" 

4010 print( 

4011 f"\n" 

4012 f" node {'':10} token token\n" 

4013 f"line index class {'':10} index kind value\n" 

4014 f"==== ===== ===== {'':10} ===== ==== =====\n") 

4015 #@+node:ekr.20191116154328.1: *4* token.error_dump 

4016 def error_dump(self): # pragma: no cover 

4017 """Dump a token or result node for error message.""" 

4018 if self.node: 

4019 node_id = obj_id(self.node) 

4020 node_s = f"{node_id} {self.node.__class__.__name__}" 

4021 else: 

4022 node_s = "None" 

4023 return ( 

4024 f"index: {self.index:<3} {self.kind:>12} {self.show_val(20):<20} " 

4025 f"{node_s}") 

4026 #@+node:ekr.20191113095507.1: *4* token.show_val 

4027 def show_val(self, truncate_n): # pragma: no cover 

4028 """Return the token.value field.""" 

4029 if self.kind in ('ws', 'indent'): 

4030 val = len(self.value) 

4031 elif self.kind == 'string': 

4032 # Important: don't add a repr for 'string' tokens. 

4033 # repr just adds another layer of confusion. 

4034 val = g.truncate(self.value, truncate_n) # type:ignore 

4035 else: 

4036 val = g.truncate(repr(self.value), truncate_n) # type:ignore 

4037 return val 

4038 #@-others 

4039#@+node:ekr.20191110165235.1: *3* class Tokenizer 

4040class Tokenizer: 

4041 

4042 """Create a list of Tokens from contents.""" 

4043 

4044 results: List[Token] = [] 

4045 

4046 #@+others 

4047 #@+node:ekr.20191110165235.2: *4* tokenizer.add_token 

4048 token_index = 0 

4049 prev_line_token = None 

4050 

4051 def add_token(self, kind, five_tuple, line, s_row, value): 

4052 """ 

4053 Add a token to the results list. 

4054 

4055 Subclasses could override this method to filter out specific tokens. 

4056 """ 

4057 tok = Token(kind, value) 

4058 tok.five_tuple = five_tuple 

4059 tok.index = self.token_index 

4060 # Bump the token index. 

4061 self.token_index += 1 

4062 tok.line = line 

4063 tok.line_number = s_row 

4064 self.results.append(tok) 

4065 #@+node:ekr.20191110170551.1: *4* tokenizer.check_results 

4066 def check_results(self, contents): 

4067 

4068 # Split the results into lines. 

4069 result = ''.join([z.to_string() for z in self.results]) 

4070 result_lines = g.splitLines(result) 

4071 # Check. 

4072 ok = result == contents and result_lines == self.lines 

4073 assert ok, ( 

4074 f"\n" 

4075 f" result: {result!r}\n" 

4076 f" contents: {contents!r}\n" 

4077 f"result_lines: {result_lines}\n" 

4078 f" lines: {self.lines}" 

4079 ) 

4080 #@+node:ekr.20191110165235.3: *4* tokenizer.create_input_tokens 

4081 def create_input_tokens(self, contents, tokens): 

4082 """ 

4083 Generate a list of Token's from tokens, a list of 5-tuples. 

4084 """ 

4085 # Create the physical lines. 

4086 self.lines = contents.splitlines(True) 

4087 # Create the list of character offsets of the start of each physical line. 

4088 last_offset, self.offsets = 0, [0] 

4089 for line in self.lines: 

4090 last_offset += len(line) 

4091 self.offsets.append(last_offset) 

4092 # Handle each token, appending tokens and between-token whitespace to results. 

4093 self.prev_offset, self.results = -1, [] 

4094 for token in tokens: 

4095 self.do_token(contents, token) 

4096 # Print results when tracing. 

4097 self.check_results(contents) 

4098 # Return results, as a list. 

4099 return self.results 

4100 #@+node:ekr.20191110165235.4: *4* tokenizer.do_token (the gem) 

4101 header_has_been_shown = False 

4102 

4103 def do_token(self, contents, five_tuple): 

4104 """ 

4105 Handle the given token, optionally including between-token whitespace. 

4106 

4107 This is part of the "gem". 

4108 

4109 Links: 

4110 

4111 - 11/13/19: ENB: A much better untokenizer 

4112 https://groups.google.com/forum/#!msg/leo-editor/DpZ2cMS03WE/VPqtB9lTEAAJ 

4113 

4114 - Untokenize does not round-trip ws before bs-nl 

4115 https://bugs.python.org/issue38663 

4116 """ 

4117 import token as token_module 

4118 # Unpack.. 

4119 tok_type, val, start, end, line = five_tuple 

4120 s_row, s_col = start # row/col offsets of start of token. 

4121 e_row, e_col = end # row/col offsets of end of token. 

4122 kind = token_module.tok_name[tok_type].lower() 

4123 # Calculate the token's start/end offsets: character offsets into contents. 

4124 s_offset = self.offsets[max(0, s_row - 1)] + s_col 

4125 e_offset = self.offsets[max(0, e_row - 1)] + e_col 

4126 # tok_s is corresponding string in the line. 

4127 tok_s = contents[s_offset:e_offset] 

4128 # Add any preceding between-token whitespace. 

4129 ws = contents[self.prev_offset:s_offset] 

4130 if ws: 

4131 # No need for a hook. 

4132 self.add_token('ws', five_tuple, line, s_row, ws) 

4133 # Always add token, even if it contributes no text! 

4134 self.add_token(kind, five_tuple, line, s_row, tok_s) 

4135 # Update the ending offset. 

4136 self.prev_offset = e_offset 

4137 #@-others 

4138#@-others 

4139g = LeoGlobals() 

4140if __name__ == '__main__': 

4141 main() 

4142#@@language python 

4143#@@tabwidth -4 

4144#@@pagewidth 70 

4145#@-leo