Skip to content

Commit bf2005d

Browse files
agattidpgeorge
authored andcommitted
tools/mpy_ld.py: Resolve fixed-address symbols if requested.
This commit lets mpy_ld.py resolve symbols not only from the object files involved in the linking process, or from compiler-supplied static libraries, but also from a list of symbols referenced by an absolute address (usually provided by the system's ROM). This is needed for ESP8266 targets as some C stdlib functions are provided by the MCU's own ROM code to reduce the final code footprint, and therefore those functions' implementation was removed from the compiler's support libraries. This means that unless `LINK_RUNTIME` is set (which lets tooling look at more libraries to resolve symbols) the build process will fail as tooling is unaware of the ROM symbols' existence. With this change, fixed-address symbols can be exposed to the symbol resolution step when performing natmod linking. If there are symbols coming in from a fixed-address symbols list and internal code or external libraries, the fixed-address symbol address will take precedence in all cases. Although this is - in theory - also working for the whole range of ESP32 MCUs, testing is currently limited to Xtensa processors and the example natmods' makefiles only make use of this commit's changes for the ESP8266 target. Natmod builds can set the MPY_EXTERN_SYM_FILE variable pointing to a linkerscript file containing a series of symbols (weak or strong) at a fixed address; these symbols will then be used by the MicroPython linker when packaging the natmod. If a different natmod build method is used (eg. custom CMake scripts), `tools/mpy_ld.py` can now accept a command line parameter called `--externs` (or its short variant `-e`) that contains the path of a linkerscript file with the fixed-address symbols to use when performing the linking process. The linkerscript file parser can handle a very limited subset of binutils's linkerscript syntax, namely just block comments, strong symbols, and weak symbols. Each symbol must be in its own line for the parser to succeed, empty lines or comment blocks are skipped. For an example of what this parser was meant to handle, you can look at `ports/esp8266/boards/eagle.rom.addr.v6.ld` and follow its format. The natmod developer documentation is also updated to reflect the new command line argument accepted by `mpy_ld.py` and the use cases for the changes introduced by this commit. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
1 parent 9174cff commit bf2005d

File tree

3 files changed

+89
-5
lines changed

3 files changed

+89
-5
lines changed

docs/develop/natmod.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,14 @@ Linker limitation: the native module is not linked against the symbol table of t
8181
full MicroPython firmware. Rather, it is linked against an explicit table of exported
8282
symbols found in ``mp_fun_table`` (in ``py/nativeglue.h``), that is fixed at firmware
8383
build time. It is thus not possible to simply call some arbitrary HAL/OS/RTOS/system
84-
function, for example.
84+
function, for example, unless that resides at a fixed address. In that case, the path
85+
of a linkerscript containing a series of symbol names and their fixed address can be
86+
passed to ``mpy_ld.py`` via the ``--externs`` command line argument. That way symbols
87+
appearing in the linkerscript will take precedence over what is provided from object
88+
files, but at the moment the object files' implementation will still reside in the
89+
final MPY file. The linkerscript parser is limited in its capabilities, and is
90+
currently used only for parsing the ESP8266 port ROM symbols list (see
91+
``ports/esp8266/boards/eagle.rom.addr.v6.ld``).
8592

8693
New symbols can be added to the end of the table and the firmware rebuilt.
8794
The symbols also need to be added to ``tools/mpy_ld.py``'s ``fun_table`` dict in the

py/dynruntime.mk

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,9 @@ endif
172172
endif
173173
MPY_LD_FLAGS += $(addprefix -l, $(LIBGCC_PATH) $(LIBM_PATH))
174174
endif
175+
ifneq ($(MPY_EXTERN_SYM_FILE),)
176+
MPY_LD_FLAGS += --externs "$(realpath $(MPY_EXTERN_SYM_FILE))"
177+
endif
175178

176179
CFLAGS += $(CFLAGS_EXTRA)
177180

tools/mpy_ld.py

Lines changed: 78 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,7 @@ def __init__(self, arch):
402402
self.known_syms = {} # dict of symbols that are defined
403403
self.unresolved_syms = [] # list of unresolved symbols
404404
self.mpy_relocs = [] # list of relocations needed in the output .mpy file
405+
self.externs = {} # dict of externally-defined symbols
405406

406407
def check_arch(self, arch_name):
407408
if arch_name != self.arch.name:
@@ -491,10 +492,14 @@ def populate_got(env):
491492
sym = got_entry.sym
492493
if hasattr(sym, "resolved"):
493494
sym = sym.resolved
494-
sec = sym.section
495-
addr = sym["st_value"]
496-
got_entry.sec_name = sec.name
497-
got_entry.link_addr += sec.addr + addr
495+
if sym.name in env.externs:
496+
got_entry.sec_name = ".external.fixed_addr"
497+
got_entry.link_addr = env.externs[sym.name]
498+
else:
499+
sec = sym.section
500+
addr = sym["st_value"]
501+
got_entry.sec_name = sec.name
502+
got_entry.link_addr += sec.addr + addr
498503

499504
# Get sorted GOT, sorted by external, text, rodata, bss so relocations can be combined
500505
got_list = sorted(
@@ -520,6 +525,9 @@ def populate_got(env):
520525
dest = int(got_entry.name.split("+")[1], 16) // env.arch.word_size
521526
elif got_entry.sec_name == ".external.mp_fun_table":
522527
dest = got_entry.sym.mp_fun_table_offset
528+
elif got_entry.sec_name == ".external.fixed_addr":
529+
# Fixed-address symbols should not be relocated.
530+
continue
523531
elif got_entry.sec_name.startswith(".text"):
524532
dest = ".text"
525533
elif got_entry.sec_name.startswith(".rodata"):
@@ -1207,13 +1215,25 @@ def link_objects(env, native_qstr_vals_len):
12071215
sym.section = env.obj_table_section
12081216
elif sym.name in env.known_syms:
12091217
sym.resolved = env.known_syms[sym.name]
1218+
elif sym.name in env.externs:
1219+
# Fixed-address symbols do not need pre-processing.
1220+
continue
12101221
else:
12111222
if sym.name in fun_table:
12121223
sym.section = mp_fun_table_sec
12131224
sym.mp_fun_table_offset = fun_table[sym.name]
12141225
else:
12151226
undef_errors.append("{}: undefined symbol: {}".format(sym.filename, sym.name))
12161227

1228+
for sym in env.externs:
1229+
if sym in env.known_syms:
1230+
log(
1231+
LOG_LEVEL_1,
1232+
"Symbol {} is a fixed-address symbol at {:08x} and is also provided from an object file".format(
1233+
sym, env.externs[sym]
1234+
),
1235+
)
1236+
12171237
if undef_errors:
12181238
raise LinkError("\n".join(undef_errors))
12191239

@@ -1456,6 +1476,9 @@ def do_link(args):
14561476
log(LOG_LEVEL_2, "qstr vals: " + ", ".join(native_qstr_vals))
14571477
env = LinkEnv(args.arch)
14581478
try:
1479+
if args.externs:
1480+
env.externs = parse_linkerscript(args.externs)
1481+
14591482
# Load object files
14601483
for fn in args.files:
14611484
with open(fn, "rb") as f:
@@ -1484,6 +1507,50 @@ def do_link(args):
14841507
sys.exit(1)
14851508

14861509

1510+
def parse_linkerscript(source):
1511+
# This extracts fixed-address symbol lists from linkerscripts, only parsing
1512+
# a small subset of all possible directives. Right now the only
1513+
# linkerscript file this is really tested against is the ESP8266's builtin
1514+
# ROM functions list ($SDK/ld/eagle.rom.addr.v6.ld).
1515+
#
1516+
# The parser should be able to handle symbol entries inside ESP-IDF's ROM
1517+
# symbol lists for the ESP32 range of MCUs as well (see *.ld files in
1518+
# $SDK/components/esp_rom/<name>/).
1519+
1520+
symbols = {}
1521+
1522+
LINE_REGEX = re.compile(
1523+
r'^(?P<weak>PROVIDE\()?' # optional weak marker start
1524+
r'(?P<symbol>[a-zA-Z_]\w*)' # symbol name
1525+
r'=0x(?P<address>[\da-fA-F]{1,8})*' # symbol address
1526+
r'(?(weak)\));$', # optional weak marker end and line terminator
1527+
re.ASCII,
1528+
)
1529+
1530+
inside_comment = False
1531+
for line in (line.strip() for line in source.readlines()):
1532+
if line.startswith('/*') and not inside_comment:
1533+
if not line.endswith('*/'):
1534+
inside_comment = True
1535+
continue
1536+
if inside_comment:
1537+
if line.endswith('*/'):
1538+
inside_comment = False
1539+
continue
1540+
if line.startswith('//'):
1541+
continue
1542+
match = LINE_REGEX.match(''.join(line.split()))
1543+
if not match:
1544+
continue
1545+
tokens = match.groupdict()
1546+
symbol = tokens['symbol']
1547+
address = int(tokens['address'], 16)
1548+
if symbol in symbols:
1549+
raise ValueError(f"Symbol {symbol} already defined")
1550+
symbols[symbol] = address
1551+
return symbols
1552+
1553+
14871554
def main():
14881555
import argparse
14891556

@@ -1500,6 +1567,13 @@ def main():
15001567
cmd_parser.add_argument(
15011568
"--output", "-o", default=None, help="output .mpy file (default to input with .o->.mpy)"
15021569
)
1570+
cmd_parser.add_argument(
1571+
"--externs",
1572+
"-e",
1573+
type=argparse.FileType("rt"),
1574+
default=None,
1575+
help="linkerscript providing fixed-address symbols to augment symbol resolution",
1576+
)
15031577
cmd_parser.add_argument("files", nargs="+", help="input files")
15041578
args = cmd_parser.parse_args()
15051579

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy