Skip to content

Conversation

willcl-ark
Copy link
Member

Fixes #18603

Checks that generated files with special compilation units do not contain disallowed sections. Special instructions and disallowed sections are defined in the script and include SSE42, SSE41, AVX, AVX2, SHANI, and .text.startup, respectively.

The script uses glob to search and lief to parse files. If any disallowed sections are found in a file the script will print an error to stderr and exit with a non-zero exit code. Otherwise, the script will exit with a zero exit code indicating success.

I also have another version on this branch which more closely mimics the interface of symbol-check.py (accepting a list of files as arguments), along with including a test which can be used to check that the script itself is working, but as it's quite a lot more code overall, I thought this branch might be preferable...

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 15, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30664 (build: Remove Autotools-based build system by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@willcl-ark
Copy link
Member Author

@hebasto I see that you are working on symbol-check.py in #25020

I would be curious therefore to know what you thought of my approach to checking for disallowed symbols (to fix #18603) in this PR vs my alternative branch. They're both still WIP and could do with a little cleaning up, but working...

@fanquake
Copy link
Member

fanquake commented May 3, 2023

I thought this branch might be preferable...

If we are going to add a script like this to the repository, it should be incorporated into the release process (guix build). AM thinking about how best to integrate it into that process, if we want to inspect the just-compiled .o files, and if there is another way to perform this test.

cc @laanwj you might have some thoughts here?

@maflcko
Copy link
Member

maflcko commented Jun 7, 2023

Not sure what is up with CI, but lint says:

contrib/devtools/special-instruction-check.py:55: error: "Binary" has no attribute "has_section"  [attr-defined]

Maybe rebase?

@willcl-ark willcl-ark force-pushed the 18603_si_check branch 2 times, most recently from 1d7a276 to 8e40d11 Compare June 7, 2023 18:39
@willcl-ark
Copy link
Member Author

Not sure what is up with CI, but lint says:

contrib/devtools/special-instruction-check.py:55: error: "Binary" has no attribute "has_section"  [attr-defined]

Maybe rebase?

mypy decided it didn't like that line. Got rid of it with # type: ignore[attr-defined] as I don't know any better

Did the CI randomly re-run on this?

@fanquake
Copy link
Member

fanquake commented Jun 7, 2023

Got rid of it with

There shouldn't be any need for changes, as that issue was fixed with upgrading to LIEF 0.13.1. So should be resolved after rebasing.

@willcl-ark
Copy link
Member Author

Ah I see. I rebased an re-ran the lint to see it fail locally, but didn't update my lief version!

@willcl-ark
Copy link
Member Author

willcl-ark commented Jun 7, 2023

It does in fact, appear to need changes:

will@ubuntu in ~/src/bitcoin-18603_si_check on  18603_si_check [$!?] : C v16.0.5-clang : 🐍 3.8.16
$ pip3 install lief==0.13.1
Collecting lief==0.13.1
  Using cached lief-0.13.1-cp38-cp38-manylinux_2_24_x86_64.whl (4.1 MB)
Installing collected packages: lief
Successfully installed lief-0.13.1

will@ubuntu in ~/src/bitcoin-18603_si_check on  18603_si_check [$!?] : C v16.0.5-clang : 🐍 3.8.16
$ ./test/lint/all-lint.py
src/addrdb.cpp:213: unneccessary ==> unnecessary
src/test/miniminer_tests.cpp:466: indeces ==> indices
src/test/miniminer_tests.cpp:467: indeces ==> indices
^ Warning: codespell identified likely spelling errors. Any false positives? Add them to the list of ignored words in test/lint/spelling.ignore-words.txt
contrib/devtools/special-instruction-check.py:55: error: "Binary" has no attribute "has_section"  [attr-defined]
Found 1 error in 1 file (checked 268 source files)
^---- failure generated from lint-python.py

All other python libraries appear to be right:

₿ pip list
Package           Version
----------------- -------
codespell         2.2.1
flake8            5.0.4
lief              0.13.1
mccabe            0.7.0
mypy              0.971
mypy-extensions   1.0.0
pip               23.1.2
pycodestyle       2.9.1
pyflakes          2.5.0
pyzmq             24.0.1
setuptools        56.0.0
toml              0.10.2
tomli             2.0.1
typing_extensions 4.5.0
vulture           2.6

@willcl-ark
Copy link
Member Author

It seems that although the stubs can be found in https://github.com/lief-project/LIEF/blob/b9a5f970b210dbc115c7aaac10e04623bdc08ef8/api/python/lief/ELF.pyi#L369-L371, they are not present for the Binary type https://github.com/lief-project/LIEF/blob/b9a5f970b210dbc115c7aaac10e04623bdc08ef8/api/python/lief/__init__.pyi#L27.

It almost looked possible that we could just switch to lief.ELF.parse(file) instead, but this also returns a Binary on the Python API:

We start by the ELF format. To create an ELF.Binary from a file we just have to give its path to the lief.parse() or lief.ELF.parse() functions

With the Python API, these functions have the same behaviour but in C++, LIEF::Parser::parse() will return a pointer to a LIEF::Binary object whereas LIEF::ELF::Parser::parse() will return a LIEF::ELF::Binary object

If we keep the original lief.parse(file) and use a small Lief patch it is fixable:

diff --git a/api/python/lief/__init__.pyi b/api/python/lief/__init__.pyi
index fe256f55..3872e579 100644
--- a/api/python/lief/__init__.pyi
+++ b/api/python/lief/__init__.pyi
@@ -63,6 +63,8 @@ class Binary(Object):
     def get_function_address(self, function_name: str) -> object: ...
     def get_symbol(self, symbol_name: str) -> lief.Symbol: ...
     def has_symbol(self, symbol_name: str) -> bool: ...
+    def has_section(self, section_name: str) -> bool: ...
+    def has_section_with_offset(self, offset: int) -> bool: ...
     def offset_to_virtual_address(self, offset: int, slide: int = ...) -> object: ...
     @overload
     def patch_address(self, address: int, patch_value: List[int], va_type: lief.Binary.VA_TYPES = ...) -> None: ...

Easiest fix will just be to enumerate the sections so I'll push that for now

@willcl-ark willcl-ark force-pushed the 18603_si_check branch 2 times, most recently from 3893368 to b6ec259 Compare June 9, 2023 08:51
@DrahtBot
Copy link
Contributor

🤔 There hasn't been much activity lately and the CI seems to be failing.

If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn't been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

@laanwj laanwj self-requested a review April 14, 2024 21:42
@laanwj
Copy link
Member

laanwj commented Apr 14, 2024

cc @laanwj you might have some thoughts here?

i think this script makes sense, it's good to check that objects with special instructions don't contain a startup section (that will always run). Agree it should be part of the build process (guix at least), and ideally the CI, to be useful.

@laanwj
Copy link
Member

laanwj commented Apr 16, 2024

BTW: along with some capstone magic like #29874, and a bare-bones DWARF (or linker map) parser to get function ranges, it would be possible to make a check like this that runs in the final binary, checking that special instructions only appear in allow-listed functions.

fixes: bitcoin#18603

Script will check *.o "special compilation units" (based on special
instructions) and check that they do not contain .text.startup sections.
@willcl-ark
Copy link
Member Author

BTW: along with some capstone magic like #29874, and a bare-bones DWARF (or linker map) parser to get function ranges, it would be possible to make a check like this that runs in the final binary, checking that special instructions only appear in allow-listed functions.

This sounds very cool (and more practically-useful than the check here perhaps).

i think this script makes sense, it's good to check that objects with special instructions don't contain a startup section (that will always run). Agree it should be part of the build process (guix at least), and ideally the CI, to be useful.

I can add it to the CI in this change if that's preferred? But I'm not familiar with adding steps to guix (and it sounds like it could be tricky based on @fanquake 's comment above).

@fanquake
Copy link
Member

I can add it to the CI in this change if that's preferred?

I'm not sure. We ended up removing security/symbol checks from the CI, because most of them wont pass there, and even if they do pass in the CI, that doesn't really matter unless they pass in Guix too. I'm going to think about this a bit more, now that there is some renewed interest, and ideas for new checks (£29874).

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 2, 2024

🐙 This pull request conflicts with the target branch and needs rebase.

@DrahtBot
Copy link
Contributor

⌛ There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@willcl-ark
Copy link
Member Author

Closing for now, as there's not much interest currently. Would be happy to re-open in the future.

@laanwj
Copy link
Member

laanwj commented Dec 3, 2024

What was the blocker here?

Is the problem "how to check just-generated .o files" in the CI/GUIX build?

This sounds very cool (and more practically-useful than the check here perhaps).

Revisiting this, i'm not sure. Checking the final binary would be more practical with integration in the process (it could run with the other symbol/security checks), but with optimizations like LTO it's not necessarily true that the different init functions end up in different places. So there's something to be said for checking the intermediate objects, too, before things are 'mixed'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scripts: check for .text.startup sections
5 participants