Skip to content

generate_checksum() does not close __data__ before reassigning it #356

@adang1345

Description

@adang1345

I am using pefile 2022.5.30 on Windows. Consider the following code, which corrects the header checksum of a PE file.

import pefile

with pefile.PE('libffi-8.dll') as file:
    file.OPTIONAL_HEADER.CheckSum = file.generate_checksum()
    contents = file.write()
with open('libffi-8.dll', 'wb') as file:
    file.write(contents)

With CPython 3.10.7, this code runs without error. With PyPy3.9, I get the following error.

Traceback (most recent call last):
  File "sandbox0.py", line 6, in <module>
    with open('libffi-8.dll', 'wb') as file:
OSError: [Errno 22] Invalid argument: 'libffi-8.dll'

The error happens because the first line of generate_checksum() fails to close self.__data__ before reassigning it to a different object. If self.__data__ is a memory-mapped file, then this seems to keep a lock on the file so that it cannot be opened in later code.

def generate_checksum(self):
    self.__data__ = self.write()
    ....

With CPython, no error happens because the reference-counting garbage collector takes care of the closing the memory-mapped file, while PyPy experiences an error because its garbage collector does not close the memory-mapped file immediately. We should not be relying on any particular garbage collection implementation. If I change the beginning of generate_checksum() to the following, then I no longer see the issue.

def generate_checksum(self):
    data = self.write()
    self.close()
    self.__data__ = data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions