Skip to content

Drop Adobe Illustrator (.ai) detection support #743

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 18, 2025
Merged

Conversation

Borewit
Copy link
Collaborator

@Borewit Borewit commented Feb 25, 2025

Drop Adobe Illustrator (.ai) file detection support as the detection mechanism is very poor quality (as raised in #582).

Skipping 1350 bytes, followed by a search in the next 10kB for a keywords is not a reliable mechanism.
There is no fixed offset at 1350 bytes, as well the 10kB is not based on any meaningful foundation.
And neither the keywords necessary from from a context which reliably indicates this is an Adobe Illustrator file.

Second objection I have, is that this is based on (poor) text file parsing, as PDF is a text based format.
Text based formats are not in the scope of file-type.

Reverses: #323
Resolves: #582

Related:

@Borewit Borewit self-assigned this Feb 25, 2025
@sindresorhus
Copy link
Owner

Agreed, but this is a breaking change, and we just did a major version, so I think this should wait a bit.

@sindresorhus
Copy link
Owner

You could maybe consider AI detection for your XML detection library as I think I remember that you can detect Illustrator files by looking at its XMP metadata, which is XML.

@Borewit Borewit added the API change Major change, dependents may need to update their code label Feb 25, 2025
@Borewit
Copy link
Collaborator Author

Borewit commented Feb 25, 2025

Agreed, but this is a breaking change, and we just did a major version, so I think this should wait a bit.

No problem, I am not in a rush.

You could maybe consider AI detection for your XML detection library as I think I remember that you can detect Illustrator files by looking at its XMP metadata, which is XML.

Something like that. Starting with basic PDF decoding which I could maybe utilize read-next-line to iterate over the lines. I am not very familiar with the PDF file format though.

@Borewit Borewit force-pushed the drop-ai-support branch from 8707213 to 81c27a3 Compare May 18, 2025 17:17
@sindresorhus sindresorhus merged commit af169f3 into main May 18, 2025
6 checks passed
@sindresorhus sindresorhus deleted the drop-ai-support branch May 18, 2025 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Major change, dependents may need to update their code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.pdf files can be detected as .ai based on content
2 participants