Skip to content

Conversation

becojo
Copy link
Contributor

@becojo becojo commented Apr 13, 2023

Description:

Add detection for OpenAI API keys. Interestingly, their API keys look made up of base64 and contains the encoded string "OpenAI" (T3BlbkFJ) between 2 blocks of random bytes so this rule should be pretty precise.

Checklist:

  • Does your PR pass tests?
  • Have you written new tests for your changes?
  • Have you lint your code locally prior to submission?

@zricethezav
Copy link
Collaborator

Can confirm, looks like T3BlbkFJ is present in all keys. This might need to be checked periodically as they might change the signature?

@zricethezav zricethezav merged commit 7dc9ba4 into gitleaks:master Jun 14, 2023
@rgmz
Copy link
Contributor

rgmz commented Jun 16, 2023

@becojo Just curious, what was the reason for removing the sk- keyword?

@becojo
Copy link
Contributor Author

becojo commented Jun 16, 2023

@rgmz It was there mostly because I copy pasted another rule. I figured sk- is pretty short and is likely to be present in a lot of files that don't contain an API key, wasting time running the regular expression for nothing.

@becojo
Copy link
Contributor Author

becojo commented Jun 16, 2023

@zricethezav This might need to be checked periodically as they might change the signature?

They have API keys in the scope of their bug bounty program I would imagine they make some noise if they change the format.

@becojo becojo deleted the openai branch June 16, 2023 16:03
@rgmz
Copy link
Contributor

rgmz commented Jun 16, 2023

Fair enough. The T3BlbkFJ does seem to be a constant, however, it is not guaranteed:

OpenAI API Keys are of the form sk-\w{48}. Anything not of this form will be discarded.

...That being said, the fact that it's OpenAI base64-encoded makes me doubt that they would remove it.


I just wanted to clarify because I noticed it was in gitleaks.toml but not openai.go (#1200). Thanks for confirming!

@becojo
Copy link
Contributor Author

becojo commented Jun 16, 2023

oh, good catch my bad

@rgmz
Copy link
Contributor

rgmz commented Aug 19, 2023

Can confirm, looks like T3BlbkFJ is present in all keys. This might need to be checked periodically as they might change the signature?

Azure's OpenAI Service seems to use a different token format. I discovered an instance where a private Azure endpoint was authenticated with a 32 alphanumeric characters. It's possible the api_key used an Azure AD token; I'm not familiar enough to with Azure to say for certain.

"""
Reference: https://github.com/openai/openai-python#microsoft-azure-active-directory-authentication
"""
import openai

# Setup parameters
openai.api_type = "azure_ad" # I need to double-check what api_type was specified.
openai.api_base = "https://example-endpoint.openai.azure.com/"
openai.api_key = "se23rtrbvtydjt30r0cwspdcpg8a4548"

(I'd expect this to get picked up by the generic rule.)

alayne222 pushed a commit to alayne222/gitleaks that referenced this pull request May 28, 2025
* Add detection for OpenAI API keys

* Remove `sk-` keyword
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants