Skip to content

[youtube] API-only mode/fallback (experimental) #682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 46 commits into from

Conversation

coletdjnz
Copy link
Member

@coletdjnz coletdjnz commented Aug 12, 2021

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

WIP & Experimental. This may not be something we merge all of, if any.

Adding a mode for the YouTube extractors to only use the Innertube API without any webpage downloading.

This might be useful for avoiding rate limiting - the webpage requests tend to be more heavily rate-limited.

API-only mode will also provide a fallback if the webpage fails to download. This is especially useful for if the webpage gets rate-limited, but the API requests are not (something I have personally seen).

TODO & Known Issues:

  • Auth issues: Multi-channel account (delegated session id) and multi-account cookies (SESSION_INDEX) are not available as we don't have ytcfg
  • YouTubeIE implementation: deal with not having player url from initial webpage
  • Need to test every feed/tab/playlist etc. types we can think of
  • extractor arg - currently there will be separate arg for YoutubeIE and YoutubeTabIE. Do we want one unified toggle?
  • if a playlist does not exist then _extract_response will throw an ExtractorError with expected=False
  • this TODO

@coletdjnz coletdjnz changed the title [youtube] API-only mode (experimental) [youtube] API-only mode/fallback (experimental) Aug 12, 2021
@coletdjnz
Copy link
Member Author

coletdjnz commented Aug 12, 2021

Auth issues: Multi-channel account (delegated session id) and multi-account cookies (SESSION_INDEX) are not available as we don't have ytcfg

Some options (only use if authenticated):

  • https://www.youtube.com/getDatasyncIdsEndpoint:

    • will get correct sync id for account
    • will give data sync ids for all accounts
    • does not give session index (don't think it is possible to derive since the list changes)
  • v1/account/accounts_list:

    • will give data sync ids of all channels
    • can derive session id from "accountSigninToken" url
    • does not tell us what the account the cookies are for.
    • does not appear to give other accounts that are logged in?
  • https://www.youtube.com/getAccountSwitcherEndpoint:

    • Parts of response are similar to above
    • Shows all accounts & channels

Curious as to how the non-web based clients get this data. OAuth?

For interest: ytmusicapi requires the user to pass the datasyncid (and session index): https://ytmusicapi.readthedocs.io/en/latest/usage.html#brand-accounts

Random note: datasyncid/accountStateToken/DELEGATED_SESSION_ID/onBehalfOfUser/pageIdToken are pretty much all the same thing in this context...

Edit: I've gone with the third option. It seems like these "hacky" endpoints as what YouTube calls them are as they are called.

@coletdjnz
Copy link
Member Author

Another issue:
There appears to be only one working STS value (i.e., using an older player version with it's STS value will not work). Unless we have a way to get the STS or player version in an api-only way, we are stuck with only using non-js based player clients.

@coletdjnz
Copy link
Member Author

coletdjnz commented Aug 21, 2021

Some changes:

  • When the player url fails to be extracted by other means, we fallback to extracting it from the iframe_api page.
  • We do not extract the player_url/player js/sts if the client does not require it
    • This is configured by setting REQUIRES_JS_PLAYER in the hard-coded ytcfgs. Default is True.
    • We can override this decision globally by using player_js=no_require/require arg (TODO: better name)
  • if a client requires sts/player js and we fail to extract it, that client will be skipped.
    • TODO: maybe don't want this behaviour since in non-signature related cases it doesn't matter if we have sts/player js
  • player url can be also extracted from player_ytcfg (e.g. embedded, web_music)
  • we can disable the initial webpage download with player_skip=webpage.
    • TODO: do we want this under player_skip=configs?

Currently, to enable a pure api-only mode for YoutubeIE, we can pass --extractor-args youtube:player_skip=webpage,configs;player_js=no_require.

@coletdjnz

This comment has been minimized.

@coletdjnz
Copy link
Member Author

Slightly unrelated: you can get 429nd for downloading too many subtitles in a short amount of time. But, the 429 appears to only affect subtitle downloads.

@coletdjnz

This comment has been minimized.

@coletdjnz
Copy link
Member Author

coletdjnz commented Sep 27, 2021

Some extra things to fix:
MP -> OLAK playlist resolve is broken
fix/improve tests
generalise extract_webpage and extract_response in some way (might make for another PR though)

@coletdjnz
Copy link
Member Author

Superseded by #1122

@coletdjnz coletdjnz closed this Sep 29, 2021
@coletdjnz coletdjnz deleted the api-only-real branch March 4, 2022 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant