Skip to content

Unable to find __INITIAL_DATA__ #330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
biegehydra opened this issue May 4, 2025 · 1 comment
Open

Unable to find __INITIAL_DATA__ #330

biegehydra opened this issue May 4, 2025 · 1 comment

Comments

@biegehydra
Copy link

biegehydra commented May 4, 2025

Even though the example html defines it, the executed script doesn't find it.

https://github.com/biegehydra/initial_data_bug

Python File

import asyncio
import os
import platform
import shutil
from selenium_driverless import webdriver

async def main():
    options = webdriver.ChromeOptions()
    driver = None

    if platform.system() == "Windows":
        possible_paths = [
            os.path.expandvars(r"%ProgramFiles%\Google\Chrome\Application\chrome.exe"),
            os.path.expandvars(r"%ProgramFiles(x86)%\Google\Chrome\Application\chrome.exe"),
            os.path.expandvars(r"%LocalAppData%\Google\Chrome\Application\chrome.exe")
        ]
        chrome_path = next((path for path in possible_paths if os.path.exists(path)), None)
    else:
        chrome_path = shutil.which("google-chrome")

    if chrome_path is None:
        print("Error: Could not find Google Chrome binary.")
        return

    print(f"Using Chrome binary at: {chrome_path}")
    options.binary_location = chrome_path
    try:
        driver = await webdriver.Chrome(options=options)
        print("WebDriver initialized.")

        # Get absolute path to example.html
        html_file_path = os.path.abspath("example.html")
        # Format for file URL
        file_url = f"file:///{html_file_path.replace(os.sep, '/')}"

        print(f"Navigating to: {file_url}")
        await driver.get(file_url)
        print("Navigation complete.")

        # Wait a moment for scripts to potentially execute (optional)
        await asyncio.sleep(1)

        print("Executing script to get window.__INITIAL_DATA__...")
        script = "return typeof window.__INITIAL_DATA__ !== 'undefined' ? window.__INITIAL_DATA__ : 'Not Defined';"
        initial_data = await driver.execute_script(script)

        print("\n--- Extracted Data ---")
        print(initial_data)
        print("----------------------")

    except Exception as e:
        print(f"\nAn error occurred: {e}")
        try:
            # Attempt to get page source on error for debugging
            source = await driver.page_source
            print("\n--- Page Source on Error ---")
            print(source[:1000] + "...") # Print first 1000 chars
            print("--------------------------")
        except Exception as pe:
            print(f"Could not get page source after error: {pe}")

    finally:
        if driver:
            print("\nClosing WebDriver...")
            await driver.quit()
            print("WebDriver closed.")
if __name__ == "__main__":
    asyncio.run(main()) 

Html File

<!DOCTYPE html>
<html>
<head>
    <title>Test Page</title>
    <script type="text/javascript">
        window.__INITIAL_DATA__ = {
            "user": "testUser",
            "sessionId": 12345,
            "settings": {
                "theme": "dark",
                "notifications": true
            }
        };
        console.log("window.__INITIAL_DATA__ defined:", window.__INITIAL_DATA__);
    </script>
</head>
<body>
    <h1>Test Page for window.__INITIAL_DATA__</h1>
    <p>Check the console and the script output.</p>
</body>
</html> 

Output

Using Chrome binary at: C:\Program Files\Google\Chrome\Application\chrome.exe
WebDriver initialized.
Navigating to: file:///C:/Users/sweed/Downloads/initial_data_bug/example.html
Navigation complete.
Executing script to get window.__INITIAL_DATA__...
c:\Python\Lib\site-packages\selenium_driverless\types\deserialize.py:175: UserWarning: got execution_context_id and unique_context=True, defaulting to execution_context_id
  warnings.warn("got execution_context_id and unique_context=True, defaulting to execution_context_id")
--- Extracted Data ---
Not Defined
----------------------

Closing WebDriver...
WebDriver closed.

Image

@biegehydra
Copy link
Author

This was solved with unique_context=False, but confirm that this is expected behaviour to not work when no value is provided for unique_context. The warning says "got execution_context_id and unique_context=True, defaulting to execution_context_id" which makes me think it should ignore unique_context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant