feat: add npm download statistics tracking system #366

Janpot · 2025-06-20T15:20:11Z

Summary

Add weekly GitHub Action to collect npm download statistics
Implement TypeScript script with parallel API fetching for performance
Store historical data grouped by major version for efficiency
Support for @mui/material and @base-ui/components packages

Features

Weekly automation: Runs every Sunday at midnight UTC
Manual trigger: Can be triggered manually via workflow_dispatch
Parallel fetching: Processes multiple packages simultaneously
Historical tracking: Maintains timestamped download history
Major version grouping: Aggregates downloads by major version using semver
Automatic commits: Commits and pushes updated data files

Data Structure

Data is stored in data/npm-versions/{package}.json with format:

{
  "package": "@mui/material",
  "timestamps": [1234567890],
  "downloads": {
    "5": [1000000],
    "6": [2000000]
  }
}

Test Results

✅ Successfully tested with both packages
✅ Proper directory structure created automatically
✅ Historical data updates working correctly

🤖 Generated with Claude Code

- Add weekly GitHub Action to collect npm download stats - Implement TypeScript script with parallel API fetching - Store historical data grouped by major version - Support for @mui/material and @base-ui/components packages - Automatic git commits with collected data 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

.github/workflows/npm-download-stats.yml

Co-authored-by: Michał Dudak <[email protected]> Signed-off-by: Jan Potoms <[email protected]>

brijeshb42 · 2025-06-20T19:51:20Z

scripts/collect-npm-stats.ts

+
+    // Determine file path
+    const dataDir = join(process.cwd(), 'data', 'npm-versions');
+    const filePath = join(dataDir, `${packageName}.json`);


Jsonl file format instead of json seems more apt for this use case.
No need of reading the existing contents and merging the new data and writing back.
You just append the new json string as a line at the end of the file.

const fs = require('fs'); const path = require('path'); // Your JSON object const obj = { id: 123, name: "Example", active: true }; // Convert the object to a one-line JSON string const jsonLine = JSON.stringify(obj); // Path to the .jsonl file const filePath = path.join(__dirname, 'data.jsonl'); // Append it as a new line to the file fs.appendFile(filePath, jsonLine + '\n', (err) => { if (err) { console.error('Error writing to file:', err); } else { console.log('JSON object appended successfully!'); } });

I think we should be optimizing for the read of this data, not the write. Data is only written once per week as a background task, where it might be read much more often than that (like in a dashboard). I think the number of datapoints here (52/year) isn't enough to justify using jsonl. Once we have 5 years of datapoints (260 points per package major), we could archive old data and incorporate it into a "by month" dataset, so I don't think memory usage is a long term concern either.

Jsonl is optimal for reading as well since we can read it line by line for the data points.
But I agree that given its weekly, it doesn't make much sense for early optimization.

I agree it is optimal performance wise, but has less DX compared to simply JSON.parse(stats) or import stats from './stats.json' (build time parse in webpack loader)

What I was initially aiming to optimize for was

simple, cheap and maintenance free, no servers or databases.

read performance, the plan is to read this directly from raw.githubusercontent.com in the infra dashboard. I want the file to be small.

@dav-is that is exactly where my mind was when building this. I saw those npm api results range from a few kb up to a few 100 kb. I picked this format (it's basically a column store) as it's so well size optimized that we can avoid building in rollover or expiration logic forever.

Drawback though is that we lose all the individual version information. I'm removing the per-major aggregation and do that on the client instead.

Introduced a fetchWithRetry function to handle network errors and transient server issues when fetching NPM package stats. This improves reliability by retrying failed requests up to three times with a delay.

This reverts commit c54cf4f.

LukasTy

Nice initiative and great use of Claude Code. 👍

ON related note: Have you considered adding any of the X packages to the mix? 🤔

LukasTy · 2025-06-23T11:11:26Z

.github/workflows/npm-download-stats.yml

+permissions:
+  contents: write


Nitpick: If I'm not mistaken, we usually set permissions on the job level. 🤔

I.e.:

permissions: {}

instead of this.
And the following after L14:

permissions: contents: write

Janpot · 2025-06-23T11:55:13Z

ON related note: Have you considered adding any of the X packages to the mix? 🤔

@LukasTy yes, I'll add them. but I'm going to put this on a personal repo for now. I know it's possible, but I don't want to build around branch protection rules for this. Don't want to create too powerful bypasses just for this functionality.

Janpot added the scope: code-infra Specific to the core-infra product label Jun 20, 2025

Janpot requested a review from a team June 20, 2025 15:25

michaldudak reviewed Jun 20, 2025

View reviewed changes

.github/workflows/npm-download-stats.yml Outdated Show resolved Hide resolved

michaldudak reviewed Jun 20, 2025

View reviewed changes

.github/workflows/npm-download-stats.yml Outdated Show resolved Hide resolved

Janpot and others added 2 commits June 20, 2025 21:04

Update .github/workflows/npm-download-stats.yml

b8c566c

Co-authored-by: Michał Dudak <[email protected]> Signed-off-by: Jan Potoms <[email protected]>

Update .github/workflows/npm-download-stats.yml

a5a8ad7

Co-authored-by: Michał Dudak <[email protected]> Signed-off-by: Jan Potoms <[email protected]>

brijeshb42 reviewed Jun 20, 2025

View reviewed changes

Janpot added 8 commits June 21, 2025 07:54

Update collect-npm-stats.ts

1854aa9

Update collect-npm-stats.ts

85dbbd0

Update collect-npm-stats.ts

b3de3c9

Update collect-npm-stats.ts

e8aab63

Update collect-npm-stats.ts

8ff0ca3

Add retry logic to NPM stats fetch requests

8c783cd

Introduced a fetchWithRetry function to handle network errors and transient server issues when fetching NPM package stats. This improves reliability by retrying failed requests up to three times with a delay.

hardcode

8aa44ba

Add baseline

c333fef

Janpot merged commit c54cf4f into master Jun 23, 2025
7 checks passed

Janpot deleted the feat/npm-download-stats-tracker branch June 23, 2025 10:55

Janpot added a commit that referenced this pull request Jun 23, 2025

Revert "feat: add npm download statistics tracking system (#366)"

236d414

This reverts commit c54cf4f.

LukasTy reviewed Jun 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add npm download statistics tracking system #366

feat: add npm download statistics tracking system #366

Uh oh!

Janpot commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

brijeshb42 Jun 20, 2025 •

edited

Loading

Uh oh!

dav-is Jun 20, 2025 •

edited

Loading

Uh oh!

brijeshb42 Jun 20, 2025

Uh oh!

dav-is Jun 20, 2025 •

edited

Loading

Uh oh!

Janpot Jun 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

LukasTy left a comment

Uh oh!

LukasTy Jun 23, 2025

Uh oh!

Janpot commented Jun 23, 2025

Uh oh!

Uh oh!

feat: add npm download statistics tracking system #366

feat: add npm download statistics tracking system #366

Uh oh!

Conversation

Janpot commented Jun 20, 2025

Summary

Features

Data Structure

Test Results

Uh oh!

Uh oh!

Uh oh!

brijeshb42 Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dav-is Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brijeshb42 Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

dav-is Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Janpot Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LukasTy left a comment

Choose a reason for hiding this comment

Uh oh!

LukasTy Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

Janpot commented Jun 23, 2025

Uh oh!

Uh oh!

brijeshb42 Jun 20, 2025 •

edited

Loading

dav-is Jun 20, 2025 •

edited

Loading

dav-is Jun 20, 2025 •

edited

Loading

Janpot Jun 21, 2025 •

edited

Loading