Skip to content

Navigation Menu

Appearance settings

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

medialab / hyphe Public

Notifications You must be signed in to change notification settings
Fork 63
Star 357

Code
Issues 48
Pull requests 1
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Releases: medialab/hyphe

Releases · medialab/hyphe

Hot summer 2025, crawling from within

07 Jul 15:27

boogheta

Compare

Choose a tag to compare

Loading

Hot summer 2025, crawling from within Latest

Latest

ChangeLog:

Allow to start individual and multiple crawls directly from the NETWORK page
Make MANAGE TAGS page applicable also to UNDECIDED webentities (#506 #441)
Hide www from DEFINE WEBENTITY prefix slider (#510)
Minor frontend fixes (correct crawl statuses for startpages in WebEntity's list of pages, fixed durations for crawls in Monitor all crawls page, proper permalinks to wayback machine for INA Web Archives)

Full Changelog: v1.12.1...v1.12.2

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Hot 2025, the post Skybox era

03 Jul 14:05

boogheta

Compare

Choose a tag to compare

Loading

Hot 2025, the post Skybox era

ChangeLog:

Add a button to start a crawl directly from the Network page
Add in frontend ways to cancel all pending crawls and cancel/recrawl individual crawls from the Monitor all crawls page (+ fix API's crawl.cancel_all route to also cancel crawls unscheduled within scrapy yet and set their crawl status appropriately)
Improve reviewed crawls button in the Monitor all crawls page
Add default webentity creation rules for Bluesky and X user accounts, as well as skyblogs for webarchives
Fix Monitor latest crawls page not displaying most recent ones in some server cases due to misaligned timestamps
Small fixes for BnF & INA Web Archives (proper permalinks, adapt to recent upstream changes)
Minor fixes to installation doc and frontend display (make tags validation easier with an "Add" button, display visually crawl status of each page of a webentity, handle total redirected pages missing from old hyphe corpus versions, autostop network spatialization, make some buttons more visible, fix duration displayed for canceled unscheduled crawls, autofocus input in Import page, etc.)

Full Changelog: v1.12.0...v1.12.1

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

2025 up in the Skybox

14 May 16:32

boogheta

Compare

Choose a tag to compare

Loading

2025 up in the Skybox

ChangeLog:

Fix Default WebEntityCreationRule not always applied when different of domain (upgrades to hyphe-traph v2.2) (#499)
Add an option in the web interface to load tags from a CSV file along with importing new or existing WebEntities (#503)
Add the possibility to set a crawl job as reviewed (#478)
Allow to rename a corpus (#457)
Better handle WebEntities with prefixes including special characters in the path (#447)
Distinguish crawl pages error from simple redirection ones (#492)
Auto resolve more urls directly within crawler (#463)
Fix automatic feeding of recent UserAgents, whether behind a proxy or not
Small fixes for INA & BnF Web Archives (#502 + permalinks with misformatted dates)
Minor fixes to lookups logic, config loading, manual installation doc, corpus landing page (#487) and backend logs display

Full Changelog: v1.11.0...v1.12.0

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Early 2024

31 Jan 09:52

boogheta

Compare

Choose a tag to compare

Loading

Early 2024

ChangeLog:

Give access to detailed crawl logs within frontend (#452)
Diverse small UI fixes/improvements in frontend (#482, #483, #485, #486, #488, #494)
Complete adaptation of web archives handling to INA's (#484)

Full Changelog: v1.10.9...v1.11.0

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Back-to-school papercuts

25 Aug 17:31

boogheta

Compare

Choose a tag to compare

Loading

Back-to-school papercuts

ChangeLog:

Add a button to export metadata from all pages of a webentity (#318)
Explicitly separate startpages warnings regarding redirected pages and faulty ones (#379)
Allow to set a specific User-Agent per crawl within the web interface (#461)
Display hints on the meaning of the different possible status of a crawl (#474)
Highlight corresponding webentities when hovering a status or a tag in the network legend (#459)
Switch User-Agents list used within crawls to relying on https://www.useragents.me/ (#453)
Various improvements (cleaner backend logs, remove empty traphs directories (#475), updated heuristics for webentity links calculation rhythm, visual fixes (#476, #477)

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Hot Summer '23

21 Aug 10:34

boogheta

Compare

Choose a tag to compare

Loading

Hot Summer '23

ChangeLog:

migrated caching WELinks to (working) files instead of mongo to handle huge corpuses
allow to set archives pass as ENV variable for docker instances
display time required by links indexation on overview

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Summer '23

21 Jul 17:27

boogheta

Compare

Choose a tag to compare

Loading

Summer '23

ChangeLog:

Added handling of more webarchives as sources (Arquivo.pt + INA DLWeb) + fixed various webarchives frontend info (#469, #471,
Added a corpus setting "ignore internal links" to crawl but not record links within the currently crawled webentity in order to fasten drastically indexation of entities with crazy amounts of links (with a cost in terms of functionalities since the network of internal pages is then not available, and entities that are split after a crawl will require to recrawled) (cf #371, #378, #433)
Better handle frontend warning on pending actions when trying to close a tab (#465, #466)
Minor fixes (#448, #460, #467, #468, #470, 50d97e8, 85decf2)

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Better, faster, stronger traph, there it is!

29 Nov 18:11

boogheta

Compare

Choose a tag to compare

Loading

Better, faster, stronger traph, there it is!

ChangeLog:

Switched to breaking new version of hyphe-traph 2.1, which should help fasten indexation on big networks, but requires to rebuild corpuses from start
Make iterator traph calls less recurrent to leave priority to quick user actions
Fixed stack on calling empty callback in List Webentities
Upgraded urllib3 to handle SSL deprecation
Froze dependencies to maintain python2.7 compat

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Summer '22

19 Aug 11:48

boogheta

Compare

Choose a tag to compare

Loading

Summer '22

ChangeLog:

Upgraded User Agents list
Added extra default WebEntity CreationRules for Github, Instagram, TikTok, Reddit and a bunch of blog platforms
Added perma.cc to list of default autofollowlinks
Diverse fixes and extra features for webarchives (links to archive permalinks, etc.)
Minor bugfixes

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Spring '22

30 Mar 15:50

boogheta

Compare

Choose a tag to compare

Loading

Spring '22

ChangeLog:

Added a distinction between successful and errored crawled pages to identify Suspicious crawls (#425)
Fixed frontend compatibility within Hyphe-Browser (medialab/hyphe-browser#212)
Fixed WebArchives crawling interface (#431) and behavior from BNF's archives (#426)
Improved network page's interaction using latest sigma.js v2.2 (node highlight etc & #367)
Allowed frontend to automatically restart a closed corpus when reopening the frontend directly on a specific corpus link (#440)
Allowed to check contiguous cases in frontend's lists of webentities using the shift key (#438)
Allowed to tune the frontend's header color from the config (#430)
Published Hyphe on Zenodo & Software Heritage
Minor fixes (#397, #388, #432, #429, #437, #343, #341, #444, #325)

Assets 2

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.