Nostr Scraping Project
Core Features Implemented
- Import from nosdump file
- NIP-50 Search Functionality
- Scraping + Job + Worker Functionality
- Real-time Subscriptions
Nuanced Featuers
- Import from nosdump file
- NIP-50 Search Functionality
- Scraping + Job + Worker Functionality
- Recursive Filter Scraping with logging so scraping can be interrupted and restarted
- Simple Filter Job Management System
- Realtime Subscriptions
- Can be subscribed to a filter as the events are published to the Nostr relay
- TODO
- Export to Nosdump file file
- Nostr Kind White Listing
- Nostr Kind Black Listing
- NIP05 Scraping and History
- Nostr Relay Metadata Scraping and History
- Bot Support
- Authentication Support
- Ephemeral Events
- CRON styled Filter Scraping
- nkbip-02 AI Embedding Vector Support
- SQL Table for Indexed tags
- SQL Table for not so indexed tags
TODO
- Write a blog post about what I want to get out of Scraping Nostr
- Error logging for nosdump-ingest
- Add a Relay to nosdump-ingest ingesting data
- Write a blog post about the problem of Activities verses Workflows, and relate it to what were were trying to do with CGFS
- We ought to start using fractal terminology to describe our scraping n stuff, like with CGFS we are supposed to reference the root of a discussion, in Nostr we are supposed to also reference the root event of a discussion, I believe in the future raw Nostr events posted without context are going to be rare and not adopted by default in Nostr clients
Job States
- My Job States
- TODO
- RUNNING
- COMPLETED
- ERROR
- FAILED
- Temporal Activity States
- Running
- Cancelled
- Completed
- Failed
- Terminated
- Timed Out
As per Nostr Scraping Plan 0.0.1 we got a couple different things we need to scrape separately,
- Events from a User from a Specific Relay
- scrape.pubkey.from.relay.0.0.1
- Replies to a thread
- scrape.replies.to.thread.from.relay.0.0.1
- Reactions to a Thread
- scrape.reactions.to.thread.from.relay.0.0.1
- Follows of a NPUB
- scrape.follows.of.pubkey.from.relay.0.0.1
- Badges send to a User
- scrape.badges.to.publey.from.relay.0.0.1
-
NIP05 Stuff
- scrape.nip05.0.0.1
-
We start with a single NPUB of popular Nostr User
- We scrape the Users NIP05 Identity for other Relays they use
- We scrape all that users events from every relay they say they publish to
- We then grab all the
- We then look at their follow list
- We add every NPUB to a backlog of Nostr events to scrape
Logs
Backlinks
- Zero to One for mememaps.net
- ETL to QE, Update 71, Nostr SQL Over Engineering Complete, Time for Websockets
- ETL to QE, Update 71, Nostr SQL Over Engineering Complete, Time for Websockets
- ETL to QE, Update 70, Embarrassing Over Engineering
- ETL to QE, Update 66, Do One Thing and Do It Well
- ETL to QE, Update 66, Do One Thing and Do It Well
- DDaemon 2025 Roadmap Rev. 0.0.4
- DDaemon 2025 Roadmap Rev. 0.0.3
- Dentropy Daemon