big-data architecture · distributed parallel processing · high-availability systems · Linux platform engineering · Linux security | VP of Engineering · big-data architect · senior engineer · consultant | Europe
Isoxya web crawler
Isoxya is an internet data processing system representing years of research into building next-generation crawlers and scrapers. Isoxya comes in two editions: Community Edition (CE), a free and open-source (BSD 3-Clause) mini crawler, suitable for small crawls on a single computer; and Pro Edition (PE), a commercial and closed-source distributed crawler, suitable for small, large, and humongous crawls on high-availability clusters of multiple computers. Both editions utilise flexible plugins, allowing numerous programming languages to be used to extend the core engine via JSON interfaces. Plugins written for Isoxya CE should typically scale to Isoxya PE with minimal or no changes.
Isoxya web crawler Community Edition
⚠️ Postgres-XL upstream hasn't been updated since 2019-08-08, and various important messages on the mailing list have gone unanswered. Thus, this project is now archived.
⚠️ CRAB Cardano stake pool retirement was announced on 2020-10-13, and there's currently no incentive in me maintaining this code or donating build time. Thus, this project is now archived. Please get in touch if you have a proposition for how to make this project (or operating a Cardano stake pool) viable.
Isoxya web crawler plugin: Crawler HTML
Isoxya web crawler plugin: Elasticsearch
Isoxya web crawler plugin: Spellchecker
Isoxya web crawler Scripts
Contact Nic to enquire about their availability for a project.
- Available now for 2 days / week!
- Nic doesn't work with recruitment agencies.