Document Understanding Subnet - Whitepaper 1.0
  • What is the Document Understanding Subnet?
  • Core Functionalities
    • Current Capabilities
    • Future Capabilities
  • Supporting Infrastructure
  • Operational Overview
    • Reward Mechanism
  • Technical Architecture
    • Checkbox-Text Extraction: YOLO Checkbox Detector
    • OCR Engine: Tesseract OCR
    • Workflow of Checkbox-Text Extraction
  • Internal OCR Engine Development
    • Advanced Layout Analysis
  • Advantages of Document Understanding Subnet
  • Use Cases of Document Understanding Subnet
  • Economic Model
    • Key Participants and Roles
    • Integration with Bittensor’s Economic Framework
  • Comparative Analysis
    • GPT
    • Azure Document AI
    • Google Document AI
    • AWS Document Processing
  • Strategic Opportunities
  • Integration Options
  • Roadmap
  • Links
Powered by GitBook
On this page
  1. Core Functionalities

Current Capabilities

PreviousCore FunctionalitiesNextFuture Capabilities

Last updated 6 months ago

Checkbox and Associated Text Detection

The Document Understanding Subnet currently supports advanced Checkbox and Associated Text Detection (available on Testnet 236). This feature accurately identifies checkboxes within document images and associates each with its related text.

It leverages a sophisticated vision model for checkbox location and an OCR engine to extract corresponding text. The high accuracy checkboxes detector of this feature surpasses that of leading centralized solutions, including GPT-4 Vision and Azure Form Recognizer, making it invaluable for processing structured forms and surveys. This detection capability significantly improves data extraction quality and efficiency, proving essential for applications that rely on structured form data.

Figure 2: Example of checkbox and associated text detection.
Figure 3: Example of checked text without boxes.