DEV Community

Cover image for πŸ“βœ¨ClearText
Ajinkya Bobade
Ajinkya Bobade

Posted on

πŸ“βœ¨ClearText

This is a submission for the GitHub Copilot Challenge : Transitions and Transformations

What I Built

I have built "ClearText" which is an AI-powered text detection and enhancement tool that makes text in images cleaner.

Title bar

It's Perfect For 🎯

  • πŸ“„ Document Digitization
  • πŸ“š Book Scanning
  • πŸ“± Mobile Photos of Text
  • πŸ–¨οΈ Improving Scanned Documents
  • πŸ“‘ Text Enhancement in Images

Demo

ClearText Demo

Repo

Github Repository - ClearText

Here's an example of what ClearText can do:

ClearText Demo

Image description

ClearText takes input image (left hand side), removes all noise and outputs pure text (right hand side).

ClearText has a huge potential where it can be used in the following fields:

Document Processing πŸ“„

  • Banking & Finance
    • 🏦 Check processing
    • πŸ“Š Financial statement digitization

Healthcare πŸ₯

  • Medical Records
    • πŸ“‹ Patient records digitization
    • πŸ”¬ Lab report enhancement

Legal Industry βš–οΈ

  • Document Management
    • πŸ“œ Contract digitization
    • πŸ—„οΈ Case file processing

Academic Use Cases πŸ“š

  • πŸ“– Textbook scanning
  • πŸ“‘ Research paper digitization

Copilot Experience πŸ€–

I used co-pilot extensively to complete this amazing project. Here are the ways in which co-pilot helped me :

Code Completion πŸ“

  • Auto-completed common OpenCV operations
  • Suggested image processing parameters
  • Completed function signatures for Streamlit components

Chat Assistance πŸ’¬

  • Debugged ONNX model loading issues
  • Explained image processing pipeline
  • Suggested optimizations for image transformations

Inline Suggestions ⚑

  • Recommended error handling patterns
  • Suggested variable names and types

Model Switching πŸ”„

Used different models for specific tasks:

  • Code Completion: GitHub Copilot
  • Documentation: Claude
  • Debugging: GPT-4

Common Prompts Used 🎯

# Function implementation
/explain image processing pipeline
/suggest error handling
/optimize performance
Enter fullscreen mode Exit fullscreen mode

Code Edits ✏️

  • Refactored image processing functions
  • Added blur/no-blur options
  • Improved error messages
  • Enhanced documentation

Project Evolution & Contributions

Building on Open Source

This project builds upon the excellent CRAFT text detection model by CLOVA AI Research, while making significant architectural and functional improvements:

1. Production-Ready Architecture πŸ—οΈ

  • I converted the research-focused PyTorch model to production-ready ONNX format
  • Leveraged ONNX Runtime for optimized inference across different hardware
  • Added complete Docker containerization for reliable deployment

2. Enhanced Text Processing Pipeline πŸ”„

The original CRAFT model provides basic text detection. ClearText significantly expands on this by:

  • Adding custom image preprocessing for better text clarity
  • Implementing new post-processing transforms for enhanced output quality
  • Creating an entirely new text enhancement pipeline
  • Developing a user-friendly web interface for easy interaction

3. Major Output Improvements πŸ“ˆ

ClearText transforms the basic text detection output into a comprehensive text enhancement solution:

  • Original CRAFT: Basic text region detection
  • ClearText Additions:
    • Text clarity enhancement
    • Document digitization capabilities
    • Support for various document types (books, mobile photos, scanned documents)
    • Complete image processing pipeline

Transparency Statement

While this project builds upon CRAFT's foundational text detection capabilities, ClearText represents a significant evolution with entirely new functionality, architecture, and use cases. All original CRAFT code is properly credited and licensed under MIT License.

Conclusion

Developing ClearText during the GitHub Copilot 1-Day Build Challenge has been an amazing journey. Without co-pilot, transforming complex text detection model into an accessible, user-friendly web application would have been tremendously difficult. The project showcases how AI can bridge the gap between computer vision and practical, everyday use cases.

Top comments (0)