Skip to content

Gittxt is an AI-focused CLI and plugin tool for extracting, filtering, and packaging text from GitHub repos. Build LLM-compatible datasets, prep code for prompt engineering, and power AI workflows with structured .txt, .json, .md, or .zip outputs.

License

Notifications You must be signed in to change notification settings

sandy-sp/gittxt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ AI-Ready Text Extractor for Git Repos | CLI tool for dataset prep, summaries, reverse engineering & bundling

πŸš€ Gittxt: Get Text from Git β€” Optimized for AI

Docs Python Version PyPI version Release Tested with Pytest PyPI Downloads GitHub repo size GitHub top language Build Status Made for LLMs License

Gittxt is an open-source tool that transforms GitHub repositories into LLM-compatible datasets.

Perfect for developers, data scientists, and AI engineers, Gittxt helps you extract and structure .txt, .json, .md content into clean, analyzable formats for use in:

  • Prompt engineering
  • Fine-tuning & retrieval
  • Codebase summarization
  • Open-source LLM workflows

πŸ’‘ Why Gittxt?

Large Language Models often expect input in very specific formats. Many tools (e.g., ChatGPT, Gemini, Ollama) struggle with arbitrary GitHub URLs, complex folders, or non-text assets.

Gittxt bridges this gap by:

  • Extracting all usable text from a repo
  • Organizing it for easy ingestion by LLMs
  • Offering structured .txt, .json, .md, .zip outputs
  • Giving you full control with filtering, formatting, and plugin support

✨ Features at a Glance

  • βœ… Text extractor for code, docs, config files
  • βœ… Output: .txt, .json, .md, .zip
  • βœ… CLI and plugin system (FastAPI, Streamlit)
  • βœ… AI-ready summaries (OpenAI / Ollama)
  • βœ… Reverse engineer .txt/.json reports back into repo structure
  • βœ… .gittxtignore support
  • βœ… Async scanning for large projects
  • βœ… Works offline and in constrained compute environments

πŸ“ Output Types

outputs/
β”œβ”€β”€ txt/         # Plain text report
β”œβ”€β”€ json/        # Structured metadata
β”œβ”€β”€ md/          # Markdown-formatted summary
└── zip/         # Bundled results + manifest

πŸš€ Quickstart

Install

pip install gittxt

Run your first scan

gittxt scan https://github.com/sandy-sp/gittxt --output-format txt,json --lite --zip

Reverse engineer a summary

gittxt re outputs/project.md -o ./restored

🌐 Explore the Visual Web App

Try the hosted version (no install required!)

πŸ‘‰ Launch Streamlit App


πŸ“ˆ Gittxt for AI Workflows

  • Use it to build structured input for LLMs
  • Ideal for prompt chaining, document agents, code summarization
  • Helps transform messy repos into single-file, AI-consumable reports

πŸ“– Full Documentation

All CLI flags, plugins, formats, and filters are documented here:

πŸ“š Explore Gittxt Docs


πŸ”§ Plugin Support

Gittxt supports modular plugins:

  • gittxt-api: Run via FastAPI backend
  • gittxt-streamlit: Interactive dashboard

Install & run with:

gittxt plugin install gittxt-streamlit
gittxt plugin run gittxt-streamlit

🧠 Built for Developers & AI Engineers

Created by Sandeep Paidipati, Gittxt was born out of a need to:

  • Quickly preview and summarize GitHub repos with LLMs
  • Avoid manual copying, filtering, and converting files
  • Create AI-ready datasets for learning and experimentation

πŸ™ Support the Project

  • ⭐️ Star this repo if it helped you
  • 🧡 Share it with your dev/AI community
  • 🀝 Contact me for collaboration or sponsorship

πŸ”’ License

MIT License Β© Sandeep Paidipati


Gittxt β€” Get Text from Git β€” Optimized for AI

About

Gittxt is an AI-focused CLI and plugin tool for extracting, filtering, and packaging text from GitHub repos. Build LLM-compatible datasets, prep code for prompt engineering, and power AI workflows with structured .txt, .json, .md, or .zip outputs.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy