Google Antigravity
Google Antigravity desktop app for agentic development and AI workflows in your computer desktop session.
App Store
Apps filtered by tag
Google Antigravity desktop app for agentic development and AI workflows in your computer desktop session.
DeepSeek OCR is a powerful open-source OCR (Optical Character Recognition) tool based on the advanced DeepSeek-AI model. It enables accurate text extraction from images and document scans via a user-friendly web interface and API. Supports various image formats and offers configurations for image size, cropping, and upload limits. Additionally, DeepSeek OCR features four core recognition modes: Plain OCR for raw text extraction, Describe for intelligent image content descriptions, Find for keyword localization with visual bounding box returns, and Freeform for flexible image understanding tasks based on custom prompts. **Key Features:** - High-accuracy text recognition with DeepSeek-OCR, supporting images and multi-page PDF documents - Preserves document layout including tables, formulas, and structural formatting - Web frontend (React) and REST API (FastAPI) for easy usage and system integration - Export results to Markdown, HTML, DOCX, or JSON formats - Automatic extraction and embedding of images from PDF files - GPU acceleration and Docker deployment for fast and scalable processing **Prerequisites:** - ZimaOS version 1.5.2 or higher, or NVIDIA Open Driver version 580 or higher - NVIDIA GPU with >= 8 GB VRAM for optimal performance **Learn More:** - [DeepSeek OCR App (GitHub)](https://github.com/rdumasia303/deepseek_ocr_app)
Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can be used to prepare raw data or improve existing training data to get more accurate ML models.
Langflow is a powerful, open-source UI designed specifically for building and debugging multi-agent and Retrieval-Augmented Generation (RAG) applications. It provides a visual, drag-and-drop interface that simplifies the process of creating complex AI workflows. The system consists of two main components: - **Langflow**: The main application providing a visual interface for building AI workflows - **PostgreSQL**: A robust database system for storing application data and configurations **Key Features:** - Visual, drag-and-drop interface for building AI workflows - Support for multi-agent systems and RAG applications - Integrated debugging tools for testing and optimization - Persistent storage for workflows and configurations - Easy deployment with Docker containers **Learn More:** - [Langflow Official Website](https://www.langflow.org) - [Langflow GitHub Repository](https://github.com/langflow-ai/langflow) - [Documentation](https://docs.langflow.org)
LLaMA Factory is a comprehensive framework for fine-tuning Large Language Models (LLMs) with support for over 100 models. It provides a user-friendly web interface and powerful training methods including LoRA, QLoRA, and full-parameter training. **Key Features:** - Support for 100+ LLMs including LLaMA, Mistral, Qwen, and more - Multiple fine-tuning methods (LoRA, QLoRA, Full, Freeze) - Intuitive Web UI for easy model management - Built-in API server for model inference - Multi-GPU training support - Quantization and model export capabilities **Hardware Requirements:** - GPU: NVIDIA GPU with CUDA support required **Learn More:** - [GitHub Repository](https://github.com/hiyouga/LLaMA-Factory) - [Documentation](https://llamafactory.readthedocs.io/)
OpenHands is an open-source AI-powered coding assistant that provides developers with intelligent code completion, generation, and debugging capabilities. It runs in a sandboxed environment to ensure security and isolation while allowing access to various development tools and resources. **Key Features:** - AI-powered code completion and generation - Interactive debugging and error resolution - Support for multiple programming languages - Secure sandboxed execution environment - Customizable runtime configurations - Integration with Docker for containerized workflows **Learn More:** - [OpenHands Official Website](https://www.all-hands.dev) - [OpenHands GitHub Repository](https://github.com/All-Hands-AI/OpenHands) - [Documentation](https://docs.all-hands.dev)
RagFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It enables users to build their own private ChatGPT by leveraging the power of large language models and deep document parsing capabilities. RagFlow supports various document formats including PDF, Word, Markdown, and more, allowing users to create intelligent question-answering systems based on their own documents. **Key Features:** - Deep document understanding with advanced parsing capabilities - Support for multiple document formats (PDF, Word, Markdown, etc.) - Private knowledge base with data security assurance - Customizable RAG workflows for different use cases - Integration with popular large language models - Web-based interface for easy management and interaction **Hardware Requirements:** - CPU >= 4 cores - RAM >= 16 GB - Disk >= 50 GB **Learn More:** - [RagFlow Official Website](https://ragflow.io) - [RagFlow GitHub Repository](https://github.com/infiniflow/ragflow)
WeKnora is a deep document understanding and semantic retrieval framework based on Large Language Models (LLM), specifically designed for document scenarios with complex structures and heterogeneous content. It adopts a modular architecture, integrating key technologies such as multimodal preprocessing, semantic vector indexing, intelligent retrieval, and large model inference. Based on the Retrieval-Augmented Generation (RAG) paradigm, it achieves context-aware, high-quality Q&A capabilities. WeKnora can deeply understand document content in different formats, combine relevant document fragments with language model inference, and output accurate, coherent semantic results. **Key Features:** - Multimodal Deep Parsing: Supports structured content extraction from various formats such as PDF, Word, TXT, images, including OCR image text recognition. - Semantic Vector Indexing and Intelligent Retrieval: Achieves high-precision semantic matching and recall through a combination of vector retrieval, keyword retrieval, and even knowledge graph-enhanced retrieval. - RAG Closed-Loop Q&A Generation: Generates accurate and coherent content answers by leveraging large language model inference and retrieval fragment fusion. - Agent Mode Enhanced Capabilities: Supports ReACT Agent mode, which can call built-in tools, external web search, etc., during multi-round iterations, to improve complex task processing capabilities. - Multi-type Knowledge Base Management: Can create FAQ and document-type knowledge bases, and flexibly manage tags, batch import files or URLs. - Configurable Dialogue Strategy and UI: Provides an intuitive Web interface and REST API, allowing online adjustment of models, retrieval thresholds, and Prompt to control dialogue behavior. **Learn More:** - [Official Website](https://weknora.weixin.qq.com) - [GitHub Link](https://github.com/Tencent/WeKnora)
ChatBot UI is an advanced chatbot kit for OpenAI's chat models aiming to mimic ChatGPT's interface and functionality. Simply add your OpenAI API key and start chatting! This version of ChatBot UI supports both GPT-3.5 and GPT-4 models. Conversations are stored locally within your browser. You can export and import conversations to safeguard against data loss.
Chatpad AI is an alternative user interface for OpenAI's chat models. Simply add your OpenAI API key and you're ready to go! No tracking. No cookies. All your data is stored locally within your browser, and you can export and import conversations to safeguard against data loss.
Kokoro is an advanced Text-to-Speech (TTS) model that delivers impressive speech quality with only 82 million parameters, making it competitive with much larger and more resource-intensive models. Despite its relatively compact architecture, Kokoro effectively transforms text into clear, natural-sounding speech, making it an excellent choice for applications relying on speech synthesis. The model has been specifically designed to ensure high efficiency and fast processing, making it suitable for both resource-constrained environments and production systems. In comparison to traditional TTS models, which often require substantial computational resources, Kokoro offers a more cost-effective and faster alternative without compromising the quality of speech output. Its lightweight architecture ensures that Kokoro can be deployed even on less powerful devices, making it easier to integrate into various applications. Developers can use Kokoro in a wide range of projects, whether for virtual assistants, interactive systems, or enhancing accessibility. The model not only provides accurate and intelligible speech, but also introduces emotional nuances that enhance the user experience. With its flexibility and ability to be applied across diverse scenarios, Kokoro is a valuable resource for anyone seeking an efficient, lightweight, and powerful speech synthesis solution in their projects. ⚠️ This app only works in Chromium-based browsers (e.g., Chrome, Edge, Brave) and is available at "umbrel.local:8877/web/". Please note that the app is approximately 4GB in size, so the loading process may take a few moments. ⚙️ The API is available at "umbrel.local:8877", and the API documentation can be found at "umbrel.local:8877/docs".
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. ✨ Features - 🖥️ UI & Experience inspired by ChatGPT with enhanced design and features - 🤖 AI Model Selection: - Anthropic (Claude), AWS Bedrock, OpenAI, Azure OpenAI, Google, Vertex AI, OpenAI Assistants API (incl. Azure) - Custom Endpoints: Use any OpenAI-compatible API with LibreChat, no proxy required - Compatible with Local & Remote AI Providers: - Ollama, groq, Cohere, Mistral AI, Apple MLX, koboldcpp, together.ai, - OpenRouter, Perplexity, ShuttleAI, Deepseek, Qwen, and more - 🔧 Code Interpreter API: - Secure, Sandboxed Execution in Python, Node.js (JS/TS), Go, C/C++, Java, PHP, Rust, and Fortran - Seamless File Handling: Upload, process, and download files directly - No Privacy Concerns: Fully isolated and secure execution - 🔦 Agents & Tools Integration: - LibreChat Agents: - No-Code Custom Assistants: Build specialized, AI-driven helpers without coding - Flexible & Extensible: Attach tools like DALL-E-3, file search, code execution, and more - Compatible with Custom Endpoints, OpenAI, Azure, Anthropic, AWS Bedrock, and more - Model Context Protocol (MCP) Support for Tools - Use LibreChat Agents and OpenAI Assistants with Files, Code Interpreter, Tools, and API Actions - 🔍 Web Search: - Search the internet and retrieve relevant information to enhance your AI context - Combines search providers, content scrapers, and result rerankers for optimal results - 🪄 Generative UI with Code Artifacts: - Code Artifacts allow creation of React, HTML, and Mermaid diagrams directly in chat - 🎨 Image Generation & Editing - Text-to-image and image-to-image with GPT-Image-1 - Text-to-image with DALL-E (3/2), Stable Diffusion, Flux, or any MCP server - Produce stunning visuals from prompts or refine existing images with a single instruction - 💾 Presets & Context Management: - Create, Save, & Share Custom Presets - Switch between AI Endpoints and Presets mid-chat - Edit, Resubmit, and Continue Messages with Conversation branching - Fork Messages & Conversations for Advanced Context control - 💬 Multimodal & File Interactions: - Upload and analyze images with Claude 3, GPT-4.5, GPT-4o, o1, Llama-Vision, and Gemini 📸 - Chat with Files using Custom Endpoints, OpenAI, Azure, Anthropic, AWS Bedrock, & Google 🗃️ - 🌎 Multilingual UI: - English, 中文, Deutsch, Español, Français, Italiano, Polski, Português Brasileiro - Русский, 日本語, Svenska, 한국어, Tiếng Việt, 繁體中文, العربية, Türkçe, Nederlands, עברית - 🧠 Reasoning UI: - Dynamic Reasoning UI for Chain-of-Thought/Reasoning AI models like DeepSeek-R1 - 🎨 Customizable Interface: - Customizable Dropdown & Interface that adapts to both power users and newcomers - 🗣️ Speech & Audio: - Chat hands-free with Speech-to-Text and Text-to-Speech - Automatically send and play Audio - Supports OpenAI, Azure OpenAI, and Elevenlabs - 📥 Import & Export Conversations: - Import Conversations from LibreChat, ChatGPT, Chatbot UI - Export conversations as screenshots, markdown, text, json - 🔍 Search & Discovery: - Search all messages/conversations - 👥 Multi-User & Secure Access: - Multi-User, Secure Authentication with OAuth2, LDAP, & Email Login Support - Built-in Moderation, and Token spend tools - ⚙️ Configuration & Deployment: - Configure Proxy, Reverse Proxy, Docker, & many Deployment options - Use completely local or deploy on the cloud - 📖 Open-Source & Community: - Completely Open-Source & Built in Public - Community-driven development, support, and feedback
⚠️ Removal Notice: The LibreTranslate app has been disabled over trademark issues. ⚠️ This app may take up to 10 minutes or more to become accessible after installation, depending on your hardware and internet connection. LibreTranslate must first download around 10 GB of translation models in the background before the UI becomes available. Please be patient. LibreTranslate is a free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it doesn't rely on proprietary providers such as Google or Azure to perform translations. Instead, its translation engine is powered by the open source Argos Translate library.
LlamaGPT is a self-hosted, offline, and private chatbot that provides a ChatGPT-like experience, with no data leaving your device. Note: The download size of LlamaGPT is about 5.5GB. Depending on your internet speed, it may take some time for it to be installed. LlamaGPT consumes approximately 5GB of RAM. As a result, it is not suitable for users running umbrelOS on a Raspberry Pi 4 with 4GB RAM. For the best user experience, at least 8GB RAM is recommended. LlamaGPT is optimized for the Umbrel Home, generating words as fast as ~3 words/sec. On a Raspberry Pi 4 with 8GB RAM, it generates words at ~1 word/sec. Performance can vary depending on which other apps are installed on your Umbrel. Powered by the state-of-the-art Nous Hermes Llama 2 7B language model, LlamaGPT is fine-tuned on over 300,000 instructions to offer longer responses and a lower hallucination rate. LlamaGPT has been made possible thanks to the incredible open source work from various developers and teams. We extend our gratitude to Mckay Wrigley for building the Chatbot UI, Georgi Gerganov for implementing llama.cpp, Andrei for developing the Python bindings for llama.cpp, NousResearch for fine-tuning the model, Tom Jobbins for quantizing the model, and Meta for releasing Llama 2 under a permissive license. An official app from Umbrel.
💬 An open-source, modern-design ChatGPT/LLMs UI/Framework. 🗣️ Supports speech-synthesis, multi-modal, and extensible (function call) plugin system. 🤖 One-click deployment of your private OpenAI ChatGPT/Claude/Gemini/Groq/Ollama chat application. 🔑 To get started, add your OpenAI API key in the settings or configure one of the many other providers.
LocalAI is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs, generate images, audio locally with consumer grade hardware, supporting multiple model families and architectures. ⚠️ Note Before running a model, make sure your device has enough free RAM to support it. Attempting to run a model that exceeds your available memory could cause your device to crash or become unresponsive. Always check the model requirements before downloading or starting it.
Ollama allows you to download and run advanced AI models directly on your own hardware. Self-hosting AI models ensures full control over your data and protects your privacy. ⚠️ Before running a model, make sure your device has enough free RAM to support it. Attempting to run a model that exceeds your available memory could cause your device to crash or become unresponsive. Always check the model requirements before downloading or starting it. **Getting Started:** The easiest way to get started with Ollama is to install the Open WebUI app from the Umbrel App Store. Open WebUI will automatically connect to your Ollama setup, allowing you to manage model downloads and chat with your AI models effortlessly. **Advanced Setup:** If you want to connect Ollama to other apps or devices, here's how: - Apps running on UmbrelOS: Use ollama_ollama_1 as the host and 11434 as the port when configuring other apps to connect to Ollama. For example, the API Base URL would be: http://ollama_ollama_1:11434. - Custom Integrations: Connect Ollama to third-party apps or your own code using your UmbrelOS local domain (e.g., http://umbrel.local:11434) or your device's IP address, which you can find in the UmbrelOS Settings page (e.g., http://192.168.4.74:11434).
🤖 Perplexica is an intelligent AI-powered search and question-answering engine that redefines the way information is discovered and understood. Instead of providing a long list of links, it processes data from multiple sources, interprets the meaning behind each query, and generates clear, structured, and conversational responses. Every answer is designed to be both informative and contextually relevant, helping users understand complex topics without the need to sift through endless pages of search results. What makes Perplexica unique is its ability to comprehend intent rather than just keywords. When a user asks a question, it analyzes not only the literal terms but also the underlying purpose of the request. This allows it to deliver responses that feel thoughtful, connected, and natural. It synthesizes knowledge from different perspectives, combines verified information, and presents it in a way that feels like having a conversation with an expert rather than using a traditional search engine. Perplexica can explore virtually any topic, from technical explanations and factual research to creative brainstorming and educational summaries. It helps users follow up with deeper questions, refine their understanding, and uncover new insights effortlessly. By blending intelligent reasoning with natural language fluency, Perplexica turns online exploration into a smooth, engaging, and insightful experience that adapts to every user's curiosity.