Cuterwrite's Blog

👋 Welcome to Cuterwrite 's Blog

Artificial Intelligence and Data Science

Open WebUI MCP Integration - MCPO and Claw Cloud Deployment

A comprehensive guide on integrating Open WebUI with MCP protocol through MCPO, leveraging Claw Cloud Run's free container resources for zero-cost deployment.

Artificial Intelligence and Data Science

Implementing Local RAG Service: Integrating Open WebUI, Ollama, and Qwen2.5

This article introduces how to implement an efficient and intuitive Retrieval-Augmented Generation (RAG) service locally, integrating Open WebUI, Ollama, and the Qwen2.5 model through Docker. Steps include deploying Open WebUI, configuring Ollama to use the bge-m3 embedding model for document vectorization, and using the Qwen2.5 generation model to answer user queries. Ultimately, a localized system capable of document retrieval and answer generation is achieved. This method not only simplifies the operation process but also enhances data privacy protection and the application capabilities of generative AI.

High Performance Computing

Arm Matrix Acceleration: Scalable Matrix Extension SME

This article introduces the Scalable Matrix Extension (SME) in the Arm architecture, focusing on its efficient matrix computation capabilities in the Streaming SVE mode, and the mechanism of using the ZA array for large-scale data storage and flexible access, providing powerful hardware acceleration support for high-performance computing applications.

High Performance Computing

Arm Performance Optimization: Scalable Vector Extension SVE

This article introduces Arm's Scalable Vector Extension (SVE) and its enhanced version SVE2. They significantly improve the performance of data-intensive applications (such as HPC and ML) by providing variable-length vector registers, flexible per-lane predication, and a rich instruction set, and ensure portability across different hardware platforms through software binary compatibility. Additionally, SVE provides ACLE (ARM C Language Extensions) to assist developers in programming, allowing SVE instructions to be used directly in C/C++ code by calling intrinsic functions in the arm_sve.h header file for efficient vectorized operations.

Artificial Intelligence and Data Science

Introduction to LLM Ecosystem: From Model Fine-tuning to Application Implementation

Building a complete LLM application requires more than just having a powerful model. A thriving LLM ecosystem needs to cover all aspects from model training and optimization to deployment and application. This article will take you through various aspects of the LLM ecosystem, exploring how to truly apply LLM to real-world scenarios.