Accelerate AI Development with Snowflake

Model customization techniques enable you to optimize models for your specific use case. Snowflake is introducing serverless fine-tuning (generally available soon), allowing developers to fine-tune models for enhanced cost-performance benefits. This fully managed service eliminates the need for developers to build or manage their own infrastructure for training and inference. 

Gain multimodal support in COMPLETE function

Enhance AI apps and pipelines with multimodal support for richer responses. With new generative AI capabilities, developers can now process multimodal data, using the most relevant information in their applications. We are enabling multimodal LLM inference (private preview soon) as part of the Cortex COMPLETE function for image inputs using the Llama 3.2 models available in Snowflake Cortex AI. Support for audio, video and image embeddings will follow soon. Expanded multimodal support enriches responses for diverse tasks such as summarization, classification and entity extraction across various media types. 

Deliver multimodal analytics with familiar SQL syntax

Database queries are the underlying force that runs the insights across organizations and powers data-driven experiences for users. Traditionally, SQL has been limited to structured data neatly organized in tables. Snowflake will be introducing new multimodal SQL functions (private preview soon) that enable data teams to run analytical workflows on unstructured data, such as images. With these functions, teams can run tasks such as semantic filters and joins across unstructured data sets using familiar SQL syntax. 

Confidently process large inference jobs with provisioned throughput capacity

A consistent end-user experience is often a gating factor as developers move beyond proofs of concept. With Provisional Throughput (public preview soon on AWS), customers can reserve dedicated throughput, ensuring consistent and predictable performance for their workloads. Additionally, we launched cross-region inference, allowing you to access preferred LLMs even if they aren’t available in your primary region.

Develop high-quality, conversational AI apps, faster 

Snowflake now offers new tools to simplify developing and deploying conversational AI applications.

Advanced document preprocessing for RAG

Earlier this year, we launched Cortex Search to help customers unlock insights from unstructured data and turn vast document collections into AI-ready resources without complex coding. The fully managed retrieval solution enables developers to build scalable AI apps that extract insights from unstructured data within Snowflake’s secure environment. This capability is especially powerful when paired with layout-aware text extraction and chunking functions, which optimize documents for retrieval by streamlining preprocessing through short SQL functions.

Now you can make documents AI-ready faster with two new SQL preprocessing functions. We’re introducing streamlined solutions for processing documents from blob storage (e.g., Amazon S3) into text representations for usage in retrieval-augmented generation (RAG) applications. SQL users can now replace complex document processing pipelines with simple SQL functions from Cortex AI, such as PARSE_DOCUMENT (public preview) and SPLIT_TEXT_RECURSIVE_CHARACTER (private preview). The parsing function takes care of extracting text and layout from documents. Developers do not have to move the raw data from its original storage location. The text splitting function takes care of the chunking of extracted text into segments that are more optimized for indexing and retrieval. Learn more.

Conversational analytics improvements in Cortex Analyst

Expand the scope of accurate, self-service analytics in natural language with Cortex Analyst. Snowflake Cortex Analyst continues to evolve as a fully managed service, providing conversational, self-serve analytics that allow users to seamlessly interact with structured data in Snowflake. Recent updates enhance user experience and analytical depth, including SQL Joins support for Star and Snowflake schemas (public preview) while maintaining high quality, enabling more complex data explorations and richer insights. Additionally, multiturn conversations (public preview) allow users to ask follow-up questions for more fluid interactions. Integration with Cortex Search (public preview) improves the accuracy of generated SQL queries by dynamically retrieving exact or similar literal values for complex, high-cardinality data fields, while API-level role-based access controls strengthen security and governance. 

Together, these updates empower enterprises to securely derive accurate, timely insights from their data, reducing the overall cost of data-driven decision-making. To learn more about these new features and related updates check out our Cortex Analyst blog post.

Advanced orchestration and observability tools for LLM apps

Reduce manual integration and orchestration in chat applications with the Cortex Chat API (public preview soon), which simplifies building interactive applications in Snowflake. By combining retrieval and generation into a single API call, you can now build agentic chat apps to talk to both structured and unstructured data. The optimized prompt enables high-quality responses along with citations that reduce hallucinations and increase trust. A single integration endpoint simplifies the application architecture. 

Increase AI app trustworthiness with built-in evaluation and monitoring through the new integrated AI Observability for LLM Apps (private preview). This observability suite provides essential tools to enhance evaluation and trust in LLM applications, supporting customers’ AI compliance efforts. These observability features allow app developers to assess quality metrics — such as relevance, groundedness and bias — alongside traditional performance metrics such as latency all during the development process. They also enable thorough monitoring of application logs, enabling organizations to keep a close eye on their AI applications.