Skip to main content

What is FlockMTL?

Overview

FlockMTL enhances DuckDB by integrating semantic functions and robust resource management capabilities, enabling advanced analytics and language model operations directly within SQL queries.

Semantic Functions

FlockMTL offers a suite of semantic functions that allow users to perform various language model operations:

  • Scalar Map Functions:

    • llm_complete: Generates text completions using a specified language model.
    • llm_filter: Filters data based on language model evaluations, returning boolean values.
    • llm_embedding: Generates embeddings for input text, useful for semantic similarity tasks.
  • Aggregate Reduce Functions:

    • llm_reduce: Aggregates multiple inputs into a single output using a language model.
    • llm_rerank: Reorders query results based on relevance scores from a language model.
    • llm_first: Selects the top-ranked result after reranking.
    • llm_last: Selects the bottom-ranked result after reranking.

This allows users to perform tasks such as text generation, summarization, classification, filtering, fusion, and embedding generation.

Hybrid Search Functions

FlockMTL also provides functions that support hybrid search. Namely, the following data fusion algorithms to combine scores of various retrievers:

  • fusion_rrf: Implements Reciprocal Rank Fusion (RRF) to combine rankings from multiple scoring systems.
  • fusion_combsum: Sums normalized scores from different scoring systems.
  • fusion_combmnz: Sums normalized scores and multiplies by the hit count.
  • fusion_combmed: Computes the median of normalized scores.
  • fusion_combanz: Calculates the average of normalized scores.

These functions enable users to combine the strengths of different scoring methods, such as BM25 and embedding scores, to produce the best-fit results, and even create end-to-end RAG pipelines.

Structured Output

FlockMTL provides structured output capabilities that allow users to obtain predictable, schema-compliant JSON responses from Large Language Models. This feature works with all FlockMTL LLM functions and supports both OpenAI and Ollama providers, ensuring consistent data formats for downstream processing.

Resource Management

FlockMTL introduces a resource management framework that treats models (MODEL) and prompts (PROMPT) similarly to tables, allowing for organized storage and retrieval.

System Requirements

FlockMTL is supported by the different operating systems and platforms, such as:

  • Linux
  • macOS
  • Windows

And to ensure stable and reliable performance, it is important to meet only two requirements:

  • DuckDB Setup: Version 1.1.1 or later. FlockMTL is compatible with the latest stable release of DuckDB, which can be installed from the official DuckDB installation guide.
  • Provider API Key: FlockMTL supports multiple providers such as OpenAI, Azure, and Ollama. Configure the provider of your choice to get started.