GenAI

699 bookmarks

Newest

Azure API Management Your Auth Gateway For MCP Servers | Microsoft Community Hub

The Model Context Protocol (MCP) is quickly becoming the standard for integrating Tools 🛠️ with Agents 🤖 and Azure API Management is at the fore-front,...

Design Pattern

·techcommunity.microsoft.com·Apr 19, 2025

Azure API Management Your Auth Gateway For MCP Servers | Microsoft Community Hub

cohere-developer-experience/notebooks/guides/embed-v4-pdf-search/embed-v4-pdf-search.ipynb at main · cohere-ai/cohere-developer-experience

Docs, Snippets, Guides. Contribute to cohere-ai/cohere-developer-experience development by creating an account on GitHub.

Example #embedding

·github.com·Apr 17, 2025

cohere-developer-experience/notebooks/guides/embed-v4-pdf-search/embed-v4-pdf-search.ipynb at main · cohere-ai/cohere-developer-experience

cohere-developer-experience/notebooks at main · cohere-ai/cohere-developer-experience

Docs, Snippets, Guides. Contribute to cohere-ai/cohere-developer-experience development by creating an account on GitHub.

Example #cohere #embedding

·github.com·Apr 17, 2025

cohere-developer-experience/notebooks at main · cohere-ai/cohere-developer-experience

Chain-of-Thought Prompting

Learn how Chain-of-Thought prompting improves AI reasoning by guiding models to explain their thought process. Discover its impact on LLM accuracy and complex tasks.

Prompt Engineering

·learnprompting.org·Apr 12, 2025

Chain-of-Thought Prompting

gemini-samples/guides/langgraph-react-agent.ipynb at main · philschmid/gemini-samples

Contribute to philschmid/gemini-samples development by creating an account on GitHub.

Tutorial

·github.com·Apr 12, 2025

gemini-samples/guides/langgraph-react-agent.ipynb at main · philschmid/gemini-samples

Relevance Feedback in Informational Retrieval

Relerance feedback: from ancient history to LLMs Why relevance feedback techniques are good on paper but not popular in neural search and what we can do about it

Deep Dive #embedding #re-ranking #vector-similarity

·qdrant.tech·Apr 12, 2025

Relevance Feedback in Informational Retrieval

RAG from Scratch

Contribute to labdmitriy/llm-rag development by creating an account on GitHub.

Example

·github.com·Apr 11, 2025

RAG from Scratch

Google's Prompt Engineering

Prompt Engineering #ebook

·up.raindrop.io·Apr 11, 2025

Google's Prompt Engineering

What Is GraphRAG?

GraphRAG is a powerful retrieval mechanism that improves Generative AI applications by taking advantage of the rich context in graph data structures.

Concept #knowledge-graph #neo4j

·neo4j.com·Apr 11, 2025

What Is GraphRAG?

Introducing the Weaviate Query Agent | Weaviate

Learn about the Query Agent, our new agentic search service that redefines how you interact with Weaviate’s database!

Design Pattern

·weaviate.io·Apr 11, 2025

Introducing the Weaviate Query Agent | Weaviate

An Overview of Late Interaction Retrieval Models: ColBERT, ColPali, and ColQwen

Late interaction allow for semantically rich interactions that enable a precise retrieval process across different modalities of unstructured data, including text and images.

In this context, “interaction” refers to the process of assessing how well a document matches a given search query by comparing their representations.

A dense retrieval model is a model that uses some type of neural network architecture to retrieve relevant documents for a search query.

Traditional methods for retrieval commonly use “no-interaction” retrieval models. In this case, the search query and documents are processed separately

Advantages of no-interaction retrieval models are primarily that they are fast and computationally efficient

These characteristics make full interaction models great for second-stage retrieval, like reranking a curated set of candidate documents

extremely computationally expensive

contextually rich

scalable and contextually rich

storage requirements - they require an embedding for each token, which requires a lot more storage for a complete set of vectors

Disadvantages of no-interaction retrieval models lie in the lack of interaction between the search query and the documents.

multimodal late interaction retrieval models

vision language models (VLMs) instead of text-only models

Concept

·weaviate.io·Apr 10, 2025

An Overview of Late Interaction Retrieval Models: ColBERT, ColPali, and ColQwen

Remote MCP with Azure Functions (Python) - Code Samples

Run a remote MCP server on Azure functions.

Tutorial

·learn.microsoft.com·Apr 10, 2025

Remote MCP with Azure Functions (Python) - Code Samples

Azure-Samples/remote-mcp-apim-functions-python: Azure API Management as AI Gateway to Remote MCP servers.

Azure API Management as AI Gateway to Remote MCP servers. - Azure-Samples/remote-mcp-apim-functions-python

Example

·github.com·Apr 10, 2025

Azure-Samples/remote-mcp-apim-functions-python: Azure API Management as AI Gateway to Remote MCP servers.

Authentication and Authorization - Azure App Service

Learn about the built-in authentication and authorization support in Azure App Service and Azure Functions, and how it can help secure your app.

Design Pattern

·learn.microsoft.com·Apr 10, 2025

Authentication and Authorization - Azure App Service

Azure-Samples/remote-mcp-functions-python

Contribute to Azure-Samples/remote-mcp-functions-python development by creating an account on GitHub.

Example

·github.com·Apr 8, 2025

Azure-Samples/remote-mcp-functions-python

A Visual Guide to Reasoning LLMs

How do we create LLMs that can reason? Exploring Test-Time Compute Techniques and DeepSeek-R1.

Deep Dive #reasoning

·newsletter.maartengrootendorst.com·Apr 8, 2025

A Visual Guide to Reasoning LLMs

The "think" tool: Enabling Claude to stop and think \ Anthropic

A blog post for developers, describing a new method for complex tool-use situations

The primary evaluation metric used in τ-bench is pass^k, which measures the probability that all k independent task trials are successful for a given task, averaged across all tasks. Unlike the pass@k metric that is common for other LLM evaluations (which measures if at least one of k trials succeeds), pass^k evaluates consistency and reliability—critical qualities for customer service applications where consistent adherence to policies is essential.

Design Pattern

·anthropic.com·Apr 6, 2025

The "think" tool: Enabling Claude to stop and think \ Anthropic

MCP Security Notification: Tool Poisoning Attacks

We have discovered a critical vulnerability in the Model Context Protocol (MCP) that allows for

Concept #security

·invariantlabs.ai·Apr 6, 2025

MCP Security Notification: Tool Poisoning Attacks

LangChain (@LangChainAI) on X

Understanding multi-agent handoffs Handoffs are a central concept in multi-agent systems. LangGraph swarm is built on them. But, they can be hard to understand. Here, we break-down the swarm handoff mechanism. 📽️: https://t.co/YkSCFeg9A8

Concept #swarm

·x.com·Apr 4, 2025

LangChain (@LangChainAI) on X

VectifyAI/PageIndex: Document Index System for Reasoning-Based RAG

Document Index System for Reasoning-Based RAG

Tool #semantic-search

·github.com·Apr 4, 2025

VectifyAI/PageIndex: Document Index System for Reasoning-Based RAG

How to Build a Knowledge Graph in 7 Steps

Discover how to build a knowledge graph in 7 simple steps, from defining your use case to creating a model to ingesting your data.

Tutorial #entity-resolution #knowledge-graph

·neo4j.com·Apr 3, 2025

How to Build a Knowledge Graph in 7 Steps

Open-Source MCP servers | Glama

Enterprise-grade security, privacy, with features like agents, MCP, prompt templates, and more.

Tool

·glama.ai·Apr 2, 2025

Open-Source MCP servers | Glama

Smithery - Model Context Protocol Registry

Extend your agent's capabilities with Model Context Protocol servers.

Tool

·smithery.ai·Apr 2, 2025

Smithery - Model Context Protocol Registry

TICKing All the Boxes: Generated Checklists Improve LLM Evaluation...

Given the widespread adoption and usage of Large Language Models (LLMs), it is crucial to have flexible and interpretable evaluations of their instruction-following ability. Preference judgments between model outputs have become the de facto evaluation standard, despite distilling complex, multi-faceted preferences into a single ranking. Furthermore, as human annotation is slow and costly, LLMs are increasingly used to make these judgments, at the expense of reliability and interpretability. In this work, we propose TICK (Targeted Instruct-evaluation with ChecKlists), a fully automated, interpretable evaluation protocol that structures evaluations with LLM-generated, instruction-specific checklists. We first show that, given an instruction, LLMs can reliably produce high-quality, tailored evaluation checklists that decompose the instruction into a series of YES/NO questions. Each question asks whether a candidate response meets a specific requirement of the instruction. We demonstrate that using TICK leads to a significant increase (46.4% $\to$ 52.2%) in the frequency of exact agreements between LLM judgements and human preferences, as compared to having an LLM directly score an output. We then show that STICK (Self-TICK) can be used to improve generation quality across multiple benchmarks via self-refinement and Best-of-N selection. STICK self-refinement on LiveBench reasoning tasks leads to an absolute gain of $+$7.8%, whilst Best-of-N selection with STICK attains $+$6.3% absolute improvement on the real-world instruction dataset, WildBench. In light of this, structured, multi-faceted self-improvement is shown to be a promising way to further advance LLM capabilities. Finally, by providing LLM-generated checklists to human evaluators tasked with directly scoring LLM responses to WildBench instructions, we notably increase inter-annotator agreement (0.194 $\to$ 0.256).

Literature #explainability

·arxiv.org·Apr 2, 2025

TICKing All the Boxes: Generated Checklists Improve LLM Evaluation...

Everything a Developer Needs to Know About the Model Context Protocol (MCP)

The Model Context Protocol gives users a wide range of services they can connect to from the comfort of their AI client.

Concept #neo4j

·neo4j.com·Apr 2, 2025

Everything a Developer Needs to Know About the Model Context Protocol (MCP)

Running Dockerized Puppeteer in Claude Desktop

Discover how the Model Context Protocol (MCP) simplifies building AI applications by seamlessly integrating Anthropic Claude with Docker Desktop, enhancing developer productivity and workflow efficiency.

Tutorial

·docker.com·Apr 1, 2025

Running Dockerized Puppeteer in Claude Desktop

Building Your Own RAG System: Enhancing Claude with Your Documentation

Connecting Claude Desktop to Your Documentation Through MCP and Qdrant

Tutorial #llamaindex

·aiboosted.dev·Mar 31, 2025

Building Your Own RAG System: Enhancing Claude with Your Documentation

mcp-use

Model-Agnostic MCP Library for LLMs

Package

·pypi.org·Mar 31, 2025

mcp-use

It's MCP week apparently! Yesterday we showed you how to use LlamaCloud as… | LlamaIndex

It's MCP week apparently! Yesterday we showed you how to use LlamaCloud as an MCP server, today see how to use LlamaIndex as a client to any MCP…

Example #llamaindex

·linkedin.com·Mar 28, 2025

It's MCP week apparently! Yesterday we showed you how to use LlamaCloud as… | LlamaIndex

RAG 2.0 is really about grounding general purpose agents in proprietary… | Jerry Liu

RAG 2.0 is really about grounding general purpose agents in proprietary enterprise context. Instead of simply one-shot answering a simple question, the agent…

Report

·linkedin.com·Mar 27, 2025

RAG 2.0 is really about grounding general purpose agents in proprietary… | Jerry Liu