New Embedding transformers on the FME Hub

Question

Pleased to share that the team have published two custom transformers to the FME Hub designed specifically to make embeddings generation easier.

Vector embeddings are the essential building blocks for semantic search and Retrieval-Augmented Generation (RAG). Some more background into why they are useful here:

To make it easier to generate these embeddings directly within your FME workspaces, we’ve created two new connectors tailored to different deployment preferences:

OllamaEmbeddingsConnector
If you prefer running your Large Language Models (LLMs) locally for strict data privacy, offline processing or simply to experiment with open-source models (like Llama 3 or Mistral), this one is for you. This transformer connects your FME workflow to a local Ollama instance, allowing you to generate embeddings securely without your data ever leaving your machine.

GeminiEmbeddings
If you’re looking for cloud-powered, enterprise-grade capabilities this transformer integrates seamlessly with Google’s Gemini API. It’s perfect for generating high-quality embeddings at scale using Google’s infra directly within your workspace.

What can you do with them?

Populate Vector Databases: Generate embeddings from your datasets and feed them straight into vector stores (like pgvector, Pinecone, or Qdrant) for advanced spatial-semantic queries.
Build RAG Pipelines: Construct and automate Retrieval-Augmented Generation workflows entirely within FME.
Intelligent Data Matching: Group, cluster, and categorise your text data based on semantic similarity rather than relying on exact keyword matches.

Have a play, feedback welcome.

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded