1 Commits

Author SHA1 Message Date
justin 2e10279beb eval: new baseline on the 4-endpoint embed pool index
22 queries against the prod image index rebuilt today on the expanded
GPU pool with the resilient embedder (PR #8): dense MRR 0.539→0.924,
bm25+rerank 0.920→0.959, hybrid_rrf+rerank 0.875→0.960 vs the
2026-05-22 baseline. No regression from mixed-provenance embeddings.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 20:38:23 -04:00
6 changed files with 34 additions and 73 deletions
+1 -1
View File
@@ -22,7 +22,7 @@ env:
# Two GPU-pinned Ollama containers on the Gitea host — same infra # Two GPU-pinned Ollama containers on the Gitea host — same infra
# zerto-docs uses. :11435 = Titan X, :11436 = 1080 Ti. Indexer # zerto-docs uses. :11435 = Titan X, :11436 = 1080 Ti. Indexer
# round-robins per batch. # round-robins per batch.
OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436,http://192.168.0.125:11434,http://192.168.0.126:11434 OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436
EMBED_MODEL: nomic-embed-text EMBED_MODEL: nomic-embed-text
PRODUCT_NAME: hvm PRODUCT_NAME: hvm
+1 -1
View File
@@ -34,7 +34,7 @@ env:
# :11435 owns the Titan X, :11436 owns the 1080 Ti; the indexer # :11435 owns the Titan X, :11436 owns the 1080 Ti; the indexer
# round-robins per batch so both cards run in parallel. The host's # round-robins per batch so both cards run in parallel. The host's
# primary Ollama on :11434 is left alone for OpenWebUI etc. # primary Ollama on :11434 is left alone for OpenWebUI etc.
OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436,http://192.168.0.125:11434,http://192.168.0.126:11434 OLLAMA_URLS: http://192.168.0.2:11435,http://192.168.0.2:11436
EMBED_MODEL: nomic-embed-text EMBED_MODEL: nomic-embed-text
PRODUCT_NAME: hvm PRODUCT_NAME: hvm
File diff suppressed because one or more lines are too long
@@ -6,7 +6,7 @@ New hardware and software are continually being tested and certified. This docum
NOTE NOTE
There is a self-validation program for partners to quickly validate their solutions for compatibility with HVM. For more information about the program, see [HPE Morpheus VM Essentials Software self-validation program](https://www.hpe.com/psnow/doc/a00155725enw?jumpid=in_pdfviewer-psnow). There is a self-validation program for partners to quickly validate their solutions for compatibility with HVM. For more information about the program, see [the self-validation program documentation](https://www.hpe.com/psnow/doc/a00155725enw?jumpid=in_pdfviewer-psnow).
Table 1. Server Hardware Support Table 1. Server Hardware Support
@@ -44,20 +44,16 @@ Table 1. Server Hardware Support
| Cisco | AMD 1 RU server | Cisco UCS C225 M8 | | | Cisco | AMD 1 RU server | Cisco UCS C225 M8 | |
| Dell | | Dell PowerEdge R660 | iSCSI, FC, NFS | | Dell | | Dell PowerEdge R660 | iSCSI, FC, NFS |
| Dell | | Dell PowerEdge R670 | iSCSI, FC, NFS | | Dell | | Dell PowerEdge R670 | iSCSI, FC, NFS |
| Fujitsu | Intel x86 1U rack server | Fujitsu PRIMERGY | For more information, see [PRIMERGY RX2530 M8 Rack Server Data Sheet](https://sp.ts.fujitsu.com/dmsp/Publications/public/ds-py-rx2530-m8-en.pdf). |
| Lenovo | | Lenovo ThinkEdge SE450 | | | Lenovo | | Lenovo ThinkEdge SE450 | |
| Lenovo | Intel x86 2U rack server | Lenovo ThinkSystem SR650 V3 | For more information, see [Lenovo ThinkSystem SR650 V3 Server Product Guide](https://lenovopress.lenovo.com/lp1601-thinksystem-sr650-v3-server). |
| Supermicro | Intel 2 RU server | Supermicro SYS-621H-TN12 | | | Supermicro | Intel 2 RU server | Supermicro SYS-621H-TN12 | |
| xFusion | | Fusion Server 1288H V7 | | | xFusion | | Fusion Server 1288H V7 | |
1 For Synergy hardware device support, review the latest SSP published for Ubuntu at <https://support.hpe.com/docs/display/public/synergy-sw-release/OS_Support.html>. HPE Synergy D3940 is not supported, and for more information about Ubuntu support in HPE Synergy, see the [Customer Notice a00129603](https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-a00129603en_us). 1 For Synergy hardware device support, review the latest SSP published for Ubuntu at <https://support.hpe.com/docs/display/public/synergy-sw-release/OS_Support.html>. Please note that HPE Synergy D3940 is not supported, and for additional information on Ubuntu support in HPE Synergy, refer to the [Customer Notice a00129603](https://support.hpe.com/hpesc/public/docDisplay?docId=emr_na-a00129603en_us).
Table 2. Storage Hardware Support Table 2. Storage Hardware Support
| Vendor | Hardware Family | Platform Type | Comments | | Vendor | Hardware Family | Platform Type | Comments |
| --- | --- | --- | --- | | --- | --- | --- | --- |
| Lenovo | DE Series | DE6600F | Fibre channel |
| Hitachi Vantara | VSP One B20 | Firmware SVOS RF10.4 or later | Fibre Channel |
| Dell | Dell | PowerStore 4.0.0.2 | iSCSI, FC, NFS | | Dell | Dell | PowerStore 4.0.0.2 | iSCSI, FC, NFS |
| Dell | Dell Unity | Dell EMC Unity XT 880F | | | Dell | Dell Unity | Dell EMC Unity XT 880F | |
| Dell | Dell PowerStore | Dell PowerStore | | | Dell | Dell PowerStore | Dell PowerStore | |
@@ -79,30 +75,27 @@ Table 2. Storage Hardware Support
NOTE NOTE
For more information, see [Storage matrix](https://www.hpe.com/storage/spock). For more detail, please visit the [storage matrix](http://www.hpe.com/storage/spock)
Table 3. Independent Software Vendor (ISV) Support Table 3. Independent Software Vendor (ISV) Support
| Partner | Product Name | Product Version | Deployment | Validation Type | Resources | | Partner | Product Name | Product Version | Deployment | Validation Type | Resources |
| --- | --- | --- | --- | --- | --- | | --- | --- | --- | --- | --- | --- |
| Accops | Accops HyWorks | 4.1.0.23929 | | Partner | - [Connector Management Overview](https://docs.accops.com/HyWorks_4_0/content/core_features/connector_mgmt_overview.html) - [Connector Feature Matrix Comparison](https://docs.accops.com/HyWorks_4_0/content/common/dedicated_provider_feature_matrix.html) - [Features and Enhancements in HyWorks v4.1](https://docs.accops.com/HyWorks_4_0/content/release_notes/4.1/4.1_features_enhancements.html) |
| Opentext | Data Protector | 24.4 and 25.4 | | Partner | - [Opentext Data Protector 24.4 Virtualization Support Matrix](https://docs.microfocus.com/DP/SupportMatrix/24.4/Virtualization_SupportMatrix_DP24.4.pdf) - [Opentext Data Protector 25.4 Virtualization Support Matrix](https://docs.microfocus.com/doc/200/25.4/dpvirtualizationsupportmatrix) |
| Wipro | VisionEDGE | 10.7.0 | | | |
| Aerospike | Aerospike Enterprise Edition Database | 8.1.0.2 | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/powering-aerospike-clusters-with-hpe-morpheus-vm-essentials/ba-p/7262354) | | Aerospike | Aerospike Enterprise Edition Database | 8.1.0.2 | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/powering-aerospike-clusters-with-hpe-morpheus-vm-essentials/ba-p/7262354) |
| Apache | Cassandra DB | 5.0.6 | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/cassandra-with-hpe-morpheus-vm-essentials-software-scale-and/ba-p/7259488) | | Apache | Cassandra DB | 5.0.6 | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/cassandra-with-hpe-morpheus-vm-essentials-software-scale-and/ba-p/7259488) |
| CelerData | CelerData Enterprise (StarRocks) | 3.4.4 | | | | | CelerData | CelerData Enterprise (StarRocks) | 3.4.4 | | | |
| Citrix1,3 | Citrix Virtual Apps and Desktops | 7.2402 LTSR CU1 | | HPE | | | Citrix1,3 | Citrix Virtual Apps and Desktops | 7.2402 LTSR CU1 | | HPE | |
| Cohesity | DataProtect | 7.1.2 and later | Agentbased | Partner | - [Technical Brief](https://psnow.ext.hpe.com/doc/a00146586enw) - [TekTalkonPoint](https://vshow.on24.com/vshow/HPETekTalks/content/4929110/) - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/protect-hpe-morpheus-vm-essentials-software-vms-with-hpe/ba-p/7240793) | | Cohesity | DataProtect | 7.1.2 and later | Agentbased | Partner | [Technical Brief](https://psnow.ext.hpe.com/doc/a00146586enw), [TekTalkonPoint](https://vshow.on24.com/vshow/HPETekTalks/content/4929110/), [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/protect-hpe-morpheus-vm-essentials-software-vms-with-hpe/ba-p/7240793) |
| Cohesity | NetBackup | 11 | Agentbased | Partner | [Release Notes](https://urldefense.com/v3/__https:/www.veritas.com/support/en_US/doc/103228346-168289021-1__;!!NpxR!jDjqUFB8W_nHe21CV5Pr5HQI_JYJVb8JzEDaoWsgX-ql62BKdr7VMcYhflhPHfhA-iDDH26OitC3RorzksoLJQKzxjk$) | | Cohesity | NetBackup | 11 | Agentbased | Partner | [Release Notes](https://urldefense.com/v3/__https:/www.veritas.com/support/en_US/doc/103228346-168289021-1__;!!NpxR!jDjqUFB8W_nHe21CV5Pr5HQI_JYJVb8JzEDaoWsgX-ql62BKdr7VMcYhflhPHfhA-iDDH26OitC3RorzksoLJQKzxjk$) |
| Commvault2 | Commvault Cloud Backup and Recovery | 11.40 | Agent-Based, Imagebased | Partner | - [Technical Brief](https://www.hpe.com/psnow/doc/a50013306enw) - [Tektalk-on-Point](https://vshow.on24.com/vshow/HPETekTalks/content/5025885/) - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/protect-hpe-morpheus-vm-essentials-software-vms-with-commvault-s/ba-p/7245399) - [Solution Brief](https://www.commvault.com/resources/solution-brief/secure-your-virtual-environment-with-simplicity-and-scale) | | Commvault2 | Commvault Cloud Backup and Recovery | 11.40 | Agent-Based, Imagebased | Partner | [Technical Brief](https://www.hpe.com/psnow/doc/a50013306enw), [Tektalk-on-Point](https://vshow.on24.com/vshow/HPETekTalks/content/5025885/), [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/protect-hpe-morpheus-vm-essentials-software-vms-with-commvault-s/ba-p/7245399), [Solution Brief](https://www.commvault.com/resources/solution-brief/secure-your-virtual-environment-with-simplicity-and-scale) |
| HP | HP Anyware | 25.03.1 | | HPE | [Technical Paper](https://www.hpe.com/psnow/doc/a00155457enw) | | HP | HP Anyware | 25.03.1 | | HPE | [Technical Paper](https://www.hpe.com/psnow/doc/a00155457enw) |
| iTernity | iCAS | 3.7.7.4 | | Partner | | | iTernity | iCAS | 3.7.7.4 | | Partner | |
| OpenText | OpenText Analytics Database (Vertica) | 26 or 25.2 | | HPE | | | OpenText | OpenText Analytics Database (Vertica) | 26 or 25.2 | | HPE | |
| Oracle1 | Database | 19c | Single instance only; Oracle RAC support TBD | HPE | - [Technical Brief](https://www.hpe.com/psnow/doc/a50012368enw) - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/reduce-costs-with-hpe-vm-essentials-in-your-oracle-database-on/ba-p/7238767) - [TekTalkonPoint](https://vshow.on24.com/vshow/HPETekTalks/content/4937728/) | | Oracle1 | Database | 19c | Single instance only; Oracle RAC support TBD | HPE | [Technical Brief](https://www.hpe.com/psnow/doc/a50012368enw), [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/reduce-costs-with-hpe-vm-essentials-in-your-oracle-database-on/ba-p/7238767), [TekTalkonPoint](https://vshow.on24.com/vshow/HPETekTalks/content/4937728/) |
| Oracle | MySQL Community Edition | 8.4.6 | | | | | Oracle | MySQL Community Edition | 8.4.6 | | | |
| Medical Informatics Corp. (MIC) | Sickbay Clinical Platform | 3.45.4.0 | | HPE | | | Medical Informatics Corp. (MIC) | Sickbay Clinical Platform | 3.45.4.0 | | HPE | |
| Microsoft1 | SQL Server | SQL Server 2016, 2017, 2019, 2022, 2025 | Single instance with Availability Groups | HPE | - [Technical Brief](https://www.hpe.com/psnow/doc/a50012536enw?jumpid=in_ResourceLibrary) - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/sql-server-runs-on-the-new-hpe-vm-essentials/ba-p/7238640) - [TekTalk-on-Point](https://support.hpe.com/hpesc/public/api/document/sd00006551en_us/render?page=GUID-EA7C0803-E66B-4B17-B994-30D4025A258F.xml#GUID-EA7C0803-E66B-4B17-B994-30D4025A258F) | | Microsoft1 | SQL Server | SQL Server 2016, 2017, 2019, 2022, 2025 | Single instance with Availability Groups | HPE | [Technical Brief](https://www.hpe.com/psnow/doc/a50012536enw?jumpid=in_ResourceLibrary), [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/sql-server-runs-on-the-new-hpe-vm-essentials/ba-p/7238640), [TekTalk-on-Point](https://support.hpe.com/hpesc/public/api/document/sd00006551en_us/render?page=GUID-EA7C0803-E66B-4B17-B994-30D4025A258F.xml#GUID-EA7C0803-E66B-4B17-B994-30D4025A258F) |
| MongoDB1 | Enterprise Advanced | 8.0.0 | | HPE | - [Technical Brief](https://www.hpe.com/psnow/doc/a50012355enw) - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/optimize-ai-development-how-hpe-vm-essentials-and-mongodb/ba-p/7235922) - [Video](https://youtu.be/UYpOJ6JnuEk) | | MongoDB1 | Enterprise Advanced | 8.0.0 | | HPE | [Technical Brief](https://www.hpe.com/psnow/doc/a50012355enw), [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/optimize-ai-development-how-hpe-vm-essentials-and-mongodb/ba-p/7235922), [Video](https://youtu.be/UYpOJ6JnuEk) |
| Elastic1 | Elastic Stack | 9.0.01 | | HPE | [Technical Paper](https://hpe.seismic.com/Link/Content/DCGR7RhHBHG8QGmT6C74h6gqg4fj) | | Elastic1 | Elastic Stack | 9.0.01 | | HPE | [Technical Paper](https://hpe.seismic.com/Link/Content/DCGR7RhHBHG8QGmT6C74h6gqg4fj) |
| Omnissa1,3 | Horizon | 8.13.1 (Build 11490723527) | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/unlock-efficient-vdi-with-hpe-vm-essentials-software-and-omnissa/ba-p/7238879) | | Omnissa1,3 | Horizon | 8.13.1 (Build 11490723527) | | HPE | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/unlock-efficient-vdi-with-hpe-vm-essentials-software-and-omnissa/ba-p/7238879) |
| OpenText | Data Protector | 24.4 | Agent-based | Partner | | | OpenText | Data Protector | 24.4 | Agent-based | Partner | |
@@ -114,8 +107,8 @@ Table 3. Independent Software Vendor (ISV) Support
| Splunk | Splunk Enterprise SmartStore | 9.4.3 | Distributed clustered deployment, single site | HPE | | | Splunk | Splunk Enterprise SmartStore | 9.4.3 | Distributed clustered deployment, single site | HPE | |
| Thales Group | CipherTrust Manager | 2.22.0 and 2.11.1 | | Partner | [Deployment Guide](https://docs-cybersec.thalesgroup.com/bundle/latest-cdsp-cm/page/get_started/deployment/virtual-deployment/private-cloud-deployment/index.html#hpe-vm-essentials-deployment) | | Thales Group | CipherTrust Manager | 2.22.0 and 2.11.1 | | Partner | [Deployment Guide](https://docs-cybersec.thalesgroup.com/bundle/latest-cdsp-cm/page/get_started/deployment/virtual-deployment/private-cloud-deployment/index.html#hpe-vm-essentials-deployment) |
| Vali Cyber | ZeroLock | 3.9.8 | | | | | Vali Cyber | ZeroLock | 3.9.8 | | | |
| Veeam4 | Veeam Backup & Replication | 13.0.1 | Image-based with Changed Block Tracking (CBT) | Partner | - [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/increase-virtualization-efficiency-and-data-protection/ba-p/7262542) - [Blog](https://urldefense.com/v3/__https:/www.veeam.com/blog/veeam-agentless-backup-hpe-morpheus-vm-essentials.html__;!!NpxR!jkrbZXsRMKMZ7lPimKHlkym6liEIAxhAVF0erZc-r0YIQJG_mpDH8gwHuSrw8SF0hOsP-uAS_FKFJWhgvdzOtT9TZg$) - [Video](https://www.hpe.com/h22228/video-gallery/us/en/v100010406/video/) | | Veeam4 | Veeam Backup & Replication | 13.0.1 | Image-based with Changed Block Tracking (CBT) | Partner | [Blog](https://community.hpe.com/t5/the-cloud-experience-everywhere/increase-virtualization-efficiency-and-data-protection/ba-p/7262542), [Blog](https://urldefense.com/v3/__https:/www.veeam.com/blog/veeam-agentless-backup-hpe-morpheus-vm-essentials.html__;!!NpxR!jkrbZXsRMKMZ7lPimKHlkym6liEIAxhAVF0erZc-r0YIQJG_mpDH8gwHuSrw8SF0hOsP-uAS_FKFJWhgvdzOtT9TZg$), [Video](https://www.hpe.com/h22228/video-gallery/us/en/v100010406/video/) |
| Veeam4 | Backup and Replication | 12.3 | Agentbased | Partner | - [Technical Brief](https://www.hpe.com/psnow/doc/a50012338enw) - [Blog-part 1](https://community.veeam.com/blogs-and-podcasts-57/navigating-hpe-vm-essentials-part-1-what-is-it-and-how-to-protect-it-with-veeam-9610) - [Blog-part 2](https://community.veeam.com/blogs-and-podcasts-57/navigating-hpe-vm-essentials-part-2-exploring-backup-strategies-9611) - [Blog-part 3](https://community.veeam.com/blogs-and-podcasts-57/hpe-vme-and-veeam-backup-replication-9863) - [Video](https://psnow.ext.hpe.com/asset?id=7f67fb9a-7e53-4eee-ac47-3f7f89828ca3&preview=true) | | Veeam4 | Backup and Replication | 12.3 | Agentbased | Partner | [Technical Brief](https://www.hpe.com/psnow/doc/a50012338enw), Blog ([part 1](https://community.veeam.com/blogs-and-podcasts-57/navigating-hpe-vm-essentials-part-1-what-is-it-and-how-to-protect-it-with-veeam-9610), [part 2](https://community.veeam.com/blogs-and-podcasts-57/navigating-hpe-vm-essentials-part-2-exploring-backup-strategies-9611), [part 3](https://community.veeam.com/blogs-and-podcasts-57/hpe-vme-and-veeam-backup-replication-9863)), [Video](https://psnow.ext.hpe.com/asset?id=7f67fb9a-7e53-4eee-ac47-3f7f89828ca3&preview=true) |
| Virtual Cable | UDS Enterprise | 4.0 and above | | | | | Virtual Cable | UDS Enterprise | 4.0 and above | | | |
1 Applications have been validated within the bounds of the supported HPE Morpheus Software functionality. Always check the HPE Morpheus Software feature list to determine whether specific functionality is supported by the HVM hypervisor (ex. shared disk access). 1 Applications have been validated within the bounds of the supported HPE Morpheus Software functionality. Always check the HPE Morpheus Software feature list to determine whether specific functionality is supported by the HVM hypervisor (ex. shared disk access).
@@ -128,7 +121,7 @@ Table 3. Independent Software Vendor (ISV) Support
Most modern applications like databases were designed with very “loose” dependance on hardware infrastructure. They can typically run on a variety of hypervisors including virtual machines and containers. The respective ISV vendor typically only specifies the supported underlying operating system (Guest OS) but does not require certification of any hypervisor. However, there can be specific features that a customer deployment of these applications requires at a hypervisor or infrastructure level. For example, a Microsoft SQL Server Failover cluster instance requires a shared disk between multiple SQL Server VMs. Oracle, similarly, requires shared disks for an Oracle Real Application Cluster (RAC) setup. Therefore, it needs to be always validated whether the specific deployment requires certain features and whether these are supported by VM Essentials in its latest release. Most modern applications like databases were designed with very “loose” dependance on hardware infrastructure. They can typically run on a variety of hypervisors including virtual machines and containers. The respective ISV vendor typically only specifies the supported underlying operating system (Guest OS) but does not require certification of any hypervisor. However, there can be specific features that a customer deployment of these applications requires at a hypervisor or infrastructure level. For example, a Microsoft SQL Server Failover cluster instance requires a shared disk between multiple SQL Server VMs. Oracle, similarly, requires shared disks for an Oracle Real Application Cluster (RAC) setup. Therefore, it needs to be always validated whether the specific deployment requires certain features and whether these are supported by VM Essentials in its latest release.
Select ISV applications require “full stack” certifications including OS, hypervisor, compute and storage devices, or even the specific storage connectivity protocol. SAP HANA and related SAP applications are a typical example; so are some Healthcare Electronic Health Record (EDR) applications. If you or your customer plans on running one of these applications, contact your HPE account team. Select ISV applications require “full stack” certifications including OS, hypervisor, compute and storage devices, or even the specific storage connectivity protocol. SAP HANA and related SAP applications are a typical example; so are some Healthcare Electronic Health Record (EDR) applications. If you or your customer plans on running one of these applications, please reach out to your HPE account team.
Table 4. Hypervisor OS Compatibility and Interoperability Matrix Table 4. Hypervisor OS Compatibility and Interoperability Matrix
+4 -35
View File
@@ -19,7 +19,6 @@ from __future__ import annotations
import os import os
import logging import logging
import time
from typing import Any from typing import Any
import httpx import httpx
@@ -44,18 +43,11 @@ EMBED_DIM = int(os.environ.get("EMBED_DIM", "768"))
class OllamaEmbeddings(EmbeddingFunction): class OllamaEmbeddings(EmbeddingFunction):
"""Calls /api/embed across N Ollama endpoints, round-robin per batch. """Calls /api/embed across N Ollama endpoints, naive round-robin.
For indexing throughput on multiple GPUs, run one Ollama container For indexing throughput on multiple GPUs, run one Ollama container
per GPU (pinned via NVIDIA_VISIBLE_DEVICES) and pass all their URLs per GPU (pinned via NVIDIA_VISIBLE_DEVICES) and pass all their URLs
in OLLAMA_URL — the embedder picks the next endpoint per batch. in OLLAMA_URL — the embedder picks the next endpoint per batch.
Resilient (ported from zerto-docs PR #45): a failed call rotates to
the next endpoint and retries with backoff instead of failing the
whole rebuild. HTTP status errors additionally halve the input —
the .0.125 Windows Ollama (4090) 400s when its model runner dies on
an oversized input array, and one endpoint rejecting a batch the
others accept shouldn't kill a multi-hour index build.
""" """
def __init__(self, urls: list[str] = OLLAMA_URLS, model: str = EMBED_MODEL): def __init__(self, urls: list[str] = OLLAMA_URLS, model: str = EMBED_MODEL):
@@ -64,37 +56,14 @@ class OllamaEmbeddings(EmbeddingFunction):
self._next = 0 self._next = 0
def __call__(self, input: Documents) -> Embeddings: def __call__(self, input: Documents) -> Embeddings:
return self._embed(list(input), attempt=1)
def _embed(self, texts: list, attempt: int) -> Embeddings:
url = self.urls[self._next % len(self.urls)] url = self.urls[self._next % len(self.urls)]
self._next += 1 self._next += 1
try:
with httpx.Client(timeout=300) as c: with httpx.Client(timeout=300) as c:
r = c.post(f"{url}/api/embed", r = c.post(f"{url}/api/embed",
json={"model": self.model, "input": texts}) json={"model": self.model, "input": list(input)})
r.raise_for_status() r.raise_for_status()
return r.json().get("embeddings") or [] data = r.json()
except (httpx.TransportError, httpx.HTTPStatusError) as e: return data.get("embeddings") or []
if isinstance(e, httpx.HTTPStatusError):
desc = f"HTTP {e.response.status_code} ({e.response.text[:200]})"
else:
desc = f"transport error {type(e).__name__}"
if attempt >= 5:
log.error("%s from %s (%d texts) — giving up after %d attempts",
desc, url, len(texts), attempt)
raise
if isinstance(e, httpx.HTTPStatusError) and len(texts) > 16:
mid = len(texts) // 2
log.warning("%s from %s — splitting %d texts into %d+%d (attempt %d)",
desc, url, len(texts), mid, len(texts) - mid, attempt)
return (self._embed(texts[:mid], attempt + 1)
+ self._embed(texts[mid:], attempt + 1))
backoff = 0.5 * (2 ** (attempt - 1)) # 0.5, 1, 2, 4
log.warning("%s (attempt %d, %s) — retrying in %.1fs",
desc, attempt, url, backoff)
time.sleep(backoff)
return self._embed(texts, attempt + 1)
def name(self) -> str: # newer chromadb requires this def name(self) -> str: # newer chromadb requires this
return f"ollama:{self.model}" return f"ollama:{self.model}"