Skip to content

Conversation

@HelloWorld-25
Copy link

@HelloWorld-25 HelloWorld-25 commented Dec 28, 2025

Issue : #230 As highlighted by #229, Searcher's getResults() only yields result's entry path. while convenient for single-archive search, it prevents implementing multi-ZIM search as results would only be path strings from multiple ZIMs.

We should then implement multiple ZIM search properly by

Binding addArche to Searcher (ref impl in #229)
Change Searcher API so that results can be identified


Our Changes :

Allowing multiple archives to be bound to a Searcher

1. Added a _archives list to track all registered archives.

 2. Introduced Searcher.addArchive(archive: Archive) method to register additional archives.

 3. Searcher.search(query) now searches across all bound archives.

Returning identifiable search results

    1. Introduced a new class: SearchResult containing:

              class SearchResult:
                    archive: Archive
                    path: str

    2. Updated SearchResultSet.__iter__() to yield SearchResult objects instead of strings.

   3. Results are now unambiguous and include both archive and entry path.

Updated type hints in search.pyi

   1. SearchResultSet.__iter__() now returns Iterator[SearchResult]

   2. Searcher.addArchive() is added to type hints

   3. Python API fully matches the new Cython implementation.

Benefits :

1.Enables multi-ZIM search

2.Ensures search results are uniquely identifiable

3.Clean, maintainable API, consistent with libzim C++ internals

4.Future-proof for features like ranking, filtering, and deduplication across multiple archives.


Backward Compatibility :

1.The API change from Iterator[str] → Iterator[SearchResult] is intentional to support multi-ZIM search.

2.Users can still access the path via result.path to simplify migration.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant