Add multi-archive search support and identifiable search results #248
+112
−84
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue : #230 As highlighted by #229, Searcher's getResults() only yields result's entry path. while convenient for single-archive search, it prevents implementing multi-ZIM search as results would only be path strings from multiple ZIMs.
We should then implement multiple ZIM search properly by
Binding addArche to Searcher (ref impl in #229)
Change Searcher API so that results can be identified
Our Changes :
Allowing multiple archives to be bound to a Searcher
Returning identifiable search results
Updated type hints in search.pyi
Benefits :
1.Enables multi-ZIM search
2.Ensures search results are uniquely identifiable
3.Clean, maintainable API, consistent with libzim C++ internals
4.Future-proof for features like ranking, filtering, and deduplication across multiple archives.
Backward Compatibility :
1.The API change from Iterator[str] → Iterator[SearchResult] is intentional to support multi-ZIM search.
2.Users can still access the path via result.path to simplify migration.