-
Notifications
You must be signed in to change notification settings - Fork 8
Add Typesense #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add Typesense #44
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
08e7b64
add typesense
ruslandoga b59abed
fix ci
ruslandoga 79edf7d
Update lib/hexdocs/queue.ex
ruslandoga 67ae461
add compose
ruslandoga 4eecb02
hide typesense behind impl
ruslandoga 1c727da
simplify find_search_items/3
ruslandoga 2ed6007
cleanup test
ruslandoga f0a39a0
include typesense in ci tests
ruslandoga 649397a
add post in hexdocs.http
ruslandoga 6333e8c
read typesense indexing response eagerly
ruslandoga e95f897
make collection configurable
ruslandoga 2862552
remove content-type: text/plain from import
ruslandoga 25938fb
update typesense to 27.1
ruslandoga 7ae762a
do typesense indexing last
ruslandoga f436ed7
extract indexing to separate function, wrap in log lines
ruslandoga 7ce5be5
refactor
ruslandoga f70d9b4
handle more errors
ruslandoga 64481ed
add proglang to collection
ruslandoga a29793d
refactor tests a bit
ruslandoga 460bc7a
test find_search_items/3
ruslandoga a70bc4b
test package delete
ruslandoga 321b806
refactor log tests
ruslandoga 58e923d
add bad document test
ruslandoga d7cb3b9
remove search_data_js format error logs when there is no search_data_js
ruslandoga dd222ea
rm releases.exs
ruslandoga bfc0539
read Typesense collection from env
ruslandoga 1a97386
switch to :json
ruslandoga 1087e6a
update tests
ruslandoga 8d58e4f
add tests for invalid search items
ruslandoga 9a839e4
Merge branch 'main' into add-typesense
wojtekmach a4ba5c3
Prepare hexdocs staging for typesense
wojtekmach 26169bb
wip
wojtekmach ff16251
Revert "wip"
wojtekmach File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
services: | ||
typesense: | ||
image: typesense/typesense:27.1 | ||
command: --data-dir /tmp --api-key=hexdocs | ||
ports: | ||
- 8108:8108 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
defmodule Hexdocs.Search.Local do | ||
@behaviour Hexdocs.Search | ||
|
||
@impl true | ||
def index(_package, _version, _proglang, _items), do: :ok | ||
|
||
@impl true | ||
def delete(_package, _version), do: :ok | ||
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
defmodule Hexdocs.Search do | ||
require Logger | ||
|
||
@type package :: String.t() | ||
@type version :: Version.t() | ||
@type proglang :: String.t() | ||
@type search_items :: [map] | ||
|
||
@callback index(package, version, proglang, search_items) :: :ok | ||
@callback delete(package, version) :: :ok | ||
|
||
defp impl, do: Application.fetch_env!(:hexdocs, :search_impl) | ||
|
||
@spec index(package, version, proglang, search_items) :: :ok | ||
def index(package, version, proglang, search_items) do | ||
impl().index(package, version, proglang, search_items) | ||
end | ||
|
||
@spec delete(package, version) :: :ok | ||
def delete(package, version) do | ||
impl().delete(package, version) | ||
end | ||
|
||
@spec find_search_items(package, version, [{Path.t(), content :: iodata}]) :: | ||
{proglang, search_items} | nil | ||
def find_search_items(package, version, files) do | ||
search_data_js = | ||
Enum.find_value(files, fn {path, content} -> | ||
case Path.basename(path) do | ||
"search_data-" <> _digest -> content | ||
_other -> nil | ||
end | ||
end) | ||
|
||
unless search_data_js do | ||
Logger.info("Failed to find search data for #{package} #{version}") | ||
end | ||
|
||
search_data_json = | ||
case search_data_js do | ||
"searchData=" <> json -> | ||
json | ||
|
||
_ when is_binary(search_data_js) -> | ||
Logger.error("Unexpected search_data format for #{package} #{version}") | ||
nil | ||
|
||
nil -> | ||
nil | ||
end | ||
|
||
search_data = | ||
if search_data_json do | ||
try do | ||
:json.decode(search_data_json) | ||
catch | ||
_kind, reason -> | ||
Logger.error( | ||
"Failed to decode search data json for #{package} #{version}: " <> | ||
inspect(reason) | ||
) | ||
|
||
nil | ||
end | ||
end | ||
|
||
case search_data do | ||
%{"items" => [_ | _] = search_items} -> | ||
proglang = Map.get(search_data, "proglang") || proglang(search_items) | ||
{proglang, search_items} | ||
|
||
nil -> | ||
nil | ||
|
||
_ -> | ||
Logger.error( | ||
"Failed to extract search items and proglang from search data for #{package} #{version}" | ||
) | ||
|
||
nil | ||
end | ||
end | ||
|
||
defp proglang(search_items) do | ||
if Enum.any?(search_items, &elixir_module?/1), do: "elixir", else: "erlang" | ||
end | ||
|
||
defp elixir_module?(%{"type" => "module", "title" => <<first_letter, _::binary>>}) | ||
when first_letter in ?A..?Z, | ||
do: true | ||
|
||
defp elixir_module?(_), do: false | ||
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
defmodule Hexdocs.Search.Typesense do | ||
@moduledoc false | ||
require Logger | ||
alias Hexdocs.HTTP | ||
|
||
@behaviour Hexdocs.Search | ||
|
||
@impl true | ||
def index(package, version, proglang, search_items) do | ||
full_package = full_package(package, version) | ||
|
||
ndjson = | ||
Enum.map(search_items, fn item -> | ||
json = | ||
Map.take(item, ["type", "ref", "title", "doc"]) | ||
|> Map.put("package", full_package) | ||
|> Map.put("proglang", proglang) | ||
|> :json.encode() | ||
|
||
[json, ?\n] | ||
end) | ||
|
||
url = url("collections/#{collection()}/documents/import?action=create") | ||
headers = [{"x-typesense-api-key", api_key()}] | ||
|
||
case HTTP.post(url, headers, ndjson, [:with_body]) do | ||
{:ok, 200, _resp_headers, ndjson} -> | ||
ndjson | ||
|> String.split("\n") | ||
|> Enum.each(fn json -> | ||
case :json.decode(json) do | ||
%{"success" => true} -> | ||
:ok | ||
|
||
%{"success" => false, "error" => error, "document" => document} -> | ||
Logger.error( | ||
"Failed to index search item for #{package} #{version} for document #{inspect(document)}: #{inspect(error)}" | ||
) | ||
end | ||
end) | ||
|
||
{:ok, status, _resp_headers, _body} -> | ||
Logger.error("Failed to index search items for #{package} #{version}: status=#{status}") | ||
|
||
{:error, reason} -> | ||
Logger.error("Failed to index search items #{package} #{version}: #{inspect(reason)}") | ||
end | ||
end | ||
|
||
@impl true | ||
def delete(package, version) do | ||
full_package = full_package(package, version) | ||
|
||
query = URI.encode_query([{"filter_by", "package:#{full_package}"}]) | ||
url = url("collections/#{collection()}/documents?" <> query) | ||
headers = [{"x-typesense-api-key", api_key()}] | ||
|
||
case HTTP.delete(url, headers) do | ||
{:ok, 200, _resp_headers, _body} -> | ||
:ok | ||
|
||
{:ok, status, _resp_headers, _body} -> | ||
Logger.error("Failed to delete search items for #{package} #{version}: status=#{status}") | ||
|
||
{:error, reason} -> | ||
Logger.error( | ||
"Failed to delete search items for #{package} #{version}: #{inspect(reason)}" | ||
) | ||
end | ||
end | ||
|
||
@spec collection :: String.t() | ||
def collection do | ||
Application.fetch_env!(:hexdocs, :typesense_collection) | ||
end | ||
|
||
@spec collection_schema :: map | ||
def collection_schema(collection \\ collection()) do | ||
%{ | ||
"fields" => [ | ||
%{"facet" => true, "name" => "proglang", "type" => "string"}, | ||
%{"facet" => true, "name" => "type", "type" => "string"}, | ||
%{"name" => "title", "type" => "string"}, | ||
%{"name" => "doc", "type" => "string"}, | ||
%{"facet" => true, "name" => "package", "type" => "string"} | ||
], | ||
"name" => collection, | ||
"token_separators" => [".", "_", "-", " ", ":", "@", "/"] | ||
} | ||
end | ||
|
||
@spec api_key :: String.t() | ||
def api_key do | ||
Application.fetch_env!(:hexdocs, :typesense_api_key) | ||
end | ||
|
||
defp full_package(package, version) do | ||
"#{package}-#{version}" | ||
end | ||
|
||
defp url(path) do | ||
base_url = Application.fetch_env!(:hexdocs, :typesense_url) | ||
Path.join(base_url, path) | ||
end | ||
end |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot trust this data since it's user provided, can they do anything dangerous by providing something we don't expect? Maybe we should do some rudimentary validation?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can provide long strings like https://github.com/cloudpods-dev/docker-engine-api-elixir/blob/813cc557da483f623a8f484db04efc7e58db0376/lib/docker_engine_api/api/container.ex#L67, but Typesense seems to handle it fine. We can check for content size, maybe. I think if Typesense doesn't like the payload, it would simply reject it.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test that checks that invalid fields in search items (like
type
being a map instead of a string, ordoc
being a list) are rejected: 8d58e4f