Skip to content

Commit 658127e

Browse files
committed
Documentation and two redis config files.
1 parent 4ab08d4 commit 658127e

File tree

4 files changed

+298
-1
lines changed

4 files changed

+298
-1
lines changed

config-available/redis_sentinel.pl

+133
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
/* Part of SWISH
2+
3+
Author: Jan Wielemaker
4+
5+
WWW: http://www.swi-prolog.org
6+
Copyright (C): 2020-2024, SWI-Prolog Solutions b.v.
7+
All rights reserved.
8+
9+
Redistribution and use in source and binary forms, with or without
10+
modification, are permitted provided that the following conditions
11+
are met:
12+
13+
1. Redistributions of source code must retain the above copyright
14+
notice, this list of conditions and the following disclaimer.
15+
16+
2. Redistributions in binary form must reproduce the above copyright
17+
notice, this list of conditions and the following disclaimer in
18+
the documentation and/or other materials provided with the
19+
distribution.
20+
21+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
22+
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
23+
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
24+
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
25+
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
26+
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
27+
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
28+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
30+
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
31+
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
32+
POSSIBILITY OF SUCH DAMAGE.
33+
*/
34+
35+
:- module(config_redis, []).
36+
% Edit with Redis consumer id. Make sure each member of the cluster
37+
% has a unique id.
38+
swish:swish_node('<unassigned>').
39+
40+
/** <module> Configure Redis
41+
42+
SWISH may be configured to use the Redis key-value store for the various
43+
databases. This allows multiple SWISH instances to act as a cluster.
44+
45+
Typically the configuration needs to be edited in two places:
46+
47+
- redis_server/3 must be called to address the Redis server
48+
- redis_consumer may be set to identify this instance. The
49+
default is derived from the host name and port to which
50+
this SWISH instance listens. Clusters are advised to
51+
assign a stable name to each cluster member.
52+
*/
53+
54+
:- multifile swish_config:config/2.
55+
56+
% Do not activate if we run SWISH in _ide_ mode
57+
:- if(\+swish_config:config(ide,true)).
58+
59+
:- use_module(swish(lib/config), []).
60+
:- use_module(library(redis)).
61+
:- use_module(library(settings)).
62+
:- use_module(swish('config-available/user_profile')).
63+
:- use_module(library(profile/backend/profile_redis), []).
64+
:- use_module(library(http/http_session)).
65+
:- use_module(library(http/http_redis_plugin)).
66+
67+
:- initialization
68+
redis_servers.
69+
70+
% Edit. List the sentinels. Normally we list all of them. If there are
71+
% more though, they will be picked up as long as we can get hold of at
72+
% least one from the list below. The second argument is the Redis
73+
% consumer. We use this to find a working replicator that has the same
74+
% Redis consumer and thus (hopefully) has the lowest latency. When
75+
% found, this is configured as "read only" Redis DB.
76+
sentinel('1.2.3.4':26379, node1).
77+
sentinel('1.2.3.5':26379, node2).
78+
sentinel('1.2.3.6':26379, node3).
79+
80+
% edit: passwords. Note that the sentinel and redis passwords
81+
% can be different.
82+
redis_connect_options(
83+
[ user(swish),
84+
password("********"),
85+
version(3),
86+
tls(true),
87+
cacert('config-enabled/etc/redis/ca.crt'),
88+
key('config-enabled/etc/redis/client.key'),
89+
cert('config-enabled/etc/redis/client.cert'),
90+
sentinels(Sentinels),
91+
sentinel_user(query),
92+
sentinel_password("********")
93+
]) :-
94+
findall(Sentinel, sentinel(Sentinel, _), Sentinels).
95+
96+
redis_servers :-
97+
redis_master_server(swish),
98+
redis_ro_server(swish, swish_ro).
99+
100+
redis_master_server(ServerId) :-
101+
redis_connect_options(Options),
102+
redis_server(ServerId, sentinel(swipl), Options).
103+
104+
:- dynamic ro_server/1 as volatile.
105+
106+
redis_ro_server(Master, SlaveServerId) :-
107+
redis_connect_options(Options),
108+
swish:swish_node(Consumer),
109+
sentinel(IP:_, Consumer),
110+
sentinel_slave(Master, swipl, Slave, Options),
111+
IP == Slave.ip,
112+
Port = Slave.port,
113+
redis_server(SlaveServerId, IP:Port, Options),
114+
assertz(ro_server(SlaveServerId)).
115+
redis_ro_server(_, _).
116+
117+
swish_config:config(redis_ro, Server) :-
118+
ro_server(Server).
119+
swish_config:config(redis, swish).
120+
swish_config:config(redis_prefix, swish).
121+
swish_config:config(redis_consumer, Consumer) :-
122+
swish:swish_node(Consumer).
123+
124+
:- set_setting(user_profile:redis_server, swish).
125+
:- set_setting(user_profile:redis_prefix, 'swish:profiles').
126+
:- set_setting(user_profile:backend, impl_profile_redis).
127+
:- set_setting(user_profile:session_persistency, true).
128+
129+
:- http_set_session_options([ redis_db(swish),
130+
redis_prefix('swish:http:session')
131+
]).
132+
133+
:- endif. % \+swish_config:config(ide,true)
File renamed without changes.

doc/PublicInstall.md

+164
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Running SWISH for large user communities
2+
3+
By default, SWISH uses the local filesystem to store its data. All
4+
data is stored in a directory `data`. Data such as chats, user
5+
profile, etc are stored using SWI-Prolog's `library(persistency)` as
6+
Prolog terms. Files are saved in _gitty_ format as files to the
7+
subdirectory `data`. The _gitty_ format is based on GIT, saving data
8+
based on the SHA1 hash of the content and linking versions of files
9+
together as _commits_. This format only allows a single SWISH server.
10+
11+
## Using Redis
12+
13+
Alternatively, a [redis](https://redis.io/) database can be used that
14+
stores the chats, user profile data and the _HEAD_ commit for each
15+
file. The files themselves are still stored in the `data/storage`
16+
directory using the same format. This setup uses Redis _streams_ to
17+
broadcast events of interest to all SWISH nodes in the cluster. These
18+
events are used to make chat work over the cluster members and to
19+
replicate new objects for the `storage` directory:
20+
21+
- If an object (hashed document) is added to the store, the cluster
22+
is informed. Each cluster member stores the object.
23+
- If a cluster member is down when the object is added, it will
24+
find the most recent HEAD commit hash from the Redis DB. If
25+
the referenced object is not in its store, it broadcasts a
26+
_discovery request_ for the hash. The cluster member that
27+
has a copy of this document will repost it.
28+
29+
To deploy multiple SWISH instances using this setup, one must first
30+
setup a Redis DB. Currently SWI-Prolog's redis library supports three
31+
of the four redis configurations:
32+
33+
- Single node. This is easy, but vulnerable to data loss. The
34+
single node also easily becomes a bottleneck.
35+
- Single node with static replicators. This avoids data loss.
36+
It also allows SWISH instances to configure Redis write operations
37+
to use the master and read operations to use a nearby replicator.
38+
No files can be saved if the master goes down.
39+
- High availability cluster. In this case a set of redis nodes are
40+
monitored by a set of _sentinels_, minimally 3. The network
41+
operates as above, but if the sentinels discover that the master
42+
is down, they elect a new master and reconfigure the network.
43+
The SWISH client asks the sentinels for the current configuration.
44+
If the client gets Redis errors it will reconsult the sentinels and
45+
reconnect to the possibly changed configuration.
46+
47+
### Getting the Redis network up
48+
49+
Getting the cluster up consists of these steps:
50+
51+
1. Decide on the Redis configuration to use and configure the
52+
Redis Db (cluster).
53+
2. Select a matching SWISH Redis configuration, copy it from
54+
`config-available` to `config-enabled` and edit to suit
55+
your setup. Available configurations:
56+
57+
- `config-available/redis_simple.pl` <br>
58+
Simple single node clear-text connection. Only use on trusted
59+
networks!
60+
- `config-available/redis_sentinel.pl`
61+
Sentinel cluster using TLS for establishing secure connections.
62+
3. Bring up SWISH
63+
4. If you have old data, you may use `lib/redis_transfer.pl` to transfer
64+
the existing data from the Prolog .db files to populate the Redis DB.
65+
5. Add new nodes. Make sure to edit the _Redis consumer_ in each
66+
enabled configuration.
67+
68+
After the above, we have multiple SWISH instances that share the
69+
files, user profiles, chat and HTTP session management. Each node
70+
needs to be contacted at its own address though.
71+
72+
## Making the nodes operate as a single service
73+
74+
To make the nodes accessible as a single service we need some form of
75+
session aware load balancing. There are many ways to do this. The
76+
public site using an [nginx](https://www.nginx.com/) instance as
77+
_reverse proxy_. The nginx _upstream_ mechanism with policy `ip_hash`
78+
is used for load balancing. The skeleton setup is below. Details,
79+
such as error pages, HTTPS configuration, etc. are left out.
80+
81+
```
82+
upstream swish {
83+
ip_hash;
84+
server <url1>
85+
server <url2>
86+
...
87+
}
88+
89+
server {
90+
location / {
91+
proxy_pass https://swish;
92+
proxy_http_version 1.1;
93+
proxy_buffering off;
94+
client_body_buffer_size 100k;
95+
proxy_cache off;
96+
proxy_set_header Host $host
97+
proxy_set_header X-Real-IP $remote_addr;
98+
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
99+
proxy_read_timeout 86400;
100+
}
101+
102+
location /chat {
103+
proxy_pass https://swish;
104+
proxy_http_version 1.1;
105+
proxy_set_header Host $host
106+
proxy_set_header Upgrade $http_upgrade;
107+
proxy_set_header Connection "upgrade";
108+
proxy_set_header X-Real-IP $remote_addr;
109+
proxy_read_timeout 86400;
110+
}
111+
}
112+
```
113+
114+
## Many files need better search: enable Elastic search
115+
116+
File search in the file tab and top-right search box by default using
117+
simple Prolog search. This works great with a few hundreds of files,
118+
but above it gets too expensive to walk over each file HEAD commit and
119+
possibly its content.
120+
121+
For this purpose we provided an [Elastic](https://www.elastic.co/)
122+
plugin. The setup works as follows:
123+
124+
1. Launch Elastic
125+
2. Copy `config-available/elastic.pl` to `config-enabled` and edit
126+
the location of the Elastic instance and the connection details
127+
(password, certificates).
128+
3. Use `lib/plugin/es_swish.pl` (loaded by `config-enabled/elastic.pl`) to
129+
1. Create the Elastic index using `?- es_create_index.`
130+
2. Run `?- es_add(0, 1 000 000).` to populate the index. The
131+
arguments are offset and limit, so you can do the job in
132+
batches to see how it goes.
133+
134+
After enabling, new documents are automatically added to the index
135+
when they are saved. If there has been a disruption or you can update
136+
the index with all documents added or modified using `es_add_since/1`,
137+
which takes the number of seconds to look back. So, to add documents
138+
for the past week, use:
139+
140+
?- es_add_since(7*24*3600).
141+
142+
## Deployment hints
143+
144+
The public instance runs on docker images created using the Dockerfile
145+
from the GIT repo below.
146+
147+
- https://github.com/SWI-Prolog/docker-swish-public.git
148+
149+
The docker version deploys the `libssh` pack that allows logging into
150+
the running SWISH server using SSH. This is used notably for running
151+
the maintenance commands above.
152+
153+
We maintain the `config-enabled` directory of each node as a git
154+
repository that is a clone of a version maintain on the machine from
155+
which we control all instances. The version at the node is checked
156+
out on a branch with a commit that reflects the local differences
157+
(consumer in `redis.pl` and the public network address in `network.pl`
158+
if each server can also be accessed explicitly). To update the
159+
configuration we
160+
161+
1. Edit it on the maintenance machine
162+
2. Run `git push <remote> master` to update the remote master
163+
3. On the node run `git rebase master` to update the node config
164+
Do a 4. Restart SWISH.

lib/plugin/es_swish.pl

+1-1
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
:- module(es_swish,
3636
[ es_create_index/0,
3737
es_add_file/1, % +File
38-
es_add/2, % +Offset, +Limit +Count
38+
es_add/2, % +Offset, +Limit
3939
es_add_since/1, % +Time
4040
es_query/2
4141
]).

0 commit comments

Comments
 (0)