Skip to content

Commit 7645461

Browse files
committed
Cleanup files.
1 parent cf2fcc9 commit 7645461

File tree

7 files changed

+215
-1
lines changed

7 files changed

+215
-1
lines changed

.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
Keep
12
node_modules
23
node_modules
34
swish-bower-components.zip
@@ -14,3 +15,7 @@ https
1415
web/icons/noble
1516
TAGS
1617
yarn.lock
18+
.yarn-senitel
19+
web/js/swish-min-new.js
20+
web/js/swish-min-new.js.map
21+
swish-node-modules.zip

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ $(YARN_ARCHIVE)::
6161
upload::
6262
rm -f $(YARN_ARCHIVE)
6363
zip -r $(YARN_ARCHIVE) web/node_modules
64-
rsync $(YARN_ARCHIVE) ops:/home/swipl/web/download/swish/$(YARN_ARCHIVE)
64+
rsync $(YARN_ARCHIVE) plweb@oehoe:srv/plweb/data/download/swish/$(YARN_ARCHIVE)
6565

6666

6767
################

doc/DS-Blog.md

+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Introducing SWISH DataLab
2+
3+
The SWISH DataLab addresses one of the main bottlenecks of data science,
4+
bringing data from different sources together, cleaning and selecting
5+
this data. Most pipelines use a general purpose programming language
6+
such as Python to clean and ingest the data into a linked data store or
7+
RDBMS after which the relevant data is selected and applicable machine
8+
learning is applied. In contrast, SWISH data management is based on
9+
Prolog, a _relational_ and _logic_ based language. External data sources
10+
such as RDBMS systems, Linked Data, CSV files, XML files, JSON, etc. are
11+
made available using a mixture of _adaptors_ that make the data
12+
available in Prolog's relational model without transferring the data and
13+
_ingestion_, which loads the data into Prolog.
14+
15+
Subsequently, declarative rules are stated to define a clean and
16+
coherent view on the data that is targetted towards analysing this data.
17+
Due to the logic basis of Prolog this view is modular, concise and
18+
declarative, making it easy to maintain. SWI-Prolog's _tabling_
19+
extension provides the same termination properties as DataLog as well as
20+
the same order indepency of rules within the subset Prolog shares with
21+
DataLog. Tabling also provides _caching_ results. At the same time,
22+
users have access to the more general Prolog language to code
23+
transformations that are not supported by DataLog.
24+
25+
SWISH unites [SWI-Prolog](https://www.swi-prolog.org) and
26+
[R](https://www.r-project.org/) together behind a web based IDE that
27+
resembles [Jupyter](https://jupyter.org/) notebooks. This platform can
28+
be deployed on your laptop as well as on a server. The platform allows
29+
multiple data scientists to work on the same data simultaneously while
30+
rule sets can be reused and shared between users. This notably allows
31+
technical people to provide more complicated data transformation steps
32+
to domain experts. The platform can be configured to allow both
33+
authenticated users and anonymous users with limited access rights.
34+
Notebooks and programs are stored in a GIT-like repository and fully
35+
versioned. It is possible to create a snapshot of a query and all
36+
relevant programs for reliable reproduction of results. Data views
37+
defined in SWISH may be downloaded as CSV and can be accessed through a
38+
web based API.
39+
40+
Using Prolog for data integration, cleaning and modelling started life
41+
as a valorisation project within [COMMIT/](https://www.commit-nl.nl/). A
42+
web enabled version of SWI-Prolog was pioneered by [Torbjörn
43+
Lager](https://www.gu.se/english/about_the_university/staff/?languageId=100001&userId=xlagto)
44+
The combination of Prolog and R has been pioneered by Nicos Angelopoulos
45+
at the NKI (Dutch Cancer Institute) in the life sciences domain. SWISH
46+
is in use at CWI to analyse user behaviour based on HTTP log data from
47+
the Dutch national library (Koninklijke Bibliotheek). Samer Abdallah
48+
(University College London) uses SWISH for analysing music. The core of
49+
SWISH is under active development and heavily tested as a shared Prolog
50+
teaching environment.
51+
52+
Useful links:
53+
54+
- [Download SWISH from GitHub](https://github.com/SWI-Prolog/swish)
55+
- [SWISH and R for Docker](https://hub.docker.com/u/swipl)
56+
- [SWISH for Prolog teaching](https://swish.swi-prolog.org)
57+
- [SWISH DataLab: A Web Interface for Data Exploration and Analysis,
58+
BNAIC 2016](https://www.springerprofessional.de/en/swish-datalab-a-web-interface-for-data-exploration-and-analysis/15059986)
59+
60+
61+

doc/REPL.md

+104
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Clustered SWISH
2+
3+
## Syncing the gitty store
4+
5+
The gitty store is a directed graph of commits. Each commit is linked to
6+
a _data object_. Both commits and data objects are hashed by content and
7+
read-only. This implies they are easily replicated over the network. The
8+
replication takes two forms:
9+
10+
- A node may _announce_ an object by sending the objects content as
11+
a series of chunks.
12+
- A node may _request_ for an entire object or a missing object
13+
chunk. Receiving nodes that have the object will broadcast the
14+
missing object.
15+
16+
The real problem is updating the _head pointer_. This is a central
17+
database that defines the latest version of a file with a certain name.
18+
This notion must be syncronised. This is implemented as follows:
19+
20+
- A node asks the cluster for their current head.
21+
- If all nodes agree on the current head we are done, but some
22+
nodes may not have the indicated file.
23+
- If some nodes have no head, _announce_ the head
24+
- Else
25+
- Ask all nodes to produce a backward path of commits that
26+
includes all reported heads from the other nodes.
27+
- Work out the last common hash, possibly by majority vote.
28+
- Work out the changes since this common hash.
29+
- If nodes agree or have no info, fine
30+
- If nodes disagree, go with the majority.
31+
- Propose the new head to all nodes that agreed on the majority
32+
path. These nodes will _accept_ if nothing changed since their
33+
report, blocking further changes for a specified time.
34+
- If all accept, send a new head notion. Else restart from the
35+
beginning.
36+
37+
The above deals with a life cluster. Nodes that have missed a
38+
conversation or joined the network later may miss a file or the latest
39+
version of a file.
40+
41+
## Remote syncing
42+
43+
Remote syncing is necessary for both new cluster members and for cluster
44+
members that have been offline for some time.
45+
46+
- Find the node with most changes using a request.
47+
- Ask this node to start the process.
48+
- Each cluster member checks it has the change. If not, it starts
49+
a negotiation using gitty_remote_head/2.
50+
51+
## Profile management and login
52+
53+
FIXME
54+
55+
Remote sync of library(persistency)?
56+
57+
- Realise a distributed ledger of changes.
58+
- Apply these.
59+
60+
61+
- Add serial to each event
62+
- Broadcast them
63+
- Adding an event
64+
- Propos
65+
66+
67+
## Email notifications
68+
69+
FIXME
70+
71+
## Chat subsystem
72+
73+
### Maintain a global overview of visitor count
74+
75+
Visitor change messages cary a `local_visitors` and `visitors` field and
76+
are relayed. Nodes receiving such a message uses the `local_visitors` to
77+
update their count of visitors on that node. Nodes composing such a
78+
message count the local visitors and add the known totals from the other
79+
nodes.
80+
81+
### Subscribed files
82+
83+
WSID joining a file, leaving a file or logging out is broadcasted and
84+
each node maintains a view of the remote users by WSID.
85+
86+
FIXME: need to deal with joining nodes and missed updates.
87+
88+
### Profile changes
89+
90+
Profile changes, login, logout are sent to all nodes and each nodes
91+
sends them to the browsers that have the WSID watching some file.
92+
93+
### Chat syncing
94+
95+
- Find the last message of all nodes for DocID.
96+
- If Serial-ID matches, we are done
97+
- Else
98+
- Ask each node for the history as chat(Serial,ID,Time) triples.
99+
- Asses agreement (= no info or same)
100+
- If all agree, send an sync request for the serial range that
101+
is not known everywhere.
102+
- Else, send an agreement _serial_ and a list of Serial-ID
103+
pairs constructed from a chronologically ordered list of
104+
chat messages about which there is no agreement.

doc/Redis.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Running SWISH using Redis
2+
3+
## Background
4+
5+
- https://docs.gitlab.com/ee/administration/redis/replication_and_failover_external.html

TODO.md doc/TODO.md

File renamed without changes.

doc/impact.md

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
- Usage for swish.swi-prolog.org
2+
- Period: Oct 29 2017 - Nov 26 2017
3+
- Visitors: 41433
4+
- Unique visitors: 15375 (based on IP)
5+
- Queries: 738498
6+
- Community:
7+
- Google (Feb 7, 2018)
8+
- "link:swish.swi-prolog.org": 9010 results
9+
- SWISH Prolog: 26.800 results
10+
- GitHub: 6 contributors, 226 stars, 55 forks
11+
- Docker:
12+
- swipl/swish: 121 pulls
13+
- swipl/rserve: 43 pulls (R docker for use with SWISH)
14+
- Commercial use
15+
- Simularity (http://simularity.com/, satellite image analysis)
16+
- Public sites running SWISH with extended versions of Prolog
17+
- http://cplint.ml.unife.it/
18+
Machine learning and R support
19+
- http://lpsdemo.interprolog.com/
20+
"LPS is a logic and computer language for representing the thoughts
21+
and for controlling the behaviour of an intelligent machine situated
22+
in a changing world."
23+
- Publications
24+
- Torbjörn Lager, Jan Wielemaker:
25+
Pengines: Web Logic Programming Made Easy. TPLP 14
26+
- Jan Wielemaker, Torbjörn Lager, Fabrizio Riguzzi:
27+
SWISH: SWI-Prolog for Sharing. IULP 2015. Extended version submitted
28+
to TPLP (Theory and Practice of Logic Programming journal).
29+
- Veruska Zamborlini, Jan Wielemaker, Marcos Da Silveira, Cédric Pruski,
30+
Annette ten Teije, Frank van Harmelen: SWISH for Prototyping Clinical
31+
Guideline Interactions Theory. SWAT4LS 2016
32+
- Wouter Beek, Jan Wielemaker:
33+
SWISH: An Integrated Semantic Web Notebook. International
34+
Semantic Web Conference (Posters & Demos) 2016
35+
- Tessel Bogaard, Jan Wielemaker, Laura Hollink, Jacco van Ossenbruggen:
36+
SWISH DataLab: A Web Interface for Data Exploration and Analysis. BNCAI 2016
37+
- Marco Alberti, Elena Bellodi, Giuseppe Cota, Fabrizio Riguzzi,
38+
Riccardo Zese: cplint on SWISH: Probabilistic Logical Inference
39+
with a Web Browser. Intelligenza Artificiale

0 commit comments

Comments
 (0)