Scripts run using PHP. They run in sequence and outputs to file.
All PDFs are cached in this Git repo. So step 2 or step 3 does not require any download.
The summary pages is her:
The JSON files can be seen here:
php
pdftotext
(step 1 / step 1.2 only)
Ubuntu:
apt install php-cli poppler-utils
php 1-valgprotokoll-download.php
- Reads from urls.txt. Downloads PDFs. Read to txt ()
php 1.2-valgprotokoll-elections-no.php
- Reads PDFs in elections.no git repo. Updates Git submodule in PHP script (
git submodule update --remote elections-no.github.io
)
php 2-valgprotokoll-parser.php
- Parses all txt files generated by step 1 / step 1.2. Outputs JSON.
- Will ignore any files with errors. Can be turned off with:
php 2-valgprotokoll-parser.php throw
php 3-valgprotokoll-html-report.php
- Created HTML from JSON ouput in step 2.
- Search.
- Open dev tools and run the following:
var list = '';
for (var i = 0; i < a.length; i++) {
var that = a[i];
console.log(that);
if(
that.href.indexOf('google.com') === -1
&& that.href.indexOf('google.no') === -1
&& that.href.indexOf('youtube.com') === -1
&& that.href.indexOf('blogger.com') === -1
&& that.href.indexOf('googleusercontent.com') === -1
&& that.href.length > 2) {
list += "\n" + that.href;
}
}
console.log(list + "\n");
- Browse to next page and redo.