Skip to content

Commit 2e6589d

Browse files
committed
bak
1 parent 36bca31 commit 2e6589d

File tree

6 files changed

+86
-8
lines changed

6 files changed

+86
-8
lines changed

apple-inc.bigb

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,21 @@ https://en.wikipedia.org/w/index.php?title=Think_different&oldid=990983100#Telev
5656
{c}
5757
{parent=Apple Inc.}
5858

59+
= Apple I
60+
{c}
61+
{parent=Apple Inc product}
62+
{title2=1976}
63+
{wiki}
64+
65+
\Video[https://www.youtube.com/watch?v=h0UhmEOvU34]
66+
{title=Steve Jobs' Apple-1 sells for \$945k}
67+
68+
= Apple II
69+
{c}
70+
{parent=Apple Inc product}
71+
{title2=1977}
72+
{wiki}
73+
5974
= iPod
6075
{c}
6176
{parent=Apple Inc product}

artificial-intelligence.bigb

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -446,11 +446,29 @@ Term invented by <Ciro Santilli> to refer to problems that can only be solved on
446446

447447
It is somewhat of a flawed analogy to <NP-complete>.
448448

449+
= Polanyi's paradox
450+
{c}
451+
{parent=Artificial general intelligence}
452+
{title2=We can know more than we can tell}
453+
454+
= Mechanistic interpretability
455+
{parent=Polanyi's paradox}
456+
{wiki}
457+
458+
* https://x.com/aif_media/status/1923028051149062607
459+
460+
= Interpretability
461+
{synonym}
462+
449463
= AGI test
450464
{c}
451465
{parent=Artificial general intelligence}
452466
{wiki=https://en.wikipedia.org/w/index.php?title=Artificial_general_intelligence&oldid=1192191193#Tests_for_human-level_AGI}
453467

468+
= AGI benchmark
469+
{c}
470+
{synonym}
471+
454472
= CAPTCHA
455473
{c}
456474
{parent=AGI test}
@@ -853,6 +871,7 @@ The topic received some attention with the <AI boom> and rise of <LLMs>:
853871
* https://leanprover-community.github.io/archive/stream/219941-Machine-Learning-for-Theorem-Proving/topic/autoformalization.3F.html
854872

855873
= AI Math benchmark
874+
{c}
856875
{parent=Automated theorem proving}
857876
{tag=Computer benchmark}
858877

@@ -1345,6 +1364,10 @@ e.g.:
13451364
./ollama-expect llama3.2 'What is quantum field theory?'
13461365
``
13471366

1367+
Benchmarks:
1368+
* <Ciro Santilli's hardware/P14s>: 4.8s, CPU only
1369+
* <Ciro Santilli's hardware/P51>: 9.6s, uses <NVIDIA> GPU
1370+
13481371
= LLM benchmark
13491372
{parent=Large language model}
13501373
{tag=Computer benchmark}
@@ -1399,6 +1422,28 @@ This requires knowing that the probability that twins are born on different days
13991422

14001423
Solutions to some of the problems on specific <LLMs> can be seen e.g. at: https://github.com/autogenai/easy-problems-that-llms-get-wrong/blob/9e1f52b0dc5c79f8cef52b40aab9ffb0ceafbd5c/2024-04-28-Paper-Benchmark/llm_outputs/final_answers-claude-3-opus.csv
14011424

1425+
= List of LLM benchmarks
1426+
{parent=LLM benchmark}
1427+
1428+
= MMLU
1429+
{c}
1430+
{parent=List of LLM benchmarks}
1431+
{title2=2020}
1432+
{wiki}
1433+
1434+
= Humanity's Last Exam
1435+
{c}
1436+
{parent=List of LLM benchmarks}
1437+
{tag=AI Math benchmark}
1438+
{title2=2025}
1439+
{wiki}
1440+
1441+
Contains highly specialized questions in various academic fields, including <mathematics>. The problems are answered either with a number, or multiple choice, or free text.
1442+
1443+
* https://arxiv.org/abs/2501.1424
1444+
* https://huggingface.co/datasets/cais/hle
1445+
* https://agi.safe.ai/
1446+
14021447
= Uncensored LLM
14031448
{parent=Large language model}
14041449

cirosantilli.github.io.code-workspace

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@
105105
"inclusionism",
106106
"Infineon",
107107
"Intermetallic",
108+
"interpretability",
108109
"Ising",
109110
"Janelia",
110111
"Jundiaí",

machine-learning.bigb

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1103,42 +1103,56 @@ OK-ish data explorer: https://knowyourdata-tfds.withgoogle.com/#tab=STATS&datase
11031103
\Image[http://web.archive.org/web/20230430064700im_/https://i.stack.imgur.com/qoTGE.png]
11041104
{title=<MNIST> image 3 of a '1'}
11051105

1106-
= Extract MNIST images
1106+
= Extract <MNIST> images
11071107
{c}
11081108
{parent=MNIST database}
11091109

1110-
= Extracting MNIST images
1110+
= Extracting <MNIST> images
11111111
{synonym}
11121112

11131113
* https://stackoverflow.com/questions/40427435/extract-images-from-idx3-ubyte-file-or-gzip-via-python/75993239#75993239
11141114
* https://stackoverflow.com/questions/55049511/how-to-download-mnist-images-as-pngs/75993252#75993252
11151115

1116-
= Best algorithm for MNIST
1116+
= Best algorithm for <MNIST>
11171117
{c}
11181118
{parent=MNIST database}
11191119

11201120
The table: https://en.wikipedia.org/w/index.php?title=MNIST_database&oldid=1152541822#Classifiers
11211121

1122-
= Fashion MNIST
1122+
= Fashion <MNIST>
11231123
{c}
11241124
{parent=MNIST database}
11251125
{title2=2017}
11261126
{wiki}
11271127

1128-
Same style as <MNIST>, but with clothes. Designed to be much harder, and more representative of modern applications, while still retaining the low resolution of <MNIST> for simplicity of training.
1128+
Same style as <MNIST>: 28x28 grayscale images, but with clothes rather than hand written digits.
1129+
1130+
It was designed to be much harder than <MNIST>, and more representative of modern applications, while still retaining the low resolution of <MNIST> for simplicity of training.
1131+
1132+
\Image[https://web.archive.org/web/20250511105702im_/https://github.com/zalandoresearch/fashion-mnist/raw/master/doc/img/fashion-mnist-sprite.png]
11291133

11301134
= CIFAR-10
11311135
{c}
11321136
{parent=Computer vision dataset}
11331137
{wiki}
11341138

1135-
60,000 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
1139+
https://www.cs.toronto.edu/~kriz/cifar.html
1140+
1141+
60,000 tiny 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
11361142

11371143
TODO release date.
11381144

11391145
This dataset can be thought of as an intermediate between the simplicity of <MNIST>, and a more full blown <ImageNet>.
11401146

1141-
= Toronto faces dataset
1147+
\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/airplane1.png]
1148+
1149+
\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/automobile1.png]
1150+
1151+
\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird1.png]
1152+
1153+
\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/cat1.png]
1154+
1155+
= #Toronto faces dataset
11421156
{c}
11431157
{parent=Computer vision dataset}
11441158
{title2=TFD}

ourbigbook.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
{
2+
"generateSitemap": true,
23
"ignoreConvert": [
34
".*\\.scss",
45
"sponsor/updates/template.bigb"
@@ -27,6 +28,7 @@
2728
["cirodown", "ourbigbook"]
2829
],
2930
"publishCommitDate": "2000-01-01T00:00:00",
31+
"publishRootUrl": "https://cirosantilli.com",
3032
"publishRemoteUrl": "[email protected]:cirosantilli/cirosantilli.github.io.git",
3133
"unsafeXss": true,
3234
"web": {

software.bigb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,7 @@ Basically they require users to hand-code a metric and provide a program skeleto
145145
All the novel results they announced were in <constraint satisfaction problems> or <optimization problem>. Their results are still awesome, but it's not very different from <AlphaGo> style things.
146146

147147
= AI code generation benchmark
148+
{c}
148149
{parent=Automatic programming}
149150

150151
Bibliography:
@@ -167,7 +168,7 @@ Appears to be a very small number of newly created problems?
167168
* https://github.com/openai/human-eval
168169
* https://arxiv.org/abs/2107.03374
169170

170-
The tests are present in a gzip inside the Git repo: https://github.com/openai/human-eval/blob/master/data/HumanEval.jsonl.gz these researchers.
171+
The tests are present in a gzip inside the Git repo: https://github.com/openai/human-eval/blob/master/data/HumanEval.jsonl.gz These researchers.
171172

172173
To get a quick overview of the problems with <jq>:
173174
``

0 commit comments

Comments
 (0)