bak

cirosantilli · cirosantilli · commit 2e6589d1b203 · 2025-05-21T11:03:56.000+01:00
diff --git a/apple-inc.bigb b/apple-inc.bigb
@@ -56,6 +56,21 @@ https://en.wikipedia.org/w/index.php?title=Think_different&oldid=990983100#Telev
 {c}
 {parent=Apple Inc.}
 
+= Apple I
+{c}
+{parent=Apple Inc product}
+{title2=1976}
+{wiki}
+
+\Video[https://www.youtube.com/watch?v=h0UhmEOvU34]
+{title=Steve Jobs' Apple-1 sells for \$945k}
+
+= Apple II
+{c}
+{parent=Apple Inc product}
+{title2=1977}
+{wiki}
+
 = iPod
 {c}
 {parent=Apple Inc product}
diff --git a/artificial-intelligence.bigb b/artificial-intelligence.bigb
@@ -446,11 +446,29 @@ Term invented by <Ciro Santilli> to refer to problems that can only be solved on
 
 It is somewhat of a flawed analogy to <NP-complete>.
 
+= Polanyi's paradox
+{c}
+{parent=Artificial general intelligence}
+{title2=We can know more than we can tell}
+
+= Mechanistic interpretability
+{parent=Polanyi's paradox}
+{wiki}
+
+* https://x.com/aif_media/status/1923028051149062607
+
+= Interpretability
+{synonym}
+
 = AGI test
 {c}
 {parent=Artificial general intelligence}
 {wiki=https://en.wikipedia.org/w/index.php?title=Artificial_general_intelligence&oldid=1192191193#Tests_for_human-level_AGI}
 
+= AGI benchmark
+{c}
+{synonym}
+
 = CAPTCHA
 {c}
 {parent=AGI test}
@@ -853,6 +871,7 @@ The topic received some attention with the <AI boom> and rise of <LLMs>:
 * https://leanprover-community.github.io/archive/stream/219941-Machine-Learning-for-Theorem-Proving/topic/autoformalization.3F.html
 
 = AI Math benchmark
+{c}
 {parent=Automated theorem proving}
 {tag=Computer benchmark}
 
@@ -1345,6 +1364,10 @@ e.g.:
 ./ollama-expect llama3.2 'What is quantum field theory?'
 ``
 
+Benchmarks:
+* <Ciro Santilli's hardware/P14s>: 4.8s, CPU only
+* <Ciro Santilli's hardware/P51>: 9.6s, uses <NVIDIA> GPU
+
 = LLM benchmark
 {parent=Large language model}
 {tag=Computer benchmark}
@@ -1399,6 +1422,28 @@ This requires knowing that the probability that twins are born on different days
 
 Solutions to some of the problems on specific <LLMs> can be seen e.g. at: https://github.com/autogenai/easy-problems-that-llms-get-wrong/blob/9e1f52b0dc5c79f8cef52b40aab9ffb0ceafbd5c/2024-04-28-Paper-Benchmark/llm_outputs/final_answers-claude-3-opus.csv
 
+= List of LLM benchmarks
+{parent=LLM benchmark}
+
+= MMLU
+{c}
+{parent=List of LLM benchmarks}
+{title2=2020}
+{wiki}
+
+= Humanity's Last Exam
+{c}
+{parent=List of LLM benchmarks}
+{tag=AI Math benchmark}
+{title2=2025}
+{wiki}
+
+Contains highly specialized questions in various academic fields, including <mathematics>. The problems are answered either with a number, or multiple choice, or free text.
+
+* https://arxiv.org/abs/2501.1424
+* https://huggingface.co/datasets/cais/hle
+* https://agi.safe.ai/
+
 = Uncensored LLM
 {parent=Large language model}
 
diff --git a/cirosantilli.github.io.code-workspace b/cirosantilli.github.io.code-workspace
@@ -105,6 +105,7 @@
             "inclusionism",
             "Infineon",
             "Intermetallic",
+            "interpretability",
             "Ising",
             "Janelia",
             "Jundiaí",
diff --git a/machine-learning.bigb b/machine-learning.bigb
@@ -1103,42 +1103,56 @@ OK-ish data explorer: https://knowyourdata-tfds.withgoogle.com/#tab=STATS&datase
 \Image[http://web.archive.org/web/20230430064700im_/https://i.stack.imgur.com/qoTGE.png]
 {title=<MNIST> image 3 of a '1'}
 
-= Extract MNIST images
+= Extract <MNIST> images
 {c}
 {parent=MNIST database}
 
-= Extracting MNIST images
+= Extracting <MNIST> images
 {synonym}
 
 * https://stackoverflow.com/questions/40427435/extract-images-from-idx3-ubyte-file-or-gzip-via-python/75993239#75993239
 * https://stackoverflow.com/questions/55049511/how-to-download-mnist-images-as-pngs/75993252#75993252
 
-= Best algorithm for MNIST
+= Best algorithm for <MNIST>
 {c}
 {parent=MNIST database}
 
 The table: https://en.wikipedia.org/w/index.php?title=MNIST_database&oldid=1152541822#Classifiers
 
-= Fashion MNIST
+= Fashion <MNIST>
 {c}
 {parent=MNIST database}
 {title2=2017}
 {wiki}
 
-Same style as <MNIST>, but with clothes. Designed to be much harder, and more representative of modern applications, while still retaining the low resolution of <MNIST> for simplicity of training.
+Same style as <MNIST>: 28x28 grayscale images, but with clothes rather than hand written digits.
+
+It was designed to be much harder than <MNIST>, and more representative of modern applications, while still retaining the low resolution of <MNIST> for simplicity of training.
+
+\Image[https://web.archive.org/web/20250511105702im_/https://github.com/zalandoresearch/fashion-mnist/raw/master/doc/img/fashion-mnist-sprite.png]
 
 = CIFAR-10
 {c}
 {parent=Computer vision dataset}
 {wiki}
 
-60,000 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
+https://www.cs.toronto.edu/~kriz/cifar.html
+
+60,000 tiny 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
 
 TODO release date.
 
 This dataset can be thought of as an intermediate between the simplicity of <MNIST>, and a more full blown <ImageNet>.
 
-= Toronto faces dataset
+\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/airplane1.png]
+
+\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/automobile1.png]
+
+\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird1.png]
+
+\Image[https://web.archive.org/web/20250517192041im_/https://www.cs.toronto.edu/~kriz/cifar-10-sample/cat1.png]
+
+= #Toronto faces dataset
 {c}
 {parent=Computer vision dataset}
 {title2=TFD}
diff --git a/ourbigbook.json b/ourbigbook.json
@@ -1,4 +1,5 @@
 {
+  "generateSitemap": true,
   "ignoreConvert": [
     ".*\\.scss",
     "sponsor/updates/template.bigb"
@@ -27,6 +28,7 @@
     ["cirodown", "ourbigbook"]
   ],
   "publishCommitDate": "2000-01-01T00:00:00",
+  "publishRootUrl": "https://cirosantilli.com",
   "publishRemoteUrl": "git@github.com:cirosantilli/cirosantilli.github.io.git",
   "unsafeXss": true,
   "web": {
diff --git a/software.bigb b/software.bigb
@@ -145,6 +145,7 @@ Basically they require users to hand-code a metric and provide a program skeleto
 All the novel results they announced were in <constraint satisfaction problems> or <optimization problem>. Their results are still awesome, but it's not very different from <AlphaGo> style things.
 
 = AI code generation benchmark
+{c}
 {parent=Automatic programming}
 
 Bibliography:
@@ -167,7 +168,7 @@ Appears to be a very small number of newly created problems?
 * https://github.com/openai/human-eval
 * https://arxiv.org/abs/2107.03374
 
-The tests are present in a gzip inside the Git repo: https://github.com/openai/human-eval/blob/master/data/HumanEval.jsonl.gz these researchers.
+The tests are present in a gzip inside the Git repo: https://github.com/openai/human-eval/blob/master/data/HumanEval.jsonl.gz These researchers.
 
 To get a quick overview of the problems with <jq>:
 ``