[pull] master from ggerganov:master #148

pull · 2024-10-23T22:00:38Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.

Update the binding list by adding LM-Kit.NET (C# & VB.NET)

Co-authored-by: arthw <[email protected]>

* speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch

* llama : deprecate softmax sampler + fix dist sampler ggml-ci * tests : replace macros with functions ggml-ci * sampling : change temperature sampler logic For t <= 0.0f, keep the max logit intact and set the rest to -inf * cont : no need for special "greedy" logic top-k == 1 is the same * tests : init prob correctly * llama : handle temp <= 0.0 in the temp_ext sampler too ggml-ci * cont : avoid extra loop in temperature sampler for sub-zero temp ggml-ci

ggml-ci

This commit updates the argument value hint for the `--attention` argument to `non-causal`. The motivation for this change is that the only values for this argument are `causal` and `non-causal`.

add PocketPal AI app

'eol' messes up the rendering with nvim v0.10.2 for some reason

This commit fixes two typos in the help text for the `--embd-normalize` and `--embd-separator` arguments. It also updates common.h which contain the same typo in two comments.

* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request

* Add chat template for RWKV-World Signed-off-by: Molly Sophia <[email protected]> * RWKV: Fix the chat template not being used Signed-off-by: Molly Sophia <[email protected]> * RWKV v6: Set EOT token to ``\n\n`` Signed-off-by: Molly Sophia <[email protected]> * readme: add rwkv into supported model list Signed-off-by: Molly Sophia <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]>

* llama: remove useless template matching for rwkv-world Signed-off-by: Molly Sophia <[email protected]> * converter: Add comment about the hack for rwkv models Signed-off-by: Molly Sophia <[email protected]> * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>

This commit renames the member field batch in llm_build_context to ubatch, and also the parameter batch in llama_build_graph, and llama_set_inputs to ubatch. The motivation for this change is to make the code more readable (considering there are the structs llama_batch and llama_sbatch), and consistent with other parts of the code base where parameters/fields of type llama_ubatch are named ubatch.

* llama : fix empty batch cause llama_batch_allocr to crash * move batch_allocr inside decode/encode_internal * fix build * add GGML_ASSERT * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>

Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09) → 'github:NixOS/nixpkgs/4c2fcb090b1f3e5b47eaa7bd33913b574a11e0a0?narHash=sha256-/uilDXvCIEs3C9l73JTACm4quuHUsIHcns1c%2BcHUJwA%3D' (2024-10-18)

* add pool_2d Signed-off-by: Junhee Yoo <[email protected]> * fix im2col and add unittest for N>=1024 Signed-off-by: Junhee Yoo <[email protected]> * add tests for N % 1024 != 0 Signed-off-by: Junhee Yoo <[email protected]> * remove trailing whitespaces Signed-off-by: Junhee Yoo <[email protected]> * apply suggestions Signed-off-by: Junhee Yoo <[email protected]> * apply more optimization - original IM2COL kernel + _ext with MIN() Signed-off-by: Junhee Yoo <[email protected]> * apply review: change kernel name of pool_2d Signed-off-by: Junhee Yoo <[email protected]> * apply review Signed-off-by: Junhee Yoo <[email protected]> * fix more formatting and enhance readability Signed-off-by: Junhee Yoo <[email protected]> --------- Signed-off-by: Junhee Yoo <[email protected]>

* added classic vim support * fixed ring update, removed blank line * minor * minor * minor doc update * removed uneeded var * minor * minor * fixed job_start creating new scratch buffers * fixed job_start creating new scratch buffers * fixed ghost text indenting when expandtab is on * removed unused code * minor * unified fim_on_exit * minor * vim ghost text rendering now uses pos_x and pos_y parameters * renamed *_hlgroup to hlgroup_* * renamed *_ghost_text to ghost_text_*, moved nvim/vim detection to llama#init() * minor --------- Co-authored-by: Michael Coppola <[email protected]>

This commit removes the setting of the `used` field of the contexts in the global state (g_state) in `ggml_init`. The motivation for this change is that I believe that this additional initialization might not be required after the changes in Commit 45fc4fe ("sync : latest changes from whisper.cpp"), which changed the initialization of the contexts field from `{ 0 }` to `{ { 0 } }`: ```console g_state = (struct ggml_state) { - /*.contexts =*/ { 0 }, + /*.contexts =*/ { { 0 } }, }; ``` My understanding is that the `{0}` initialization might not have zero-initialized all the nested fields in every array element because of compiler differences, and might have been the reason for having the explicit setting of the `used` fields to false.

icppWorld and others added 27 commits October 20, 2024 19:01

readme : update infra list (#9942)

7cab208

llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.

readme : update bindings list (#9951)

45f0976

Update the binding list by adding LM-Kit.NET (C# & VB.NET)

fix mul_mat_vec_q and *_vec_q error (#9939)

1db8c84

Co-authored-by: arthw <[email protected]>

speculative : fix handling of some input params (#9963)

bc21975

* speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch

rpc : pack only RPC structs (#9959)

d5ebd79

ggml : add asserts for type conversion in fattn kernels (#9971)

f594bc8

ggml-ci

llama.vim : plugin for Neovim (#9787)

dbd5f2f

arg : fix attention non-causal arg value hint (#9985)

94008cc

This commit updates the argument value hint for the `--attention` argument to `non-causal`. The motivation for this change is that the only values for this argument are `causal` and `non-causal`.

readme : update UI list (#9972)

994cfb1

add PocketPal AI app

llama.vim : move info to the right of screen [no ci] (#9787)

e01c67a

'eol' messes up the rendering with nvim v0.10.2 for some reason

llama.vim : fix info text display [no ci] (#9787)

e94a138

arg : fix typo in embeddings argument help [no ci] (#9994)

674804a

This commit fixes two typos in the help text for the `--embd-normalize` and `--embd-separator` arguments. It also updates common.h which contain the same typo in two comments.

[CANN] Adapt to dynamically loadable backends mechanism (#9970)

6b84473

* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request

lora : warn user if new token is added in the adapter (#9948)

c421ac0

CUDA: fix 1D im2col, add tests (ggml/993)

80273a3

llama.vim : bump generation time limit to 3s [no ci]

2d3aba9

sync : ggml

190a37d

server : samplers accept the prompt correctly (#10019)

0a1c750

github-actions bot added examples python server labels Oct 23, 2024

github-actions bot added ggml SYCL Nvidia GPU testing script labels Oct 23, 2024

pull bot added ⤵️ pull and removed examples python server ggml SYCL Nvidia GPU testing script labels Oct 23, 2024

teleprint-me closed this Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] master from ggerganov:master #148

[pull] master from ggerganov:master #148

Uh oh!

pull bot commented Oct 23, 2024 •

edited

Loading

Uh oh!

Uh oh!

[pull] master from ggerganov:master #148

[pull] master from ggerganov:master #148

Uh oh!

Conversation

pull bot commented Oct 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pull bot commented Oct 23, 2024 •

edited

Loading