forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] master from ggerganov:master #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
llama_cpp_canister allows you to run llama.cpp as a Smart Contract on the Internet Computer. The smart contract runs as WebAssembly in a so-called 'canister'.
Update the binding list by adding LM-Kit.NET (C# & VB.NET)
Co-authored-by: arthw <[email protected]>
* speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch
* llama : deprecate softmax sampler + fix dist sampler ggml-ci * tests : replace macros with functions ggml-ci * sampling : change temperature sampler logic For t <= 0.0f, keep the max logit intact and set the rest to -inf * cont : no need for special "greedy" logic top-k == 1 is the same * tests : init prob correctly * llama : handle temp <= 0.0 in the temp_ext sampler too ggml-ci * cont : avoid extra loop in temperature sampler for sub-zero temp ggml-ci
This commit updates the argument value hint for the `--attention` argument to `non-causal`. The motivation for this change is that the only values for this argument are `causal` and `non-causal`.
add PocketPal AI app
'eol' messes up the rendering with nvim v0.10.2 for some reason
This commit fixes two typos in the help text for the `--embd-normalize` and `--embd-separator` arguments. It also updates common.h which contain the same typo in two comments.
* [CANN] Adapt to dynamically loadable backends mechanism * Fix the Bug: inference running result is garbled in debug running model for LM models who's type is Q4_0 class * Handle the review comments of this pull request
* Add chat template for RWKV-World Signed-off-by: Molly Sophia <[email protected]> * RWKV: Fix the chat template not being used Signed-off-by: Molly Sophia <[email protected]> * RWKV v6: Set EOT token to ``\n\n`` Signed-off-by: Molly Sophia <[email protected]> * readme: add rwkv into supported model list Signed-off-by: Molly Sophia <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]>
* llama: remove useless template matching for rwkv-world Signed-off-by: Molly Sophia <[email protected]> * converter: Add comment about the hack for rwkv models Signed-off-by: Molly Sophia <[email protected]> * Update src/llama.cpp Co-authored-by: Xuan Son Nguyen <[email protected]> --------- Signed-off-by: Molly Sophia <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]>
This commit renames the member field batch in llm_build_context to ubatch, and also the parameter batch in llama_build_graph, and llama_set_inputs to ubatch. The motivation for this change is to make the code more readable (considering there are the structs llama_batch and llama_sbatch), and consistent with other parts of the code base where parameters/fields of type llama_ubatch are named ubatch.
* llama : fix empty batch cause llama_batch_allocr to crash * move batch_allocr inside decode/encode_internal * fix build * add GGML_ASSERT * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> --------- Co-authored-by: Georgi Gerganov <[email protected]>
Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09) → 'github:NixOS/nixpkgs/4c2fcb090b1f3e5b47eaa7bd33913b574a11e0a0?narHash=sha256-/uilDXvCIEs3C9l73JTACm4quuHUsIHcns1c%2BcHUJwA%3D' (2024-10-18)
* add pool_2d Signed-off-by: Junhee Yoo <[email protected]> * fix im2col and add unittest for N>=1024 Signed-off-by: Junhee Yoo <[email protected]> * add tests for N % 1024 != 0 Signed-off-by: Junhee Yoo <[email protected]> * remove trailing whitespaces Signed-off-by: Junhee Yoo <[email protected]> * apply suggestions Signed-off-by: Junhee Yoo <[email protected]> * apply more optimization - original IM2COL kernel + _ext with MIN() Signed-off-by: Junhee Yoo <[email protected]> * apply review: change kernel name of pool_2d Signed-off-by: Junhee Yoo <[email protected]> * apply review Signed-off-by: Junhee Yoo <[email protected]> * fix more formatting and enhance readability Signed-off-by: Junhee Yoo <[email protected]> --------- Signed-off-by: Junhee Yoo <[email protected]>
* added classic vim support * fixed ring update, removed blank line * minor * minor * minor doc update * removed uneeded var * minor * minor * fixed job_start creating new scratch buffers * fixed job_start creating new scratch buffers * fixed ghost text indenting when expandtab is on * removed unused code * minor * unified fim_on_exit * minor * vim ghost text rendering now uses pos_x and pos_y parameters * renamed *_hlgroup to hlgroup_* * renamed *_ghost_text to ghost_text_*, moved nvim/vim detection to llama#init() * minor --------- Co-authored-by: Michael Coppola <[email protected]>
This commit removes the setting of the `used` field of the contexts in the global state (g_state) in `ggml_init`. The motivation for this change is that I believe that this additional initialization might not be required after the changes in Commit 45fc4fe ("sync : latest changes from whisper.cpp"), which changed the initialization of the contexts field from `{ 0 }` to `{ { 0 } }`: ```console g_state = (struct ggml_state) { - /*.contexts =*/ { 0 }, + /*.contexts =*/ { { 0 } }, }; ``` My understanding is that the `{0}` initialization might not have zero-initialized all the nested fields in every array element because of compiler differences, and might have been the reason for having the explicit setting of the `used` fields to false.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )