ExLlamav2_HF: Convert logits to FP32 #4310

turboderp · 2023-10-17T19:54:01Z

ExLlama converts logits to FP32 at the end of the forward pass, while ExLlamaV2 returns FP16 logits straight from the lm_head layer. Converting in the V2 loader makes sure the HF sampler behavior the same for V1 and V2.

Checklist:

I have read the Contributing guidelines.

# Conflicts: # modules/exllamav2.py # requirements.txt # requirements_amd.txt # requirements_amd_noavx2.txt # requirements_apple_intel.txt # requirements_apple_silicon.txt # requirements_cpu_only.txt # requirements_cpu_only_noavx2.txt # requirements_noavx2.txt # requirements_nowheels.txt

Ph0rk0z · 2023-10-18T11:20:27Z

PPL on evaluation went down slightly after this.

oobabooga · 2023-10-19T02:15:59Z

Thanks @turboderp, I would never have noticed this.

turboderp and others added 7 commits October 5, 2023 17:50

Bump exllamav2 to 0.0.5

c0975e0

Allow configuring BOS token for ExLlamaV2Model

68f90f7

Merge branch 'main' into turboderp-main

186a02d

Update other requirements.txt

fd46da3

add_bos_token is also used by exllamav1

55fbf9c

ExLamav2_HF: convert logits to FP32 before sampling

28f35a9

oobabooga merged commit ae8cd44 into oobabooga:main Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ExLlamav2_HF: Convert logits to FP32 #4310

ExLlamav2_HF: Convert logits to FP32 #4310

Uh oh!

turboderp commented Oct 17, 2023

Uh oh!

Ph0rk0z commented Oct 18, 2023

Uh oh!

oobabooga commented Oct 19, 2023

Uh oh!

Uh oh!

ExLlamav2_HF: Convert logits to FP32 #4310

ExLlamav2_HF: Convert logits to FP32 #4310

Uh oh!

Conversation

turboderp commented Oct 17, 2023

Checklist:

Uh oh!

Ph0rk0z commented Oct 18, 2023

Uh oh!

oobabooga commented Oct 19, 2023

Uh oh!

Uh oh!