Skip to content

Commit 008d95c

Browse files
Update noob_intro_transformers.md
1 parent 0bf9001 commit 008d95c

File tree

1 file changed

+19
-12
lines changed

1 file changed

+19
-12
lines changed

noob_intro_transformers.md

+19-12
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,17 @@ authors:
77

88
# Total noob’s intro to Hugging Face Transformers
99

10-
Welcome to "A Total Noob’s Introduction to Hugging Face Transformers," a guide designed specifically for those looking to understand the bare basics of using open-source ML. Our goal is to demystify what Hugging Face Transformers is and how it works, not to turn you into a machine learning practitioner, but to enable better understanding of and collaboration with those who are. That being said, the best way to learn is by doing, so we'll walk through a simple worked example of running Google’s new Gemma 2B LLM in a notebook in a Hugging Face space.
10+
Welcome to "A Total Noob’s Introduction to Hugging Face Transformers," a guide designed specifically for those looking to understand the bare basics of using open-source ML. Our goal is to demystify what Hugging Face Transformers is and how it works, not to turn you into a machine learning practitioner, but to enable better understanding of and collaboration with those who are. That being said, the best way to learn is by doing, so we'll walk through a simple worked example of running Microsoft’s Phi-2 LLM in a notebook on a Hugging Face space.
1111

1212
You might wonder, with the abundance of tutorials on Hugging Face already available, why create another? The answer lies in accessibility: most existing resources assume some technical background, including Python proficiency, which can prevent non-technical individuals from grasping ML fundamentals. As someone who came from the business side of AI, I recognize that the learning curve presents a barrier and wanted to offer a more approachable path for like-minded learners.
1313

1414
Therefore, this guide is tailored for a non-technical audience keen to better understand open-source machine learning without having to learn Python from scratch. We assume no prior knowledge and will explain concepts from the ground up to ensure clarity. If you're an engineer, you’ll find this guide a bit basic, but for beginners, it's an ideal starting point.
1515

16-
If you want to continue your ML learning journey after you follow this tutorial, I recommend the recent [Hugging Face course](https://www.deeplearning.ai/short-courses/open-source-models-hugging-face/) we released in partnership with DeepLearning AI.
17-
1816
Let’s get stuck in… but first some context.
1917

2018
## What is Hugging Face Transformers?
2119

22-
Hugging Face Transformers is an open-source Python library that provides access to thousands of pre-trained Transformers models for natural language processing (NLP), computer vision, audio tasks, and more. It simplifies the process of implementing and deploying Transformer models by abstracting away the complexity of training or deploying models in lower level ML frameworks like PyTorch, TensorFlow and JAX.
20+
Hugging Face Transformers is an open-source Python library that provides access to thousands of pre-trained Transformers models for natural language processing (NLP), computer vision, audio tasks, and more. It simplifies the process of implementing Transformer models by abstracting away the complexity of training or deploying models in lower level ML frameworks like PyTorch, TensorFlow and JAX.
2321

2422
## What is a library?
2523

@@ -83,7 +81,7 @@ A Docker template is a predefined blueprint for a software environment that incl
8381

8482
By default, our Space comes with a complimentary CPU, which is fine for some applications. However, the many computations required by LLMs benefit significantly from being run in parallel to improve speed, which is something GPUs are great at.
8583

86-
It's also important to choose a GPU with enough memory to store the model and providing spare working memory. In our case, an A10G Small with 24GB is enough for Gemma 2B.
84+
It's also important to choose a GPU with enough memory to store the model and providing spare working memory. In our case, an A10G Small with 24GB is enough for Phi-2.
8785

8886
<p align="center">
8987
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/llama2-non-engineers/guide4.png"><br>
@@ -145,15 +143,15 @@ Although Transformers is already installed, the specific Classes within Transfor
145143
9. Define which model you want to run
146144
- To detail the model you want to download and run from the Hugging Face Hub, you need to specify the name of the model repo in your code
147145
- We do this by setting a variable equal to the model name, in this case we decide to call the variable `model_id`
148-
- We’ll use a non-gated version of Gemma 2B instruction tuned model which can be found at https://huggingface.co/alpindale/gemma-2b-it this saves us an extra step of having to authenticate your Hugging Face account in the code
146+
- We’ll use Microsoft's Phi-2, a small but surprisingly capable model which can be found at https://huggingface.co/microsoft/phi-2. Note: Phi-2 is a base not an instruction tuned model and so will respond unusually if you try to use it for chat
149147

150148
```json
151-
model_id = "alpindale/gemma-2b-it"
149+
model_id = "microsoft/phi-2"
152150
```
153151

154152
## What is an instruction tuned model?
155153

156-
An instruction-tuned language model is a type of model that has been further trained from its base version to understand and respond to commands or prompts given by a user, improving its ability to follow instructions. Base models are able to autocomplete text, but often don’t respond to commands in a useful way.
154+
An instruction-tuned language model is a type of model that has been further trained from its base version to understand and respond to commands or prompts given by a user, improving its ability to follow instructions. Base models are able to autocomplete text, but often don’t respond to commands in a useful way. We'll see this later when we try to prompt Phi.
157155

158156
10. Create a model object and load the model
159157
- To load the model from the Hugging Face Hub into our local environment we need to instantiate the model object. We do this by passing the “model_id” which we defined in the last step into the argument of the “.from_pretrained” method on the AutoModelForCausalLM Class.
@@ -184,7 +182,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_id, add_eos_token=True, padding_
184182
A tokenizer is a tool that splits sentences into smaller pieces of text (tokens) and assigns each token a numeric value called an input id. This is needed because our model only understands numbers, so we first must convert (a.k.a encode) the text into a format the model can understand. Each model has it’s own tokenizer vocabulary, it’s important to use the same tokenizer that the model was trained on or it will misinterpret the text.
185183

186184
12. Create the inputs for the model to process
187-
- Define a new variable `input_text` that will take the prompt you want to give the model
185+
- Define a new variable `input_text` that will take the prompt you want to give the model. In this case I asked "Who are you?" but you can choose whatever you prefer.
188186
- Pass the new variable as an argument to the tokenizer object to create the `input_ids`
189187
- Pass a second argument to the tokenizer object, `return_tensors="pt"`, this ensures the token_id is represented as the correct kind of vector for the model version we are using (i.e. in Pytorch not Tensorflow)
190188

@@ -194,17 +192,26 @@ input_ids = tokenizer(input_text, return_tensors="pt")
194192
```
195193

196194
13. Run generation and decode the output
197-
- Now the input in the right format we need to pass it into the model, we do this by calling the `.generate` method on the `model object` passing the `input_ids` as an argument and assigning it to a new variable `outputs`
195+
- Now the input in the right format we need to pass it into the model, we do this by calling the `.generate` method on the `model object` passing the `input_ids` as an argument and assigning it to a new variable `outputs`. We also set a second argument `max_new_tokens` equal to 100, this limts the number of tokens the model will generate.
198196
- The outputs are not human readable yet, to return them to text we must decode the output. We can do this with the `.decode` method and saving that to the variable `decoded_outputs`
199197
- Finally, passing the `decoded_output` variable into the print function allows us to see the model output in our notebook.
200198
- Optional: Pass the `outputs` variable into the print function to see how they compare to the `decoded outputs`
201199

202200
```json
203-
outputs = model.generate(input_ids["input_ids"])
201+
outputs = model.generate(input_ids["input_ids"], max_new_tokens=100)
204202
decoded_outputs = tokenizer.decode(outputs[0])
205203
print(decoded_outputs)
206204
```
207205

208206
## Why do I need to decode?
209207

210-
Remember that the model only understands numbers, so when we provided our `input_ids` as vectors it returned an output in the same format. To return those outputs to text we need to reverse the initial encoding we did using the tokenizer.
208+
Models only understand numbers, so when we provided our `input_ids` as vectors it returned an output in the same format. To return those outputs to text we need to reverse the initial encoding we did using the tokenizer.
209+
210+
## Why does the output not make sense?
211+
212+
Remember that Phi-2 is a base model that hasn't been instruction tuned for conversational uses, as such it's effectively a massive auto-complete model. Based on your input it is predicting what it thinks is most likely to come next based on all the web pages, books and other content it has seen previously.
213+
214+
Congratulations, you've run inference on your very first LLM!
215+
216+
I hope that working through this example helped you to better understand the world of open-source ML. If you want to continue your ML learning journey, I recommend the recent [Hugging Face course](https://www.deeplearning.ai/short-courses/open-source-models-hugging-face/) we released in partnership with DeepLearning AI.
217+

0 commit comments

Comments
 (0)