Adds evaluation links to concepts, how tos, tutorials (#5790)

jacoblee93 · web-flow · commit c2c20fedf5b6 · 2024-06-17T17:41:37.000-07:00
diff --git a/docs/core_docs/docs/concepts.mdx b/docs/core_docs/docs/concepts.mdx
@@ -658,6 +658,7 @@ Most modules in LangChain include the `.stream()` method as an ergonomic streami
 ```ts
 import { ChatAnthropic } from "@langchain/anthropic";
 import { concat } from "@langchain/core/utils/stream";
+import type { AIMessageChunk } from "@langchain/core/messages";
 
 const model = new ChatAnthropic({ model: "claude-3-sonnet-20240229" });
 
@@ -673,6 +674,8 @@ for await (const chunk of stream) {
     gathered = concat(gathered, chunk);
   }
 }
+
+console.log(gathered);
 ```
 
 For models (or other components) that don't support streaming natively, this iterator would just yield a single chunk, but
@@ -1085,6 +1088,24 @@ Table columns:
 | Token     | [many classes](/docs/how_to/split_by_token/)                            | Tokens                                |               | Splits text on tokens. There exist a few different ways to measure tokens.                                                                                  |
 | Character | [CharacterTextSplitter](/docs/how_to/character_text_splitter/)          | A user defined character              |               | Splits text based on a user defined character. One of the simpler methods.                                                                                  |
 
+### Evaluation
+
+<span data-heading-keywords="evaluation,evaluate"></span>
+
+Evaluation is the process of assessing the performance and effectiveness of your LLM-powered applications.
+It involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose.
+This process is vital for building reliable applications.
+
+![](/img/langsmith_evaluate.png)
+
+[LangSmith](https://docs.smith.langchain.com/) helps with this process in a few ways:
+
+- It makes it easier to create and curate datasets via its tracing and annotation features
+- It provides an evaluation framework that helps you define metrics and run your app against your dataset
+- It allows you to track results over time and automatically run your evaluators on a schedule or as part of CI/Code
+
+To learn more, check out [this LangSmith guide](https://docs.smith.langchain.com/concepts/evaluation).
+
 ### Generative UI
 
 LangChain.js provides a few templates and examples showing off generative UI,
diff --git a/docs/core_docs/docs/how_to/index.mdx b/docs/core_docs/docs/how_to/index.mdx
@@ -274,7 +274,16 @@ You can peruse [LangGraph.js how-to guides here](https://langchain-ai.github.io/
 ## [LangSmith](https://docs.smith.langchain.com/)
 
 LangSmith allows you to closely trace, monitor and evaluate your LLM application.
-It seamlessly integrates with LangChain, and you can use it to inspect and debug individual steps of your chains as you build.
+It seamlessly integrates with LangChain and LangGraph.js, and you can use it to inspect and debug individual steps of your chains as you build.
 
 LangSmith documentation is hosted on a separate site.
 You can peruse [LangSmith how-to guides here](https://docs.smith.langchain.com/how_to_guides/).
+
+### Evaluation
+
+<span data-heading-keywords="evaluation,evaluate"></span>
+
+Evaluating performance is a vital part of building LLM-powered applications.
+LangSmith helps with every step of the process from creating a dataset to defining metrics to running evaluators.
+
+To learn more, check out the [LangSmith evaluation how-to guides](https://docs.smith.langchain.com/how_to_guides/evaluation).
diff --git a/docs/core_docs/docs/tutorials/index.mdx b/docs/core_docs/docs/tutorials/index.mdx
@@ -7,13 +7,13 @@ sidebar_class_name: hidden
 
 New to LangChain or to LLM app development in general? Read this material to quickly get up and running.
 
-### Basics
+## Basics
 
 - [Build a Simple LLM Application with LCEL](/docs/tutorials/llm_chain)
 - [Build a Chatbot](/docs/tutorials/chatbot)
 - [Build an Agent](/docs/tutorials/agents)
 
-### Working with external knowledge
+## Working with external knowledge
 
 - [Build a Retrieval Augmented Generation (RAG) Application](/docs/tutorials/rag)
 - [Build a Conversational RAG Application](/docs/tutorials/qa_chat_history)
@@ -23,24 +23,30 @@ New to LangChain or to LLM app development in general? Read this material to qui
 - [Build a Question Answering application over a Graph Database](/docs/tutorials/graph)
 - [Build a PDF ingestion and Question/Answering system](/docs/tutorials/pdf_qa/)
 
-### Specialized tasks
+## Specialized tasks
 
 - [Build an Extraction Chain](/docs/tutorials/extraction)
 - [Classify text into labels](/docs/tutorials/classification)
 - [Summarize text](/docs/tutorials/summarization)
 
-### LangGraph.js
+## LangGraph.js
 
 LangGraph.js is an extension of LangChain aimed at
 building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
 
 LangGraph.js documentation is currently hosted on a separate site.
 You can peruse [LangGraph.js tutorials here](https://langchain-ai.github.io/langgraphjs/tutorials/).
 
-### LangSmith
+## LangSmith
 
 LangSmith allows you to closely trace, monitor and evaluate your LLM application.
 It seamlessly integrates with LangChain, and you can use it to inspect and debug individual steps of your chains as you build.
 
 LangSmith documentation is hosted on a separate site.
 You can peruse [LangSmith tutorials here](https://docs.smith.langchain.com/tutorials/).
+
+### Evaluation
+
+LangSmith helps you evaluate the performance of your LLM applications. The below tutorial is a great way to get started:
+
+- [Evaluate your LLM application](https://docs.smith.langchain.com/tutorials/Developers/evaluation)
diff --git a/docs/core_docs/static/img/langsmith_evaluate.png b/docs/core_docs/static/img/langsmith_evaluate.png