Quick Start
Creating a Text Summarization & Keyword Extraction Prompt¶
First, let's create a prompt. In this quick start guide, we'll create a prompt that takes a text as input and outputs a summary and keywords of that text.
In other words, we'll create a prompt that implements the function (text: str) -> (summary: str, keywords: List[str])
.
graph TD
Input("text: str") --> Function["Text Summarization & Keyword Extraction"]
Function --> Output1("summary: str")
Function --> Output2("keywords: List[str]")
PromptoGen provides a data class (pg.Prompt
) to represent prompts.
We will create a prompt using this data class.
This data class inherits from pydantic.BaseModel
.
To create a prompt, the following information is needed:
Item | Argument Name | Type |
---|---|---|
Prompt Name | name |
str |
Prompt Description | description |
str |
List of Input Parameters | input_parameters |
List[pg.ParameterInfo] |
List of Output Parameters | output_parameters |
List[pg.ParameterInfo] |
Input/Output Template | template |
pg.IOExample |
List of Input/Output Examples | examples |
List[pg.IOExample] |
Using this information, we'll create the prompt.
import promptogen as pg
summarizer = pg.Prompt(
name="Text Summarizer and Keyword Extractor",
description="Summarize text and extract keywords.",
input_parameters=[
pg.ParameterInfo(name="text", description="Text to summarize"),
],
output_parameters=[
pg.ParameterInfo(name="summary", description="Summary of text"),
pg.ParameterInfo(name="keywords", description="Keywords extracted from text"),
],
template=pg.IOExample(
input={'text': "This is a sample text to summarize."},
output={
'summary': "This is a summary of the text.",
'keywords': ["sample", "text", "summarize"],
},
),
examples=[
pg.IOExample(
input={
'text': "One sunny afternoon, a group of friends decided to gather at the nearby park to engage in various games and activities. They played soccer, badminton, and basketball, laughing and enjoying each other's company while creating unforgettable memories together."},
output={
'summary': "A group of friends enjoyed an afternoon playing sports and making memories at a local park.",
'keywords': ["friends", "park", "sports", "memories"],
},
)
],
)
Formatting the Prompt as a String without Input Parameters¶
First, let's try formatting the prompt as a string without any input parameters.
With PromptoGen, you can flexibly create formatters to turn prompts into strings.
Here, we'll use a formatter called KeyValuePromptFormatter
, which outputs the keys and values of input/output variables in the form of key: value
.
To format a string without input parameters, use the format_prompt_without_input
method of the formatter.
This method takes the prompt and formatter as arguments and formats the prompt into a string.
import promptogen as pg
summarizer = pg.Prompt(
name="Text Summarizer and Keyword Extractor",
# ...(other parameters omitted)...
)
formatter = pg.KeyValuePromptFormatter()
print(formatter.format_prompt_without_input(summarizer))
Summarize text and extract keywords.
Input Parameters:
- text: Text to summarize
Output Parameters:
- summary: Summary of text
- keywords: Keywords extracted from text
Template:
Input:
text: "This is a sample text to summarize."
Output:
summary: """This is a summary of the text."""
keywords: [
"sample",
"text",
"summarize"
]
Example 1:
Input:
text: "One sunny afternoon, a group of friends decided to gather at the nearby park to engage in various games and activities. They played soccer, badminton, and basketball, laughing and enjoying each other's company while creating unforgettable memories together."
Output:
summary: """A group of friends enjoyed an afternoon playing sports and making memories at a local park."""
keywords: [
"friends",
"park",
"sports",
"memories"
]
Formatting the Prompt as a String with Input Parameters¶
Next, let's try formatting the prompt as a string with input parameters.
Input parameters are specified using a dict
.
To format a string with input parameters, use the format_prompt
method.
import promptogen as pg
summarizer = pg.Prompt(
name="Text Summarizer and Keyword Extractor",
# ...(other parameters omitted)...
)
input_value = {
'text': "In the realm of software engineering, developers often collaborate on projects using version control systems like Git. They work together to create and maintain well-structured, efficient code, and tackle issues that arise from implementation complexities, evolving user requirements, and system optimization.",
}
print(formatter.format_prompt(summarizer, input_value))
Summarize text and extract keywords.
Input Parameters:
- text: Text to summarize
Output Parameters:
- summary: Summary of text
- keywords: Keywords extracted from text
Template:
Input:
text: "This is a sample text to summarize."
Output:
summary: """This is a summary of the text."""
keywords: [
"sample",
"text",
"summarize"
]
Example 1:
Input:
text: "One sunny afternoon, a group of friends decided to gather at the nearby park to engage in various games and activities. They played soccer, badminton, and basketball, laughing and enjoying each other's company while creating unforgettable memories together."
Output:
summary: """A group of friends enjoyed an afternoon playing sports and making memories at a local park."""
keywords: [
"friends",
"park",
"sports",
"memories"
]
--------
Input:
text: "In the realm of software engineering, developers often collaborate on projects using version control systems like Git. They work together to create and maintain well-structured, efficient code, and tackle issues that arise from implementation complexities, evolving user requirements, and system optimization."
Output:
Generating Output Using a Large Language Model¶
Next, let's try generating output from a large language model.
In PromptoGen, communication with a large language model is done through an abstract class called TextLLM
.
TextLLM
is an abstract class in PromptoGen for uniformly handling large language models. pg.FunctionBasedTextLLM
is an implementation of TextLLM
that generates outputs from large language models using a function.
import promptogen as pg
def generate_text_by_text(text: str) -> str:
# Here we generate output from the large language model
return "<generated text>"
text_llm = pg.FunctionBasedTextLLM(
generate_text_by_text=generate_text_by_text,
)
Example: Generating Output Using OpenAI ChatGPT API¶
This library does not provide a feature to generate outputs from large language models, but it can be implemented using APIs like the OpenAI ChatGPT API.
Here, let's try generating a summarized text from the input text using the OpenAI ChatGPT API.
First, set the OpenAI API Key and Organization ID as environment variables.
import promptogen as pg
import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")
openai.organization = os.getenv("OPENAI_ORG_ID")
def generate_chat_completion(text: str, model: str) -> str:
resp = openai.ChatCompletion.create(
model=model,
messages=[{"role": "user", "content": text}],
max_tokens=2048,
stream=True,
)
raw_resp = ""
for chunk in resp:
chunk_content = chunk["choices"][0]["delta"].get("content", "")
raw_resp += chunk_content
return raw_resp
text_llm = pg.FunctionBasedTextLLM(
generate_text_by_text=lambda input_text: generate_chat_completion(input_text, "gpt-3.5-turbo"),
)
Next, let's try formatting the prompt with input parameters and generating output from a large language model.
import promptogen as pg
# ...(omitted)...
text_llm = pg.FunctionBasedTextLLM(
# ...(omitted)...
)
summarizer = pg.Prompt(
name="Text Summarizer and Keyword Extractor",
# ...(other parameters omitted)...
)
raw_req = formatter.format_prompt(summarizer, input_value)
print(raw_req)
raw_resp = text_llm.generate(raw_req)
print(raw_resp)
summary: """Software engineers collaborate using Git to create and maintain efficient code, and address implementation issues and user requirements."""
keywords: [
"software engineering",
"developers",
"collaborate",
"projects",
"version control systems",
"Git",
"code",
"implementation complexities",
"user requirements",
"system optimization"
]
Converting Output to a Python Object¶
Since the LLM output is just a string, let's try converting it to a Python object.
You can parse the output string from the LLM using the formatter.parse
method based on the prompt's output parameters. The parsing result is stored in a Python dict
.
import promptogen as pg
# ...(omitted)...
text_llm = pg.FunctionBasedTextLLM(
# ...(omitted)...
)
summarizer = pg.Prompt(
name="Text Summarizer and Keyword Extractor",
# ...(other parameters omitted)...
)
raw_req = formatter.format_prompt(summarizer, input_value)
print(raw_req)
raw_resp = text_llm.generate(raw_req)
print(raw_resp)
summarized_resp = formatter.parse(summarizer, raw_resp)
print(summarized_resp)
{'summary': 'Software engineers collaborate using Git to create and maintain efficient code, and address implementation issues and user requirements.', 'keywords': ['software engineering', 'developers', 'collaborate', 'projects', 'version control systems', 'Git', 'code', 'implementation complexities', 'user requirements', 'system optimization']}
This output is a dict
that contains the parsed results from the LLM output string.
Conclusion¶
We've introduced the basics of using PromptoGen.
The flow introduced here is as follows:
- Define a prompt
- Define a formatter
- Use the formatter to format the prompt and input parameters into a string
- Generate output using a large language model
- Convert the output to a Python object
While we've shown a simple example here, PromptoGen allows for easy handling of more complex prompts and input/output parameters.
Furthermore, it's possible to specify the prompt itself as an input or output parameter, allowing for dynamic generation of prompts.