Supplementary code for the Build a Large Language Model From Scratch book by Sebastian Raschka Code repository: https://github.com/rasbt/LLMs-from-scratch |
![]() |
Generating An Instruction Dataset via Llama 3 and Ollama#
This notebook uses an 8-billion-parameter Llama 3 model through ollama to generate a synthetic dataset using the “hack” proposed in the “Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing” paper (https://arxiv.org/abs/2406.08464)
The generated dataset will be an instruction dataset with “instruction” and “output” field similar to what can be found in Alpaca:
{
"instruction": "What is the atomic number of helium?",
"output": "The atomic number of helium is 2.",
},
The code doesn’t require a GPU and runs on a laptop (it was tested on a M3 MacBook Air)
Note that the instruction datasets created here are for educational purposes. However, it is the users’ duty to ensure that their use adheres to the terms of the relevant licensing agreements with Meta AI’s Llama 3.
from importlib.metadata import version
pkgs = [
"tqdm", # Progress bar
]
for p in pkgs:
print(f"{p} version: {version(p)}")
tqdm version: 4.65.0
Installing Ollama and Downloading Llama 3#
Ollama is an application to run LLMs efficiently
It is a wrapper around llama.cpp, which implements LLMs in pure C/C++ to maximize efficiency
Note that it is a tool for using LLMs to generate text (inference), not training or finetuning LLMs
Prior to running the code below, install ollama by visiting https://ollama.com and following the instructions (for instance, clicking on the “Download” button and downloading the ollama application for your operating system)
For macOS and Windows users, click on the ollama application you downloaded; if it prompts you to install the command line usage, say “yes”
Linux users can use the installation command provided on the ollama website
In general, before we can use ollama from the command line, we have to either start the ollama application or run
ollama serve
in a separate terminal

With the ollama application or
ollama serve
running, in a different terminal, on the command line, execute the following command to try out the 8-billion-parameter Llama 3 model (the model, which takes up 4.7 GB of storage space, will be automatically downloaded the first time you execute this command)
# 8B model
ollama run llama3
The output looks like as follows:
$ ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
pulling 4fa551d4f938... 100% ▕████████████████▏ 12 KB
pulling 8ab4849b038c... 100% ▕████████████████▏ 254 B
pulling 577073ffcc6c... 100% ▕████████████████▏ 110 B
pulling 3f8eb4da87fa... 100% ▕████████████████▏ 485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Note that
llama3
refers to the instruction finetuned 8-billion-parameter Llama 3 modelAlternatively, you can also use the larger 70-billion-parameter Llama 3 model, if your machine supports it, by replacing
llama3
withllama3:70b
After the download has been completed, you will see a command line prompt that allows you to chat with the model
Try a prompt like “What do llamas eat?”, which should return an output similar to the following:
>>> What do llamas eat?
Llamas are ruminant animals, which means they have a four-chambered
stomach and eat plants that are high in fiber. In the wild, llamas
typically feed on:
1. Grasses: They love to graze on various types of grasses, including tall
grasses, wheat, oats, and barley.
You can end this session using the input
/bye
Using Ollama’s REST API#
Now, an alternative way to interact with the model is via its REST API in Python via the following function
Before you run the next cells in this notebook, make sure that ollama is still running, as described above, via
ollama serve
in a terminalthe ollama application
Next, run the following code cell to query the model
First, let’s try the API with a simple example to make sure it works as intended:
import urllib.request
import json
def query_model(prompt, model="llama3", url="http://localhost:11434/api/chat", role="user"):
# Create the data payload as a dictionary
data = {
"model": model,
"seed": 123, # for deterministic responses
"temperature": 1., # for deterministic responses
"top_p": 1,
"messages": [
{"role": role, "content": prompt}
]
}
# Convert the dictionary to a JSON formatted string and encode it to bytes
payload = json.dumps(data).encode("utf-8")
# Create a request object, setting the method to POST and adding necessary headers
request = urllib.request.Request(url, data=payload, method="POST")
request.add_header("Content-Type", "application/json")
# Send the request and capture the response
response_data = ""
with urllib.request.urlopen(request) as response:
# Read and decode the response
while True:
line = response.readline().decode("utf-8")
if not line:
break
response_json = json.loads(line)
response_data += response_json["message"]["content"]
return response_data
result = query_model("What do Llamas eat?")
print(result)
---------------------------------------------------------------------------
ConnectionRefusedError Traceback (most recent call last)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:1348, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1347 try:
-> 1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:1276, in HTTPConnection.request(self, method, url, body, headers, encode_chunked)
1275 """Send a complete request to the server."""
-> 1276 self._send_request(method, url, body, headers, encode_chunked)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:1322, in HTTPConnection._send_request(self, method, url, body, headers, encode_chunked)
1321 body = _encode(body, 'body')
-> 1322 self.endheaders(body, encode_chunked=encode_chunked)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:1271, in HTTPConnection.endheaders(self, message_body, encode_chunked)
1270 raise CannotSendHeader()
-> 1271 self._send_output(message_body, encode_chunked=encode_chunked)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:1031, in HTTPConnection._send_output(self, message_body, encode_chunked)
1030 del self._buffer[:]
-> 1031 self.send(msg)
1033 if message_body is not None:
1034
1035 # create a consistent interface to message_body
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:969, in HTTPConnection.send(self, data)
968 if self.auto_open:
--> 969 self.connect()
970 else:
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py:940, in HTTPConnection.connect(self)
939 sys.audit("http.client.connect", self, self.host, self.port)
--> 940 self.sock = self._create_connection(
941 (self.host,self.port), self.timeout, self.source_address)
942 self.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py:845, in create_connection(address, timeout, source_address)
844 try:
--> 845 raise err
846 finally:
847 # Break explicitly a reference cycle
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/socket.py:833, in create_connection(address, timeout, source_address)
832 sock.bind(source_address)
--> 833 sock.connect(sa)
834 # Break explicitly a reference cycle
ConnectionRefusedError: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
Cell In[3], line 1
----> 1 result = query_model("What do Llamas eat?")
2 print(result)
Cell In[2], line 25, in query_model(prompt, model, url, role)
23 # Send the request and capture the response
24 response_data = ""
---> 25 with urllib.request.urlopen(request) as response:
26 # Read and decode the response
27 while True:
28 line = response.readline().decode("utf-8")
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:216, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
214 else:
215 opener = _opener
--> 216 return opener.open(url, data, timeout)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:519, in OpenerDirector.open(self, fullurl, data, timeout)
516 req = meth(req)
518 sys.audit('urllib.Request', req.full_url, req.data, req.headers, req.get_method())
--> 519 response = self._open(req, data)
521 # post-process response
522 meth_name = protocol+"_response"
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:536, in OpenerDirector._open(self, req, data)
533 return result
535 protocol = req.type
--> 536 result = self._call_chain(self.handle_open, protocol, protocol +
537 '_open', req)
538 if result:
539 return result
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:1377, in HTTPHandler.http_open(self, req)
1376 def http_open(self, req):
-> 1377 return self.do_open(http.client.HTTPConnection, req)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py:1351, in AbstractHTTPHandler.do_open(self, http_class, req, **http_conn_args)
1348 h.request(req.get_method(), req.selector, req.data, headers,
1349 encode_chunked=req.has_header('Transfer-encoding'))
1350 except OSError as err: # timeout error
-> 1351 raise URLError(err)
1352 r = h.getresponse()
1353 except:
URLError: <urlopen error [Errno 61] Connection refused>
Extract Instructions#
Now, let’s use the “hack” proposed in the paper: we provide the empty prompt template
"<|begin_of_text|><|start_header_id|>user<|end_header_id|>"
prompt, which will cause the instruction-finetuned Llama 3 model to generate an instruction
def extract_instruction(text):
for content in text.split("\n"):
if content:
return content.strip()
query = "<|begin_of_text|><|start_header_id|>user<|end_header_id|>"
result = query_model(query, role="assistant")
instruction = extract_instruction(result)
print(instruction)
I am trying to find a way to make my child's birthday party more special and unique. What are some creative ideas you have?
As we can see above, surprisingly, the model indeed generated an instruction
Generate Responses#
Now, the next step is to create the corresponding response, which can be done by simply passing the instruction as input
response = query_model(instruction, role="user")
print(response)
What an exciting question! I'd be delighted to help you come up with some creative and unique ideas to make your child's birthday party truly special!
Here are a few ideas to get you started:
1. **Themed Scavenger Hunt**: Plan a scavenger hunt based on the birthday child's favorite theme (e.g., superheroes, animals, or princesses). Hide clues and challenges throughout the party area, leading up to a final surprise.
2. **DIY Crafts Station**: Set up a craft station where kids can create their own party favors, such as customized t-shirts, crowns, or jewelry. This activity encourages creativity and makes for a memorable keepsake.
3. **Mystery Box Challenge**: Fill mystery boxes with different textures, smells, and sounds. Have the kids guess what's inside each box without looking. This game promotes problem-solving and teamwork.
4. **Indoor Camping Adventure**: Set up a cozy indoor "camping" area with sleeping bags, flashlights, and s'mores-making stations. Kids can enjoy a camping experience without leaving the party location.
5. **Personalized Photo Booth**: Create a customized photo booth with props and backdrops that match the birthday child's theme. This activity allows kids to take home special memories and share them on social media.
6. **Foodie Fun**: Plan a cooking or baking station where kids can make their own treats, such as cupcakes, pizzas, or trail mix. This activity teaches valuable skills and lets kids enjoy their creations.
7. **Outdoor Movie Night**: Set up an outdoor movie screen (or projector) with cozy seating and snacks. Screen the birthday child's favorite film or a classic kid-friendly movie.
8. **Science Experiments**: Host a science-themed party where kids can conduct fun experiments, such as making slime, creating lava lamps, or growing crystals.
9. **Karaoke Contest**: Set up a karaoke machine with popular kids' songs and have a singing competition. Offer prizes for the best performances, and provide fun props like microphones and costumes.
10. **Time Capsule Ceremony**: Have each guest bring a small item that represents their favorite memory or something they're looking forward to in the future. Bury the time capsule together as a group, with instructions to open it on a specific date (e.g., next year's birthday party).
11. **Special Guest Appearance**: Arrange for a special guest, such as a superhero, princess, or even a real-life animal (if feasible), to make an appearance at the party.
12. **Customized Games**: Design custom games and activities that fit the birthday child's interests and personality. This could include a customized version of a favorite game or a new game altogether.
Remember, the key to making your child's birthday party unique is to incorporate elements that reflect their personality and interests. Mix and match these ideas or come up with something entirely new – the possibilities are endless!
What do you think? Is there anything in particular that resonates with you, or would you like more suggestions?
Generate Dataset#
We can scale up this approach to an arbitrary number of data samples (you may want to apply some optional filtering length or quality (e.g., using another LLM to rate the generated data)
Below, we generate 5 synthetic instruction-response pairs, which takes about 3 minutes on an M3 MacBook Air
(To generate a dataset suitable for instruction finetuning, we want to increase this to at least 1k to 50k and perhaps run it on a GPU to generate the examples in a more timely fashion)
Tip
You can generate even higher-quality responses by changing
model="llama3"
tomodel="llama3:70b"
, however, this will require more computational resources
from tqdm import tqdm
dataset_size = 5
dataset = []
for i in tqdm(range(dataset_size)):
result = query_model(query, role="assistant")
instruction = extract_instruction(result)
response = query_model(instruction, role="user")
entry = {
"instruction": instruction,
"output": response
}
dataset.append(entry)
100%|█████████████████████████████████████████████| 5/5 [02:37<00:00, 31.41s/it]
with open("instruction-data-llama3-7b.json", "w") as file:
json.dump(dataset, file, indent=4)
!cat instruction-data-llama3-7b.json
[
{
"instruction": "What is the significance of the number 7 in various cultures and religions?",
"output": "The number 7 has been a significant and recurring theme across many cultures and religions, often imbuing it with special meaning and symbolism. Here are some examples:\n\n1. **Numerology**: In numerology, the number 7 is considered sacred and mystical, associated with spiritual awakening, introspection, and enlightenment.\n2. **Judaism**: The Torah has seven days of creation, seven weeks in the wilderness, and seven years of rest (Sabbatical year). Seven is also a symbol of completion or perfection.\n3. **Christianity**: In Christianity, there are seven deadly sins, seven virtues, and seven sacraments. Jesus was said to have spoken seven sermons, and the number 7 appears in various biblical accounts, such as the seven days of creation and the seven angels who appear before God.\n4. **Islam**: In Islamic tradition, there are seven heavens, seven earths, and seven veils that separate the physical world from the divine realm. The Quran mentions seven verses (Ayats) that were revealed to Muhammad in a single revelation.\n5. **Hinduism**: In Hindu mythology, there are seven chakras (energy centers) in the human body, seven colors of the rainbow, and seven planets (including the sun and moon). The number 7 is also associated with the seven aspects of the divine feminine (Saptamatrikas).\n6. **Buddhism**: In Buddhist cosmology, there are seven levels of existence (dhatus) and seven stages of enlightenment (bodhisattva path). The seven days of meditation retreats in Tibetan Buddhism are also a nod to this number.\n7. **Ancient Greek and Roman cultures**: The number 7 was considered perfect or complete, with seven being the number of planets known at the time (including Earth, Sun, Moon, Mars, Mercury, Jupiter, and Venus). Seven days were also the typical duration of a week in these cultures.\n8. **Chinese culture**: In Chinese astrology, there are seven celestial stems and 12 earthly branches, creating a system of cyclical time reckoning. The number 7 is also associated with good luck and prosperity.\n9. **African cultures**: In some African traditions, the number 7 is associated with spiritual powers, such as divination and healing. Seven is also considered a sacred number in Yoruba culture, representing the seven orishas (deities).\n10. **Esoteric traditions**: Many esoteric traditions, including Hermeticism, Gnosticism, and Theosophy, have employed the number 7 to represent spiritual principles, such as the seven planes of existence, seven chakras, or seven sacred seals.\n\nThese examples illustrate the widespread significance of the number 7 across various cultures and religions. While the specific meanings may vary, the number 7 often represents completion, perfection, spiritual growth, and mystical connections."
},
{
"instruction": "I've been wondering, what is the most important thing you can learn from a child?",
"output": "What a great question!\n\nChildren have so much to teach us, and I think one of the most important things we can learn from them is the power of simplicity, curiosity, and joy. Children are often unencumbered by the complexities and worries that come with adulthood, and they approach life with a fresh perspective.\n\nHere are some specific things children can teach us:\n\n1. **Simplicity**: Children are masters of simplifying things. They don't get bogged down in details or worry about what others think. They just enjoy the moment and find joy in simple things like playing outside, drawing, or reading.\n2. **Curiosity**: Kids are naturally curious and love to explore their world. They ask questions, investigate, and learn from experience. This curiosity can be a powerful reminder to us adults to stay open-minded and seek new knowledge and experiences.\n3. **Joy and playfulness**: Children have an amazing capacity for joy and playfulness. They find happiness in the simplest things, like playing with blocks or having a water balloon fight. We could all benefit from embracing our inner child and finding more joy in life's simple pleasures.\n4. **Unconditional love**: Children are capable of loving unconditionally, without judgment or expectation. This can be a powerful reminder to us adults to practice self-love, self-acceptance, and kindness towards others.\n5. **Creativity and imagination**: Kids are incredibly creative and imaginative, often thinking outside the box and coming up with innovative solutions. This creativity can inspire us adults to think differently, try new things, and approach problems from unique angles.\n6. **Resilience**: Children are surprisingly resilient in the face of challenges and setbacks. They learn to adapt, cope, and move forward, teaching us valuable lessons about perseverance and bouncing back from adversity.\n\nIn summary, learning from children can help us regain a sense of simplicity, curiosity, joy, and playfulness, while also reminding us of the importance of unconditional love, creativity, and resilience."
},
{
"instruction": null,
"output": ""
},
{
"instruction": "What is the best way to deal with a difficult person?",
"output": "Dealing with a difficult person can be challenging, but there are strategies that can help you navigate the situation effectively. Here are some tips:\n\n1. **Stay calm**: Take a deep breath and try not to take their behavior personally. A calm demeanor can help de-escalate tensions and prevent misunderstandings.\n2. **Listen actively**: Sometimes, people act out because they feel unheard or misunderstood. Make an effort to listen carefully to what they're saying, and respond thoughtfully.\n3. **Set boundaries**: Establish clear limits on what you are and aren't willing to engage in. Be firm but respectful when communicating your needs.\n4. **Avoid taking the bait**: Don't let their provocations get under your skin. Stay focused on the issue at hand and avoid getting drawn into an argument or debate.\n5. **Use \"I\" statements**: When expressing yourself, use \"I\" statements instead of \"you\" statements, which can come across as accusatory. This helps to reduce defensiveness and promotes a more constructive conversation.\n6. **Practice empathy**: Try to understand where the other person is coming from, even if you don't agree with their perspective. Showing that you care about their feelings can help diffuse tension.\n7. **Don't take it personally**: Remember that the difficult person's behavior is often a reflection of themselves, not you. Keep your self-worth and confidence intact.\n8. **Seek common ground**: Look for areas of agreement or shared interests. This can help to build bridges and create a more positive atmosphere.\n9. **Use humor (carefully)**: A well-timed, lighthearted joke or witty remark can help diffuse tension and lighten the mood. However, be cautious not to offend or make light of serious issues.\n10. **Know when to disengage**: If the situation becomes too heated or toxic, it may be necessary to take a step back and re-engage at a later time when emotions have cooled down.\n11. **Seek support**: Don't be afraid to ask for help from friends, family, or a professional if you're struggling to manage your interactions with a difficult person.\n12. **Practice self-care**: Take care of yourself physically, emotionally, and mentally. Engage in activities that bring you joy and help you maintain your energy and resilience.\n\nRemember, dealing with a difficult person is not about winning an argument or changing their behavior; it's about maintaining your own emotional well-being and responding constructively to the situation."
},
{
"instruction": "I'm looking for a way to get my cat's attention when they're hiding under the bed or in a closet.",
"output": "The classic \"where'd my cat go?\" conundrum! Don't worry, I've got some tips to help you coax your kitty out from their hiding spots:\n\n1. **Use their favorite treats**: Cats love food, and familiar treats can be a powerful lure. Try calling your cat's name and saying \"treat time\" in a playful tone. This might encourage them to emerge and investigate.\n2. **Make some noise**: Cats have poor eyesight but excellent hearing. Gently knock on the bed frame or closet door with your knuckles, making a soft, rhythmic sound. This can help your cat pinpoint where you are and entice them to come out.\n3. **Speak softly and calmly**: When speaking to your cat, use a gentle, soothing tone. Avoid loud or harsh voices, as these might scare them even further into hiding.\n4. **Use verbal cues**: Establish a consistent verbal cue, like \"come on out\" or \"let's play,\" which can become associated with the idea of leaving their hiding spot.\n5. **Create a \"safe zone\"**: If your cat is hiding due to fear or anxiety (e.g., from loud noises or other pets), try creating a safe, comfortable space for them to emerge into. This could be a cozy blanket or a familiar toy.\n6. **Patiently wait it out**: Sometimes, cats just need time and space to feel secure enough to come out. Give your cat the opportunity to leave their hiding spot at their own pace.\n7. **Use a flashlight (carefully)**: If your cat is hiding in a dark space, try using a flashlight to create a gentle beam of light. Be cautious not to shine it directly into their eyes, as this could startle them further.\n8. **Offer a familiar object**: Place a familiar toy or blanket near the entrance to the hiding spot, which can help your cat feel more comfortable coming out.\n9. **Make the space inviting**: If your cat is hiding under the bed, try moving any clutter or dust bunnies away from the area. Make the space underneath the bed a pleasant place for them to emerge into.\n10. **Be patient and don't force it**: Respect your cat's boundaries and allow them to come out when they're ready. Forcing them to leave their hiding spot can create negative associations and make them more likely to hide in the future.\n\nRemember, every cat is different, so try a combination of these methods to see what works best for your feline friend."
}
]