Definind a PGVector Retriver in DSpy¶
This notebook contains a simple example of how you can use DSpy's custom retrievers to create a dedicated retriever for pgvector.
First we'll define a vector store using LLamaIndex to wrap our database. In this example, we've already got some test data populated.
You need to ensure you have a postgres instance running somewhere.
In [37]:
# import all the dependencies
from llama_index.core import SimpleDirectoryReader, StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.postgres import PGVectorStore
import textwrap
import psycopg2
from sqlalchemy import make_url
import dspy
import os
from typing import List, Optional
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from dspy.retrieve.pgvector_rm import PgVectorRM
In [2]:
# we're gonna use the classic paul graham dataset from LLamaIndex but you can use whatever data you want
!mkdir -p 'data/paul_graham/'
!wget '' -O 'data/paul_graham/paul_graham_essay.txt'
--2024-10-16 08:11:46-- Resolving (,,, ... Connecting to (||:443... connected. HTTP request sent, awaiting response... 200 OK Length: 75042 (73K) [text/plain] Saving to: ‘data/paul_graham/paul_graham_essay.txt’ data/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.07s 2024-10-16 08:11:46 (1.04 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]
In [2]:
# read them in
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
print("Document ID:", documents[0].doc_id)
Document ID: deeb8de9-e597-4197-a4f8-ddc173813caa
In [11]:
# we'll use hugging face embeddings
HF_TOKEN: Optional[str] = os.getenv("HUGGING_FACE_TOKEN")
embedding_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Settings.embed_model = embedding_model
# no llm for now
Settings.llm = None
LLM is explicitly disabled. Using MockLLM.
In [29]:
# connect to the db
connection_string = "postgresql://test:test@localhost:5432/test"
db_name = "test"
conn = psycopg2.connect(connection_string)
conn.autocommit = True
# 384 are the hf embeddings
url = make_url(connection_string)
vector_store = PGVectorStore.from_params(
"hnsw_m": 16,
"hnsw_ef_construction": 64,
"hnsw_ef_search": 40,
"hnsw_dist_method": "vector_cosine_ops",
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context, show_progress=True
query_engine = index.as_query_engine()
Parsing nodes: 0%| | 0/1 [00:00<?, ?it/s]
Generating embeddings: 0%| | 0/22 [00:00<?, ?it/s]
Now let's define the retriever.
We can inherit from the PGVectorRM like so:
class DBRetriever(PgVectorRM): def init(self, kwargs): super().__init__(kwargs)
In [38]:
# now let's define a forward method - which dspy will use to fetch the documents
class DBRetriever(PgVectorRM):
def __init__(self, **kwargs):
def forward(self, query: str, k: int = None):
# fetch the embeddings using dspy inbuilt method
query_embedding = self._get_embeddings(query)
retrieved_docs = []
fields = psycopg2.sql.SQL(",").join(
[psycopg2.sql.Identifier(f) for f in self.fields]
if self.include_similarity:
# check for the similarity - closer to one being a closer match
similarity_field = psycopg2.sql.SQL(",") + psycopg2.sql.SQL(
"1 - ({embedding_field} <=> %s::vector) AS similarity",
fields += similarity_field
args = (query_embedding, query_embedding, k if k else self.k)
args = (query_embedding, k if k else self.k)
# our full sql query will look like this:
# SELECT field1, field2, 1 - (embedding_field <=> %s::vector) AS similarity
# FROM table_name
# ORDER BY embedding_field <=> %s::vector
sql_query = psycopg2.sql.SQL(
"select {fields} from {table} order by {embedding_field} <=> %s::vector limit %s",
with self.conn as conn:
with conn.cursor() as cur:
cur.execute(sql_query, args)
rows = cur.fetchall()
columns = [descrip[0] for descrip in cur.description]
# post-process the query to fetch what we want
for row in rows:
data = dict(zip(columns, row))
data["long_text"] = data[self.content_field]
# Return Prediction
return retrieved_docs
In [39]:
# before we try it out we need to define an embedding function
# otherwise dspy will default to using openai
def db_embedding_func(query):
return embedding_model.get_text_embedding(query)
In [40]:
#let's try it out
retriever = DBRetriever(
pg_table_name="data_paul_graham_essay", # this is the able
content_field="text", # this is the text column
fields=["metadata_", "id", "text"], # the fields we want returned
In [42]:
retriever("How to start a startup?")
[Example({'metadata_': {'file_path': '/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-10-16', 'last_modified_date': '2024-10-16', '_node_content': '{"id_": "ba168434-1236-4273-8034-f7d996b7ee78", "embedding": null, "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "deeb8de9-e597-4197-a4f8-ddc173813caa", "node_type": "4", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "d582e8efd7877d91d8d9dee6fc273581522753aaaaa5ef6c016a7e3c58cdcc36", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "17619405-76ec-4333-8003-52c4994d3492", "node_type": "1", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "5fe5bd0fe231eb1ecd78be490b3ece301c2ecfe9eb81fe001bc574f42cf8d23b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "5e5c7f57-3800-4661-8cec-f48d0d56b997", "node_type": "1", "metadata": {}, "hash": "271a764bd9023019eff7a5fe0c31c597a095c64e13fbdd11fed69de1b34c118e", "class_name": "RelatedNodeInfo"}}, "text": "", "mimetype": "text/plain", "start_char_idx": 52590, "end_char_idx": 56804, "text_template": "{metadata_str}\\n\\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\\n", "class_name": "TextNode"}', '_node_type': 'TextNode', 'document_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'ref_doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa'}, 'id': 60, 'text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.', 'similarity': 0.5843820746058846, 'long_text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.'}) (input_keys=None), Example({'metadata_': {'file_path': '/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-10-16', 'last_modified_date': '2024-10-16', '_node_content': '{"id_": "c96fe7cb-f1a1-4234-8b3d-6622bf5d21b1", "embedding": null, "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "deeb8de9-e597-4197-a4f8-ddc173813caa", "node_type": "4", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "d582e8efd7877d91d8d9dee6fc273581522753aaaaa5ef6c016a7e3c58cdcc36", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "1a8be999-eaa6-40e5-bd7c-4361332ddb26", "node_type": "1", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "5fe5bd0fe231eb1ecd78be490b3ece301c2ecfe9eb81fe001bc574f42cf8d23b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "ec26eb2f-fb8e-4054-9baf-9a378e0b6c19", "node_type": "1", "metadata": {}, "hash": "271a764bd9023019eff7a5fe0c31c597a095c64e13fbdd11fed69de1b34c118e", "class_name": "RelatedNodeInfo"}}, "text": "", "mimetype": "text/plain", "start_char_idx": 52590, "end_char_idx": 56804, "text_template": "{metadata_str}\\n\\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\\n", "class_name": "TextNode"}', '_node_type': 'TextNode', 'document_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'ref_doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa'}, 'id': 82, 'text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.', 'similarity': 0.5843820746058846, 'long_text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.'}) (input_keys=None), Example({'metadata_': {'file_path': '/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-10-16', 'last_modified_date': '2024-10-16', '_node_content': '{"id_": "2c195200-6728-4bf5-90f2-b6377e7fe56c", "embedding": null, "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "deeb8de9-e597-4197-a4f8-ddc173813caa", "node_type": "4", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "d582e8efd7877d91d8d9dee6fc273581522753aaaaa5ef6c016a7e3c58cdcc36", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "93aa6793-6d5a-4436-bcb7-82af8df72e59", "node_type": "1", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "5fe5bd0fe231eb1ecd78be490b3ece301c2ecfe9eb81fe001bc574f42cf8d23b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "717629ba-1f00-4c13-b708-d2cf07ffb0f2", "node_type": "1", "metadata": {}, "hash": "271a764bd9023019eff7a5fe0c31c597a095c64e13fbdd11fed69de1b34c118e", "class_name": "RelatedNodeInfo"}}, "text": "", "mimetype": "text/plain", "start_char_idx": 52590, "end_char_idx": 56804, "text_template": "{metadata_str}\\n\\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\\n", "class_name": "TextNode"}', '_node_type': 'TextNode', 'document_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'ref_doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa'}, 'id': 16, 'text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.', 'similarity': 0.5843820746058846, 'long_text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.'}) (input_keys=None), Example({'metadata_': {'file_path': '/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-10-16', 'last_modified_date': '2024-10-16', '_node_content': '{"id_": "f972308c-1437-480b-989e-475bbcfba2e3", "embedding": null, "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "deeb8de9-e597-4197-a4f8-ddc173813caa", "node_type": "4", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "d582e8efd7877d91d8d9dee6fc273581522753aaaaa5ef6c016a7e3c58cdcc36", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "1251a22d-1692-42be-9009-dc043761e11e", "node_type": "1", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "5fe5bd0fe231eb1ecd78be490b3ece301c2ecfe9eb81fe001bc574f42cf8d23b", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "4bb86b44-432a-4583-adf7-d61752a9851b", "node_type": "1", "metadata": {}, "hash": "271a764bd9023019eff7a5fe0c31c597a095c64e13fbdd11fed69de1b34c118e", "class_name": "RelatedNodeInfo"}}, "text": "", "mimetype": "text/plain", "start_char_idx": 52590, "end_char_idx": 56804, "text_template": "{metadata_str}\\n\\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\\n", "class_name": "TextNode"}', '_node_type': 'TextNode', 'document_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'ref_doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa'}, 'id': 38, 'text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.', 'similarity': 0.5843820746058846, 'long_text': 'We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who\'d already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we\'d intended.\n\nWe invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don\'t think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.\n\nThe deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]\n\nFairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.\n\nAs YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another\'s customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.\n\nI had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.\n\nIn the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn\'t startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one\'s intellectual curiosity.\n\nHN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I\'d had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one\'s work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]\n\nAs well as HN, I wrote all of YC\'s internal software in Arc.'}) (input_keys=None), Example({'metadata_': {'file_path': '/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75042, 'creation_date': '2024-10-16', 'last_modified_date': '2024-10-16', '_node_content': '{"id_": "7b0c87e0-a6d8-42cf-862b-bce575e8b70a", "embedding": null, "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "deeb8de9-e597-4197-a4f8-ddc173813caa", "node_type": "4", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "d582e8efd7877d91d8d9dee6fc273581522753aaaaa5ef6c016a7e3c58cdcc36", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "539a2562-e32c-48af-89f5-3dffb8887f4a", "node_type": "1", "metadata": {"file_path": "/Users/hugo/git/notebooks/data/paul_graham/paul_graham_essay.txt", "file_name": "paul_graham_essay.txt", "file_type": "text/plain", "file_size": 75042, "creation_date": "2024-10-16", "last_modified_date": "2024-10-16"}, "hash": "385b97a987fa824e41ea3f3bf034bd59d40c941c976a80196d5db76100e17f4a", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "e3897c57-5e87-46a8-b994-9bb81c14400e", "node_type": "1", "metadata": {}, "hash": "1c5d05090f6921f2d28e3e12a6fb1bdee915b832287002ccc3a989b44c25376b", "class_name": "RelatedNodeInfo"}}, "text": "", "mimetype": "text/plain", "start_char_idx": 27804, "end_char_idx": 32197, "text_template": "{metadata_str}\\n\\n{content}", "metadata_template": "{key}: {value}", "metadata_seperator": "\\n", "class_name": "TextNode"}', '_node_type': 'TextNode', 'document_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa', 'ref_doc_id': 'deeb8de9-e597-4197-a4f8-ddc173813caa'}, 'id': 9, 'text': 'Now we felt like we were really onto something. I had visions of a whole new generation of software working this way. You wouldn\'t need versions, or ports, or any of that crap. At Interleaf there had been a whole group called Release Engineering that seemed to be at least as big as the group that actually wrote the software. Now you could just update the software right on the server.\n\nWe started a new company we called Viaweb, after the fact that our software worked via the web, and we got $10,000 in seed funding from Idelle\'s husband Julian. In return for that and doing the initial legal work and giving us business advice, we gave him 10% of the company. Ten years later this deal became the model for Y Combinator\'s. We knew founders needed something like this, because we\'d needed it ourselves.\n\nAt this stage I had a negative net worth, because the thousand dollars or so I had in the bank was more than counterbalanced by what I owed the government in taxes. (Had I diligently set aside the proper proportion of the money I\'d made consulting for Interleaf? No, I had not.) So although Robert had his graduate student stipend, I needed that seed funding to live on.\n\nWe originally hoped to launch in September, but we got more ambitious about the software as we worked on it. Eventually we managed to build a WYSIWYG site builder, in the sense that as you were creating pages, they looked exactly like the static ones that would be generated later, except that instead of leading to static pages, the links all referred to closures stored in a hash table on the server.\n\nIt helped to have studied art, because the main goal of an online store builder is to make users look legit, and the key to looking legit is high production values. If you get page layouts and fonts and colors right, you can make a guy running a store out of his bedroom look more legit than a big company.\n\n(If you\'re curious why my site looks so old-fashioned, it\'s because it\'s still made with this software. It may look clunky today, but in 1996 it was the last word in slick.)\n\nIn September, Robert rebelled. "We\'ve been working on this for a month," he said, "and it\'s still not done." This is funny in retrospect, because he would still be working on it almost 3 years later. But I decided it might be prudent to recruit more programmers, and I asked Robert who else in grad school with him was really good. He recommended Trevor Blackwell, which surprised me at first, because at that point I knew Trevor mainly for his plan to reduce everything in his life to a stack of notecards, which he carried around with him. But Rtm was right, as usual. Trevor turned out to be a frighteningly effective hacker.\n\nIt was a lot of fun working with Robert and Trevor. They\'re the two most independent-minded people I know, and in completely different ways. If you could see inside Rtm\'s brain it would look like a colonial New England church, and if you could see inside Trevor\'s it would look like the worst excesses of Austrian Rococo.\n\nWe opened for business, with 6 stores, in January 1996. It was just as well we waited a few months, because although we worried we were late, we were actually almost fatally early. There was a lot of talk in the press then about ecommerce, but not many people actually wanted online stores. [8]\n\nThere were three main parts to the software: the editor, which people used to build sites and which I wrote, the shopping cart, which Robert wrote, and the manager, which kept track of orders and statistics, and which Trevor wrote. In its time, the editor was one of the best general-purpose site builders. I kept the code tight and didn\'t have to integrate with any other software except Robert\'s and Trevor\'s, so it was quite fun to work on. If all I\'d had to do was work on this software, the next 3 years would have been the easiest of my life. Unfortunately I had to do a lot more, all of it stuff I was worse at than programming, and the next 3 years were instead the most stressful.\n\nThere were a lot of startups making ecommerce software in the second half of the 90s. We were determined to be the Microsoft Word, not the Interleaf. Which meant being easy to use and inexpensive. It was lucky for us that we were poor, because that caused us to make Viaweb even more inexpensive than we realized. We charged $100 a month for a small store and $300 a month for a big one.', 'similarity': 0.5729762315750122, 'long_text': 'Now we felt like we were really onto something. I had visions of a whole new generation of software working this way. You wouldn\'t need versions, or ports, or any of that crap. At Interleaf there had been a whole group called Release Engineering that seemed to be at least as big as the group that actually wrote the software. Now you could just update the software right on the server.\n\nWe started a new company we called Viaweb, after the fact that our software worked via the web, and we got $10,000 in seed funding from Idelle\'s husband Julian. In return for that and doing the initial legal work and giving us business advice, we gave him 10% of the company. Ten years later this deal became the model for Y Combinator\'s. We knew founders needed something like this, because we\'d needed it ourselves.\n\nAt this stage I had a negative net worth, because the thousand dollars or so I had in the bank was more than counterbalanced by what I owed the government in taxes. (Had I diligently set aside the proper proportion of the money I\'d made consulting for Interleaf? No, I had not.) So although Robert had his graduate student stipend, I needed that seed funding to live on.\n\nWe originally hoped to launch in September, but we got more ambitious about the software as we worked on it. Eventually we managed to build a WYSIWYG site builder, in the sense that as you were creating pages, they looked exactly like the static ones that would be generated later, except that instead of leading to static pages, the links all referred to closures stored in a hash table on the server.\n\nIt helped to have studied art, because the main goal of an online store builder is to make users look legit, and the key to looking legit is high production values. If you get page layouts and fonts and colors right, you can make a guy running a store out of his bedroom look more legit than a big company.\n\n(If you\'re curious why my site looks so old-fashioned, it\'s because it\'s still made with this software. It may look clunky today, but in 1996 it was the last word in slick.)\n\nIn September, Robert rebelled. "We\'ve been working on this for a month," he said, "and it\'s still not done." This is funny in retrospect, because he would still be working on it almost 3 years later. But I decided it might be prudent to recruit more programmers, and I asked Robert who else in grad school with him was really good. He recommended Trevor Blackwell, which surprised me at first, because at that point I knew Trevor mainly for his plan to reduce everything in his life to a stack of notecards, which he carried around with him. But Rtm was right, as usual. Trevor turned out to be a frighteningly effective hacker.\n\nIt was a lot of fun working with Robert and Trevor. They\'re the two most independent-minded people I know, and in completely different ways. If you could see inside Rtm\'s brain it would look like a colonial New England church, and if you could see inside Trevor\'s it would look like the worst excesses of Austrian Rococo.\n\nWe opened for business, with 6 stores, in January 1996. It was just as well we waited a few months, because although we worried we were late, we were actually almost fatally early. There was a lot of talk in the press then about ecommerce, but not many people actually wanted online stores. [8]\n\nThere were three main parts to the software: the editor, which people used to build sites and which I wrote, the shopping cart, which Robert wrote, and the manager, which kept track of orders and statistics, and which Trevor wrote. In its time, the editor was one of the best general-purpose site builders. I kept the code tight and didn\'t have to integrate with any other software except Robert\'s and Trevor\'s, so it was quite fun to work on. If all I\'d had to do was work on this software, the next 3 years would have been the easiest of my life. Unfortunately I had to do a lot more, all of it stuff I was worse at than programming, and the next 3 years were instead the most stressful.\n\nThere were a lot of startups making ecommerce software in the second half of the 90s. We were determined to be the Microsoft Word, not the Interleaf. Which meant being easy to use and inexpensive. It was lucky for us that we were poor, because that caused us to make Viaweb even more inexpensive than we realized. We charged $100 a month for a small store and $300 a month for a big one.'}) (input_keys=None)]
This is a simple way to define a retriever for PGVector in DSpy. We can now pass the results of the query to a model to generate text.
In [ ]: