Fabrice Grinda

  • Playing with
    Unicorns
  • Featured
  • Categories
  • Portfolio
  • About Me
  • Newsletter
  • AI
  • EN
    • FR
    • AR
    • BN
    • DA
    • DE
    • ES
    • FA
    • HI
    • ID
    • IT
    • JA
    • KO
    • NL
    • PL
    • PT-BR
    • PT-PT
    • RO
    • RU
    • TH
    • UK
    • UR
    • VI
    • ZH-HANS
    • ZH-HANT
× Image Description

Subscribe to Fabrice's Newsletter

Tech Entrepreneurship, Economics, Life Philosophy and much more!

Check your inbox or spam folder to confirm your subscription.

Menu

  • EN
    • FR
    • AR
    • BN
    • DA
    • DE
    • ES
    • FA
    • HI
    • ID
    • IT
    • JA
    • KO
    • NL
    • PL
    • PT-BR
    • PT-PT
    • RO
    • RU
    • TH
    • UK
    • UR
    • VI
    • ZH-HANS
    • ZH-HANT
  • Home
  • Playing with Unicorns
  • Featured
  • Categories
  • Portfolio
  • About Me
  • Newsletter
  • FAQ
  • AI
  • Privacy Policy
Skip to content
Fabrice Grinda

Internet entrepreneurs and investors

× Image Description

Subscribe to Fabrice's Newsletter

Tech Entrepreneurship, Economics, Life Philosophy and much more!

Check your inbox or spam folder to confirm your subscription.

Fabrice Grinda

Internet entrepreneurs and investors

Month: September 2024

Fabrice AI: The Technical Journey

Fabrice AI: The Technical Journey

As I mentioned in the previous post, developing Fabrice AI proved way more complex than expected, forcing me to explore many different approaches.

The Initial Approach: Llama Index – Vector Search

My first foray into enhancing Fabrice AI’s retrieval abilities involved the use of the Llama Index for vector search. The concept was simple: take the content from my blog, convert it into Langchain documents, and then transform these into Llama documents. These Llama documents would then be stored in a vector index, enabling me to query this index for relevant information.

However, as I began to test the system, it became apparent that this approach was not yielding the results I had hoped for. Specifically, when I queried the system with context-heavy questions like “What are the biggest mistakes marketplace founders make?” the AI failed to provide meaningful answers. Instead of retrieving the nuanced content I knew was embedded in the data, it returned irrelevant or incomplete responses.

This initial failure led me to reconsider my approach. I realized that simply storing content in a vector index was not enough; the retrieval mechanism needed to understand the context and nuances of the questions being asked. This realization was the first of many lessons that would shape the evolution of Fabrice AI.

Storing Knowledge: MongoDB Document Storage and Retrieval

With the limitations of the Llama Index approach in mind, I next explored storing the Llama documents in MongoDB. MongoDB’s flexible schema and document-oriented structure seemed like a promising solution for managing the diverse types of content I had accumulated over the years.

The plan was to create a more dynamic and responsive search experience. However, this approach quickly ran into issues. The search functionality, which I had anticipated to be more robust, failed to perform as expected. Queries that should have returned relevant documents instead yielded no results or irrelevant content.

This setback was frustrating, but it also underscored a critical lesson: the storage method is just as important as the retrieval strategy. I began to consider other options, such as utilizing MongoDB Atlas for vector searches, which could potentially provide the precision and scalability I needed. However, before committing to this alternative, I wanted to explore other approaches to determine if there might be a more effective solution.

Metadata Retriever and Vector Store: Seeking Specificity

One of the next avenues I explored was the use of a metadata retriever combined with a vector store. The idea behind this approach was to categorize the vast array of information within Fabrice AI and then retrieve answers based on these categories. By structuring the data with metadata, I hoped to improve the AI’s ability to provide specific, targeted answers.

Yet, this method also had its limitations. While it seemed promising on the surface, the AI struggled to deliver accurate responses to all types of queries. For example, when I asked, “Is the author optimistic?” The system failed to interpret the question in the context of the relevant content. Instead of providing an insightful analysis based on the metadata, it either returned vague answers or none.

This approach taught me a valuable lesson about the importance of context in AI. It is not enough to simply categorize information; the AI must also understand how these categories interact and overlap to form a cohesive understanding of the content. Without this depth of understanding, even the most sophisticated retrieval methods can fall short.

Structuring Knowledge: The SummaryTreeIndex

As I continued to refine Fabrice AI, I experimented with creating a SummaryTreeIndex. This approach aimed to summarize all the documents into a tree format, allowing the AI to navigate through these summaries and retrieve relevant information based on the structure of the content.

The idea was that by summarizing the documents, the AI could quickly identify key points and respond to queries with concise, accurate information. However, this method also faced significant challenges. The AI struggled to provide meaningful answers to complex queries, such as “How to make important decisions in life?” Instead of drawing from the rich, nuanced content stored within the summaries, the AI’s responses were often shallow or incomplete.

This experience underscored the difficulty of balancing breadth and depth in AI. While summaries can provide a high-level overview, they often lack the detailed context needed to answer more complex questions. I realized that any effective solution would need to integrate both detailed content and high-level summaries, allowing the AI to draw on both as needed.

This is why in the version of Fabrice AI that is currently live, I have the AI first give a summary of the answer, before going into more details.

Expanding Horizons: Knowledge Graph Index

Recognizing the limitations of the previous methods, I turned to a more sophisticated approach: the Knowledge Graph Index. This approach involved constructing a knowledge graph from unstructured text, enabling the AI to engage in entity-based querying. The goal was to create a more dynamic and interconnected understanding of the content, allowing Fabrice AI to answer complex, context-heavy questions more effectively.

Despite its promise, the Knowledge Graph Index also faced significant hurdles. The AI struggled to produce accurate results, particularly for queries that required a deep understanding of the context. For example, when asked, “What are fair Seed & Series A valuations?” the AI again failed to provide a relevant answer, highlighting the difficulty of integrating unstructured text into a coherent knowledge graph.

This approach, while ultimately unsuccessful, provided important insights into the challenges of using knowledge graphs in AI. The complexity of the data and the need for precise context meant that even a well-constructed knowledge graph could struggle to deliver the desired results. One more drawback with the Knowledge Graph Index was its slow speed. The response time to get related documents was very high relative to a vector store index.

Re-evaluating the Data: Gemini

After several setbacks, I decided to take a different approach by leveraging Google’s AI, Gemini. The idea was to create datasets from JSON-CSV files and then train a custom model LLM using this data. I hoped that by using structured data and a robust training model, I could overcome some of the challenges that had plagued previous attempts.

However, this approach also encountered difficulties. The training process was halted due to incorrect data formatting, which prevented the model from being trained effectively. This setback underscored the importance of data integrity in AI training. Without properly formatted and structured data, even the most advanced models can fail to perform as expected.

This experience led me to consider the potential of using BigQuery to store JSON data, providing a more scalable and reliable platform for managing the large datasets needed to train Fabrice AI effectively.

Combining Strengths: Langchain Documents with Pinecone

Despite the challenges faced so far, I was determined to find a solution that would allow Fabrice AI to effectively store and retrieve knowledge. This determination led me to experiment with Langchain documents and Pinecone. The approach involved creating a Pinecone vector store using Langchain documents and OpenAI embeddings, then retrieving the top similar documents based on the query.

This method showed promise, particularly when the query included the title of the document. For example, when asked, “What is the key to happiness?” the AI was able to retrieve and summarize the relevant content accurately. However, there were still limitations, particularly when the query lacked specific keywords or titles.

This approach demonstrated the potential of combining different technologies to enhance AI performance. By integrating Langchain documents with Pinecone’s vector store, I was able to improve the relevance and accuracy of the AI’s responses, albeit with some limitations.

Achieving Consistency: GPT Builder OpenAI

After exploring various methods and technologies, I turned to Open AI’s GPT Builder to consolidate and refine the knowledge stored within Fabrice AI. By uploading all the content into a GPT knowledge base, I aimed to create a more consistent and reliable platform for retrieving and interacting with my knowledge.

This approach proved to be one of the most successful, with the AI able to provide better results across a range of queries. The key to this success was the integration of all the knowledge into a single, cohesive system, allowing the AI to draw on the full breadth of content when answering questions.

As mentioned in my previous post, I could not get it to run on my website, and it was only available to paid subscribers of Chat GPT which I felt was too limiting. Also, while it was better, I still did not love the quality of the answers and was not comfortable releasing it to the public.

Final Refinement: GPT Assistants Using Model 4o

The final piece of the puzzle in developing Fabrice AI came with the introduction of GPT Assistants using Model 4o. This approach represented the culmination of everything I had learned throughout the project. By utilizing a vector database and refining the prompts, I aimed to achieve the highest possible level of accuracy and contextual understanding in the AI’s responses.

This method involved uploading all the knowledge I had accumulated into a vector database, which was then used as the foundation for the AI’s interactions. The vector database allowed the AI to perform more sophisticated searches, retrieving information based on the semantic meaning of queries rather than relying solely on keyword matching. This marked a significant advancement over previous approaches, enabling the AI to better understand and respond to complex, nuanced questions.

One of the key innovations of this approach was the careful refinement of prompts. By meticulously crafting and testing different prompts, I was able to guide the AI towards providing more accurate and relevant answers. This involved not only tweaking the wording of the prompts but also experimenting with different ways of structuring the queries to elicit the best possible responses.

The results were impressive. The AI was now able to handle a wide range of queries with high accuracy, even when the questions were open-ended or required a deep understanding of context. For example, when asked, “How to make the most important decisions in your life?” The AI provided a comprehensive and insightful answer, drawing on a variety of sources and perspectives to deliver a well-rounded response.

This success was the culmination of hundreds of hours of work and countless experiments. It demonstrated that, with the right combination of technology and refinement, it was possible to create an AI that could not only store and retrieve information effectively but also engage with it in a meaningful way. The development of GPT Assistants using Model 4o marked the point at which Fabrice AI truly came into its own, achieving the level of sophistication and accuracy that I had envisioned from the start. The GPT Assistants API was then integrated into my blog to allow end users to interact with Fabrice AI in the way you see it on the blog right now.

Reflecting on the Journey

The process of developing Fabrice AI highlighted the complexities of working with AI, particularly when it comes to understanding and contextualizing information. It taught me that there are no shortcuts in AI development—every step, every iteration, and every experiment is a necessary part of the journey towards creating something truly effective.

Looking ahead, I’m excited to continue refining and expanding Fabrice AI. As mentioned in the last post, I will review the questions asked to complete the knowledge base where there are gaps. I am also hoping to eventually release an interactive version that looks and sounds like me that you can talk to.

Author Rose BrownPosted on September 4, 2024September 5, 2024Categories Personal Musings, Tech Gadgets5 Comments on Fabrice AI: The Technical Journey

Search

Recent Posts

  • The Meaning of Life
  • FJ Labs Q2 2025 Update
  • World of DaaS Conversation with Auren Hoffman: Diversified Portfolios, Secondary Sales & Dinner Parties
  • Episode 50: Venture Market Trends
  • Decoding the Future: AI, Venture Market & Marketplaces

Recent Comments

  • Ahmed Aladdin on The Meaning of Life
  • Ahmed Aladdin on The Meaning of Life
  • Germine Rose on The Meaning of Life
  • Fabrice on 2024: Amélie
  • Michael J on 2024: Amélie

Archives

  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • August 2018
  • June 2018
  • May 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • March 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006
  • June 2006
  • May 2006
  • April 2006
  • March 2006
  • February 2006
  • January 2006
  • December 2005
  • November 2005

Categories

  • Crypto/Web3
  • Books
  • Business Musings
  • Displays of Creativity
  • Entrepreneurship
  • Featured Posts
  • Year in Review
  • Life Optimization
  • FJ Labs
  • Decision Making
  • The Economy
  • Asset Light Living
  • Musings
  • Optimism & Happiness
  • Dogs
  • FJ Labs
  • Happiness
  • Interesting Articles
  • Interviews & Fireside Chats
  • Marketplaces
  • Movies & TV Shows
  • New York
  • OLX
  • Panels & Roundtable Discussions
  • Personal Musings
  • Playing with Unicorns
  • Plays
  • The Economy
  • Quotes & Poems
  • Speeches
  • Tech Gadgets
  • Travels
  • Video Games
  • Year in Review

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
  • Home
  • Playing with Unicorns
  • Featured
  • Categories
  • Portfolio
  • About Me
  • Newsletter
  • FAQ
  • AI
  • Privacy Policy
× Image Description

Subscribe to Fabrice's Newsletter

Tech Entrepreneurship, Economics, Life Philosophy and much more!

Check your inbox or spam folder to confirm your subscription.

>
This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.