<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Julia Community 🟣: Jan Siml</title>
    <description>The latest articles on Julia Community 🟣 by Jan Siml (@svilupp).</description>
    <link>https://forem.julialang.org/svilupp</link>
    <image>
      <url>https://forem.julialang.org/images/xsJxdUrmmfk1cteKNHXU96f3JbB7z69pdyULa0Jmj9E/rs:fill:90:90/g:sm/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L3VzZXIvcHJvZmls/ZV9pbWFnZS83NC9m/OTU5ZDNiNy1jYzIx/LTQ1MDUtODUyMi1l/NjEzZWUwZjBiMzAu/anBlZw</url>
      <title>Julia Community 🟣: Jan Siml</title>
      <link>https://forem.julialang.org/svilupp</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.julialang.org/feed/svilupp"/>
    <language>en</language>
    <item>
      <title>AIHelpMe.jl Pt.2: Instant Expertise, Infinite Solutions for Julia Developers</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Tue, 30 Apr 2024 08:36:09 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/aihelpmejl-pt2-instant-expertise-infinite-solutions-for-julia-developers-a31</link>
      <guid>https://forem.julialang.org/svilupp/aihelpmejl-pt2-instant-expertise-infinite-solutions-for-julia-developers-a31</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/svilupp/AIHelpMe.jl"&gt;AIHelpMe.jl&lt;/a&gt; is a Julia package that harnesses the power of AI models to provide tailored coding guidance, integrating seamlessly with PromptingTools.jl to offer a unique approach to answering coding queries directly in Julia's environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Frustration of Searching for Julia Answers
&lt;/h2&gt;

&lt;p&gt;We've all been there. You're stuck on a problem, and you need some guidance on how to implement a specific functionality in Julia. You head to your trusty search engine, type in your query, and... wait, why are all these results in Python? You tweak your query, adding "Julia" this and "Julia language" that, but still, the results are scattered and unclear.&lt;/p&gt;

&lt;p&gt;After 5 minutes of searching, you finally stumble upon a Discourse post from 2018. But wait, is this even relevant to Julia 1.0? Should you bother opening it? You take a deep breath and dive in, hoping that the answer lies within.&lt;/p&gt;

&lt;p&gt;Another 5 minutes pass, and you finally find what you're looking for. You copy out the necessary snippet, and after a few more minutes of customizing it to your needs, you have what you needed. The process is tedious, to say the least.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to AIHelpMe's Official Release
&lt;/h2&gt;

&lt;p&gt;AIHelpMe, a Julia package designed to enhance coding assistance with AI, has been officially registered and released. This milestone makes it easier to install directly from the Julia registry and introduces new functionalities and new knowledge packs! Now, developers can access sophisticated, AI-powered coding guidance more efficiently than ever.&lt;/p&gt;

&lt;h2&gt;
  
  
  How AIHelpMe Works
&lt;/h2&gt;

&lt;p&gt;AIHelpMe uses a Retrieval Augment Generation (RAG) pattern to provide accurate and relevant answers to your coding questions. It preprocesses the provided documentation, converting text snippets into numerical embeddings. When you ask a question, AIHelpMe looks up the most relevant documentation snippets, feeds them into the AI model, and generates an answer tailored to Julia's ecosystem and best practices.&lt;/p&gt;

&lt;p&gt;Get started with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Pkg&lt;/span&gt;
&lt;span class="n"&gt;Pkg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AIHelpMe"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;AIHelpMe&lt;/span&gt;
&lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="s"&gt;"How do I implement quicksort in Julia?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: It requires at least the OpenAI API key (&lt;code&gt;ENV["OPENAI_API_KEY"]&lt;/code&gt;), but I would strongly recommend getting a FREE Cohere API key for re-ranking (silver pipeline).&lt;/p&gt;

&lt;h2&gt;
  
  
  How AIHelpMe Differs from Other Solutions
&lt;/h2&gt;

&lt;p&gt;Compared to chatbots, AIHelpMe offers several advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Grounded in actual resources&lt;/strong&gt;: AIHelpMe's answers are based on actual, up-to-date Julia resources, not outdated training data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customizable knowledge&lt;/strong&gt;: You choose what knowledge to include, allowing you to improve precision and recall.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexible trade-offs&lt;/strong&gt;: You choose the trade-off between cost, performance, and time, giving you greater control over your coding experience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State-of-the-art RAG methods&lt;/strong&gt;: AIHelpMe leverages the latest RAG methods, and full customization is possible, ensuring that you get the most accurate and relevant answers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note: Only a limited number of packages have been pre-processed so far: Julia docs, Tidier ecosystem, and Makie ecosystem. It's still experimental, but it works! Load them with:&lt;/p&gt;

&lt;h2&gt;
  
  
  Starter Example
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;AIHelpMe&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;AIHelpMe&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_result&lt;/span&gt;

&lt;span class="c"&gt;# ideally, switch to better pipeline for proper results, requires setting up Cohere API key&lt;/span&gt;
&lt;span class="n"&gt;AIHelpMe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update_pipeline!&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;silver&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# load tidier index, others available: :julia, :makie&lt;/span&gt;
&lt;span class="n"&gt;AIHelpMe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_index!&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;tidier&lt;/span&gt;&lt;span class="x"&gt;);&lt;/span&gt;

&lt;span class="c"&gt;# Ask a question&lt;/span&gt;
&lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="s"&gt;"How do you add a regression line to a plot in TidierPlots?"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://forem.julialang.org/images/faBQBVouq4IGJYEyWmwiflkqwFTIPEm33jqWzLBMDrw/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL2ky/bWhoNzEzOGd3MDll/c2Q3ajZsLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/faBQBVouq4IGJYEyWmwiflkqwFTIPEm33jqWzLBMDrw/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL2ky/bWhoNzEzOGd3MDll/c2Q3ajZsLnBuZw" alt="Quick answer over Tidier docs" width="800" height="99"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or show the highlighted answer (you can customize to add the actual source docs, or remove the scores/highlights):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;# See highlighted answer (optional)&lt;/span&gt;
&lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_result&lt;/span&gt;&lt;span class="x"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://forem.julialang.org/images/xsOj3_BGTAxyLVFcS2sBSaDC2UCwmcfMbMdGhnsfb-w/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL3g2/ZWE2bmEwbnFjZzl1/bDd0dnNqLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/xsOj3_BGTAxyLVFcS2sBSaDC2UCwmcfMbMdGhnsfb-w/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL3g2/ZWE2bmEwbnFjZzl1/bDd0dnNqLnBuZw" alt="Highlighted answer over Tidier docs" width="800" height="180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're an infrequent user of AlgebraOfGraphics like me, you'll certainly appreciate the &lt;code&gt;:makie&lt;/code&gt; knowledge pack -- it's much faster now to find the right keywords to customize your plot!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;# Load all knowledge packs&lt;/span&gt;
&lt;span class="n"&gt;AIHelpMe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_index!&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;julia&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;makie&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;tidier&lt;/span&gt;&lt;span class="x"&gt;]);&lt;/span&gt;
&lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="s"&gt;"How to set the label of a y-axis in Makie?"&lt;/span&gt;&lt;span class="n"&gt;gpt3t&lt;/span&gt; &lt;span class="c"&gt;# testing a weak model&lt;/span&gt;
&lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_result&lt;/span&gt;&lt;span class="x"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://forem.julialang.org/images/K8DTFgnTOBE0Tq578jo9PR-zn6wdMtUxp5lhm0-XFxg/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzNw/eTd0Ym02emFnbjI5/dnRsaml1LnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/K8DTFgnTOBE0Tq578jo9PR-zn6wdMtUxp5lhm0-XFxg/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzNw/eTd0Ym02emFnbjI5/dnRsaml1LnBuZw" alt="Highlighted answer over Makie docs" width="800" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note: There can still be some issues with the quality of the answers (it's GenAI!), especially for the bronze pipeline and weaker models, but, hopefully, it's already good enough to create value for you!&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Usage of AIHelpMe
&lt;/h2&gt;

&lt;p&gt;AIHelpMe offers several advanced features for experienced users wanting more control and deeper insights:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Response Insights:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;pprint&lt;/code&gt; to highlight potential hallucinations, show sources, their scores, and context snippets.&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;aihelp("Explain Julia's multiple dispatch system", return_all=true)|&amp;gt;AIHelpMe.pprint&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note: You'll always get better responses with better pipelines - see below.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customizing the AI Pipeline:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adjust the complexity of AI responses with &lt;code&gt;update_pipeline!&lt;/code&gt; choosing from bronze, silver, or gold levels.&lt;/li&gt;
&lt;li&gt;Specify AI models, including local options like Ollama.&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;AIHelpMe.update_pipeline!(:silver; model="gllama370")&lt;/code&gt; # gllama370 is Groq.com-hosted Llama 3 70b that you can access for free!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Safe Code Execution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Execute AI-generated code safely with &lt;code&gt;PromptingTools.AICode&lt;/code&gt; struct.&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;aihelp("How to create a named tuple from a dictionary?")|&amp;gt;PromptingTools.AICode&lt;/code&gt; &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more details on these advanced features, please refer to the &lt;a href="https://svilupp.github.io/svilupp.github.io/AIHelpMe.jl/dev/advanced"&gt;AIHelpMe Advanced Documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Over the summer, we hope to optimize the performance (in terms of quality) and add more knowledge packs. &lt;/p&gt;

&lt;p&gt;We're also working on making it super easy for you to develop your own knowledge packs for the packages you use, regardless of whether they're public or private. This will enable you to tailor AIHelpMe to your specific needs and workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AIHelpMe is the solution to your Julia coding woes. With its AI-powered assistance, flexible querying system, and cost-effective approach, AIHelpMe is poised to revolutionize the way you code in Julia. Try it out today and experience the future of coding assistance!&lt;/p&gt;

&lt;p&gt;Credit for the title image: DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>aihelpme</category>
    </item>
    <item>
      <title>ProToPortal: The Portal to the Magic of PromptingTools</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Sun, 28 Apr 2024 16:02:47 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/protoportal-the-portal-to-the-magic-of-promptingtools-47n2</link>
      <guid>https://forem.julialang.org/svilupp/protoportal-the-portal-to-the-magic-of-promptingtools-47n2</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;ProToPortal streamlines your interaction with any LLM tasks on the go, offering customizable templates for automatic replies, direct code evaluation, and an intuitive, multi-device interface. Explore its full capabilities and enhance your productivity on &lt;a href="https://github.com/svilupp/ProToPortal.jl"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Unveiling ProToPortal: A Julia-First LLM GUI built with Stipple.jl!
&lt;/h2&gt;

&lt;p&gt;Hello, fellow GenAI enthusiasts! Today, I'm thrilled to introduce &lt;strong&gt;ProToPortal&lt;/strong&gt;, a nifty tool born from my own need to simplify my daily interactions with Julia and AI models. It's a small but mighty project aimed at boosting productivity and minimizing those pesky prompting hassles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why ProToPortal?
&lt;/h2&gt;

&lt;p&gt;Ever found yourself on a leisurely walk with your dog, struck by a sudden coding inspiration that just couldn't wait? That's exactly where ProToPortal comes into play. This tool isn't just another coding interface; it's your on-the-go, in-your-pocket coding companion, ready to tackle tasks from simple prompts to complex code evaluations—all before you've even finished your walk!&lt;/p&gt;

&lt;h2&gt;
  
  
  Cool Features to Check Out:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Accessible Anywhere:&lt;/strong&gt; Whether you're on a train or in your comfy home office, ProToPortal is there. It works seamlessly across all devices, ensuring that your brilliant ideas never slip away just because you're not at your desk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Code Evaluation and Fixing:&lt;/strong&gt; Forget about flipping between screens to debug. ProToPortal lets you directly evaluate and fix Julia code within the GUI. Imagine tweaking your code with just a few clicks—yes, it's that easy!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Automatic Replies:&lt;/strong&gt; Streamline your workflow even further with automated responses. Set up ProToPortal to handle repetitive tasks, leaving you more time to focus on the creative aspects of coding.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  More Handy Features:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Editing Code Cells:&lt;/strong&gt; Quickly edit any of the messages right within the chat tab—just click and modify.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deleting Code Cells:&lt;/strong&gt; Made a mistake? No worries! Easily remove any unwanted messages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Saving Conversations:&lt;/strong&gt; Keep a history of your sessions for future reference or continued experimentation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a complete list of features, including detailed explanations and how-tos, be sure to check out &lt;a href="https://svilupp.github.io/ProToPortal.jl/dev"&gt;ProToPortal's Documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  See It in Action:
&lt;/h2&gt;

&lt;p&gt;Curious to see how it works in real time? &lt;/p&gt;

&lt;p&gt;Check out this video where it auto-fixes the generated code: &lt;a href="https://github.com/svilupp/ProToPortal.jl/blob/main/docs/src/videos/screen-capture-code-fixing.webm"&gt;Code fixing recording&lt;/a&gt; (webm format so not all browsers might play it).&lt;/p&gt;

&lt;p&gt;And here's another cool feature—watch how ProToPortal handles editing a conversation (gif, easy to play): &lt;a href="https://github.com/svilupp/ProToPortal.jl/blob/9eb18346d056dd1b3b4d2202ce5da0b7be7ef8cb/docs/src/videos/screen-capture-plain.gif"&gt;Editing a conversation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Give It a Try!
&lt;/h2&gt;

&lt;p&gt;Ready to dive in? Visit &lt;a href="https://github.com/svilupp/ProToPortal.jl"&gt;ProToPortal's GitHub&lt;/a&gt; to get started. It's open source, and I'm eager to hear your thoughts or see your contributions!&lt;/p&gt;

&lt;p&gt;Thanks for checking out ProToPortal. Happy coding, and let the code be with you!&lt;/p&gt;




&lt;p&gt;Big thank you to the &lt;a href="https://genieframework.com/"&gt;Genie.jl&lt;/a&gt; team! This tool wouldn't exist without their amazing packages!&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>generativeai</category>
      <category>genai</category>
      <category>genie</category>
    </item>
    <item>
      <title>Automatically Saving Conversations with PromptingTools.jl and AIHelpMe.jl</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Thu, 25 Apr 2024 20:36:42 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/automatically-saving-conversations-with-promptingtoolsjl-and-aihelpmejl-5fke</link>
      <guid>https://forem.julialang.org/svilupp/automatically-saving-conversations-with-promptingtoolsjl-and-aihelpmejl-5fke</guid>
      <description>&lt;h2&gt;
  
  
  Update 20/5
&lt;/h2&gt;

&lt;p&gt;From PromptingTools v0.26 onward you can achieve the auto-saving of your conversations by simply running this one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_model!&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt-3.5-turbo"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenAISchema&lt;/span&gt;&lt;span class="x"&gt;()&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TracerSchema&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SaverSchema&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See &lt;code&gt;?TracerSchema&lt;/code&gt; and &lt;code&gt;SaverSchema&lt;/code&gt; for more details.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Learn how to automatically save conversations with PromptingTools.jl. By saving conversations, you can contribute to building a dataset for fine-tuning a Julia-specific language model. This tutorial provides code examples to get you started&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Recently, there have been exciting discussions about fine-tuning a language model for the Julia programming language (see &lt;a href="https://discourse.julialang.org/t/an-llm-fine-tuned-for-julia-call-for-comments-help/113462/8"&gt;here&lt;/a&gt;). &lt;/p&gt;

&lt;p&gt;As part of this effort, we need a high-quality dataset of GOOD conversations related to Julia. One way to contribute to this effort is to start logging conversations with Large Language Models (LLMs) that are relevant to Julia. &lt;/p&gt;

&lt;p&gt;In this blog post, we will explore how to automatically save conversations using PromptingTools.jl and AIHelpMe.jl, a powerful Julia package for interacting with language models. By saving these conversations, we can build a valuable dataset for fine-tuning a Julia-specific language model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Defining a Custom Schema for Saving Conversations
&lt;/h2&gt;

&lt;p&gt;A lesser-known feature, PromptingTools has a custom callback system that allows us to define custom schemas that will then call your arbitrary functions before and after each LLM call (it's used mostly for observability).&lt;/p&gt;

&lt;p&gt;To save conversations, we need to define a custom schema that wraps our normal prompt schema. We can do this by creating a new struct &lt;code&gt;SaverSchema&lt;/code&gt; that inherits from &lt;code&gt;PT.AbstractTracerSchema&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Dates&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;JSON3&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;SAVE_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"finetune_julia"&lt;/span&gt;

&lt;span class="nd"&gt;@kwdef&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="nc"&gt; SaverSchema&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;:&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AbstractTracerSchema&lt;/span&gt;
    &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AbstractPromptSchema&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any call to this schema triggers a call to function &lt;code&gt;initialize_tracer&lt;/code&gt; before the LLM call and to &lt;code&gt;finalize_tracer&lt;/code&gt; after the LLM call.&lt;/p&gt;

&lt;p&gt;In our case, we want to overload the &lt;code&gt;finalize_tracer&lt;/code&gt; function to save the conversation after the LLM call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; PT.finalize_tracer&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tracer_schema&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;SaverSchema&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;tracer&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;msg_or_conv&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; 
    &lt;span class="n"&gt;tracer_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kt"&gt;NamedTuple&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt; 
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;# We already captured all kwargs, they are already in `tracer`, we can ignore tracer_kwargs in this implementation&lt;/span&gt;

    &lt;span class="n"&gt;time_received&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"YYYYmmdd_HHMMSS"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;joinpath&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SAVE_DIR&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"conversation__&lt;/span&gt;&lt;span class="si"&gt;$(model)&lt;/span&gt;&lt;span class="s"&gt;__&lt;/span&gt;&lt;span class="si"&gt;$(time_received)&lt;/span&gt;&lt;span class="s"&gt;.json"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg_or_conv&lt;/span&gt; &lt;span class="k"&gt;isa&lt;/span&gt; &lt;span class="kt"&gt;AbstractVector&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="n"&gt;msg_or_conv&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;msg_or_conv&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save_conversation&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;msg_or_conv&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 1: Saving Conversations with &lt;code&gt;aigenerate&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Now that we have defined our custom schema, we can use it to save conversations with &lt;code&gt;aigenerate&lt;/code&gt;. We need to explicitly provide the &lt;code&gt;SaverSchema&lt;/code&gt; instance to &lt;code&gt;aigenerate&lt;/code&gt; along with the input prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SaverSchema&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OpenAISchema&lt;/span&gt;&lt;span class="x"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Say hi"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt3t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you call this function, it will save the conversation to the folder defined in &lt;code&gt;SAVE_DIR&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One gotcha, if you send multiple messages in the save convo, is that all turns will be saved in separate files.&lt;br&gt;
The easiest way would be to ignore it and solve it in post-processing (&lt;code&gt;AIMessage&lt;/code&gt; have unique IDs so it should be easy to detect)&lt;br&gt;
Alternatively, you can save the hash of the content of the first 2-3 messages in the filename to clearly see the continued conversations.&lt;/p&gt;
&lt;h3&gt;
  
  
  Example 2: Registering a Traced Model
&lt;/h3&gt;

&lt;p&gt;Instead of providing the custom schema every time, we can register a traced model with the custom schema. This way, we can use the model name instead of the schema instance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;# Overwrite the schema for this model and define a nice alias&lt;/span&gt;
&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_model!&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt-3.5-turbo"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MODEL_ALIASES&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"gpt3t"&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt-3.5-turbo"&lt;/span&gt;

&lt;span class="c"&gt;# Notice the return_all -&amp;gt; we need to return ALL messages, it would be a useless record otherwise&lt;/span&gt;
&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Say hi"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt3t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Conversation gets saved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Loading Conversations
&lt;/h3&gt;

&lt;p&gt;Once we have saved conversations, we can load them back into Julia using &lt;code&gt;load_conversation&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;conv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;load_conversation&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"finetune_julia/conversation__gpt3t__20240425_205853.json"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Exporting Conversations in ShareGPT Format
&lt;/h3&gt;

&lt;p&gt;Once we have enough conversation, we will want to export so our finetuning tool can use them. &lt;br&gt;
I would highly recommend Axolotl (see an example from &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/blob/main/experiments/cheater-7b-finetune/lora.yml"&gt;my finetune&lt;/a&gt;). &lt;/p&gt;

&lt;p&gt;Axolotl can work with instructions (conversations) in ShareGPT format. This is how you can export multiple conversations into the required JSONL file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"System message 1"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; 
         &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"User message"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; 
         &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AIMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AI message"&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;conv2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"System message 2"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; 
         &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"User message"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; 
         &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AIMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AI message"&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;joinpath&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"finetune_julia"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"export_sharegpt.jsonl"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;save_conversations&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conv2&lt;/span&gt;&lt;span class="x"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Saving AIHelpMe Conversations
&lt;/h2&gt;

&lt;p&gt;If you use AIHelpMe, you're also generating loads of interesting data!&lt;br&gt;
The simplest thing for auto-logging your questions is to wrap the entry function &lt;code&gt;aihelp&lt;/code&gt; and serialize the whole &lt;code&gt;RAGResult&lt;/code&gt; (it has all the diagnostics and underlying information)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; aih&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Dates&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"YYYYmmdd_HHMMSS"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;JSON3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;joinpath&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SAVE_DIR&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"aihelp__&lt;/span&gt;&lt;span class="si"&gt;$(dt)&lt;/span&gt;&lt;span class="s"&gt;.json"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To use it, you would replace &lt;code&gt;aihelp("some question...")&lt;/code&gt; with &lt;code&gt;aih("some question...")&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The serialized RAGResult is c. 200kB, but it provides a lot of helpful detail about your question.&lt;br&gt;
If you want to save space, save just the individual conversations in &lt;code&gt;result.conversations&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sharing The Conversations
&lt;/h2&gt;

&lt;p&gt;Where to share these? To be discussed. Come join us on &lt;a href="https://discourse.julialang.org/t/an-llm-fine-tuned-for-julia-call-for-comments-help/113462/8"&gt;Discourse&lt;/a&gt; or on Julia Slack in #generative-ai.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In this blog post, we have seen how to automatically save conversations using PromptingTools.jl. By defining a custom schema and overloading the &lt;code&gt;finalize_tracer&lt;/code&gt; function, we can save conversations to files. We can also register a traced model and use it to generate text. Finally, we can load and export conversations in ShareGPT format for finetuning. With AIHelpMe.jl, we can serialize the whole &lt;code&gt;RAGResult&lt;/code&gt; with JSON3.&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>finetune</category>
    </item>
    <item>
      <title>The Hidden Cost of Locally Hosted Models: A Case Study</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Sat, 20 Apr 2024 10:08:04 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/the-hidden-cost-of-locally-hosted-models-a-case-study-5871</link>
      <guid>https://forem.julialang.org/svilupp/the-hidden-cost-of-locally-hosted-models-a-case-study-5871</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Locally-hosted AI models may appear free, but they cost you valuable time—over 10 hours a year in our case study. Switch to a commercial API like Groq to save time, boost productivity, and gain nearly three extra days of coding annually for a dollar!&lt;/p&gt;

&lt;h2&gt;
  
  
  Would You Pay a Dollar to Buy 3 Extra Days This Year?
&lt;/h2&gt;

&lt;p&gt;Imagine you could buy time. Not in a metaphorical sense, but literally reclaim hours of your life lost to waiting. For those of us using locally hosted models for ad-hoc productivity tasks like coding assistance, this isn't just a daydream—it's a decision we face every day.&lt;/p&gt;

&lt;h2&gt;
  
  
  Appreciating the Open-Source AI Ecosystem
&lt;/h2&gt;

&lt;p&gt;First, let's give credit where it's due. The thriving open-source ecosystem in generative AI deserves a massive shoutout. Organizations like Meta and Mistral have opened up their models, and platforms like Ollama and Llama.cpp have made these tools accessible for local use. This democratization of technology is nothing short of revolutionary. However, it's crucial to discuss the true cost of operating these technologies locally (by individuals, for ad-hoc tasks).&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hidden Costs of Local Hosting
&lt;/h2&gt;

&lt;p&gt;While the price tag on locally-hosted models might read "free," the reality is anything but. These models often underperform compared to their cloud-hosted counterparts (GPU-poor) or make you wait longer—sometimes both. For example, using a locally-hosted model like Mixtral on Ollama, you might wait 20 seconds for a response that a commercial provider like Groq, Together, etc. could deliver in less than a second.&lt;/p&gt;

&lt;h3&gt;
  
  
  Case Study: Daily Coding Assistance
&lt;/h3&gt;

&lt;p&gt;Let's break it down with a simple case study. Assume you're a developer making three LLM calls per hour during a three-hour coding session, each day for 250 days a year. That's 2250 LLM calls.&lt;/p&gt;

&lt;p&gt;With Ollama, a 20-second wait per call accumulates to over 12 hours spent just waiting annually. &lt;/p&gt;

&lt;p&gt;In contrast, using Groq's API, even with an extremely conservative 3-second wait (Llama 3 70b, which is GPT-4 level model), you'd spend less than 2 hours waiting over the same period.&lt;/p&gt;

&lt;p&gt;The difference? &lt;strong&gt;More than 10 hours&lt;/strong&gt; saved—or, put another way, over 3 extra days of productive coding time each year. &lt;/p&gt;

&lt;p&gt;And the cost of this extra time? Right now - FREE! Assuming the announced pricing, &lt;strong&gt;about \$1.5&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Moreover, with Groq, we assumed using GPT-4 level model! So you would likely benefit even more from MUCH better answers!&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Choose Cloud Providers?
&lt;/h2&gt;

&lt;p&gt;Given &lt;strong&gt;you have roughly 4,000 weeks on this earth&lt;/strong&gt;, spending any of them waiting on your GPU seems like a poor use of time. In a way, time is the scarcest resource yet you throw it away to save fractions of cents.&lt;/p&gt;

&lt;p&gt;Furthermore, you might lose out on innovations. Cloud providers continually upgrade their services with faster and more powerful models without requiring any effort on your part. Meanwhile, changing your local setup is a significant investment and it has its limits (VRAM...).&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Start?
&lt;/h2&gt;

&lt;p&gt;Switching is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Sign up for the &lt;a href="https://console.groq.com/keys"&gt;Groq API&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Set up your environment variable &lt;code&gt;GROQ_API_KEY&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Use PromptingTools.jl with a Groq-hosted Llama3 70b, which I aliased with "gl70" (Groq Llama 70). This alias helps save time even when typing!&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Example Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="c"&gt;# Assumes you have set the environment variable GROQ_API_KEY&lt;/span&gt;

&lt;span class="n"&gt;ai&lt;/span&gt;&lt;span class="s"&gt;"In Julia, write a function `clean_names` that cleans up column names of a DataFrame"&lt;/span&gt;&lt;span class="n"&gt;gl70&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Info: Tokens: 411 @ Cost: \$0.0003 in 2.7 seconds
AIMessage("Here is a Julia function `clean_names` that cleans up column names of a DataFrame:
```

julia
using DataFrames
&amp;lt;...continues&amp;gt;


```
```

`

This simple setup can drastically cut down your waiting time, freeing up days for you to spend on more fulfilling activities or further innovation.

If you're familiar with the [PromptingTools.jl](https://github.com/svilupp/PromptingTools.jl) package, you know you can even set up an auto-fixing loop that will execute the generated code, analyze the error for feedback and retry automatically to fix any errors with Monte Carlo Tree Search (see `?airetry!` for more details).

```julia
using PromptingTools.Experimental.AgentTools: AIGenerate, run!, AICode
using PromptingTools.Experimental.AgentTools: airetry!, aicodefixer_feedback

result = AIGenerate(
    "In Julia, write a function `clean_names` that cleans up column names of a DataFrame";
    model = "gl70") |&amp;gt; run!

success_func(aicall) = AICode(aicall.conversation[end]) |&amp;gt; isvalid
feedback_func(aicall) = aicodefixer_feedback(aicall.conversation).feedback
airetry!(success_func, result, feedback_func; max_retries = 3)
```


## In Conclusion

While the allure of "free" local hosting is strong, the hidden costs in time can be substantial. By opting for a commercial solution like Groq's API, not only do you reclaim time lost to waiting, but you also benefit from superior model performance. The investment is minimal compared to the time you buy back—time that could be spent innovating, creating, or just enjoying life. Isn't that worth considering?

If you're looking to try, do it now while Groq is free!! [Get your API key here](https://console.groq.com/keys).

## Appendix

I made a claim that Llama 3 70b is a GPT-4 level model, check out our Leaderboard [here](https://svilupp.github.io/Julia-LLM-Leaderboard/dev/examples/summarize_results_local/#Model-Comparison) to see the results in an out-of-sample benchmark.

Credit for the title image goes to DALL-E 3.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>genai</category>
      <category>generateai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Latest Scoop on PromptingTools.jl</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Fri, 05 Apr 2024 20:36:34 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/the-latest-scoop-on-promptingtoolsjl-n2n</link>
      <guid>https://forem.julialang.org/svilupp/the-latest-scoop-on-promptingtoolsjl-n2n</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;PromptingTools.jl just got a hefty update with several new versions, packed with new models, enhanced AI tools, and easier dataset prep, all thanks to a coffee-fueled solo developer who's now inviting others to join the coding party through GitHub issues. Dive in, contribute, and let's make magic together!&lt;/p&gt;

&lt;h2&gt;
  
  
  Dive Into the Latest and (Maybe) Greatest PromptingTools.jl Updates!
&lt;/h2&gt;

&lt;p&gt;Hello, Julia enthusiasts and AI aficionados! We've been busy tinkering in the Julia workshop, and guess what? We’ve rolled out not one, not two, but EIGHT new sub-versions of &lt;a href="https://github.com/svilupp/PromptingTools.jl"&gt;PromptingTools.jl&lt;/a&gt;! That’s right, we’ve been on a coding spree, fueled by too much coffee and an unyielding passion for making your lives a tad easier (and ours a bit more caffeinated).&lt;/p&gt;

&lt;p&gt;Start with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Pkg&lt;/span&gt;
&lt;span class="n"&gt;Pkg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"PromptingTools"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="n"&gt;ai&lt;/span&gt;&lt;span class="s"&gt;"How could I have lived without PromptingTools.jl for so long?"&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 138 @ Cost: $0.0002 in 5.4 seconds&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("It's great to hear that you've found PromptingTools.jl to be a valuable tool! PromptingTools.jl is designed to streamline your workflow...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;What’s Fresh in PromptingTools.jl?&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Supercharged Model Shenanigans&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AIGenerate Meets Anthropic:&lt;/strong&gt; Ready for a text generation party? Anthropic API is now on the guest list with its cool aliases – bring on the AI prose!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Extraction Gets Anthropic:&lt;/strong&gt; Ever wished for Claude from Anthropic to help you with data extraction? Wish granted! Now you can summon Claude 3 to pull out data like a digital magician. In other news, &lt;strong&gt;Claude 3 Haiku is amazing at parties&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GoogleGenAI Enters the Chat:&lt;/strong&gt; With a GOOGLE_API_KEY, you can now conjure up content with Google's Gemini model. It’s like having a Google genie but for AI text generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model Registry Bonanza:&lt;/strong&gt; We’ve added some fancy new models to our registry like “nomic-embed-text” and “mxbai-embed-large.” Because who doesn’t like more toys in their AI sandbox?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Revamped RAGTools: Sleek, Fast, and Flexible
&lt;/h3&gt;

&lt;p&gt;RAGTools has received a major upgrade, making your AI adventures smoother and speedier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;RAGTools Goes Binary:&lt;/strong&gt; Think binary is just for computers and that one friend who can only answer in yes/no? Well, now RAGTools speaks binary too, for embeddings that zip and zoom faster than you can say “BinaryCosineSimilarity()”. You should read this &lt;a href="https://huggingface.co/blog/embedding-quantization#binary-quantization-in-vector-databases"&gt;blog&lt;/a&gt;. There is a benchmark blog post upcoming!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Customizable RAG Interface:&lt;/strong&gt; For those who love to tweak and tailor, the new RAG interface is a game-changer. With &lt;code&gt;retrieve&lt;/code&gt; and &lt;code&gt;generate!&lt;/code&gt; functions and all their sub-steps properly separated and documented, you now have the power to craft a RAG pipeline that perfectly fits your project's needs, offering unparalleled flexibility in how you approach AI-driven tasks. See the documentation for more details.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Debugging &amp;amp; Analysis Tools:&lt;/strong&gt; We’ve introduced pretty-printing and support annotations because sometimes you need to read AI-generated content without squinting and sometimes you want to know when the model is lying to you.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Dataset Prep &amp;amp; Nifty Utilities&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Easier Dataset Prep:&lt;/strong&gt; We made dataset prep as easy as pie. Sadly, it doesn’t come with actual pie. But, JSONL format export? Yum!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Docs &amp;amp; Tools Galore:&lt;/strong&gt; Dive into our expanded docs with an "Extra Tools" section that’s like finding an extra fry at the bottom of the bag. Plus, FAQs to guide you through the thicket of common woes.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;For the Adventurous Souls&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dabble in AI Art:&lt;/strong&gt; Feeling artsy? Our experimental support for image generation with DALL-E models lets you channel your inner digital Picasso.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Improvements &amp;amp; Bug Squashing&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We’ve polished, tweaked, and outright cajoled PromptingTools.jl into a better version of itself, all while fixing those pesky bugs that love to play hide and seek.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Looking Forward (With Goggles On)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We're as excited as a kid in a candy store about these updates and can’t wait to see what you'll build, debug, or accidentally break with them. Your projects are the real MVPs here, and we’re just here to supply the tools (and occasionally entertain).&lt;/p&gt;

&lt;p&gt;So, grab the latest version of PromptingTools.jl, unleash your creativity, and remember: in the world of coding, the journey is half the fun, and the other half? Well, that’s debugging. Happy coding, and may your coffee be strong and your bugs few!&lt;/p&gt;

&lt;h2&gt;
  
  
  A Solo Journey (But Open to Hitchhikers):
&lt;/h2&gt;

&lt;p&gt;Plot twist: PromptingTools.jl has been mostly a solo quest. However, I’ve started mapping out all my wild ideas and to-dos on GitHub as &lt;a href="https://github.com/svilupp/PromptingTools.jl/issues"&gt;issues&lt;/a&gt;, inviting anyone keen to pitch in or tackle something that sparks their interest. It’s a chance to dive into the fray and help shape the future of this project. So, if you’re up for a bit of coding camaraderie, come join the adventure! The ultimate goal is to stabilize the functionality and interfaces and transfer it under JuliaGenAI organization for faster development.&lt;/p&gt;

&lt;p&gt;EDIT: The longer-term hope is to write an agent that will create all the PRs automatically :)&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>promptingtools</category>
    </item>
    <item>
      <title>Empowering AI with Knowledge: The New RAG Interface in PromptingTools</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Fri, 05 Apr 2024 20:05:12 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/empowering-ai-with-knowledge-the-new-rag-interface-in-promptingtools-3n5n</link>
      <guid>https://forem.julialang.org/svilupp/empowering-ai-with-knowledge-the-new-rag-interface-in-promptingtools-3n5n</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;The new RAGTools module in the PromptingTools.jl package introduces enhanced modularity and straightforward extension capabilities, enabling developers and researchers to easily customize and build Retrieval-Augmented Generation (RAG) systems tailored to their specific needs in Julia.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The introduction of Retrieval-Augmented Generation (RAG) systems addresses key challenges in generative AI, notably the tendency to lack crucial information and produce hallucinated content. By integrating external knowledge, RAG systems significantly enhance the accuracy and reliability of AI-generated responses. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;RAGTools&lt;/code&gt; module within the PromptingTools.jl package enables the creation of such systems, offering a path to mitigate these issues. As this module matures, plans are in place to transition it into its own dedicated package, further facilitating the development and adoption of RAG systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  RAGTools Module: A Primer
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;RAGTools&lt;/code&gt; offers an experimental but formidable suite of utilities designed to facilitate the crafting of RAG applications with minimal fuss. Central to its arsenal is the &lt;code&gt;airag&lt;/code&gt; function, a master orchestrator that seamlessly combines AI insights with user-curated knowledge, unlocking new dimensions of accuracy and relevance in answers.&lt;/p&gt;

&lt;p&gt;You can get started very quickly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;# required dependencies to load the necessary extensions&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;LinearAlgebra&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SparseArrays&lt;/span&gt; 
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Experimental&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RAGTools&lt;/span&gt;
&lt;span class="c"&gt;# to access unexported functionality&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;RT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Experimental&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RAGTools&lt;/span&gt;

&lt;span class="c"&gt;## Sample data&lt;/span&gt;
&lt;span class="n"&gt;sentences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;
    &lt;span class="s"&gt;"Search for the latest advancements in quantum computing using Julia language."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"How to implement machine learning algorithms in Julia with examples."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Looking for performance comparison between Julia, Python, and R for data analysis."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Find Julia language tutorials focusing on high-performance scientific computing."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Search for the top Julia language packages for data visualization and their documentation."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"How to set up a Julia development environment on Windows 10."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Discover the best practices for parallel computing in Julia."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Search for case studies of large-scale data processing using Julia."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Find comprehensive resources for mastering metaprogramming in Julia."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Looking for articles on the advantages of using Julia for statistical modeling."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"How to contribute to the Julia open-source community: A step-by-step guide."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Find the comparison of numerical accuracy between Julia and MATLAB."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Looking for the latest Julia language updates and their impact on AI research."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"How to efficiently handle big data with Julia: Techniques and libraries."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Discover how Julia integrates with other programming languages and tools."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Search for Julia-based frameworks for developing web applications."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Find tutorials on creating interactive dashboards with Julia."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"How to use Julia for natural language processing and text analysis."&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"Discover the role of Julia in the future of computational finance and econometrics."&lt;/span&gt;
&lt;span class="x"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;sources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="s"&gt;"Doc&lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;i"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## Build the index&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;build_index&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;chunker_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## Generate an answer&lt;/span&gt;
&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"What are the best practices for parallel computing in Julia?"&lt;/span&gt;

&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;airag&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# short for airag(RAGConfig(), index; question)&lt;/span&gt;
&lt;span class="c"&gt;## Output:&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Done with RAG. Total cost: \$0.0&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Some best practices for parallel computing in Julia include us...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Unveiling New Functionalities
&lt;/h2&gt;

&lt;p&gt;The latest update to the &lt;code&gt;RAGTools&lt;/code&gt; module introduces key features that enhance the creation and analysis of RAG systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Modular Interface&lt;/strong&gt;: The RAG pipeline is now broken down into distinct components (see details below), allowing users to customize and extend each phase with ease. Simply define a new type and method for only the components you wish to modify.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pipeline Transparency&lt;/strong&gt;: Users can now view detailed background information on the RAG pipeline, including the sources selected and the process at each stage (use &lt;code&gt;return_all=true&lt;/code&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Advanced RAG Functionality&lt;/strong&gt;: Default pipeline configuration now comes with question rephrasing, reranking results, and two-step answer refinement. There is even a &lt;code&gt;postprocess&lt;/code&gt; placeholder, so you can add some logging or other transformations. You can easily switch between different implementations thanks to Julia's method dispatch while calling the same top-level function.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Answer Annotation&lt;/strong&gt;: The final answers can be annotated with hallucination scores, showing the overlap with source materials and indicating the origin of specific information within the answer.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Answer Annotation Example
&lt;/h3&gt;

&lt;p&gt;With a small change, you can see which sources were used for each sentence in the answer (&lt;code&gt;[1]&lt;/code&gt;), how strongly they were supported (&lt;code&gt;[..,0.9]&lt;/code&gt;), and the color highlight of the "unknown" words (with magenta color):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;airag&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://forem.julialang.org/images/pks7k2FFxLXxulHMMaqj3x-zwMY9OtSOkdQugqz70zQ/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzZj/Mm13bDg1NGVlZzRw/MmNodmY0LnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/pks7k2FFxLXxulHMMaqj3x-zwMY9OtSOkdQugqz70zQ/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzZj/Mm13bDg1NGVlZzRw/MmNodmY0LnBuZw" alt="Annotated answer" width="800" height="315"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You immediately see that while a lot of the package names and macros look sensible, they did NOT come from our trusted knowledge base (all highlighted in magenta). In real life, we would also have clearly labelled source links that we could verify with one click.&lt;/p&gt;

&lt;p&gt;The annotation system is fully customizable (bring your own logic, styles, etc.).&lt;br&gt;
You can also obtain this information in HTML format to easily show it in your Genie apps!&lt;/p&gt;
&lt;h2&gt;
  
  
  A Closer Look at the Modular Interface
&lt;/h2&gt;

&lt;p&gt;At the heart of the new &lt;code&gt;RAGTools&lt;/code&gt; interface is its modular design, encouraging the interchange of pipeline components. This approach allows for extensive customization at every stage, from data preparation to answer generation, ensuring that developers can easily adapt the system to meet their specific needs.&lt;/p&gt;

&lt;p&gt;This system is designed for information retrieval and response generation, structured in three main phases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Preparation&lt;/strong&gt;, when you create an instance of &lt;code&gt;AbstractIndex&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt;, when you surface the top most relevant chunks/items in the index and return &lt;code&gt;AbstractRAGResult&lt;/code&gt;, which contains the references to the chunks (&lt;code&gt;AbstractCandidateChunks&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generation&lt;/strong&gt;, when you generate an answer based on the context built from the retrieved chunks, return either &lt;code&gt;AIMessage&lt;/code&gt; or &lt;code&gt;AbstractRAGResult&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The associated methods are: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;build_index&lt;/code&gt;&lt;/strong&gt;: Indexes relevant documents for retrieval.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;retrieve&lt;/code&gt;&lt;/strong&gt;: Selects pertinent information chunks based on the query.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;generate!&lt;/code&gt;&lt;/strong&gt;: Produces the final answer using the retrieved data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;airag&lt;/code&gt; is simply a wrapper around &lt;code&gt;retrieve&lt;/code&gt; and &lt;code&gt;generate!&lt;/code&gt;, providing a convenient way to execute the entire RAG pipeline in one go.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://forem.julialang.org/images/oiD_v0fxpNQ6oFQlfL9X3NvXDrQZyRBZ-o1iv3YBuhA/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9zaW1s/LmVhcnRoL1Byb21w/dGluZ1Rvb2xzLmps/L2Rldi9hc3NldHMv/cmFnX2RpYWdyYW1f/aGlnaGxldmVsLkRf/YUx1Z01MLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/oiD_v0fxpNQ6oFQlfL9X3NvXDrQZyRBZ-o1iv3YBuhA/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9zaW1s/LmVhcnRoL1Byb21w/dGluZ1Rvb2xzLmps/L2Rldi9hc3NldHMv/cmFnX2RpYWdyYW1f/aGlnaGxldmVsLkRf/YUx1Z01MLnBuZw" alt="High-level diagram" width="800" height="277"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note that the first argument is always the main dispatching parameter that you can use to customize the behavior of the pipeline. This design ensures that users can easily swap out components or extend the system without disrupting the overall functionality.&lt;/p&gt;
&lt;h2&gt;
  
  
  RAG Pipeline Workflow
&lt;/h2&gt;

&lt;p&gt;The RAG pipeline is structured into distinct stages, each comprising several critical sub-steps to ensure the generation of accurate and relevant answers.&lt;/p&gt;

&lt;p&gt;If you want to change the behavior of any step, you can define a new type and method for that step.&lt;/p&gt;

&lt;p&gt;All customization are subtypes of the abstract types, so use &lt;code&gt;subtypes&lt;/code&gt; function to discover the currently available implementations, eg, &lt;code&gt;subtypes(AbstractReranker)&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Preparation Phase
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;build_index&lt;/code&gt;&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;get_chunks&lt;/code&gt;: Segments documents into manageable chunks.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_embeddings&lt;/code&gt;: Generates embeddings for similarity searches.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_tags&lt;/code&gt;: Tags chunks for efficient filtering.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Retrieval Phase
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;retrieve&lt;/code&gt;&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;rephrase&lt;/code&gt;: Optionally rephrases queries for better matching.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;find_closest&lt;/code&gt;: Identifies the most relevant document chunks.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;find_tags&lt;/code&gt;: Filters chunks based on specific tags.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rerank&lt;/code&gt;: Reranks chunks to prioritize the best matches.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Generation Phase
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;generate!&lt;/code&gt;&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;build_context!&lt;/code&gt;: Constructs the context from selected chunks for the answer.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;answer!&lt;/code&gt;: Generates a preliminary answer.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;refine!&lt;/code&gt;: Refines the answer for clarity and relevance.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;postprocess!&lt;/code&gt;: Applies final touches to prepare the response.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A visual summary with the corresponding types:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://forem.julialang.org/images/C5em0T_--QZwn1cgdD4JXvHL6cVJ1VTcuSVZt_ZPZHw/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL2Zl/a3Q3am83MTkycGRv/ZTJkcnp1LnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/C5em0T_--QZwn1cgdD4JXvHL6cVJ1VTcuSVZt_ZPZHw/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL2Zl/a3Q3am83MTkycGRv/ZTJkcnp1LnBuZw" alt="Detailed diagram" width="800" height="1071"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Where to Start: Quick, Experiment, or Customize
&lt;/h3&gt;

&lt;p&gt;To operate the RAG system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Quick Start&lt;/strong&gt;: Utilize &lt;code&gt;airag&lt;/code&gt; for an immediate, out-of-the-box solution, suitable for rapid testing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Experimentation&lt;/strong&gt;: Leverage &lt;code&gt;RAGConfig&lt;/code&gt; to try out different implementations of &lt;code&gt;airag&lt;/code&gt;, tweaking the system for better performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customization&lt;/strong&gt;: Dive into &lt;code&gt;retrieve&lt;/code&gt; and &lt;code&gt;generate!&lt;/code&gt; for detailed customization, tailoring the process to your precise requirements.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  How to Customize the Pipeline
&lt;/h3&gt;

&lt;p&gt;If you want to customize the behavior of any step, you can do so by defining a new type and defining a new method for the step you're changing, eg, introduce a new reranker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;PromptingTools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Experimental&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RAGTools&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rerank&lt;/span&gt;

&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="nc"&gt; MyReranker&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;:&lt;/span&gt; &lt;span class="n"&gt;AbstractReranker&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="n"&gt;rerank&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;MyReranker&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then you would set the &lt;code&gt;retrive&lt;/code&gt; step to use your custom &lt;code&gt;MyReranker&lt;/code&gt; via &lt;code&gt;reranker&lt;/code&gt; keyword argument, eg, &lt;code&gt;retrieve(....; reranker = MyReranker())&lt;/code&gt; (or customize the top-level dispatching &lt;code&gt;AbstractRetriever&lt;/code&gt; struct).&lt;/p&gt;

&lt;h3&gt;
  
  
  Passing Keyword Arguments to Customize the Pipeline
&lt;/h3&gt;

&lt;p&gt;When you need to adjust specific aspects of the RAG pipeline, keyword arguments (kwargs) allow for targeted modifications. This approach is especially useful for customizing individual components within the system.&lt;/p&gt;

&lt;p&gt;To pinpoint the right keyword arguments (kwargs) for customization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consult the Diagram&lt;/strong&gt;: Review the RAG pipeline diagram or documentation. Identify the component you want to adjust.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use the Format&lt;/strong&gt;: Apply &lt;code&gt;&amp;lt;dispatch_type&amp;gt;&lt;/code&gt; + &lt;code&gt;_kwargs&lt;/code&gt; for direct customizations. For nested adjustments, use prefixes that reflect the hierarchy (e.g., &lt;code&gt;retriever_kwargs -&amp;gt; rephraser_kwargs -&amp;gt; template&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach allows for precise tweaks at any level of the pipeline, ensuring your modifications target exactly what you need.&lt;/p&gt;

&lt;p&gt;Practically, for a broad configuration, you might start with a &lt;code&gt;RAGConfig&lt;/code&gt; instance, specifying components like the &lt;code&gt;AdvancedRetriever&lt;/code&gt; to enhance retrieval capabilities. Preparing kwargs in advance facilitates managing the intricacies of nested configurations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;RAGConfig&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AdvancedRetriever&lt;/span&gt;&lt;span class="x"&gt;())&lt;/span&gt;

&lt;span class="c"&gt;# Organize kwargs for clarity and manageability&lt;/span&gt;
&lt;span class="n"&gt;kwargs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AdvancedRetriever&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;retriever_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;rephraser_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="n"&gt;RAGQueryHyDE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"custom-model"&lt;/span&gt;
        &lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;generator_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;answerer_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"custom-answer-model"&lt;/span&gt;
        &lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:8080"&lt;/span&gt;
    &lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Execute with prepared arguments&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;airag&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In scenarios where direct interaction with components like the retriever is needed, configure its kwargs similarly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;retriever_kwargs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;top_n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rephraser_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="n"&gt;RAGQueryHyDE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"custom-model"&lt;/span&gt;
    &lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:8080"&lt;/span&gt;
    &lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Apply to the retriever function directly&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;retrieve&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AdvancedRetriever&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;retriever_kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Delving deeper into the pipeline, for tasks such as rephrasing, specific kwargs can be directly applied to fine-tune the operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;rephrase_kwargs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"custom-model"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=:&lt;/span&gt;&lt;span class="n"&gt;RAGQueryHyDE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"http://localhost:8080"&lt;/span&gt;
    &lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Customize the rephrase step&lt;/span&gt;
&lt;span class="n"&gt;rephrased_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rephrase&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SimpleRephraser&lt;/span&gt;&lt;span class="x"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;rephrase_kwargs&lt;/span&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structured approach to passing kwargs ensures that each stage of the RAG pipeline can be precisely controlled and customized, allowing for a tailored question-answering system that meets specific needs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Custom Indexes or Vector Databases
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;RAGTools&lt;/code&gt; default implementation is built with an in-memory index suitable for datasets up to 100,000 chunks. For larger datasets or specific indexing needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Define a Custom Index&lt;/strong&gt;: Create a new index by extending &lt;code&gt;AbstractChunkIndex&lt;/code&gt;. Use the &lt;code&gt;ChunkIndex&lt;/code&gt; as a guide for required fields.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Customize Interaction Methods&lt;/strong&gt;: Implement new methods for your index to integrate with the retrieval process of the RAG pipeline.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Share Your Implementation&lt;/strong&gt;: Contributions of integrations with common vector databases are welcome. They enrich the community's resources, enabling more versatile RAG applications.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You would use the same approach to build a hybrid index (semantic search + BM25).&lt;/p&gt;

&lt;p&gt;This approach allows &lt;code&gt;RAGTools&lt;/code&gt; to accommodate a broader range of applications, from large-scale datasets to specialized indexing strategies, enhancing its utility and adaptability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The latest enhancements in the &lt;code&gt;RAGTools&lt;/code&gt; module are a leap forward in democratizing the development of RAG systems. By blending ease of use with deep customizability, we open new avenues for developers and researchers to explore AI-driven question-answering possibilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  We Want to Hear from You!
&lt;/h2&gt;

&lt;p&gt;Your feedback and use cases are crucial as we refine &lt;code&gt;RAGTools&lt;/code&gt; and prepare to carve it out into its own package. Whether you're exploring the vanilla implementation or integrating vector databases, share your insights with us. Your contributions are key to enhancing this interface, making it more robust and versatile for the community. Help shape the future of &lt;code&gt;RAGTools&lt;/code&gt;—join us in this exciting journey towards a more powerful and user-friendly generative AI toolkit.&lt;/p&gt;




&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>rag</category>
    </item>
    <item>
      <title>A 7 Billion Parameter Model that Beats GPT-4 on Julia Code?</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Thu, 14 Mar 2024 09:16:40 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/a-7-billion-parameter-model-that-beats-gpt-4-on-julia-code-51j2</link>
      <guid>https://forem.julialang.org/svilupp/a-7-billion-parameter-model-that-beats-gpt-4-on-julia-code-51j2</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Fine-tuning AI models for specialized tasks is both cost-effective and straightforward, needing only a few examples and less than a dollar, especially when leveraging tools like Axolotl to simplify the process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;What if I told you that a David-sized AI model just outsmarted Goliath GPT-4 in Julia code generation? Welcome to the tale of Cheater-7B, our pint-sized hero, whose adventure into fine-tuning showcases the might of focused AI training. The best part? This entire transformation took just 1 hour and cost less than fifty cents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cheater-7B
&lt;/h2&gt;

&lt;p&gt;Cheater-7B is a nimble 7 billion parameter model, fine-tuned to perfection on its task. Despite its size, even the quantized version (GGUF Q5), beats GPT4 in Julia code generation in our &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard"&gt;leaderboard&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://forem.julialang.org/images/KOpDWsEARmywLd1MnGKD1AhTmigD668fs13oGHblqOc/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL25q/dHMwZ2htZnltamRp/azNhdWU1LnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/KOpDWsEARmywLd1MnGKD1AhTmigD668fs13oGHblqOc/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL25q/dHMwZ2htZnltamRp/azNhdWU1LnBuZw" alt="Cheater-7b Performance" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How Is It Possible? Fine-tuning + Cheating!
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Fine-tuning
&lt;/h3&gt;

&lt;p&gt;Yes, this blog post is a bit of a joke! It is not about the model itself, it’s actually a brief introduction to &lt;strong&gt;fine-tuning&lt;/strong&gt;, which allows you to  “tune” a smaller model to perform like a big one on a &lt;strong&gt;specific task&lt;/strong&gt; (this part is very important.)&lt;/p&gt;

&lt;p&gt;Fine-tuning Cheater-7B on a select 11 problems demonstrates that you don't need vast datasets to achieve significant improvements. This little giant not only excelled in familiar territory but also showed promising signs of learning from new, unseen challenges (see the Appendix!)&lt;/p&gt;

&lt;h3&gt;
  
  
  Beyond Cheating
&lt;/h3&gt;

&lt;p&gt;Yes, Cheater-7B got a head start by "cheating" on the test! We fine-tuned it on 11/14 test cases in our leaderboard (the one we compare models on) - this happens more often than you think in the real world (often unconsciously).&lt;/p&gt;

&lt;p&gt;But the real story here is the power of fine-tuning - because our model turned out to be better than the base model (the model we fine-tuned) in some of the unseen test cases as well! Clearly, it did pick up some Julia knowledge along the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fine-tuning 101
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Is It?
&lt;/h3&gt;

&lt;p&gt;Fine-tuning a model involves adjusting a pre-trained machine learning model's parameters so it can better perform on a specific task, effectively leveraging the model's learned knowledge (probability distribution of the next token) and adapting it to new, related challenges with a relatively small dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Fine-Tuning Should Be Your New Best Friend
&lt;/h3&gt;

&lt;p&gt;Fine-tuning stands out for specific tasks (ie, narrow domains) that demand efficiency, and privacy. It's akin to sharpening your tools to ensure they cut cleaner and faster, all while keeping the costs astonishingly low.&lt;/p&gt;

&lt;p&gt;Once you build your Generative AI system, sooner or later you will have to route some of the simple requests to smaller fine-tuned models as part of the optimization process. Everyone does that, even the big players.&lt;/p&gt;

&lt;h3&gt;
  
  
  Understanding the Limits of Fine-Tuning
&lt;/h3&gt;

&lt;p&gt;While fine-tuning can transform a general AI model into a specialist, it's not a silver bullet. This process excels at refining a model's existing knowledge to perform specific tasks (eg, adjusting the format or style, embedding certain prompts or examples) with greater accuracy or up-weighting/surfacing certain knowledge (eg, Julia) to be used more.&lt;/p&gt;

&lt;p&gt;However, it does have many limitations. It's not very effective for tasks that require the model to learn entirely new information or skills from scratch. For such challenges, you might need to incorporate additional learning methods, like Retrieval Augmented Generation (RAG), to supplement the model's capabilities. In essence, fine-tuning adjusts the focus of the lens but doesn't replace the lens altogether.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting Started with Fine-Tuning: Easier Than You Think
&lt;/h3&gt;

&lt;p&gt;Diving into fine-tuning is more accessible than ever, thanks to user-friendly tools like Axolotl. This approach not only simplifies the process but also opens the door to a collaborative effort in building specialized, efficient AI models for specific needs.&lt;/p&gt;

&lt;p&gt;You need very little data to get started - we used just 11 test cases to get started.&lt;/p&gt;

&lt;p&gt;You can find all the required resources and recipes &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/tree/main/experiments/cheater-7b-finetune"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Cheater-7B Experiment: Fast, Affordable, Enlightening
&lt;/h3&gt;

&lt;p&gt;The journey of creating Cheater-7B was a lesson in efficiency itself: just 1 hour of processing on a cloud GPU, with an investment that didn't even hit the half-dollar mark. This experiment underscores the practicality and accessibility of fine-tuning for AI enthusiasts and professionals alike.&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting Started with Fine-Tuning Data
&lt;/h3&gt;

&lt;p&gt;Your first step in fine-tuning is to gather examples, specifically AI conversations that align with the skills you're aiming to enhance (eg, good Julia conversations/exchanges). To save these conversations for later use, you can employ a helpful function from the PromptingTools package &lt;code&gt;save_conversation&lt;/code&gt; (saves a conversation to JSON).&lt;/p&gt;

&lt;p&gt;If you're looking for a communal space to store and share these conversations, consider contributing to an open-source project. Open a pull request at &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/tree/main/julia_conversations"&gt;Julia-LLM-Leaderboard's Julia Conversations&lt;/a&gt; to add your valuable data to the collective repository. &lt;br&gt;
This folder also shows example code snippets on how to save your conversations from PromptingTools.&lt;/p&gt;

&lt;p&gt;I hope to write a detailed walkthrough of the process soon, but for now, you can find all the required resources and recipes &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/tree/main/experiments/cheater-7b-finetune"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Cheater-7B's story is more than a quirky anecdote; it's a compelling illustration of how fine-tuning can unlock the potential of AI models, transforming them into task-specific powerhouses. As we continue to explore and share our experiences, the possibilities for innovation and improvement in AI are boundless. &lt;/p&gt;

&lt;p&gt;Got a cool idea or breakthrough with your fine-tuning experiments? Share it in the &lt;a href="https://julialang.slack.com/archives/C06G90C697X"&gt;generative-ai channel on Julia Slack&lt;/a&gt; and inspire the community with your innovation!&lt;/p&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Discover Axolotl: &lt;a href="https://github.com/OpenAccess-AI-Collective/axolotl"&gt;Axolotl&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Explore the Julia LLM Leaderboard: &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard"&gt;Julia LLM Leaderboard&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Resources to Train Your Cheater-7B: &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/tree/main/experiments/cheater-7b-finetune"&gt;Cheater-7B experiment&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Saving Conversations with PromptingTools: &lt;a href="https://github.com/svilupp/Julia-LLM-Leaderboard/tree/main/julia_conversations"&gt;Julia Conversations folder&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Trained Cheater-7B Model: &lt;a href="https://huggingface.co/svilupp/cheater-7b/tree/main"&gt;Cheater-7B Model&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Extra Questions
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Is it expensive?&lt;/strong&gt;&lt;br&gt;
The process of fine-tuning Cheater-7B was surprisingly affordable, costing less than half a dollar. By renting a cloud GPU from Jarvislabs.io and opting for a spot instance outside of peak hours, the entire fine-tuning operation on an RTX A5000 was completed in about an hour for just $0.39.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How was Cheater-7b trained? Is it difficult?&lt;/strong&gt;&lt;br&gt;
Training Cheater-7B was streamlined and accessible, thanks to the Axolotl tool. &lt;/p&gt;

&lt;p&gt;Axolotl simplifies the fine-tuning process, making it approachable even for those new to machine learning. With just a few commands in the CLI, a configuration YAML file, and the selected dataset, Cheater-7B was fine-tuned efficiently. This ease of use demystifies the process, making advanced AI techniques available to a broader audience.&lt;/p&gt;

&lt;p&gt;See the example configuration in the Resources section.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Where did you get the data?&lt;/strong&gt;&lt;br&gt;
The data for fine-tuning Cheater-7B came from the Julia LLM Leaderboard, focusing on solutions that demonstrated excellence and diversity. Specifically, we took the top 50 solutions that scored full points (100 points) for 11 out of the 14 test cases across different prompts. &lt;/p&gt;

&lt;p&gt;The associated code is available in the Resources section.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Can I try/use the model?&lt;/strong&gt;&lt;br&gt;
Yes, of course. Download the LORA adapter or the quantized version from &lt;a href="https://huggingface.co/svilupp/cheater-7b/tree/main"&gt;here&lt;/a&gt;.&lt;br&gt;
I'd recommend using &lt;code&gt;llama.cpp&lt;/code&gt; or &lt;code&gt;Llama.jl&lt;/code&gt; to run it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Did we not just memorize the results?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Well, partially! See below the performance of each model (and GPT4 for comparison) on various test cases.&lt;/p&gt;

&lt;p&gt;We fine-tuned our model on the first 11 test cases. It has never seen any of the last 3 test cases: &lt;code&gt;q_and_a_extractor&lt;/code&gt;, &lt;code&gt;pig_latinify&lt;/code&gt;, and &lt;code&gt;extra_julia_code&lt;/code&gt;. These are the hardest test cases in our leaderboard and you can see that even GPT4 struggles to produce "executable" code (&amp;gt;50 points) for these.&lt;/p&gt;

&lt;p&gt;The 11 training cases didn't teach our model much about &lt;code&gt;pig_latinify&lt;/code&gt; (requires knowledge of multi-threading and associated libraries) and &lt;code&gt;extract_julia_code&lt;/code&gt; (requires large models because there can be multiple nested levels of triple backticks and strings in the inputs, which tips up most models).&lt;/p&gt;

&lt;p&gt;However, the performance on &lt;code&gt;q_and_a_extractor&lt;/code&gt; has increased significantly compared to both GPT4 and the base model! It's likely because the model learned how to do Regex operations in Julia and learned to navigate the return types better.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://forem.julialang.org/images/dCVU5-82sWf9UwyTyKt2G2tMX9zt27b2yJMZ_VWU8cI/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL21j/MHl4aHExemVyMTRp/ZXc4OGxzLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/dCVU5-82sWf9UwyTyKt2G2tMX9zt27b2yJMZ_VWU8cI/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL21j/MHl4aHExemVyMTRp/ZXc4OGxzLnBuZw" alt="Test Case Comparison" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>finetuning</category>
    </item>
    <item>
      <title>Six Steps to Success: Designing and Delivering Your First Generative AI Demo</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Fri, 08 Mar 2024 20:42:57 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/six-steps-to-success-designing-and-delivering-your-first-generative-ai-demo-25k4</link>
      <guid>https://forem.julialang.org/svilupp/six-steps-to-success-designing-and-delivering-your-first-generative-ai-demo-25k4</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Discover how to create an engaging and effective Generative AI demo in Julia with six crucial tips, focusing on simplifying technical complexities, crafting a compelling narrative, and enhancing user experience with stunning UI and prompt caching. This guide ensures your demo captures the imagination of your stakeholders and showcases the potential of GenAI technology.&lt;/p&gt;

&lt;h2&gt;
  
  
  Crafting Your First Generative AI Demo: A Guide to Wow Your Stakeholders
&lt;/h2&gt;

&lt;p&gt;In the rapidly evolving world of Generative AI (GenAI), demonstrating the capabilities of your solution in a way that captivates and convinces stakeholders is more crucial than ever. A well-crafted demo can serve as a powerful tool to showcase technical possibilities and ignite the imagination of your audience. However, the goal here is not to present a polished, ready-to-use product but to illuminate the potential applications of GenAI in a vivid, engaging manner. &lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Purpose of Your Demo
&lt;/h2&gt;

&lt;p&gt;Before diving into the mechanics of building your demo, it's essential to distinguish between a demo and a Minimum Viable Product (MVP). A demo is a showcase, designed to highlight what's possible with GenAI, helping stakeholders envision how they might use such technology in their own contexts. It’s about painting a picture of the future, not delivering the final product for immediate use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six Essential Tips for a Successful GenAI Demo
&lt;/h2&gt;

&lt;p&gt;Crafting a demo that stands out requires more than just technical know-how. It demands strategic planning, creativity, and a focus on the end-user experience. Let’s explore six tips that can make your GenAI demo a resounding success.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tip 1: Balance Your Focus Away from the Technical
&lt;/h2&gt;

&lt;p&gt;When preparing your demo, follow the 33/33/33 rule:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;33% Good Planning&lt;/strong&gt;: Dedicate a third of your efforts to planning. Identify the core feature or capability — your "wow" factor — and develop a "screenplay" that showcases it compellingly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;33% Simplifying Technical Aspects&lt;/strong&gt;: Don’t get bogged down in technical perfection. Your demo should be simple yet effective, highlighting GenAI’s capabilities without unnecessary complexity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;33% Polishing the UI&lt;/strong&gt;: Aesthetics matter. Spend a third of your time ensuring the user interface (UI) is clean, engaging, and intuitive. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's dive into the specifics of each step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tip 2: Find Your “Wow” and Write a “Screenplay” for it
&lt;/h2&gt;

&lt;p&gt;Determine what aspect of your GenAI solution will most impress your audience. Is it the interface, the novel insights it generates, or its ability to synthesize and summarize complex information? &lt;/p&gt;

&lt;p&gt;Once identified, craft a &lt;strong&gt;detailed screenplay&lt;/strong&gt; for your demo. This script should outline every step of the demo, simulating a real user interaction (it will help you with the technical simplifications!) Focus on showcasing this core feature and simplify everything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tip 3: Break Bigger Tasks into Individual “Skills”
&lt;/h2&gt;

&lt;p&gt;Instead of striving to create a GenAI solution that can do everything, break down the larger workflow/conversation into smaller, discrete tasks or “skills.”&lt;/p&gt;

&lt;p&gt;For example, is there a web search feature? A set of specific questions it can answer? An email draft? Each one of these is a "skill" that can be built independently for faster iteration and more reliability (think "input -&amp;gt; output").&lt;/p&gt;

&lt;p&gt;Your demo can then call on these skills separately (without any preceding conversation history) to keep things simple. It will just look like a big conversation, but it's, actually, a series of smaller, more reliable interactions.&lt;/p&gt;

&lt;p&gt;This approach allows you to highlight specific strengths of your solution without overcomplicating the demo. Think of each skill as a standalone feature that, when combined, showcases the versatility and power of your GenAI solution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tip 4: Remove Any Unnecessary Complexity
&lt;/h2&gt;

&lt;p&gt;Your demo should be as straightforward as possible. Avoid complex setups like chained Large Language Model calls, which can introduce unnecessary points of failure and don't waste time on building things you don't need!&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you need some data from the database? Pick the 20 most interesting records that will deliver the "wow".&lt;/li&gt;
&lt;li&gt;Do you need some web scraping? Copy &amp;amp; paste the few pages you need manually.&lt;/li&gt;
&lt;li&gt;Do you need some LLM router (to pick the right "skill")? You could use &lt;code&gt;aiclassify&lt;/code&gt; to do that, but it's good enough to use simply IF conditions with &lt;code&gt;occursin()&lt;/code&gt;. Thanks to your screenplay, you know exactly what to expect and when, so you can keep it simple.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To be clear, you don't need to follow the "screenplay" word by word, but it's a good guide to keep things simple and focused. This focus on the user experience over technical complexity will make your demo more accessible and impactful.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tip 5: Enhance Your Demo with a Stunning UI
&lt;/h2&gt;

&lt;p&gt;Leverage tools like GenieFramework's &lt;a href="https://github.com/GenieFramework/stipple.jl"&gt;Stipple.jl&lt;/a&gt; to quickly develop a beautiful UI for your demo. &lt;br&gt;
With just a few lines of code, you can create an application that not only functions well but also looks professional and engaging. &lt;/p&gt;

&lt;p&gt;You can find a basic example of a Stipple app below - it's less than 100 lines of code (50 active lines)! Code is provided in the Appendix and you can run it from your Julia REPL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://forem.julialang.org/images/gQCFeigcjRX9M2jB4f2waHNxkdgx5L-9zpJtUl1N1Jg/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzFv/NG9mZGJzbHU1ZWts/eGE1MzJjLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/gQCFeigcjRX9M2jB4f2waHNxkdgx5L-9zpJtUl1N1Jg/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzFv/NG9mZGJzbHU1ZWts/eGE1MzJjLnBuZw" alt="Stipple.jl UI Example" width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In other news, the team behind GenieFramework has just launched their new no-code builder. Make sure to check it out: &lt;a href="https://info.juliahub.com/web-applications-in-julia-with-genie-builder"&gt;Web Applications in Julia with Genie Builder&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Tip 6: Utilize Prompt Caching for a Smoother Experience
&lt;/h2&gt;

&lt;p&gt;Implement prompt caching to eliminate latency and ensure a fluid demo experience. This strategy involves storing and quickly retrieving responses for common queries or inputs, thus avoiding the need for real-time generation during the demo. It's not about deceiving your audience but about showcasing your GenAI solution's potential without technical hitches or delays (you would optimize the latency in production use cases anyway).&lt;/p&gt;

&lt;p&gt;There are &lt;a href="https://github.com/marius311/Memoization.jl"&gt;Memoization.jl&lt;/a&gt; and &lt;a href="https://github.com/JuliaCollections/Memoize.jl"&gt;Memoize.jl&lt;/a&gt;, but neither of them supports caching to disk, so you cannot restart your REPL.&lt;/p&gt;

&lt;p&gt;I prefer to use simple &lt;code&gt;Dict&lt;/code&gt; and if-else statement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;# Remember the conversation via key: `hash(conversation)`&lt;/span&gt;
&lt;span class="n"&gt;CACHE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;Dict&lt;/span&gt;&lt;span class="x"&gt;{&lt;/span&gt;&lt;span class="kt"&gt;UInt64&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;&lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;}()&lt;/span&gt;
&lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;# mock-up only, you would need to convert MyMessage to PromptingTools types for it to work&lt;/span&gt;

&lt;span class="c"&gt;# Conversation 1&lt;/span&gt;
&lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="s"&gt;"I am a user"&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="s"&gt;"I am Genie"&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;output1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="s"&gt;"Nice to meet you, Genie!"&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conv1&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;output1&lt;/span&gt;

&lt;span class="c"&gt;# Example use&lt;/span&gt;
&lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conv1&lt;/span&gt; &lt;span class="c"&gt;# known conversation&lt;/span&gt;
&lt;span class="c"&gt;## conversation = [MyMessage("New conversation", true)] # unknown conversation&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;haskey&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
    &lt;span class="nd"&gt;@info&lt;/span&gt; &lt;span class="s"&gt;"&amp;gt; Cache hit!"&lt;/span&gt;
    &lt;span class="n"&gt;output_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="nd"&gt;@info&lt;/span&gt; &lt;span class="s"&gt;"&amp;gt; Cache miss! Generating response..."&lt;/span&gt;
    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;# Save the response for later&lt;/span&gt;
    &lt;span class="n"&gt;CACHE&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The beauty is that&lt;/p&gt;

&lt;p&gt;1) You can decide whether to cache the whole conversation or only the last user message (keep it simple as per Tip 3!)&lt;br&gt;
2) You can then serialize the &lt;code&gt;Dict&lt;/code&gt; to disk and load it back when you restart your REPL. &lt;/p&gt;
&lt;h3&gt;
  
  
  Bonus Tip: Implement Quick Actions
&lt;/h3&gt;

&lt;p&gt;To make your demo even more engaging, incorporate dynamic "quick action" buttons that guide users through predefined next steps or use cases. &lt;/p&gt;

&lt;p&gt;This feature not only makes the demo more feature-rich but also ensures a smoother experience by reducing the uncertainty of open-ended interactions. Quick action buttons can be easily implemented in Stipple, enhancing the flow of your presentation and making it easier for your audience to understand the full capabilities of your GenAI solution. &lt;/p&gt;

&lt;p&gt;Additionally, by defining these actions in advance, you can more effectively leverage prompt caching, ensuring that each demonstration runs smoothly and without delay.&lt;/p&gt;
&lt;h2&gt;
  
  
  Appendix: GenieFramework UI Example
&lt;/h2&gt;

&lt;p&gt;If you want to see how easy it is to create a stunning UI for your GenAI demo, here's a basic example using GenieFramework's Stipple.jl.&lt;/p&gt;

&lt;p&gt;First, install GenieFramework (PromptingTools is not required, just comment it out!)&lt;br&gt;
Second, run the below code in your Julia REPL (or save it to a script and run it from there).&lt;br&gt;
Once the server starts, it will tell you to navigate to &lt;code&gt;http://127.0.0.1:8000&lt;/code&gt; in your browser to see the UI (or just click on the link in the REPL).&lt;/p&gt;

&lt;p&gt;If you have any questions, there is a dedicated Genie channel on the JuliaLang Slack and the Genie team also runs a great &lt;a href="https://discord.gg/fHa9GVaP"&gt;Discord server&lt;/a&gt; where you can get help!&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="n"&gt;App&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;GenieFramework&lt;/span&gt; &lt;span class="c"&gt;# GenieFramework v2.1.0&lt;/span&gt;
&lt;span class="nd"&gt;@genietools&lt;/span&gt;

&lt;span class="c"&gt;# ! Params&lt;/span&gt;
&lt;span class="n"&gt;GENIE_IMG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://easydrawingguides.com/wp-content/uploads/2021/10/how-to-draw-genie-from-aladdin-featured-image-1200.png"&lt;/span&gt;
&lt;span class="n"&gt;INTRO_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;
    &lt;span class="s"&gt;"Welcome back, Jan!"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"What can I help you with today? Eg, `example ABC`"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
&lt;span class="x"&gt;]&lt;/span&gt;

&lt;span class="c"&gt;### Helpful functions&lt;/span&gt;
&lt;span class="s"&gt;"MyMessage is a struct that represents a message in the chat"&lt;/span&gt;
&lt;span class="nd"&gt;@kwdef&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="nc"&gt; MyMessage&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Genie"&lt;/span&gt;
    &lt;span class="n"&gt;avatar&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Union&lt;/span&gt;&lt;span class="x"&gt;{&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;&lt;span class="kt"&gt;Nothing&lt;/span&gt;&lt;span class="x"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;nothing&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;AbstractVector&lt;/span&gt;&lt;span class="x"&gt;{&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;:&lt;/span&gt;&lt;span class="kt"&gt;AbstractString&lt;/span&gt;&lt;span class="x"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="x"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;from_user&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="s"&gt;"Create a `MyMessage` from a user or from Genie"&lt;/span&gt;
&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;AbstractVector&lt;/span&gt;&lt;span class="x"&gt;{&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;:&lt;/span&gt;&lt;span class="kt"&gt;AbstractString&lt;/span&gt;&lt;span class="x"&gt;},&lt;/span&gt; &lt;span class="n"&gt;from_user&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Bool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;from_user&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avatar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;from_user&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nb"&gt;nothing&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;GENIE_IMG&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;from_user&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"me"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Genie"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;AbstractString&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;from_user&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Bool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;from_user&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;### Dashboard logic&lt;/span&gt;
&lt;span class="nd"&gt;@appname&lt;/span&gt; &lt;span class="n"&gt;MyDemoApp&lt;/span&gt;

&lt;span class="nd"&gt;@app&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
    &lt;span class="nd"&gt;@in&lt;/span&gt; &lt;span class="n"&gt;btn_send&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;
    &lt;span class="nd"&gt;@in&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
    &lt;span class="nd"&gt;@in&lt;/span&gt; &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;[]&lt;/span&gt;
    &lt;span class="nd"&gt;@onchange&lt;/span&gt; &lt;span class="n"&gt;isready&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
        &lt;span class="nd"&gt;@info&lt;/span&gt; &lt;span class="s"&gt;"&amp;gt; Dashboard is ready"&lt;/span&gt;
        &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;INTRO_MESSAGE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="nd"&gt;@onbutton&lt;/span&gt; &lt;span class="n"&gt;btn_send&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
        &lt;span class="nd"&gt;@info&lt;/span&gt; &lt;span class="s"&gt;"&amp;gt; User said: &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;user_input"&lt;/span&gt; &lt;span class="c"&gt;# for tracking in REPL&lt;/span&gt;
        &lt;span class="c"&gt;# Easy way to reset conversation -&amp;gt; just send "reset"&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lowercase&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"reset"&lt;/span&gt;
            &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;INTRO_MESSAGE&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
        &lt;span class="k"&gt;elseif&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;isempty&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
            &lt;span class="c"&gt;## New converation message&lt;/span&gt;
            &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;push!&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
            &lt;span class="c"&gt;# Genie's response logic goes BELOW, eg, `aigenerate(user_input)`&lt;/span&gt;
            &lt;span class="n"&gt;genie_says&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Hey... I'm still learning. I don't know how to respond to that yet."&lt;/span&gt;
            &lt;span class="n"&gt;user_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt; &lt;span class="c"&gt;# empty the user input&lt;/span&gt;
            &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;push!&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MyMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;genie_says&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c"&gt;### Dashboard UI&lt;/span&gt;
&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; ui&lt;/span&gt;&lt;span class="x"&gt;()&lt;/span&gt;
    &lt;span class="x"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;heading&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"My First Genie Demo"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt;
        &lt;span class="c"&gt;## Row 1: Chat&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
            &lt;span class="x"&gt;[&lt;/span&gt;
                &lt;span class="n"&gt;cell&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"st-module"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
                    &lt;span class="x"&gt;[&lt;/span&gt;
                        &lt;span class="c"&gt;## awesome trick that allows to pass a vector of messages (=`conversation`) and generates an object for each&lt;/span&gt;
                        &lt;span class="n"&gt;chatmessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="s"&gt;"message.text"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="s"&gt;"message.name"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="s"&gt;"message.from_user"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avatar&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="s"&gt;"message.avatar"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;@for&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"message in conversation"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;R&lt;/span&gt;&lt;span class="s"&gt;"message.id"&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt;
                    &lt;span class="x"&gt;]),&lt;/span&gt;
            &lt;span class="x"&gt;]),&lt;/span&gt;

        &lt;span class="c"&gt;## Row 2: Input From User&lt;/span&gt;
        &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;
            &lt;span class="n"&gt;cell&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"st-module"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
                &lt;span class="x"&gt;[&lt;/span&gt;
                    &lt;span class="n"&gt;Html&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;div&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"input-group"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
                        &lt;span class="x"&gt;[&lt;/span&gt;
                            &lt;span class="n"&gt;textfield&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Waiting for your requests... Try: `&amp;lt;example command&amp;gt;`"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
                                &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
                                &lt;span class="nd"&gt;@on&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"keyup.enter"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"btn_send = !btn_send"&lt;/span&gt;&lt;span class="x"&gt;)),&lt;/span&gt;
                            &lt;span class="n"&gt;btn&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Send"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="nd"&gt;@click&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;btn_send&lt;/span&gt;&lt;span class="x"&gt;)),&lt;/span&gt;
                        &lt;span class="x"&gt;])]),&lt;/span&gt;
        &lt;span class="x"&gt;]),&lt;/span&gt;
    &lt;span class="x"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="nd"&gt;@page&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ui&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;# Start the server&lt;/span&gt;
&lt;span class="n"&gt;Genie&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isrunning&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;webserver&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt;&lt;span class="x"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="c"&gt;# end of module&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>demo</category>
    </item>
    <item>
      <title>Navigating Your First GenAI Project: A Blueprint for Success</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Tue, 13 Feb 2024 10:01:38 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/navigating-your-first-genai-project-a-blueprint-for-success-5fo9</link>
      <guid>https://forem.julialang.org/svilupp/navigating-your-first-genai-project-a-blueprint-for-success-5fo9</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Embarking on your first Generative AI project can be daunting. Avoid common pitfalls by following these five practical tips: 1) Start simple and build iteratively, 2) Start from the end, 3) Use the best model available and manage costs wisely, 4) Start with a commercial API, and 5) Prepare your "vibe" check. Each tip is designed to streamline your project’s development, ensuring efficiency and effectiveness from inception to execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I've observed numerous individuals repeating the same errors in their projects, so I hope the guidance provided here will help you avoid these common pitfalls and steer your project toward success.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Start Simple and Build Iteratively
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Idea&lt;/strong&gt;: Break your grand vision into manageable, discrete tasks. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical Application&lt;/strong&gt;: If designing an AI to generate news articles, start by focusing on creating compelling headlines. Once mastered, expand to introductory paragraphs, and so forth. This step-by-step approach mitigates risk and builds towards complexity gradually. &lt;/p&gt;

&lt;p&gt;If there are multiple GenAI steps, start with just one. Why? If each of the three consecutive steps has a 70% chance of success, the overall probability of succeeding at all three drops to around 1 in 3 - that's not a good starting point! So go step by step.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Start from the End
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Idea&lt;/strong&gt;: Visualize each step's inputs and outputs to guide development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical Application&lt;/strong&gt;: For an AI-powered fitness app, sketch out the final user interaction—say, providing personalized workout plans based on user input (e.g., available equipment, fitness level). Create an example of one conversation, or one input &amp;amp; output set and start by getting that to work.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Use the Best Model Available and Manage Costs Wisely
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Idea&lt;/strong&gt;: Opt for the highest quality AI model to test your project's potential, but be smart about data usage to keep costs in check.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical Application&lt;/strong&gt;: Using GPT-4 Turbo might reveal that your idea is feasible, whereas starting with a smaller model could lead to unnecessary troubleshooting, blaming the idea's failure on the model's limitations. If you're worried about costs, start with a small dataset and see what you can learn from it.&lt;/p&gt;

&lt;p&gt;People often overestimate how much the best models cost. For example, the blog post I wrote about analyzing themes in the City of Austin survey had c. 3000 verbatims (~400K characters) and embedding ALL of them cost ~$0.002! Generating the topics with GPT4Turbo cost less than half a cent!&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Start with a Commercial API
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Idea&lt;/strong&gt;: Commercial APIs save time and offer efficiency, outweighing the cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical Application&lt;/strong&gt;: Using OpenAI instead of Ollama might be cheaper and much faster!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Consider the &lt;strong&gt;hidden cost of locally hosted models&lt;/strong&gt;: Let's say you're choosing between Ollama Mixtral, which takes 30 seconds, and GPT-3.5 Turbo, which takes 2 seconds, for a task, the latter often provides better results. If you value your time at $20 an hour, using Ollama Mixtral, despite being free, effectively costs you 15 cents due to the longer duration, compared to the negligible 0.5 cents for GPT-3.5 Turbo's quicker completion.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimize your experiment cycle time&lt;/strong&gt;: The duration to test a new idea or modification, known as "experiment cycle time," is crucial. Opting for a commercial API justifies its cost by enabling the parallelization of tasks—what might take one GPU with Ollama Mixtral a considerable time to process can be done almost instantaneously with commercial APIs. For instance, you could execute 100 calls in the time it takes Ollama to complete just one, significantly accelerating development and reducing the effective cost of your time even further.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Prepare Your "Vibe" Check
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Key Idea&lt;/strong&gt;: Establish a mini benchmark for your project's core functionality and continuously assess progress against it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical Application&lt;/strong&gt;: Identify 2-3 key input-output pairs that encapsulate each task's essence. They should be &lt;strong&gt;challenging and complementary&lt;/strong&gt; (ie, not duplicative because you wouldn't learn more). Regularly review your application against these pairs to notice when the performance drops. While thorough evaluations will come later, these early checks are crucial for maintaining direction and focus.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Starting your first GenAI project is an exciting venture filled with opportunities and challenges. By adhering to these five tips, you position your project for success from the outset. Simplify your approach, prioritize quality, manage resources wisely, and maintain a clear vision of your project's objectives. This blueprint will guide you through the complexities of GenAI development, ensuring a smooth and productive journey from concept to completion.&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>generativeai</category>
      <category>genai</category>
    </item>
    <item>
      <title>The Quest for Ultimate Productivity: Building an LLM-Powered Assistant</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Mon, 05 Feb 2024 10:01:33 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/the-quest-for-ultimate-productivity-building-an-llm-powered-assistant-21nk</link>
      <guid>https://forem.julialang.org/svilupp/the-quest-for-ultimate-productivity-building-an-llm-powered-assistant-21nk</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;A quick preview of my journey in developing a proof of concept for a personalized LLM-powered assistant, aiming to streamline daily productivity tasks. You can do the same!&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Hello, fellow productivity enthusiasts! 🌟 Ever find yourself drowning in a sea of tasks, with your desk looking more like a paper warehouse and your inbox resembling a bottomless pit? Yeah, me too. It’s the 21st-century dilemma: so much to do, yet so little time. 🕰️&lt;/p&gt;

&lt;p&gt;Over the years, I've devoured productivity books and experimented with every app under the sun, from GTD to Timeboxing, and from Wunderlist to Motion. Despite my efforts, something always felt off. 📚✖️&lt;/p&gt;

&lt;p&gt;One day I saw a &lt;a href="https://x.com/MilesCranmer/status/1738222999063650474?s=20"&gt;tweet&lt;/a&gt; from Miles Cranmer about his Notion &amp;amp; LangChain project.&lt;br&gt;
That's when a lightbulb went off 💡: I need is a super "narrow" LLM-powered assistant, fine-tuned just for me! Imagine an assistant that knows only 2-3 tasks but executes them with unparalleled precision, because it knows you! &lt;/p&gt;
&lt;h2&gt;
  
  
  🛠️ Building the Dream Assistant
&lt;/h2&gt;

&lt;p&gt;In the past, I encountered a few key challenges, so I wanted to address them head-on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ease of Use&lt;/strong&gt;: Annotating tasks felt like a chore. Who has 5 minutes to fill out a task form? 🤷 My goal was to enable the assistant to understand tasks from just a simple sentence, removing the need for detailed input.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fresh Starts&lt;/strong&gt;: I ditched rolling over unfinished tasks to keep each day's slate clean, focusing solely on the present day's priorities. That was the failure point of auto-scheduling in Motion - at some point, the conflict backlog explodes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Realistic Boundaries&lt;/strong&gt;: To counteract my tendency to stretch myself too thin, the assistant now evaluates my daily capacity, scheduling only when there is space left and ensuring I set achievable goals by warning me against overloading my schedule. The duration estimation is out of my hands, hopefully, improving the quality of the estimates.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Integrating these improvements, the assistant now supports a streamlined approach to productivity, focusing on what tasks are essential, and when they'll be tackled, and maintaining realism in daily planning. This refined tool is all about enhancing daily productivity without the overhead, starting each day anew, and keeping ambitions in check.&lt;/p&gt;
&lt;h2&gt;
  
  
  🤖 Integration and Automation
&lt;/h2&gt;

&lt;p&gt;Connecting Notion was a no-brainer since my to-do list resides in Notion, making for a seamless transition. Now with their Calendar app, it's an even stronger proposition. I rolled up my sleeves and crafted quick macros: &lt;code&gt;@tadd&lt;/code&gt; for adding tasks (because who doesn’t love a good abbreviation?) and &lt;code&gt;@cadd&lt;/code&gt; for calendar additions (tasks meant to be auto-scheduled). It was like giving my assistant its own language. 🗣️💬&lt;/p&gt;

&lt;p&gt;Here’s a sneak peek of how it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;## Call out the tasks to schedule for today:&lt;/span&gt;
&lt;span class="nd"&gt;@cadd&lt;/span&gt; &lt;span class="s"&gt;"Hack up POC for GolemScheduler.jl"&lt;/span&gt;
&lt;span class="nd"&gt;@cadd&lt;/span&gt; &lt;span class="s"&gt;"LLMTextAnalysis.jl: add a warning if the provided strings are empty or if length is &amp;gt;10K"&lt;/span&gt;
&lt;span class="nd"&gt;@cadd&lt;/span&gt; &lt;span class="s"&gt;"Ping James about the progress"&lt;/span&gt;
&lt;span class="nd"&gt;@cadd&lt;/span&gt; &lt;span class="s"&gt;"Analyze the data for project XYZ"&lt;/span&gt;

&lt;span class="c"&gt;## [ Info: Processing task...&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 1282 @ Cost: \$0.0159 in 17.4 seconds&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduling task&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduled for 2024-02-05T08:00:00 - 2024-02-05T10:00:00&lt;/span&gt;
&lt;span class="c"&gt;## CreatedPage @ https://www.notion.so/Develop-Proof-of-Concept-for-GolemScheduler-jl-732e8e0b0f4e4b0aa7d07ae3911f99fd&lt;/span&gt;

&lt;span class="c"&gt;## [ Info: Processing task...&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 1274 @ Cost: \$0.0154 in 11.8 seconds&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduling task&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduled for 2024-02-05T10:00:00 - 2024-02-05T11:00:00&lt;/span&gt;
&lt;span class="c"&gt;## CreatedPage @ https://www.notion.so/Add-warning-functionality-to-LLMTextAnalysis-jl-0e9ebf4a13234424a247fac1256d4285&lt;/span&gt;

&lt;span class="c"&gt;## [ Info: Processing task...&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 1207 @ Cost: \$0.0138 in 8.5 seconds&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduling task&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduled for 2024-02-05T11:00:00 - 2024-02-05T11:15:00&lt;/span&gt;
&lt;span class="c"&gt;## CreatedPage @ https://www.notion.so/Ping-James-about-the-progress-00f9acf18e014eca89ca40d136c81a43&lt;/span&gt;

&lt;span class="c"&gt;## [ Info: Processing task...&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 1221 @ Cost: \$0.0141 in 7.7 seconds&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Scheduling task&lt;/span&gt;
&lt;span class="c"&gt;## ┌ Warning: No available slot found. `overflow` will be set to `true`.&lt;/span&gt;
&lt;span class="c"&gt;## └ @ Main ~/golem_scheduler/api_services.jl:181&lt;/span&gt;
&lt;span class="c"&gt;## CreatedPage @ https://www.notion.so/Analyze-the-data-for-project-XYZ-2e8f182d7c454f38a02c53b330b08367&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a snapshot of my day in Notion:&lt;br&gt;
&lt;a href="https://forem.julialang.org/images/2ND8MkJ8DkUKOQFvONd2YKU23BGGzLM63bqa5yGYsK0/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL3F3/enFrdjB5aDc4OWQ1/bWRjM2I1LnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/2ND8MkJ8DkUKOQFvONd2YKU23BGGzLM63bqa5yGYsK0/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzL3F3/enFrdjB5aDc4OWQ1/bWRjM2I1LnBuZw" alt="A Notion screenshot of my day" width="800" height="997"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  📅 When Plans Overflow
&lt;/h2&gt;

&lt;p&gt;As you can see, it happened again - I was too ambitious and the last task simply didn't fit in the allocated time.&lt;/p&gt;

&lt;p&gt;So my assistant nudged me with: “Warning: No available slot found. &lt;code&gt;overflow&lt;/code&gt; will be set to &lt;code&gt;true&lt;/code&gt;.” This is my cue to hop into Notion and play Tetris with my tasks, filtering on “overflow=true.” 🚫📆&lt;/p&gt;

&lt;p&gt;You can see that it automatically slots in the corresponding section on My Day in Notion.&lt;/p&gt;

&lt;p&gt;The best part? The links next to "CreatedPage" in the logs are clickable, taking me directly to the task in Notion if I want to quickly edit anything. It’s like having a personal assistant who knows exactly what I need when I need it. 🤖👩‍💼&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;So, there you have it—a glimpse into my journey of building a personalized LLM-powered assistant. It’s still early days, but the potential to revolutionize personal productivity is immense.&lt;/p&gt;

&lt;p&gt;Are you intrigued by the idea of crafting your own productivity sidekick? Or perhaps you’re just here for the techy talk and tales of trial and error. Either way, I’d love to hear your thoughts! 🗨️💭&lt;/p&gt;

&lt;p&gt;If there’s enough interest, I might just package this up and share it with the world. Until then, let’s keep pushing the boundaries of what’s possible, one task at a time. 🚀&lt;/p&gt;




&lt;p&gt;Remember, the journey to productivity is as much about the tools we use as it is about the mindset we cultivate. Stay curious, stay inventive, and most importantly, stay productive! 🌈✨&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>productivity</category>
      <category>golem</category>
    </item>
    <item>
      <title>Duplicate No More Pt. 2: Mastering LLM-as-a-Judge Scoring</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Fri, 26 Jan 2024 23:55:34 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/duplicate-no-more-pt-2-mastering-llm-as-a-judge-scoring-51ff</link>
      <guid>https://forem.julialang.org/svilupp/duplicate-no-more-pt-2-mastering-llm-as-a-judge-scoring-51ff</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;Explore three LLM judge scoring techniques - additive scoring, linguistic calibration scales, and categorical scoring - applied to the art of data deduplication, enhancing accuracy and consistency in identifying duplicates in contact datasets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Welcome back to our journey into the world of data deduplication using Language Model (LLM) judges. In our last episode, we navigated the basics; now, we're diving deeper to stabilize and tune our LLM's judgment capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LLM-as-a-Judge Challenge
&lt;/h2&gt;

&lt;p&gt;LLMs as judges are increasingly popular, yet their calibration remains a topic of hot debate. A recent &lt;a href="https://twitter.com/aparnadhinak/status/1748368364395721128?s=46&amp;amp;t=LqkQn2Q2J-NjCeYA4p2Dbg"&gt;Twitter post&lt;/a&gt; highlighted how uncalibrated LLMs can be. In our own deduplication experiment last time, we faced similar challenges prompting us to seek more stable and consistent methods. In particular, GPT-3.5 struggled to provide consistent results aligned with our expectations while GPT-4 performed well, but it was still volatile across subsequent runs and scores were clumped around the same numbers (instead of the full range of 0-100).&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting the Stage
&lt;/h2&gt;

&lt;p&gt;Let's revisit the FEBRL 1 dataset. We'll continue using this as our testing ground with the same setup as in our previous episode.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;DataFramesMeta&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;LinearAlgebra&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dot&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Statistics&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PromptingTools&lt;/span&gt;

&lt;span class="c"&gt;# Load the FEBRL 1 dataset.&lt;/span&gt;
&lt;span class="c"&gt;# The Freely Extensible Biomedical Record Linkage (Febrl) package is distributed with a dataset generator and four datasets are generated with the generator. This function returns the first Febrl dataset as a pandas.DataFrame.&lt;/span&gt;
&lt;span class="c"&gt;# “This data set contains 1000 records (500 original and 500 duplicates, with exactly one duplicate per original record.”&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;File&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"febrl-dataset1.csv"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DataFrame&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;rename&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;)))&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;@chain&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
    &lt;span class="n"&gt;transform&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;AbstractString&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;.=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ByRow&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;renamecols&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Contact details: &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:given_name) &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:surname), living at &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:street_number) &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:address_1), &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:address_2), &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:suburb), Postcode: &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:postcode), State: &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(:state)"&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c"&gt;## embed the texts&lt;/span&gt;
&lt;span class="n"&gt;embs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiembed&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

&lt;span class="c"&gt;# pairwise distances -- you could do it much faster with Distances.jl package&lt;/span&gt;
&lt;span class="n"&gt;dists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;let&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;
    &lt;span class="n"&gt;dists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Float32&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
    &lt;span class="nd"&gt;@inbounds&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;axes&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;
            &lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@view&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;])&lt;/span&gt; &lt;span class="o"&gt;.*&lt;/span&gt; &lt;span class="nd"&gt;@view&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="x"&gt;]))&lt;/span&gt;
            &lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="n"&gt;dists&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c"&gt;# for a given record, find the top 10 closest records&lt;/span&gt;
&lt;span class="n"&gt;let&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
    &lt;span class="n"&gt;dupe_idxs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sortperm&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;rev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@chain&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
        &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
        &lt;span class="nd"&gt;@transform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;given_name&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;surname&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;street_number&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;address_1&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;address_2&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;suburb&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;postcode&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Example for record 3 and its closest "candidate" for a duplicate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Contact details: deakin sondergeld, living at 48 goldfinch circuit, kooltuo, canterbury, Postcode: 2776, State: vic"

"Contact details: deakin sondergeld, living at 231 goldfinch circuit, kooltuo, canterbury, Postcode: 2509, State: vic"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Temperature
&lt;/h2&gt;

&lt;p&gt;The first lesson is small but important. The &lt;code&gt;temperature&lt;/code&gt; parameter in the LLM is an important factor for most practical applications. It controls the randomness of the outputs, so a higher temperature will result in more random outputs. This is useful for creative tasks, but not for our deduplication task. We want consistent results, so we need to set the temperature to be a bit lower, eg, 0.3. We can set this in PromptingTools with &lt;code&gt;aigenerate(...; api_kwargs = (;temperature = 0.3))&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Play around with the temperature and see how it affects the results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scoring Methods
&lt;/h2&gt;

&lt;p&gt;You'll notice that we rarely write our scoring system from scratch. We take an existing prompt from elsewhere and ask GPT4 to adapt it to our needs with a standard Chain-of-Thoughts (CoT) approach.&lt;/p&gt;

&lt;p&gt;We'll explore three different scoring methods today:&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 1: Additive Scoring System
&lt;/h2&gt;

&lt;p&gt;Based on the &lt;a href="https://arxiv.org/pdf/2401.10020.pdf"&gt;"Self-Rewarding Language Models"&lt;/a&gt; paper, we asked GPT-4 to tailor a prompt for our deduplication task (we don't show the full process here for brevity).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;## notice that added information about the task and then simply copied the scoring system from the Appendix of the paper&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"""
You're a professional record linkage engineer. 

Your task is to design clear evaluation criteria to compare a pair of contact details and judge whether they are duplicates or not.
Prepare an additive 0-5 points system, where more points indicate higher likelihood of being duplicates.

Example contact: 
"&lt;/span&gt;&lt;span class="n"&gt;Contact&lt;/span&gt; &lt;span class="n"&gt;details&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;james&lt;/span&gt; &lt;span class="n"&gt;waller&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;living&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;tullaroop&lt;/span&gt; &lt;span class="n"&gt;street&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;willaroo&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt; &lt;span class="n"&gt;james&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Postcode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4011&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;WA&lt;/span&gt;&lt;span class="s"&gt;". So you can see there is a full name, address, postcode and a state.

Adapt the following template criteria to match our use case of matching two contact records.
---
Review the user’s question and the corresponding response using the additive 5-point scoring system described below. Points are accumulated based on the satisfaction of each criterion:

- Add 1 point if the response is relevant and provides some information related to the user’s inquiry, even if it is incomplete or contains some irrelevant content.

- Add another point if the response addresses a substantial portion of the user’s question, but does not completely resolve the query or provide a direct answer.

- Award a third point if the response answers the basic elements of the user’s question in a useful way, regardless of whether it seems to have been written by an AI Assistant or if it has elements typically found in blogs or search results.

- Grant a fourth point if the response is clearly written from an AI Assistant’s perspective, addressing the user’s question directly and comprehensively, and is well-organized and helpful, even if there is slight room for improvement in clarity, conciseness or focus.

- Bestow a fifth point for a response that is impeccably tailored to the user’s question by an AI Assistant, without extraneous information, reflecting expert knowledge, and demonstrating a high-quality, engaging, and insightful answer.

User: &amp;lt;INSTRUCTION_HERE&amp;gt;

&amp;lt;response&amp;gt;&amp;lt;RESPONSE_HERE&amp;gt;&amp;lt;/response&amp;gt;

After examining the user’s instruction and the response:

- Briefly justify your total score, up to 100 words.

- Conclude with the score using the format: “Score: &amp;lt;total points&amp;gt;”

Remember to assess from the AI Assistant perspective, utilizing web search knowledge as necessary. To evaluate the response in alignment with this additive scoring model, we’ll systematically attribute points based on the outlined criteria.
---

First, think through step by step how one recognizes two duplicate records, what are the situations in which two pairs of records refer to the same person but differ in various fields.

Second, write a brief and concise 5-point system to evaluate a pair of contacts

"""&lt;/span&gt;
&lt;span class="c"&gt;## remember to return the whole conversation, so you can iterate on it and improve it&lt;/span&gt;
&lt;span class="n"&gt;conv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt4t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We ended up with the following prompt (after a few inline edits):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;dedupe_template1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"""
        You're a world-class record linkage engineer. 

        Compare two contact records and determine whether they refer to the same person using the additive 5-point scoring system described below. 

        Points are accumulated based on the satisfaction of each criterion:

        1. **Name Match (1 point):** Award 1 point if the names are exact matches or plausible variations/aliases of each other (e.g., "&lt;/span&gt;&lt;span class="n"&gt;Jim&lt;/span&gt;&lt;span class="s"&gt;" and "&lt;/span&gt;&lt;span class="n"&gt;James&lt;/span&gt;&lt;span class="s"&gt;").

        2. **Address Similarity (1 point):** Add +1 point if the addresses are identical or have minor discrepancies that could be typographical errors or data entry errors or formatting differences.

        3. **Postcode Consistency (1 point):** Add +1 point if the postcodes are the same. Postcodes are less prone to variation, so a mismatch here could indicate different individuals.

        4. **State Agreement (1 point):** Add +1 point if the state information matches. Mismatched states can be a strong indicator of different individuals unless there is evidence of a recent move.

        5. **Overall Cohesion (1 point):** Add 1 point if the overall comparison of the records suggests they are referring to the same person. This includes considering any supplementary information that supports the likelihood of a match, such as similar contact numbers or email addresses.

        This system allows for a maximum of 5 points, with a higher score indicating a greater likelihood that the two records are duplicates. Points cannot be deducted.
        Each criterion should be evaluated with the understanding that real-world data can have inconsistencies and errors, requiring a balance between exact matches and reasonable allowances for differences.
        Keep track of the accumulated points so far with each criterion.
                """&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"""
&amp;lt;record-1&amp;gt; {{record1}} &amp;lt;/record-1&amp;gt;
&amp;lt;record-2&amp;gt; {{record2}} &amp;lt;/record-2&amp;gt;

After detailed examination of the two records:
- Briefly justify your total score, up to 100 words.
- Conclude with the total score.
- Use the following output format: "&lt;/span&gt;&lt;span class="n"&gt;Justification&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;justify&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;go&lt;/span&gt; &lt;span class="n"&gt;criterion&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt; &lt;span class="n"&gt;criterion&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;\&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;\&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Score&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;”&lt;/span&gt;

&lt;span class="n"&gt;To&lt;/span&gt; &lt;span class="n"&gt;evaluate&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;alignment&lt;/span&gt; &lt;span class="n"&gt;with&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;additive&lt;/span&gt; &lt;span class="n"&gt;scoring&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;we’ll&lt;/span&gt; &lt;span class="n"&gt;systematically&lt;/span&gt; &lt;span class="n"&gt;attribute&lt;/span&gt; &lt;span class="n"&gt;points&lt;/span&gt; &lt;span class="n"&gt;based&lt;/span&gt; &lt;span class="n"&gt;on&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;outlined&lt;/span&gt; &lt;span class="n"&gt;criteria&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
            &lt;span class="s"&gt;""")]

## get the closest candidate for a duplicate
dupe_idxs = sortperm(dists[3, :], rev=true)
msg = aigenerate(dedupe_template1; 
    record1=df[3, :text_blob], record2=df[dupe_idxs[2], :text_blob], model="&lt;/span&gt;&lt;span class="n"&gt;gpt&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;turbo&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1106&lt;/span&gt;&lt;span class="s"&gt;", 
    api_kwargs=(; temperature=0.3))

## GPT-3 Turbo Outputs
##
## [ Info: Tokens: 644 @ Cost: \&lt;/span&gt;&lt;span class="si"&gt;$0.0008&lt;/span&gt;&lt;span class="s"&gt; in 4.0 seconds
##
## AIMessage("&lt;/span&gt;&lt;span class="n"&gt;Justification&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; 
&lt;span class="c"&gt;## - Name Match: The names are an exact match, so 1 point is awarded.&lt;/span&gt;
&lt;span class="c"&gt;## - Address Similarity: The addresses have a minor discrepancy in the street number, but the rest of the address is identical, so 1 point is awarded.&lt;/span&gt;
&lt;span class="c"&gt;## - Postcode Consistency: The postcodes are different, indicating a potential mismatch, so no points are awarded.&lt;/span&gt;
&lt;span class="c"&gt;## - State Agreement: The states match, so 1 point is awarded.&lt;/span&gt;
&lt;span class="c"&gt;## - Overall Cohesion: There is no additional information to support a match, so no points are awarded.&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## Score: 3")&lt;/span&gt;

&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template1&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; 
    &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt4t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## GPT-4 Turbo Outputs&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Justification: Starting with the Name Match, both records have the exact same name "deakin sondergeld," which earns them 1 point. For Address Similarity, although both addresses are on Goldfinch Circuit in Kooltuo, Canterbury, the house numbers are significantly different (48 vs. 231), suggesting they might not be typographical errors, so no point is awarded here. The Postcode Consistency criterion is not met, as the postcodes are different (2776 vs. 2509), resulting in no point added. State Agreement is present, with both records listing "vic" as the state, adding 1 point. Lastly, the Overall Cohesion does not strongly suggest these are the same person due to significant address and postcode discrepancies, so no additional point is awarded.&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## Score: 2")&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system showed potential in reasoning about data similarities, offering a nuanced approach to score assignments. It grounds the model better, so the scores for different models are more consistent.&lt;/p&gt;

&lt;p&gt;However, the results were not as aligned with our duplication detection goals as hoped. From time to time, the models decided to also deduct points.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 2: Linguistic Calibration Scales
&lt;/h2&gt;

&lt;p&gt;Inspired by &lt;a href="https://arxiv.org/pdf/2305.14975.pdf"&gt;"Just Ask for Calibration"&lt;/a&gt; we adapted their approach using linguistic scales for better calibration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;## We copied the example from the Appendix and adapted it to our use case&lt;/span&gt;
&lt;span class="n"&gt;dedupe_template2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"""
        You're a world-class record linkage engineer. 

        Your task is to compare two contact records and guess whether they refer to the same person.

        Provide your best guess ("&lt;/span&gt;&lt;span class="n"&gt;Duplicate&lt;/span&gt;&lt;span class="s"&gt;" vs "&lt;/span&gt;&lt;span class="n"&gt;Not&lt;/span&gt; &lt;span class="n"&gt;duplicate&lt;/span&gt;&lt;span class="s"&gt;") and describe how likely it is that your guess is correct as one of the following expressions: "&lt;/span&gt;&lt;span class="n"&gt;Almost&lt;/span&gt; &lt;span class="n"&gt;Certain&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Highly&lt;/span&gt; &lt;span class="n"&gt;Likely&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Likely&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Probably&lt;/span&gt; &lt;span class="n"&gt;Even&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Unlikely&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Highly&lt;/span&gt; &lt;span class="n"&gt;Unlikely&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;Almost&lt;/span&gt; &lt;span class="n"&gt;No&lt;/span&gt; &lt;span class="n"&gt;Change&lt;/span&gt;&lt;span class="s"&gt;"

        Give ONLY the guess and your confidence, no other words or explanation. 

        For example:

        Guess: &amp;lt;most likely guess, as short as possible; not a complete sentence, just the guess!&amp;gt;
        Confidence: &amp;lt;description of confidence, without any extra
        commentary whatsoever; just a short phrase!&amp;gt;
                """&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"""
Are the following two records duplicates?

# Record 1

{{record1}}

# Record 2

{{record2}}
            """&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template2&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; 
&lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt-3.5-turbo-1106"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## GPT-3 Turbo Outputs&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 269 @ Cost: \$0.0003 in 1.8 seconds&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Guess: Not duplicate&lt;/span&gt;
&lt;span class="c"&gt;## Confidence: Likely")&lt;/span&gt;


&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template2&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt4t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## GPT-4 Turbo Outputs&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 268 @ Cost: \$0.0028 in 1.1 seconds&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Guess: Duplicate&lt;/span&gt;
&lt;span class="c"&gt;## Confidence: Likely")&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As always GPT-4 demonstrated a better understanding and provided more accurate responses, suggesting a stronger alignment with our deduplication requirements.&lt;/p&gt;

&lt;p&gt;Conversely, GPT-3.5 struggled with this approach, often delivering answers that deviated from our expectations. &lt;/p&gt;

&lt;p&gt;Overall, we're seeing similar results as in the original article and we don't have the reasoning trace for potential audits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 3: Categorical Scoring
&lt;/h2&gt;

&lt;p&gt;Using a "traditional" categorical system where we define several categories and a point scale per category. We loosely follow the example in the &lt;a href="https://cookbook.openai.com/examples/evaluation/how_to_eval_abstractive_summarization"&gt;OpenAI cookbook&lt;/a&gt;. One difference is to limit the maximum points within each category - it's easier to explain and tends to bring more consistent results.&lt;/p&gt;

&lt;p&gt;Again, we asked GPT-4 to write the prompt for us:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"""
You're a professional record linkage engineer. 

Your task is to design clear evaluation criteria to compare a pair of contact details and judge whether they are duplicate or not.
Prepare a scoring system with 5 categories with 0-2 points each, where more points indicate higher likelihood of being duplicates. Maximum is 10 points.

Example contacts: 
- "&lt;/span&gt;&lt;span class="n"&gt;james&lt;/span&gt; &lt;span class="n"&gt;waller&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;living&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="n"&gt;tullaroop&lt;/span&gt; &lt;span class="n"&gt;street&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;willaroo&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt; &lt;span class="n"&gt;james&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Postcode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4011&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;WA&lt;/span&gt;&lt;span class="s"&gt;"
- "&lt;/span&gt;&lt;span class="n"&gt;lachlan&lt;/span&gt; &lt;span class="n"&gt;berry&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;living&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="mi"&gt;69&lt;/span&gt; &lt;span class="n"&gt;giblin&lt;/span&gt; &lt;span class="n"&gt;street&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;killarney&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bittern&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Postcode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4814&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;QLD&lt;/span&gt;&lt;span class="s"&gt;"
You can see here the available fields for the scoring system: name, address, postcode and state.

First, think through step by step what is a robust method to judge two potentially duplicate records and what the situations are in which two pairs of records refer to the same person but differ in various fields. Design your system around this knowledge.

Second, write a brief and concise explanation for your 10-point system.
"""&lt;/span&gt;
&lt;span class="c"&gt;## remember to return the whole conversation, so you can iterate on it and improve it&lt;/span&gt;
&lt;span class="n"&gt;conv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"gpt4t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_all&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ultimately, we ended up with the following prompt (after a few inline edits):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;dedupe_template3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;"""
        You're a world-class record linkage engineer. 

        Your task is to compare two contact records and score whether they refer to the same person (=are a duplicate).

        Apply the following scoring system to the two records.

        ### Duplicate Record Scoring System (0-10 Points)

**1. Name Matching:**
   - **2 points** for exact match.
   - **1 point** for partial match (nicknames, misspellings).
   - **0 points** for no match.

**2. Address Matching:**
   - **2 points** for exact match.
   - **1 point** for partial match (same street, minor errors).
   - **0 points** for no match.

**3. Postcode Matching:**
   - **2 points** for exact match.
   - **1 point** for first digits match.
   - **0 points** for no match.

**4. State Matching:**
   - **2 points** for exact match.
   - **1 point** for neighboring states or common errors.
   - **0 points** for no match.

**5. Other Fields (if available):**
   - **2 points** for exact match in fields like phone or email.
   - **1 point** for partial match.
   - **0 points** for no match or not available.

#### Guidelines
- **Maximum Score:** 10 points.
- **Higher Score:** Indicates higher likelihood of being duplicates.
- **Consider Context:** Adjust scoring based on the context and known data quality issues.

### Output Format

Record 1: &amp;lt;details of record 1&amp;gt;

Record 2: &amp;lt;details of record 2&amp;gt;

After detailed examination of the two records:

Justification: &amp;lt;justify the total score, go criterion by criterion. 100 words max&amp;gt;

Score: &amp;lt;total score&amp;gt;
                """&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserMessage&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"""
Record 1: {{record1}}

Record 2: {{record2}}

After detailed examination of the two records:

Justification:"""&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;

&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template3&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; 
    &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt-3.5-turbo-1106"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## GPT-3.5 Turbo Outputs&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 565 @ Cost: \$0.0006 in 3.1 seconds&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Name Matching: 2 points. The names are an exact match.&lt;/span&gt;
&lt;span class="c"&gt;## Address Matching: 1 point. The street name is similar, but the house numbers are different.&lt;/span&gt;
&lt;span class="c"&gt;## Postcode Matching: 0 points. The postcodes are completely different.&lt;/span&gt;
&lt;span class="c"&gt;## State Matching: 2 points. The states are an exact match.&lt;/span&gt;
&lt;span class="c"&gt;## Other Fields: 0 points. No other fields are available for comparison.&lt;/span&gt;

&lt;span class="c"&gt;## Score: 5 points")&lt;/span&gt;

&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aigenerate&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template3&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt4t"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;

&lt;span class="c"&gt;## GPT-4 Turbo Outputs&lt;/span&gt;
&lt;span class="c"&gt;##&lt;/span&gt;
&lt;span class="c"&gt;## [ Info: Tokens: 564 @ Cost: \$0.0073 in 7.8 seconds&lt;/span&gt;
&lt;span class="c"&gt;## AIMessage("Justification: Both records have an exact name match, earning 2 points. &lt;/span&gt;
&lt;span class="c"&gt;## The addresses have a partial match since they are on the same street but have different numbers, earning 1 point. &lt;/span&gt;
&lt;span class="c"&gt;## The postcodes do not match exactly or at the first digits, so they earn 0 points. &lt;/span&gt;
&lt;span class="c"&gt;## The state matches exactly, earning 2 points. &lt;/span&gt;
&lt;span class="c"&gt;## No other fields are provided for comparison. &lt;/span&gt;

&lt;span class="c"&gt;## Score: 5")&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is good! We have a clear scoring system and the results are consistent between GPT-3.5 and GPT-4.&lt;/p&gt;

&lt;p&gt;Let's test it on a few more records. &lt;/p&gt;

&lt;p&gt;We'll use structured extraction to make it easier to work with data in the DataFrame:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="s"&gt;"Apply the scoring system, go criterion by criterion, and justify your score. Maximum 10 points."&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="nc"&gt; DuplicateJudgement&lt;/span&gt;
    &lt;span class="n"&gt;justification&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiextract&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template3&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt-3.5-turbo-1106"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DuplicateJudgement&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Evaluation
&lt;/h2&gt;

&lt;p&gt;Now, let's apply Method 3 to 100 random contacts (and judge always 3 closest candidates). Let's ignore the self-consistency for now (eg, order of duplicate vs candidate).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="c"&gt;## Utility functions&lt;/span&gt;
&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; find_candidates&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;## Find the top k most similar records to the i-th record&lt;/span&gt;
    &lt;span class="n"&gt;dupe_idxs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sortperm&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="nd"&gt;@view&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;rev&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;# the first item is the record itself&lt;/span&gt;
    &lt;span class="n"&gt;dupe_idxs&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;.+&lt;/span&gt; &lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="x"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;function&lt;/span&gt;&lt;span class="nf"&gt; judge_duplicates&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text1&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text2&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;## when we make a lot of network calls, we will often get errors. Let's make sure we handle them gracefully&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;
        &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aiextract&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dedupe_template3&lt;/span&gt;&lt;span class="x"&gt;;&lt;/span&gt; &lt;span class="n"&gt;record1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text1&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;record2&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text2&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"gpt-3.5-turbo-1106"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;return_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DuplicateJudgement&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;http_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="x"&gt;(;&lt;/span&gt; &lt;span class="n"&gt;readtimeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="x"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
        &lt;span class="nd"&gt;@warn&lt;/span&gt; &lt;span class="s"&gt;"Failed to generate a judgement for &lt;/span&gt;&lt;span class="si"&gt;$(i) &lt;/span&gt;&lt;span class="s"&gt;and &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(dupe_idxs[1+i])"&lt;/span&gt;
        &lt;span class="nb"&gt;missing&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c"&gt;# We'll run our system for random 100 data points, pick the top 3 most similar records and judge them.&lt;/span&gt;
&lt;span class="n"&gt;rand_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Xoshiro&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;unique&lt;/span&gt; &lt;span class="o"&gt;|&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Base&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fix2&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;## Let's run the experiment -- this takes ~1-2 minutes&lt;/span&gt;
&lt;span class="n"&gt;df_dupes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;@chain&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
    &lt;span class="nd"&gt;@select&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;rec_id&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="n"&gt;eachindex&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="n"&gt;rand_ids&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
    &lt;span class="c"&gt;## find candidates&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;candidate_idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;find_candidates&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dists&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;flatten&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;candidate_idx&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;## bring the candidate data&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;rec_id_candidate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rec_id&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;candidate_idx&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob_candidate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;candidate_idx&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
    &lt;span class="c"&gt;## judge duplicates // we run them in parallel and just wait until they all finish&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Threads&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nd"&gt;@spawn&lt;/span&gt; &lt;span class="n"&gt;judge_duplicates&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;text_blob_candidate&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;## bring the true labels&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;is_duplicate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="s"&gt;"(\d+)"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;rec_id&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;captures&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="s"&gt;"(\d+)"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;rec_id_candidate&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;captures&lt;/span&gt;&lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="x"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c"&gt;## Let's check if all tasks are done&lt;/span&gt;
&lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="n"&gt;istaskdone&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df_dupes&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, let's analyze the results. As a reminder, the best-case scenario would be to find a duplicate for each record, ie, 100 duplicates in total.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="nd"&gt;@chain&lt;/span&gt; &lt;span class="n"&gt;df_dupes&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fetch&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dropmissing&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PT&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call_cost&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"gpt-3.5-turbo-1106"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;
    &lt;span class="nd"&gt;@aside&lt;/span&gt; &lt;span class="nd"&gt;@info&lt;/span&gt; &lt;span class="s"&gt;"Number of duplicates found: &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(count(_.is_duplicate))/&lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;(length(rand_ids)), Total cost: \&lt;/span&gt;&lt;span class="si"&gt;$$&lt;/span&gt;&lt;span class="s"&gt;(sum(_.cost))"&lt;/span&gt;
    &lt;span class="nd"&gt;@by&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;is_duplicate&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score_std&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Info: Number of duplicates found: 100/100, Total cost: \$0.193309
2×3 DataFrame
 Row │ is_duplicate  score    score_std 
     │ Bool          Float64  Float64   
─────┼──────────────────────────────────
   1 │        false     2.23    1.76996
   2 │         true     5.86    1.93855
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We successfully identified all duplicates, clearly distinguishing them based on their scores. &lt;/p&gt;

&lt;p&gt;Let's visualize the distribution of scores - we can see that the scores for duplicates are higher than for non-duplicates and they are well separated.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;StatsPlots&lt;/span&gt;

&lt;span class="n"&gt;pl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;@chain&lt;/span&gt; &lt;span class="n"&gt;df_dupes&lt;/span&gt; &lt;span class="k"&gt;begin&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fetch&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dropmissing&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@rtransform&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;judgement&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;
    &lt;span class="nd"&gt;@df&lt;/span&gt; &lt;span class="n"&gt;boxplot&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;is_duplicate&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Score"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Is duplicate?"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Scores from the Auto-Judge"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;yformatter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;round&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Int&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="x"&gt;),&lt;/span&gt; &lt;span class="n"&gt;legend&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dpi&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;xticks!&lt;/span&gt;&lt;span class="x"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="x"&gt;],&lt;/span&gt; &lt;span class="x"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Not duplicate"&lt;/span&gt;&lt;span class="x"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Duplicate"&lt;/span&gt;&lt;span class="x"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://forem.julialang.org/images/oDMBVgWepULg160nv5GFsHnHrCfmXCxDamvDtFB9AYc/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzIw/NTB1aWNjZmRnYmpo/bGR2NDlwLnBuZw" class="article-body-image-wrapper"&gt;&lt;img src="https://forem.julialang.org/images/oDMBVgWepULg160nv5GFsHnHrCfmXCxDamvDtFB9AYc/rt:fit/w:800/g:sm/q:0/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L2FydGljbGVzLzIw/NTB1aWNjZmRnYmpo/bGR2NDlwLnBuZw" alt="Distribution of Scores from the LLM-Judge" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost-Efficiency?
&lt;/h2&gt;

&lt;p&gt;Amazingly, the entire process cost just $0.2 for 300 calls, demonstrating the method's affordability and efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Our exploration demonstrated three diverse approaches to crafting scoring criteria for LLM judges in data deduplication. While we found Method 3 most effective for our needs, you might discover that the other methods better suit your specific scenarios. This journey underscores the incredible power and versatility of the LLM-as-a-Judge pattern, opening doors to numerous practical applications in the business.&lt;/p&gt;

&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>generativeai</category>
      <category>prompting</category>
    </item>
    <item>
      <title>AIHelpMe.jl: AI-Enhanced Coding Assistance for Julia</title>
      <dc:creator>Jan Siml</dc:creator>
      <pubDate>Tue, 23 Jan 2024 09:07:55 +0000</pubDate>
      <link>https://forem.julialang.org/svilupp/aihelpmejl-ai-enhanced-coding-assistance-for-julia-42a2</link>
      <guid>https://forem.julialang.org/svilupp/aihelpmejl-ai-enhanced-coding-assistance-for-julia-42a2</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/svilupp/AIHelpMe.jl"&gt;AIHelpMe&lt;/a&gt;, a new Julia package, transforms your existing docstrings into an interactive AI-powered guide, offering personalized insights directly from your code's documentation. It's in the early stages and seeks community feedback, promising a unique, low-cost way to interact with your documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Announcing AIHelpMe Pre-Release
&lt;/h2&gt;

&lt;p&gt;Welcome to &lt;a href="https://github.com/svilupp/AIHelpMe.jl"&gt;AIHelpMe&lt;/a&gt;, a new Julia package that transforms your detailed docstrings into a rich source of insights. It's not about writing code for you; rather, it's about shining a light on the valuable documentation you and others have already created. Think of it as having a chat with your code's documentation, enhanced by AI's clever touch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Motivation and Value
&lt;/h2&gt;

&lt;p&gt;Why write great docstrings? AIHelpMe gives you a compelling reason, turning them into an interactive, insightful guide. It's a subtle, yet powerful way to connect your queries with tailored, documentation-driven answers.&lt;/p&gt;

&lt;p&gt;There are a few things that set this package apart from generic chatbots:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direct Access to Your Work&lt;/strong&gt;: AIHelpMe uniquely utilizes the latest information and modules directly from your laptop, ensuring up-to-date and relevant assistance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full Control Over Searches&lt;/strong&gt;: Tailor your search scope and methods with AIHelpMe, aligning AI insights precisely with your needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contextual Understanding&lt;/strong&gt;: Go beyond typical chatbot responses; AIHelpMe offers deep insights, revealing the sources behind each answer, so you can continue your research 🧠📚&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Simply add AIHelpMe to your Julia environment (not registered yet) and get ready to interact with your code's documentation in a whole new way. Remember, API keys from Cohere and OpenAI are required, but the cost per query is just a tiny fraction of a cent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;Pkg&lt;/span&gt;
&lt;span class="n"&gt;Pkg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="x"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://github.com/svilupp/AIHelpMe.jl"&lt;/span&gt;&lt;span class="x"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;AIHelpMe&lt;/span&gt; &lt;span class="c"&gt;# automatically some downloads pre-processed documentation&lt;/span&gt;

&lt;span class="c"&gt;# alias gpt3t sends the query to the latest GPT-3.5 Turbo&lt;/span&gt;
&lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="s"&gt;"In Julia, how to create a named tuple from a dictionary? Give me an example"&lt;/span&gt;&lt;span class="n"&gt;gpt3t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Info: Done generating response. Total cost: $0.002
AIMessage("You can use the splatting operator to create a named tuple from a dictionary in Julia. Here's an example:

d = Dict(:a =&amp;gt; 1, :b =&amp;gt; 2)
nt = (; d...)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also ask GPT-4 Turbo to weigh in on your question (notice the "!" in &lt;code&gt;aihelp!"..."&lt;/code&gt;) when you need deeper insights. For example, you can ask GPT-4 Turbo to explain the constructors for &lt;code&gt;NamedTuple&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;aihelp!&lt;/span&gt;&lt;span class="s"&gt;"What other constructors does NamedTuple have?"&lt;/span&gt;&lt;span class="n"&gt;gpt4t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Info: Tokens: 1023 @ Cost: \$0.0154 in 30.7 seconds
AIMessage("Aside from creating named tuples from pairs or a dictionary, `NamedTuple` offers other constructors as well:

1. **Tuple literal with keys**:

   nt = (a=1, b=2)


2. **Tuple literal with semicolon**:

   nt = (; a=1, b=2)


3. **Using `NamedTuple` type as constructor**:

   nt = NamedTuple{(:a, :b)}((1, 2))


4. **With specified types**:

   nt = NamedTuple{(:a, :b), Tuple{Int, Int}}((1, 2))


5. **From another named tuple or iterable**:
   You can create a new named tuple by splatting an existing named tuple or any iterable of pairs:

   existing_nt = (a=1, b=2)
   nt = NamedTuple((:c =&amp;gt; 3, :d =&amp;gt; 4, existing_nt...))

   &amp;lt;Author: this example is incorrect. See the next section&amp;gt;


These constructors allow for flexibility in creating `NamedTuple`s programmatically or from existing data structures in Julia.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: Nested Julia code fences had been manually deleted to enable correct parsing on Forem.&lt;/p&gt;

&lt;h2&gt;
  
  
  LLMs will make mistakes...
&lt;/h2&gt;

&lt;p&gt;... and it's okay! It's usually easy to check &amp;amp; iterate, so the overall solution still ends up being faster / easier.&lt;/p&gt;

&lt;p&gt;The original example used the default chat model and had a mistake as pointed out by @oxinabox. We changed to "gpt3t" model for better performance. There was also a change in the prefix and suffix, but that was not required - it's just a habit of how I write prompts/questions to LLMs.&lt;/p&gt;

&lt;p&gt;Similarly, the last constructor in the examples from GPT4 Turbo throws an error. We can ask the LLM to fix it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight julia"&gt;&lt;code&gt;&lt;span class="n"&gt;aihelp&lt;/span&gt;&lt;span class="s"&gt;"How to fix `nt = NamedTuple((:c =&amp;gt; 3, :d =&amp;gt; 4, existing_nt...))`. I get error &lt;/span&gt;&lt;span class="si"&gt;$&lt;/span&gt;&lt;span class="s"&gt;err"&lt;/span&gt;&lt;span class="n"&gt;gpt4t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Info: Done generating response. Total cost: $0.002
AIMessage("You can fix the code snippet by using the `merge` function properly, to merge the `NamedTuple` with the key-value pairs:

existing_nt = (a=1, b=2)
nt = merge(existing_nt, (c=3,), (d=4,))


This code merges the existing named tuple `existing_nt` with two additional key-value pairs, `(c=3,)` and `(d=4,)`. The result `nt` will be a named tuple including all four key-value pairs.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pre-Release Testing
&lt;/h2&gt;

&lt;p&gt;As AIHelpMe is in its early stages, we're eager for community involvement to test and refine its capabilities. &lt;/p&gt;

&lt;p&gt;Is it valuable? What are its limitations?&lt;/p&gt;

&lt;p&gt;Your feedback is invaluable in shaping this tool’s future, so join us in this innovative journey!&lt;/p&gt;




&lt;p&gt;Credit for the title image goes to DALL-E 3.&lt;/p&gt;

&lt;p&gt;Thanks to @oxinabox for pointing out that the original example had an error in it!&lt;/p&gt;

</description>
      <category>generativeai</category>
      <category>genai</category>
      <category>help</category>
    </item>
  </channel>
</rss>
